Memory Usage of Different VCL Indexing Engines¶
The user can select different engines to build an index file for descriptors. The indices are used for similarity search (KNN) queries using an input descriptor with D dimensions. Choosing the right engine and the index to use depends on the available system memory, total number of descriptors, the dimension of the descriptor, the overhead for insert and search process, and the target accuracy the user is interested in for that specific use case.
-
FAISS Engine: The VCL library supports
FLAT-L2,FLAT-IP,IVFFLAT-L2,IVFLAT-IPindices-
The
FLATindex is the only index that can guarantee exact results (both forL2andIPmetrics). It provides the baseline for results for the other indexes. It does not compress the vectors, and has the maximum memory footprint out of all the indices. ForNdescriptors, withDdimensions each, the index memory footprintN x D x 4bytes (float datatype size). For example, for 10,000 embeddings each of dimension 1K, the memory footprint for theFLATindex will be 40MB. -
The
IVFFLATindex returns approximate result for similarity search queries. It implements an optimization for search speed by clustering the descriptors (organizes the descriptors into buckets) and return nearest neighbors within the cluster. Thenprobeparameter determines the number of clusters searched before the nearest neighbor results are returned (increasing thenprobewill increase the search latency linearly). As for the memory footprint, sinceIVFFLATstores the exact descriptors uncompressed (similar to theFLATindex) it will useN x D x 4bytes to storeNdescriptors withDdimension each. For example, for 10,000 embeddings each of dimension 1K, the memory footprint for theIVFFLATindex will be 40MB.
-
-
FLINNG Engine: The VCL Library supports
FLINNG-L2andFLINNG-IP. FLINNG is an approximate index that returns approximate results for nearest neighbor queries. It does not store the exact descriptors but uses randomized hash-based data-structures to represent the index. FLINNG is attractive for datasets with a large number of dimensions (curse of dimensionality) (e.g., >1000 dimension) and in the case when the total number of descriptors to be used is also very large (millions). FLINNG will have lower accuracy compared to other libraries when the number of dimensions is small. FLINNG uses almost a fixed memory size irrespective of the total number of vectors. The parameternum_rows(number of hash tables used) is an important parameter for the query time. The query time (and similarly indexing time) is linearly proportional to this parameter. The returned query accuracy is also asymptotically increasing withnum_rowsparameter. The total memory footprint of FLINNG is approximately(2^cells_per_row) x num_rows x 8B + 2^num_hash_tables x 8B + N x (8B). The default parameters of the library is set insrc/vcl/DescriptorParams.hand can be set with the function call to create the FLINNG index. The default parameters should support 10M embeddings and should consume around 4G of memory and irrespective of the dimensionality of the descriptors.