Library Features

Here we present the main library features, including the supported index types, distance functions and data types.

Index Types

SVS supports the following index types:

Graphs for static datasets

The Vamana graph (in Python, in C++) enables fast in-memory graph-based similarity search with high accuracy for static databases, where the database is fixed and never updated.

Graphs for streaming data

The DynamicVamana graph (in Python) enables fast in-memory graph-based similarity search with high accuracy for streaming data, where the database is built dynamically by adding and removing vectors.

Flat Index

The flat index (in Python, in C++) can be used to run exhaustive search, e.g., useful to compute the ground-truth nearest neighbors for a dataset.

Distance functions

SVS supports the distance functions listed in Built-In Distance Functors (see pysvs.DistanceType for the corresponding Python classes). The distance function is specified when the index is created by the corresponding index constructors. In the case of the Vamana index, it must also be specified when the graph is built (see pysvs.Vamana.build and svs::Vamana::build() for details).

Data types

The supported data types are: float32, float16, int8 and uint8. Other data types might work but performance has not been tested.

The data type can be set independently for the database vectors and the query vector. For example, one could compress the database vectors to float16, which allows for a 2x storage reduction often with negligible accuracy loss, and keep the query in float32.

In Python

The data type for the database vectors is specified by the data_type argument when the vectors are loaded with pysvs.VectorDataLoader. The data type for the query vectors is specified in the query_type argument for the corresponding index constructors (pysvs.Vamana, pysvs.Flat).

In C++

The database vectors data type is specified in the template argument of the svs::VectorDataLoader.

svs::VectorDataLoader<float>("data_f32.svs")

For details on setting the query vectors data type see Vamana Index and Flat Orchestrator.


Warning

This will not perform any dataset conversion. If a dataset was saved to disk as float16 data, for example, then it must be loaded with data_type = pysvs.DataType.float16 in Python or svs::Float16 in C++.

The supported data type combinations for (queries, database vectors) are: (float32, float32), (float32, float16), (uint8, uint8), (int8, int8), among others.

Vector compression

The memory footprint can be reduced and the search performance improved by combining the graph-search with the Locally-adaptive Vector Quantization (LVQ) [ABHT23] approach. The library has support for performing online vector compression with LVQ or for loading an index with a previously compressed dataset. See How to Choose Compression Parameters for more details about LVQ and how to set its parameters.

Search with compressed vectors

See Search using vector compression for details on how to use LVQ for search.

Graph building with compressed vectors

LVQ-compressed vectors can be used to build the graph, thus reducing the memory footprint not only for search but also for indexing. Depending on the dataset, the search accuracy can be almost unchanged even when the graph is built with highly compressed vectors using LVQ-4 or LVQ-8, that is, using only 4 or 8 bits per vector component [ABHT23].

To build the graph using compressed vectors we just need to follow the same procedure we did for graph building. First, define the building parameters,

In Python

# Now, we can build a graph index over the data set.
parameters = pysvs.VamanaBuildParameters(
    graph_max_degree = 64,
    window_size = 128,
)

In C++

auto parameters = svs::index::vamana::VamanaBuildParameters{
    1.2,  // alpha
    64,   // graph max degree
    128,  // search window size
    1024, // max candidate pool size
    60,   // prune to degree
    true, // full search history
};

then load the dataset, using the loader (details for Python, details for C++) corresponding to the chosen compression type,

In Python

data_loader = pysvs.VectorDataLoader(
    os.path.join(test_data_dir, "example_data"),  # Uncompressed data
    pysvs.DataType.float32,
    dims = 128    # Passing dimensionality is optional
)
B1 = 4    # Number of bits for the first level LVQ quantization
B2 = 8    # Number of bits for the residuals quantization
padding = 32
strategy = pysvs.LVQStrategy.Turbo
compressed_loader = pysvs.LVQLoader(data_loader,
    primary=B1,
    residual=B2,
    strategy=strategy, # Passing the strategy is optional.
    padding=padding # Passing padding is optional.
)

In C++

// Quantization
size_t padding = 32;
namespace lvq = svs::quantization::lvq;

// Wrap the compressor object in a lazy functor.
// This will defer loading and compression of the LVQ dataset until the threadpool
// used in the index has been created.
auto compressor = svs::lib::Lazy([=](svs::threads::ThreadPool auto& threadpool) {
    auto data = svs::VectorDataLoader<float, 128>("example_data").load();
    return lvq::LVQDataset<8, 0, 128>::compress(data, threadpool, padding);
});
index = svs::Vamana::assemble<float>(
    "example_config", svs::GraphLoader("example_graph"), compressor, svs::DistanceL2()
);

and finally call the build function with the chosen compression loader

In Python

# Build the index.
index = pysvs.Vamana.build(
    parameters,
    compressed_loader,
    pysvs.DistanceType.L2,
    num_threads = 4
)

In C++

// Compressed building
index =
    svs::Vamana::build<float>(parameters, compressor, svs::DistanceL2(), num_threads);
recall = run_recall(index, queries, groundtruth, 30, 10, "Compressed Build");
check(0.8212, recall);

References

[SDSK19]

Subramanya, S.J.; Devvrit, F.; Simhadri, H.V.; Krishnawamy, R.; Kadekodi, R..: Diskann: Fast accurate billion-point nearest neighbor search on a single node. In: Advances in Neural Information Processing Systems 32 (2019).

[ABHT23] (1,2)

Aguerrebere, C.; Bhati I.; Hildebrand M.; Tepper M.; Willke T..: Similarity search in the blink of an eye with compressed indices. In: Proceedings of the VLDB Endowment, 16, 11, 3433 - 3446. (2023)

[AHBW24]

Aguerrebere, C.; Hildebrand M.; Bhati I.; Willke T.; Tepper M..: Locally-adaptive Quantization for Streaming Vector Search. In: arxiv preprint arXiv:2402.02044 (2024)

[MaYa18]

Malkov, Y. A. and Yashunin, D. A..: Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. In: IEEE transactions on pattern analysis and machine intelligence 42, 4 (2018), 824–836.

[JoDJ19]

Johnson, J.; Douze, M.; Jégou, H..: Billion-scale similarity search with GPUs. In: IEEE Transactions on Big Data 7, 3 (2019), 535–547.

[GSLG20]

Guo, R.; Sun, P.; Lindgren, E.; Geng, Q.; Simcha, D.; Chern, F.; Kumar, S..: Accelerating large-scale inference with anisotropic vector quantization. In International Conference on Machine Learning. PMLR, 3887-3896 (2020)

[IwMi18]

Iwasaki, M. and Miyazaki, D..: Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data. https://github.com/yahoojapan/NGT (2018)

[AuBF20]

Aumüller, M.; Bernhardsson, E.; Faithfull, A..: ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms. In: Information Systems 87 (2020), 101374. https://doi.org/10.1016/j.is.2019.02.006

[QDLL21]

Qu, Y.; Ding, Y.; Liu, J.; Liu, K.; Ren, R.; Zhao, W. X.; Dong, D.; Wu, H. and Wang, H..: RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 5835–5847. (2021)

[SSKS21]

Singh, A.; Subramanya, S.J.; Krishnaswamy, R.; Simhadri, H.V..: FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search. In: arxiv preprint arXiv:2105.09613 (2021)

[DGDJ24]

Douze, M.; Guzhva, A.; Deng, C.; Johnson, J.; Szilvasy, G.; Mazaré, P.E.; Lomeli, M.; Hosseini, L.; Jégou, H.: The Faiss library. In: arxiv preprint arXiv:2401.08281 (2024)

[TBAH24]

Tepper M.; Bhati I.; Aguerrebere, C.; Hildebrand M.; Willke T.: LeanVec: Search your vectors faster by making them fit. arXiv preprint arXiv:2312.16335 (2024)