Flat Index

class pysvs.Flat
__init__(*args, **kwargs)

Overloaded function.

  1. __init__(self: pysvs.Flat, data_loader: Union[pysvs.VectorDataLoader, pysvs.LVQLoader], distance: pysvs.DistanceType = <DistanceType.L2: 0>, query_type: pysvs.DataType = <DataType.float32: 9>, num_threads: int = 1) -> None

Load a Flat index from data stored on disk.

Parameters:
  • data_loader – The loader for the dataset.

  • distance – The distance function to use.

  • query_type – The data type of the queries.

  • num_threads – The number of threads to use for queries (can be changed after loading).

The top level type is an abstract type backed by various specialized backends that will be instantiated based on their applicability to the particular problem instance.

The arguments upon which specialization is conducted are:

  • data_loader: Both kind (type of loader) and inner aspects of the loader like data type, quantization type, and number of dimensions.

  • distance: The distance measure being used.

Specializations compiled into the binary are listed below.

Method 0:
  • data_loader: VectorDataLoader with element type float32 and any dimension – (union alternative 0)

  • distance: all values

Method 1:
  • data_loader: VectorDataLoader with element type float16 and any dimension – (union alternative 0)

  • distance: all values

Method 2:
  • data_loader: VectorDataLoader with element type uint8 and any dimension – (union alternative 0)

  • distance: all values

Method 3:
  • data_loader: VectorDataLoader with element type int8 and any dimension – (union alternative 0)

  • distance: all values

Method 4:
  • data_loader: LVQLoader 4x4 (sequential) with any dimensions – (union alternative 1)

  • distance: L2

Method 5:
  • data_loader: LVQLoader 8x0 (sequential) with any dimensions – (union alternative 1)

  • distance: L2

Method 6:
  • data_loader: LVQLoader 4x4 (sequential) with any dimensions – (union alternative 1)

  • distance: MIP

Method 7:
  • data_loader: LVQLoader 8x0 (sequential) with any dimensions – (union alternative 1)

  • distance: MIP

  1. __init__(self: pysvs.Flat, data: numpy.ndarray[float16], distance: pysvs.DistanceType, num_threads: int = 1) -> None

Construct a Flat index over the given data, returning a searchable index.

Parameters:
  • data – The dataset to index. NOTE: PySVS will maintain an internal copy of the dataset. This may change in future releases.

  • distance – The distance type to use for this dataset.

  • num_threads – The number of threads to use for searching. This value can also be changed after the index is constructed.

  1. __init__(self: pysvs.Flat, data: numpy.ndarray[numpy.float32], distance: pysvs.DistanceType, num_threads: int = 1) -> None

Construct a Flat index over the given data, returning a searchable index.

Parameters:
  • data – The dataset to index. NOTE: PySVS will maintain an internal copy of the dataset. This may change in future releases.

  • distance – The distance type to use for this dataset.

  • num_threads – The number of threads to use for searching. This value can also be changed after the index is constructed.

  1. __init__(self: pysvs.Flat, data: numpy.ndarray[numpy.uint8], distance: pysvs.DistanceType, num_threads: int = 1) -> None

Construct a Flat index over the given data, returning a searchable index.

Parameters:
  • data – The dataset to index. NOTE: PySVS will maintain an internal copy of the dataset. This may change in future releases.

  • distance – The distance type to use for this dataset.

  • num_threads – The number of threads to use for searching. This value can also be changed after the index is constructed.

  1. __init__(self: pysvs.Flat, data: numpy.ndarray[numpy.int8], distance: pysvs.DistanceType, num_threads: int = 1) -> None

Construct a Flat index over the given data, returning a searchable index.

Parameters:
  • data – The dataset to index. NOTE: PySVS will maintain an internal copy of the dataset. This may change in future releases.

  • distance – The distance type to use for this dataset.

  • num_threads – The number of threads to use for searching. This value can also be changed after the index is constructed.

property dimensions

Return the logical number of dimensions for each vector in the dataset.

property num_threads

Get and set the number of threads used to process queries.

Type:

Read/Write (int)

property query_types

Return the query element types this index is specialized for.

search(*args, **kwargs)

Overloaded function.

  1. search(self: pysvs.Flat, queries: numpy.ndarray[numpy.float32], n_neighbors: int) -> tuple

Perform a search to return the n_neighbors approximate nearest neighbors to the query.

Parameters:
  • queries – Numpy Vector or Matrix representing the queries. If the argument is a vector, it will be treated as a single query. If the argument is a matrix, individual queries are assumed to the rows of the matrix. Returned results will have a position-wise correspondence with the queries. That is, the N-th row of the returned IDs and distances will correspond to the N-th row in the query matrix.

  • n_neighbors – The number of neighbors to return for this search job.

Returns:

A tuple (I, D) where I contains the n_neighbors approximate (or exact) nearest neighbors to the queries and D contains the approximate distances.

Note: This form is returned regardless of whether the given query was a vector or a matrix.

  1. search(self: pysvs.Flat, queries: numpy.ndarray[numpy.uint8], n_neighbors: int) -> tuple

Perform a search to return the n_neighbors approximate nearest neighbors to the query.

Parameters:
  • queries – Numpy Vector or Matrix representing the queries. If the argument is a vector, it will be treated as a single query. If the argument is a matrix, individual queries are assumed to the rows of the matrix. Returned results will have a position-wise correspondence with the queries. That is, the N-th row of the returned IDs and distances will correspond to the N-th row in the query matrix.

  • n_neighbors – The number of neighbors to return for this search job.

Returns:

A tuple (I, D) where I contains the n_neighbors approximate (or exact) nearest neighbors to the queries and D contains the approximate distances.

Note: This form is returned regardless of whether the given query was a vector or a matrix.

  1. search(self: pysvs.Flat, queries: numpy.ndarray[numpy.int8], n_neighbors: int) -> tuple

Perform a search to return the n_neighbors approximate nearest neighbors to the query.

Parameters:
  • queries – Numpy Vector or Matrix representing the queries. If the argument is a vector, it will be treated as a single query. If the argument is a matrix, individual queries are assumed to the rows of the matrix. Returned results will have a position-wise correspondence with the queries. That is, the N-th row of the returned IDs and distances will correspond to the N-th row in the query matrix.

  • n_neighbors – The number of neighbors to return for this search job.

Returns:

A tuple (I, D) where I contains the n_neighbors approximate (or exact) nearest neighbors to the queries and D contains the approximate distances.

Note: This form is returned regardless of whether the given query was a vector or a matrix.

property search_parameters

Get/set the current search parameters for the index. These parameters modify and non-algorthmic properties of search (affecting queries-per-second).

See also: pysvs.FlatSearchParameters.

Type:

“Read/Write (pysvs.FlatSearchParameters)

property size

Return the number of elements in the indexed dataset.