VDMS Configuration File¶

VDMS uses a configuration file (written in JSON) that can be specified when starting the server by using the -cfg flag:

./vdms -cfg config-vdms.json

If no configuration file is specified, VDMS will try to open the default file (config-vdms.json), and will fail to initiate if the file is not found.

Parameters¶

All the parameters in the configuration file are optional, as VDMS has default values for all of them.

Param	Explanation	Default
autodelete_interval_s	Time interval (in seconds) to delete specifc entries (Should be greater than 0)	-1
autoreplicate_interval	Time interval to backup the DB folder (Should be greater than 0)	-1
aws_log_level	This parameter is optional and it is used when "storage_type" parameter is set to "aws" value, it is used to control the level of verbosity of AWS logging system. The acccepted values are "off", "fatal", "error", "warn", "info", "debug", "trace"	off
backup_flag	Boolean whether to use the auto-replication thread	false
backup_path	Path to store the backup DB	`db_root_path`
blobs_path	Path to folder where blobs will be stored	blobs (db/blobs)
bucket_name	Bucket name for AWS storage	vdms_bucket
db_root_path	Path to the root folder where all filed/objects will be stored	db
descriptors_path	Path to folder where descriptors will be stored	descriptors (db/descriptors)
endpoint_override	Server address (including scheme and port number) to override the S3 storage address server. This parameter is valid when the "storage_type" parameter is set to "aws" value and "use_endpoint" parameter is set to true.	`http://127.0.0.1:9000` when "use_endpoint" parameter is set to true and the "storage_type" parameter is set to "aws"
expiration_time	Time interval (in seconds) to automatically delete entries
flinng_cells_per_row	For the FLINNG indexing for descriptors, controls the number of bits in the distance sensitive LSH vector for each row	1000
flinng_hashes_per_table	For the FLINNG indexing for descriptors, controls the number of hash functions to be used per table for group testing	12
flinng_num_hash_tables	For the FLINNG indexing for descriptors, controls the number of hash tables (permutations) to be used for the dataset for group testing	10
flinng_num_rows	For the FLINNG indexing for descriptors, controls the number of distance sensitive LSH vectors	3
hnsw_efConstruction	For the HNSW indexing for descriptors, controls the breadth of the search during the index construction phase	96
hnsw_efsearch	For the HNSW indexing for descriptors, controls the breadth of the search during the search query	64
hnsw_M	For the HNSW indexing for descriptors, controls the maximum number of neighbors that each descriptor can have at each layer	48
images_path	Path to folder where images (all formats) will be stored	images (db/images)
ivf_nlist	For the IVF FLAT indexing for descriptors, specify the number of partitions to create using the k-means algorithm	16
k8s_container	Boolean whether to use Kubernetes orchestration	false
max_simultaneous_clients	Number of max simultaneous connections open	500
pmgd_num_allocators	Number of allocators when creating a new PMGD graph (this will only be used when creating a new graph, and ignored if the graph already exist)	1
pmgd_path	Path to folder where PMGD graph will be stored	db
port	TCP port for incoming connections	55555
proxy_host	Address of the proxy (optional). Example: "a.proxy.from.intel.com"
proxy_port	Port number of the proxy. This parameter is needed when "proxy_host" parameter is set
proxy_scheme	Scheme used by the proxy, accepted values are "http" and "https". This parameter is needed when "proxy_host" and "proxy_port" parameters are set
query_handler	Specifies the query handler to use. Accepted values: `pmgd` (PMGD), or `neo4j` (Neo4j)	pmgd
neo4j_conn_pool_sz	Sets the pool size of neo4j client connections. This parameter is only used when "query_handler" is set to `neo4j`	32
replication_time		-1
storage_type	Database storage type. Accepted values: `local` (local storage), or `aws` (AWS S3)	local
tmp_path	Path to the temporary directory (optional)	`/tmp/tmp`
unit	Unit of the `autoreplicate_interval` variable. Accepted values: `h` (hour), `m` (minute), or `s` (seconds)	s
use_endpoint	Boolean whether to use an AWS storage mocking server (MinIO). This parameter is valid when the "storage_type" parameter is set to "aws" value	false

Config File Example¶

// VDMS Config File
// This is the run-time config file
// Sets database paths and other parameters
{
    "port": 55555,
    "cert_file": "cert.pem",
    "key_file": "key.pem",
    "ca_file": "ca.pem",
    "autoreplicate_interval":-1, // it should be > 0
    "unit":"s",
    "max_simultaneous_clients": 100,
    // "backup_path":"backups_test", // set this if you want different path to store the back up file
    "db_root_path": "db",
    "backup_flag" : "false",
    "storage_type": "local", //local, aws, etc
    "bucket_name": "vdms_bucket",
    "more-info": "github.com/IntelLabs/vdms",
    "use_endpoint": false, // storage_type is set to local
    "endpoint_override": "http://127.0.0.1:9000",// Format "scheme://ip:port"
    "k8s_container": false,
    "proxy_host": "a.proxy.from.intel.com",
    "proxy_port": 912,
    "proxy_scheme": "http", // [http|https] valid values,
    "aws_log_level": "debug", // [off|fatal|error|warn|info|debug|trace]
    "tmp_path": "/tmp/tmp"
}

Default Directories Structure¶

By default, VDMS will create a directory structure as follows:

db
├── blobs
├── descriptors
├── graph
│   ├── allocator.jdb
│   ├── edges.jdb
│   ├── graph.jdb
│   ├── indexmanager.jdb
│   ├── journal.jdb
│   ├── nodes.jdb
│   ├── stringtable.jdb
│   └── transaction.jdb
└── images
    ├── jpg
    ├── png
    └── tdb