ArangoDB v3.13 is under development and not released yet. This documentation is not final and potentially incomplete.
HTTP interface for vector indexes
Introduced in: v3.12.4
Create a vector index
collection-name
, if
it does not already exist.params* object
The parameters as used by the Faiss library.
defaultNProbe integer (default:
1
)How many neighboring centroids to consider for the search results by default. The larger the number, the slower the search but the better the search results. The default is
1
. You should generally use a higher value here or per query via thenProbe
option of the vector similarity functions.factory string
You can specify an index factory string that is forwarded to the underlying Faiss library, allowing you to combine different advanced options. Examples:
"IVF100_HNSW10,Flat"
"IVF100,SQ4"
"IVF10_HNSW5,Flat"
"IVF100_HNSW5,PQ256x16"
The base index must be an inverted file (IVF) to work with ArangoDB. If you don’t specify an index factory, the value is equivalent toIVF<nLists>,Flat
. For more information on how to create these custom indexes, see the Faiss Wiki .
nLists* integer
The number of Voronoi cells to partition the vector space into, respectively the number of centroids in the index. What value to choose depends on the data distribution and chosen metric. According to The Faiss library paper , it should be around
15 * sqrt(N)
whereN
is the number of documents in the collection, respectively the number of documents in the shard for cluster deployments. A bigger value produces more correct results but increases the training time and thus how long it takes to build the index. It cannot be bigger than the number of documents.