ArangoDB v3.13 is under development and not released yet. This documentation is not final and potentially incomplete.

HTTP interface for vector indexes

Introduced in: v3.12.4

Create a vector index

POST /_db/{database-name}/_api/index

Creates a vector index for the collection collection-name, if it does not already exist.

Path Parameters

database-name* string
The name of the database.

Query Parameters

collection* string
The collection name.

HTTP Headers

Request Body application/json object

fields* array of strings
A list with exactly one attribute path to specify where the vector embedding is stored in each document. The vector data needs to be populated before creating the index.
If you want to index another vector embedding attribute, you need to create a separate vector index.
inBackground boolean (default: false)
Set this option to true to keep the collection/shards available for write operations by not using an exclusive write lock for the duration of the index creation.
name string
A user-defined name for the index for easier identification. If not specified, a name is automatically generated.
parallelism integer
The number of threads to use for indexing. Default: 2
params* object
The parameters as used by the Faiss library.
- defaultNProbe integer (default: 1)
  How many neighboring centroids to consider for the search results by default. The larger the number, the slower the search but the better the search results. The default is 1. You should generally use a higher value here or per query via the nProbe option of the vector similarity functions.
- dimension* integer
  The vector dimension. The attribute to index needs to have this many elements in the array that stores the vector embedding.
- factory string
  You can specify an index factory string that is forwarded to the underlying Faiss library, allowing you to combine different advanced options. Examples:
  "IVF100_HNSW10,Flat"
  "IVF100,SQ4"
  "IVF10_HNSW5,Flat"
  "IVF100_HNSW5,PQ256x16" The base index must be an inverted file (IVF) to work with ArangoDB. If you don’t specify an index factory, the value is equivalent to IVF<nLists>,Flat. For more information on how to create these custom indexes, see the Faiss Wiki .
- metric* string
  Possible values: "cosine", "l2"
  Whether to use cosine or l2 (Euclidean) distance calculation.
- nLists* integer
  The number of Voronoi cells to partition the vector space into, respectively the number of centroids in the index. What value to choose depends on the data distribution and chosen metric. According to The Faiss library paper , it should be around 15 * sqrt(N) where N is the number of documents in the collection, respectively the number of documents in the shard for cluster deployments. A bigger value produces more correct results but increases the training time and thus how long it takes to build the index. It cannot be bigger than the number of documents.
- trainingIterations integer (default: 25)
  The number of iterations in the training process. The default is 25. Smaller values lead to a faster index creation but may yield worse search results.
type* string
The index type. Needs to be "vector".

Responses

200 OK
The index exists already.
201 Created
The index is created as there is no such existing index.
404 Not Found
The collection is unknown.