ArangoDB v3.10 reached End of Life (EOL) and is no longer supported.
This documentation is outdated. Please see the most recent stable version.
HTTP interface for collections
The HTTP API for collections lets you create and delete collections, get information about collections, and modify certain properties of existing collections
Addresses of collections
All collections in ArangoDB have a unique identifier and a unique name. To access a collection, use the collection name to refer to it:
http://server:port/_api/collection/<collection-name>
For example, assume that the collection identifier is 7254820
and
the collection name is demo
, then the URL of that collection is:
http://localhost:8529/_api/collection/demo
Get information about collections
List all collections
Returns an object with a result
attribute containing an array with the
descriptions of all collections in the current database.
By providing the optional excludeSystem
query parameter with a value of
true
, all system collections are excluded from the response.
Examples
Return information about all collections:
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection
Get the collection information
The result is an object describing the collection with the following attributes:
id
: The identifier of the collection.name
: The name of the collection.status
: The status of the collection as number.- 3: loaded
- 5: deleted
Every other status indicates a corrupted collection.
type
: The type of the collection as number.- 2: document collection (normal case)
- 3: edge collection
isSystem
: Iftrue
then the collection is a system collection.
Get the properties of a collection
Returns all properties of the specified collection.
200 OK
keyOptions* object
An object which contains key generation options.
allowUserKeys* boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of a document. If set tofalse
, then the key generator is solely responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.
smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices (Enterprise Edition only). (cluster only)
writeConcern integer
Determines how many copies of each shard are required to be in-sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, thewriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
Response Body application/json object
Examples
Using an identifier:
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/67601/properties
Using a name:
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/properties
Get the document count of a collection
Get the number of documents in a collection.
count
: The number of documents stored in the specified collection.
Examples
Requesting the number of documents:
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/count
Get the collection statistics
In addition to the above, the result also contains the number of documents and additional statistical information about the collection.
details boolean
Setting
details
totrue
will return extended storage engine-specific details to the figures. The details are intended for debugging ArangoDB itself and their format is subject to change. By default,details
is set tofalse
, so no details are returned and the behavior is identical to previous versions of ArangoDB. Please note that requestingdetails
may cause additional load and thus have an impact on performance.
Examples
Using an identifier and requesting the figures of the collection:
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/figures
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/figures?details=true
Get the responsible shard for a document
Returns the ID of the shard that is responsible for the given document (if the document exists) or that would be responsible if such document existed.
The request must body must contain a JSON document with at least the collection’s shard key attributes set to some values.
The response is a JSON object with a shardId
attribute, which will
contain the ID of the responsible shard.
Examples
curl -X PUT --header 'accept: application/json' --data-binary @- --dump - 'http://localhost:8529/_api/collection/testCollection/responsibleShard' <<'EOF'
{
"_key": "testkey",
"value": 23
}
EOF
Get the shard IDs of a collection
By default returns a JSON array with the shard IDs of the collection.
If the details
parameter is set to true
, it will return a JSON object with the
shard IDs as object attribute keys, and the responsible servers for each shard mapped to them.
In the detailed response, the leader shards will be first in the arrays.
Examples
Retrieves the list of shards:
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/testCollection/shards
Retrieves the list of shards with the responsible servers:
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/testCollection/shards?details=true
Get the collection revision ID
The response will contain the collection’s latest used revision id. The revision id is a server-generated string that clients can use to check whether data in a collection has changed since the last revision check.
revision
: The collection revision id as a string.
Examples
Retrieving the revision of a collection
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/revision
Get the collection checksum
Will calculate a checksum of the meta-data (keys and optionally revision ids) and optionally the document data in the collection.
The checksum can be used to compare if two collections on different ArangoDB instances contain the same contents. The current revision of the collection is returned too so one can make sure the checksums are calculated for the same state of data.
By default, the checksum will only be calculated on the _key
system attribute
of the documents contained in the collection. For edge collections, the system
attributes _from
and _to
will also be included in the calculation.
By setting the optional query parameter withRevisions
to true
, then revision
ids (_rev
system attributes) are included in the checksumming.
By providing the optional query parameter withData
with a value of true
,
the user-defined document attributes will be included in the calculation too.
The response is a JSON object with the following attributes:
checksum
: The calculated checksum as a number.revision
: The collection revision id as a string.
Examples
Retrieving the checksum of a collection:
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/checksum
Retrieving the checksum of a collection including the collection data, but not the revisions:
curl --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/checksum?withRevisions=false&withData=true
Create and delete collections
Create a collection
Creates a new collection with a given name. The request must contain an object with the following attributes.
cacheEnabled boolean
Whether the in-memory hash cache for documents should be enabled for this collection (default:
false
). Can be controlled globally with the--cache.size
startup option. The cache can speed up repeated reads of the same documents via their document keys. If the same documents are not fetched often or are modified frequently, then you may disable the cache to avoid the maintenance costs.computedValues array of objects
An optional list of objects, each representing a computed value.
expression* string
An AQL
RETURN
operation with an expression that computes the desired value. See Computed Value Expressions for details.
distributeShardsLike string
The name of another collection. If this property is set in a cluster, the collection copies the
replicationFactor
,numberOfShards
,shardingStrategy
, andwriteConcern
properties from the specified collection (referred to as the prototype collection) and distributes the shards of this collection in the same way as the shards of the other collection. In an Enterprise Edition cluster, this data co-location is utilized to optimize queries.You need to use the same number of
shardKeys
as the prototype collection, but you can use different attributes.The default is
""
.Using this parameter has consequences for the prototype collection. It can no longer be dropped, before the sharding-imitating collections are dropped. Equally, backups and restores of imitating collections alone generate warnings (which can be overridden) about a missing sharding prototype.isSystem boolean
If
true
, create a system collection. In this case, thecollection-name
should start with an underscore. End-users should normally create non-system collections only. API implementors may be required to create system collections in very special occasions, but normally a regular collection will do. (The default isfalse
)keyOptions object
additional options for key generation. If specified, then
keyOptions
should be a JSON object containing the following attributes:allowUserKeys* boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of documents. If set tofalse
, then the key generator is solely be responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.type* string
specifies the type of the key generator. The currently available generators are
traditional
,autoincrement
,uuid
andpadded
.The
traditional
key generator generates numerical keys in ascending order. The sequence of keys is not guaranteed to be gap-free.The
autoincrement
key generator generates numerical keys in ascending order, the initial offset and the spacing can be configured (note:autoincrement
is currently only supported for non-sharded collections). The sequence of generated keys is not guaranteed to be gap-free, because a new key will be generated on every document insert attempt, not just for successful inserts.The
padded
key generator generates keys of a fixed length (16 bytes) in ascending lexicographical sort order. This is ideal for the RocksDB storage engine, which will slightly benefit keys that are inserted in lexicographically ascending order. The key generator can be used in a single-server or cluster. The sequence of generated keys is not guaranteed to be gap-free.The
uuid
key generator generates universally unique 128 bit keys, which are stored in hexadecimal human-readable format. This key generator can be used in a single-server or cluster to generate “seemingly random” keys. The keys produced by this key generator are not lexicographically sorted.
Please note that keys are only guaranteed to be truly ascending in single server deployments and for collections that only have a single shard (that includes collections in a OneShard database). The reason is that for collections with more than a single shard, document keys are generated on Coordinator(s). For collections with a single shard, the document keys are generated on the leader DB-Server, which has full control over the key sequence.
replicationFactor integer
(The default is
1
): in a cluster, this attribute determines how many copies of each shard are kept on different DB-Servers. The value 1 means that only one copy (no synchronous replication) is kept. A value of k means that k-1 replicas are kept. For SatelliteCollections, it needs to be the string"satellite"
, which matches the replication factor to the number of DB-Servers (Enterprise Edition only).Any two copies reside on different DB-Servers. Replication between them is synchronous, that is, every write operation to the “leader” copy will be replicated to all “follower” replicas, before the write operation is reported successful.
If a server fails, this is detected automatically and one of the servers holding copies take over, usually without an error being reported.
schema object
Optional object that specifies the collection level schema for documents. The attribute keys
rule
,level
andmessage
must follow the rules documented in Document Schema ValidationshardKeys string
(The default is
[ "_key" ]
): in a cluster, this attribute determines which document attributes are used to determine the target shard for documents. Documents are sent to shards based on the values of their shard key attributes. The values of all shard key attributes in a document are hashed, and the hash value is used to determine the target shard.Values of shard key attributes cannot be changed once set.shardingStrategy string
This attribute specifies the name of the sharding strategy to use for the collection. There are different sharding strategies to select from when creating a new collection. The selected
shardingStrategy
value remains fixed for the collection and cannot be changed afterwards. This is important to make the collection keep its sharding settings and always find documents already distributed to shards using the same initial sharding algorithm.The available sharding strategies are:
community-compat
: default sharding used by ArangoDB Community Edition before version 3.4enterprise-compat
: default sharding used by ArangoDB Enterprise Edition before version 3.4enterprise-smart-edge-compat
: default sharding used by smart edge collections in ArangoDB Enterprise Edition before version 3.4hash
: default sharding used for new collections starting from version 3.4 (excluding smart edge collections)enterprise-hash-smart-edge
: default sharding used for new smart edge collections starting from version 3.4enterprise-hex-smart-vertex
: sharding used for vertex collections of EnterpriseGraphs
If no sharding strategy is specified, the default is
hash
for all normal collections,enterprise-hash-smart-edge
for all smart edge collections, andenterprise-hex-smart-vertex
for EnterpriseGraph vertex collections (the latter two require the Enterprise Edition of ArangoDB). Manually overriding the sharding strategy does not yet provide a benefit, but it may later in case other sharding strategies are added.smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices.
This feature can only be used in the Enterprise Edition.
smartJoinAttribute string
In an Enterprise Edition cluster, this attribute determines an attribute of the collection that must contain the shard key value of the referred-to SmartJoin collection. Additionally, the shard key for a document in this collection must contain the value of this attribute, followed by a colon, followed by the actual primary key of the document.
This feature can only be used in the Enterprise Edition and requires the
distributeShardsLike
attribute of the collection to be set to the name of another collection. It also requires theshardKeys
attribute of the collection to be set to a single shard key attribute, with an additional ‘:’ at the end. A further restriction is that whenever documents are stored or updated in the collection, the value stored in thesmartJoinAttribute
must be a string.writeConcern integer
Write concern for this collection (default: 1). It determines how many copies of each shard are required to be in sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, thewriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
200 OK
keyOptions* object
An object which contains key generation options.
allowUserKeys* boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of a document. If set tofalse
, then the key generator is solely responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.
smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices (Enterprise Edition only). (cluster only)
writeConcern integer
Determines how many copies of each shard are required to be in-sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, thewriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
Response Body application/json object
Examples
curl -X POST --header 'accept: application/json' --data-binary @- --dump - http://localhost:8529/_api/collection
{
"name": "testCollectionBasics"
}
curl -X POST --header 'accept: application/json' --data-binary @- --dump - http://localhost:8529/_api/collection
{
"name": "testCollectionEdges",
"type": 3
}
curl -X POST --header 'accept: application/json' --data-binary @- --dump - http://localhost:8529/_api/collection
{
"name": "testCollectionUsers",
"keyOptions": {
"type": "autoincrement",
"increment": 5,
"allowUserKeys": true
}
}
Drop a collection
Drops the collection identified by collection-name
.
If the collection was successfully dropped, an object is returned with the following attributes:
error
:false
id
: The identifier of the dropped collection.
Examples
Using an identifier:
curl -X DELETE --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/67885
Using a name:
curl -X DELETE --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products1
Dropping a system collection
curl -X DELETE --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/_example?isSystem=true
Truncate a collection
Removes all documents from the collection, but leaves the indexes intact.
Examples
curl -X PUT --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/truncate
Modify collections
Load a collection
Since ArangoDB version 3.9.0 this API does nothing. Previously it used to load a collection into memory.
The request body object might optionally contain the following attribute:
count
: If set, this controls whether the return value should include the number of documents in the collection. Settingcount
tofalse
may speed up loading a collection. The default value forcount
istrue
.
A call to this API returns an object with the following attributes for compatibility reasons:
id
: The identifier of the collection.name
: The name of the collection.count
: The number of documents inside the collection. This is only returned if thecount
input parameters is set totrue
or has not been specified.status
: The status of the collection as number.type
: The collection type. Valid types are:- 2: document collection
- 3: edge collection
isSystem
: Iftrue
then the collection is a system collection.
Examples
curl -X PUT --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/load
Unload a collection
Since ArangoDB version 3.9.0 this API does nothing. Previously it used to unload a collection from memory, while preserving all documents. When calling the API an object with the following attributes is returned for compatibility reasons:
id
: The identifier of the collection.name
: The name of the collection.status
: The status of the collection as number.type
: The collection type. Valid types are:- 2: document collection
- 3: edges collection
isSystem
: Iftrue
then the collection is a system collection.
Examples
curl -X PUT --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/unload
Load collection indexes into memory
You can call this endpoint to try to cache this collection’s index entries in the main memory. Index lookups served from the memory cache can be much faster than lookups not stored in the cache, resulting in a performance boost.
The endpoint iterates over suitable indexes of the collection and stores the indexed values (not the entire document data) in memory. This is implemented for edge indexes only.
The endpoint returns as soon as the index warmup has been scheduled. The index warmup may still be ongoing in the background, even after the return value has already been sent. As all suitable indexes are scanned, it may cause significant I/O activity and background load.
This feature honors memory limits. If the indexes you want to load are smaller than your memory limit, this feature guarantees that most index values are cached. If the index is greater than your memory limit, this feature fills up values up to this limit. You cannot control which indexes of the collection should have priority over others.
It is guaranteed that the in-memory cache data is consistent with the stored index data at all times.
On success, this endpoint returns an object with attribute result
set to true
.
Examples
curl -X PUT --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/products/loadIndexesIntoMemory
Change the properties of a collection
Changes the properties of a collection. Only the provided attributes are updated. Collection properties cannot be changed once a collection is created except for the listed properties, as well as the collection name via the rename endpoint (but not in clusters).
cacheEnabled boolean
Whether the in-memory hash cache for documents should be enabled for this collection (default:
false
). Can be controlled globally with the--cache.size
startup option. The cache can speed up repeated reads of the same documents via their document keys. If the same documents are not fetched often or are modified frequently, then you may disable the cache to avoid the maintenance costs.computedValues array of objects
An optional list of objects, each representing a computed value.
expression* string
An AQL
RETURN
operation with an expression that computes the desired value. See Computed Value Expressions for details.
replicationFactor integer
(The default is
1
): in a cluster, this attribute determines how many copies of each shard are kept on different DB-Servers. The value 1 means that only one copy (no synchronous replication) is kept. A value of k means that k-1 replicas are kept. For SatelliteCollections, it needs to be the string"satellite"
, which matches the replication factor to the number of DB-Servers (Enterprise Edition only).Any two copies reside on different DB-Servers. Replication between them is synchronous, that is, every write operation to the “leader” copy will be replicated to all “follower” replicas, before the write operation is reported successful.
If a server fails, this is detected automatically and one of the servers holding copies take over, usually without an error being reported.
schema object
Optional object that specifies the collection level schema for documents. The attribute keys
rule
,level
andmessage
must follow the rules documented in Document Schema ValidationwriteConcern integer
Write concern for this collection (default: 1). It determines how many copies of each shard are required to be in sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, thewriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
Examples
curl -X PUT --header 'accept: application/json' --data-binary @- --dump - http://localhost:8529/_api/collection/products/properties
{
"waitForSync": true
}
Rename a collection
Renames a collection. Expects an object with the attribute(s)
name
: The new name.
It returns an object with the attributes
id
: The identifier of the collection.name
: The new name of the collection.status
: The status of the collection as number.type
: The collection type. Valid types are:- 2: document collection
- 3: edges collection
isSystem
: Iftrue
then the collection is a system collection.
If renaming the collection succeeds, then the collection is also renamed in
all graph definitions inside the _graphs
collection in the current database.
Examples
curl -X PUT --header 'accept: application/json' --data-binary @- --dump - http://localhost:8529/_api/collection/products1/rename
{
"name": "newname"
}
Recalculate the document count of a collection
Recalculates the document count of a collection, if it ever becomes inconsistent.
It returns an object with the attributes
result
: will betrue
if recalculating the document count succeeded.
Compact a collection
Compacts the data of a collection in order to reclaim disk space. The operation will compact the document and index data by rewriting the underlying .sst files and only keeping the relevant entries.
Under normal circumstances, running a compact operation is not necessary, as the collection data will eventually get compacted anyway. However, in some situations, e.g. after running lots of update/replace or remove operations, the disk data for a collection may contain a lot of outdated data for which the space shall be reclaimed. In this case the compaction operation can be used.
Examples
curl -X PUT --header 'accept: application/json' --dump - http://localhost:8529/_api/collection/testCollection/compact