HTTP interface for collections
The HTTP API for collections lets you create and delete collections, get information about collections, and modify certain properties of existing collections
Addresses of collections
All collections in ArangoDB have a unique identifier and a unique name. To access a collection, use the collection name to refer to it:
http://server:port/_api/collection/<collection-name>
For example, assume that the collection identifier is 7254820
and
the collection name is demo
, then the URL of that collection is:
http://localhost:8529/_api/collection/demo
Get information about collections
List all collections
Examples
Return information about all collections:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection'
Get the collection information
Get the properties of a collection
200 OK
All the collection properties.
keyOptions* object
An object which contains key generation options.
allowUserKeys* boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of a document. If set tofalse
, then the key generator is solely responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.
smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices (Enterprise Edition only). (cluster only)
writeConcern integer
Determines how many copies of each shard are required to be in-sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, the defaultwriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
Response Body application/json object
Examples
Using an identifier:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/68452/properties'
Using a name:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/properties'
Get the document count of a collection
200 OK
All properties of the collection but additionally the document
count
.keyOptions* object
An object which contains key generation options.
allowUserKeys* boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of a document. If set tofalse
, then the key generator is solely responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.
smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices (Enterprise Edition only). (cluster only)
writeConcern integer
Determines how many copies of each shard are required to be in-sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, the defaultwriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
Response Body application/json object
Examples
Requesting the number of documents:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/count'
Get the collection statistics
details boolean (default:
false
)Setting
details
totrue
will return extended storage engine-specific details to the figures. The details are intended for debugging ArangoDB itself and their format is subject to change. By default,details
is set tofalse
, so no details are returned and the behavior is identical to previous versions of ArangoDB. Please note that requestingdetails
may cause additional load and thus have an impact on performance.
200 OK
All properties of the collection but additionally the document
count
and collectionfigures
.keyOptions* object
An object which contains key generation options.
allowUserKeys* boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of a document. If set tofalse
, then the key generator is solely responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.
smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices (Enterprise Edition only). (cluster only)
writeConcern integer
Determines how many copies of each shard are required to be in-sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, the defaultwriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
Response Body application/json object
Examples
Using an identifier and requesting the figures of the collection:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/figures'
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/figures?details=true'
Get the responsible shard for a document
Returns the ID of the shard that is responsible for the given document (if the document exists) or that would be responsible if such document existed.
The request must body must contain a JSON document with at least the collection’s shard key attributes set to some values.
The response is a JSON object with a shardId
attribute, which will
contain the ID of the responsible shard.
Examples
curl -X PUT --header 'accept: application/json' --data-binary @- --dump - 'http://localhost:8529/_api/collection/testCollection/responsibleShard' <<'EOF'
{
"_key": "testkey",
"value": 23
}
EOF
Get the shard IDs of a collection
Returns a JSON array with the shard IDs of the collection.
If the details
parameter is set to true
, it returns a JSON object with the
shard IDs as object attribute keys, and the responsible servers for each shard mapped to them.
In the detailed response, the leader shards come first in the arrays.
Examples
Retrieves the list of shards:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/testCollection/shards'
Retrieves the list of shards with the responsible servers:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/testCollection/shards?details=true'
Get the collection revision ID
200 OK
All collection properties but additionally the collection
revision
.keyOptions* object
An object which contains key generation options.
allowUserKeys* boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of a document. If set tofalse
, then the key generator is solely responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.
smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices (Enterprise Edition only). (cluster only)
writeConcern integer
Determines how many copies of each shard are required to be in-sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, the defaultwriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
Response Body application/json object
Examples
Retrieving the revision of a collection
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/revision'
Get the collection checksum
Calculates a checksum of the meta-data (keys and optionally revision ids) and optionally the document data in the collection.
The checksum can be used to compare if two collections on different ArangoDB instances contain the same contents. The current revision of the collection is returned too so one can make sure the checksums are calculated for the same state of data.
By default, the checksum is only calculated on the _key
system attribute
of the documents contained in the collection. For edge collections, the system
attributes _from
and _to
are also included in the calculation.
By setting the optional query parameter withRevisions
to true
, then revision
IDs (_rev
system attributes) are included in the checksumming.
By providing the optional query parameter withData
with a value of true
,
the user-defined document attributes are included in the calculation, too.
Examples
Retrieving the checksum of a collection:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/checksum'
Retrieving the checksum of a collection including the collection data, but not the revisions:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/checksum?withRevisions=false&withData=true'
Get the available key generators
Examples:
Retrieving the key generators for collections:
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/key-generators'
Create and delete collections
Create a collection
cacheEnabled boolean (default:
false
)Whether the in-memory hash cache for documents should be enabled for this collection. Can be controlled globally with the
--cache.size
startup option. The cache can speed up repeated reads of the same documents via their document keys. If the same documents are not fetched often or are modified frequently, then you may disable the cache to avoid the maintenance costs.computedValues array of objects
An optional list of objects, each representing a computed value.
expression* string
An AQL
RETURN
operation with an expression that computes the desired value. See Computed Value Expressions for details.
distributeShardsLike string (default:
""
)The name of another collection. If this property is set in a cluster, the collection copies the
replicationFactor
,numberOfShards
andshardingStrategy
properties from the specified collection (referred to as the prototype collection) and distributes the shards of this collection in the same way as the shards of the other collection. In an Enterprise Edition cluster, this data co-location is utilized to optimize queries.You need to use the same number of
shardKeys
as the prototype collection, but you can use different attributes.Using this parameter has consequences for the prototype collection. It can no longer be dropped, before the sharding-imitating collections are dropped. Equally, backups and restores of imitating collections alone generate warnings (which can be overridden) about a missing sharding prototype.isSystem boolean (default:
false
)If
true
, create a system collection. In this case, thecollection-name
should start with an underscore. End-users should normally create non-system collections only. API implementors may be required to create system collections in very special occasions, but normally a regular collection will do.keyOptions object
additional options for key generation. If specified, then
keyOptions
should be a JSON object containing the following attributes:allowUserKeys boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of documents. If set tofalse
, then the key generator is solely be responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.type string
specifies the type of the key generator. The currently available generators are
traditional
,autoincrement
,uuid
andpadded
.The
traditional
key generator generates numerical keys in ascending order. The sequence of keys is not guaranteed to be gap-free.The
autoincrement
key generator generates numerical keys in ascending order, the initial offset and the spacing can be configured (note:autoincrement
is currently only supported for non-sharded collections). The sequence of generated keys is not guaranteed to be gap-free, because a new key will be generated on every document insert attempt, not just for successful inserts.The
padded
key generator generates keys of a fixed length (16 bytes) in ascending lexicographical sort order. This is ideal for the RocksDB storage engine, which will slightly benefit keys that are inserted in lexicographically ascending order. The key generator can be used in a single-server or cluster. The sequence of generated keys is not guaranteed to be gap-free.The
uuid
key generator generates universally unique 128 bit keys, which are stored in hexadecimal human-readable format. This key generator can be used in a single-server or cluster to generate “seemingly random” keys. The keys produced by this key generator are not lexicographically sorted.
Please note that keys are only guaranteed to be truly ascending in single server deployments and for collections that only have a single shard (that includes collections in a OneShard database). The reason is that for collections with more than a single shard, document keys are generated on Coordinator(s). For collections with a single shard, the document keys are generated on the leader DB-Server, which has full control over the key sequence.
replicationFactor integer (default:
1
)In a cluster, this attribute determines how many copies of each shard are kept on different DB-Servers. The value 1 means that only one copy (no synchronous replication) is kept. A value of k means that k-1 replicas are kept. For SatelliteCollections, it needs to be the string
"satellite"
, which matches the replication factor to the number of DB-Servers (Enterprise Edition only).Any two copies reside on different DB-Servers. Replication between them is synchronous, that is, every write operation to the “leader” copy will be replicated to all “follower” replicas, before the write operation is reported successful.
If a server fails, this is detected automatically and one of the servers holding copies take over, usually without an error being reported.
schema object
Optional object that specifies the collection level schema for documents. The attribute keys
rule
,level
andmessage
must follow the rules documented in Document Schema ValidationshardKeys string (default:
["_key"]
)In a cluster, this attribute determines which document attributes are used to determine the target shard for documents. Documents are sent to shards based on the values of their shard key attributes. The values of all shard key attributes in a document are hashed, and the hash value is used to determine the target shard.
Values of shard key attributes cannot be changed once set.shardingStrategy string
This attribute specifies the name of the sharding strategy to use for the collection. There are different sharding strategies to select from when creating a new collection. The selected
shardingStrategy
value remains fixed for the collection and cannot be changed afterwards. This is important to make the collection keep its sharding settings and always find documents already distributed to shards using the same initial sharding algorithm.The available sharding strategies are:
community-compat
: default sharding used by ArangoDB Community Edition before version 3.4enterprise-compat
: default sharding used by ArangoDB Enterprise Edition before version 3.4enterprise-smart-edge-compat
: default sharding used by smart edge collections in ArangoDB Enterprise Edition before version 3.4hash
: default sharding used for new collections starting from version 3.4 (excluding smart edge collections)enterprise-hash-smart-edge
: default sharding used for new smart edge collections starting from version 3.4enterprise-hex-smart-vertex
: sharding used for vertex collections of EnterpriseGraphs
If no sharding strategy is specified, the default is
hash
for all normal collections,enterprise-hash-smart-edge
for all smart edge collections, andenterprise-hex-smart-vertex
for EnterpriseGraph vertex collections (the latter two require the Enterprise Edition of ArangoDB). Manually overriding the sharding strategy does not yet provide a benefit, but it may later in case other sharding strategies are added.smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices.
This feature can only be used in the Enterprise Edition.
smartJoinAttribute string
In an Enterprise Edition cluster, this attribute determines an attribute of the collection that must contain the shard key value of the referred-to SmartJoin collection. Additionally, the shard key for a document in this collection must contain the value of this attribute, followed by a colon, followed by the actual primary key of the document.
This feature can only be used in the Enterprise Edition and requires the
distributeShardsLike
attribute of the collection to be set to the name of another collection. It also requires theshardKeys
attribute of the collection to be set to a single shard key attribute, with an additional ‘:’ at the end. A further restriction is that whenever documents are stored or updated in the collection, the value stored in thesmartJoinAttribute
must be a string.writeConcern integer
Determines how many copies of each shard are required to be in sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, the defaultwriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
200 OK
The collection has been created.
keyOptions* object
An object which contains key generation options.
allowUserKeys* boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of a document. If set tofalse
, then the key generator is solely responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.
smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices (Enterprise Edition only). (cluster only)
writeConcern integer
Determines how many copies of each shard are required to be in-sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, the defaultwriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
Response Body application/json object
Examples
curl -X POST --header 'accept: application/json' --data-binary @- --dump - 'http://localhost:8529/_api/collection' <<'EOF'
{
"name": "testCollectionBasics"
}
EOF
curl -X POST --header 'accept: application/json' --data-binary @- --dump - 'http://localhost:8529/_api/collection' <<'EOF'
{
"name": "testCollectionEdges",
"type": 3
}
EOF
curl -X POST --header 'accept: application/json' --data-binary @- --dump - 'http://localhost:8529/_api/collection' <<'EOF'
{
"name": "testCollectionUsers",
"keyOptions": {
"type": "autoincrement",
"increment": 5,
"allowUserKeys": true
}
}
EOF
Drop a collection
collection-name
and all its documents.Examples
Using an identifier:
curl -X DELETE --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/68769'
Using a name:
curl -X DELETE --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products1'
Dropping a system collection
curl -X DELETE --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/_example?isSystem=true'
Truncate a collection
Examples
curl -X PUT --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/truncate'
Modify collections
Load a collection
Since ArangoDB version 3.9.0 this API does nothing. Previously, it used to load a collection into memory.
Examples
curl -X PUT --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/load'
Unload a collection
Since ArangoDB version 3.9.0 this API does nothing. Previously it used to unload a collection from memory, while preserving all documents.
Examples
curl -X PUT --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/unload'
Load collection indexes into memory
You can call this endpoint to try to cache this collection’s index entries in the main memory. Index lookups served from the memory cache can be much faster than lookups not stored in the cache, resulting in a performance boost.
The endpoint iterates over suitable indexes of the collection and stores the indexed values (not the entire document data) in memory. This is implemented for edge indexes only.
The endpoint returns as soon as the index warmup has been scheduled. The index warmup may still be ongoing in the background, even after the return value has already been sent. As all suitable indexes are scanned, it may cause significant I/O activity and background load.
This feature honors memory limits. If the indexes you want to load are smaller than your memory limit, this feature guarantees that most index values are cached. If the index is greater than your memory limit, this feature fills up values up to this limit. You cannot control which indexes of the collection should have priority over others.
It is guaranteed that the in-memory cache data is consistent with the stored index data at all times.
Examples
curl -X PUT --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/products/loadIndexesIntoMemory'
Change the properties of a collection
cacheEnabled boolean (default:
false
)Whether the in-memory hash cache for documents should be enabled for this collection. Can be controlled globally with the
--cache.size
startup option. The cache can speed up repeated reads of the same documents via their document keys. If the same documents are not fetched often or are modified frequently, then you may disable the cache to avoid the maintenance costs.computedValues array of objects
An optional list of objects, each representing a computed value.
expression* string
An AQL
RETURN
operation with an expression that computes the desired value. See Computed Value Expressions for details.
replicationFactor integer (default:
1
)In a cluster, this attribute determines how many copies of each shard are kept on different DB-Servers. The value 1 means that only one copy (no synchronous replication) is kept. A value of k means that k-1 replicas are kept. For SatelliteCollections, it needs to be the string
"satellite"
, which matches the replication factor to the number of DB-Servers (Enterprise Edition only).Any two copies reside on different DB-Servers. Replication between them is synchronous, that is, every write operation to the “leader” copy will be replicated to all “follower” replicas, before the write operation is reported successful.
If a server fails, this is detected automatically and one of the servers holding copies take over, usually without an error being reported.
schema object
Optional object that specifies the collection level schema for documents. The attribute keys
rule
,level
andmessage
must follow the rules documented in Document Schema ValidationwriteConcern integer
Determines how many copies of each shard are required to be in sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, the defaultwriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
200 OK
The collection has been updated successfully.
keyOptions* object
An object which contains key generation options.
allowUserKeys* boolean
If set to
true
, then you are allowed to supply own key values in the_key
attribute of a document. If set tofalse
, then the key generator is solely responsible for generating keys and an error is raised if you supply own key values in the_key
attribute of documents.You should not use both user-specified and automatically generated document keys in the same collection in cluster deployments for collections with more than a single shard. Mixing the two can lead to conflicts because Coordinators that auto-generate keys in this case are not aware of all keys which are already used.
smartGraphAttribute string
The attribute that is used for sharding: vertices with the same value of this attribute are placed in the same shard. All vertices are required to have this attribute set and it has to be a string. Edges derive the attribute from their connected vertices (Enterprise Edition only). (cluster only)
writeConcern integer
Determines how many copies of each shard are required to be in-sync on the different DB-Servers. If there are less than these many copies in the cluster, a shard refuses to write. Writes to shards with enough up-to-date copies succeed at the same time, however. The value of
writeConcern
cannot be greater thanreplicationFactor
.If
distributeShardsLike
is set, the defaultwriteConcern
is that of the prototype collection. For SatelliteCollections, thewriteConcern
is automatically controlled to equal the number of DB-Servers and has a value of0
. Otherwise, the default value is controlled by the current database’s defaultwriteConcern
, which uses the--cluster.write-concern
startup option as default, which defaults to1
. (cluster only)
Response Body application/json object
Examples
curl -X PUT --header 'accept: application/json' --data-binary @- --dump - 'http://localhost:8529/_api/collection/products/properties' <<'EOF'
{
"waitForSync": true
}
EOF
Rename a collection
Renames a collection.
If renaming the collection succeeds, then the collection is also renamed in
all graph definitions inside the _graphs
collection in the current database.
Examples
curl -X PUT --header 'accept: application/json' --data-binary @- --dump - 'http://localhost:8529/_api/collection/products1/rename' <<'EOF'
{
"name": "newname"
}
EOF
Recalculate the document count of a collection
Compact a collection
Compacts the data of a collection in order to reclaim disk space. The operation will compact the document and index data by rewriting the underlying .sst files and only keeping the relevant entries.
Under normal circumstances, running a compact operation is not necessary, as the collection data will eventually get compacted anyway. However, in some situations, e.g. after running lots of update/replace or remove operations, the disk data for a collection may contain a lot of outdated data for which the space shall be reclaimed. In this case the compaction operation can be used.
Examples
curl -X PUT --header 'accept: application/json' --dump - 'http://localhost:8529/_api/collection/testCollection/compact'