HTTP interface for WAL access
The WAL Access API is used to facilitate faster and
more reliable asynchronous replication. The API offers access to the
write-ahead log or operations log of the ArangoDB server. As a public
API, it is only supported to access these REST endpoints on a single-server
instance. While these APIs are also available on DB-Server instances, accessing them
as a user is not supported. This API replaces some of the APIs in /_api/replication
.
{% comment -%} Since the removal of AF, DC2DC, and Leader/Follower replication, the only async replication remaining is used to initialize new DB-Servers for clusters (snapshot transfer / SynchronizeShard) - and this is done using the below endpoints. They can also be used for testing cluster replication code without running a full cluster and can thus not be removed until Replication 1 is gone. {% endcomment -%}
Get the tick ranges available in the WAL
Returns the currently available ranges of tick values for all Write-Ahead Log (WAL) files. The tick values can be used to determine if certain data (identified by tick value) are still available for replication.
The body of the response contains a JSON object.
tickMin
: minimum tick availabletickMax
: maximum tick availabletime
: the server time as string in formatYYYY-MM-DDTHH:MM:SSZ
server
: An object with fieldsversion
andserverId
Examples
Returns the available tick ranges.
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/range'
Get the last available tick value
Returns the last available tick value that can be served from the server’s replication log. This corresponds to the tick of the latest successful operation.
The result is a JSON object containing the attributes tick
, time
and server
.
tick
: contains the last available tick,time
time
: the server time as string in formatYYYY-MM-DDTHH:MM:SSZ
server
: An object with fieldsversion
andserverId
Examples
Returning the first available tick
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/lastTick'
Tail recent server operations
Returns data from the server’s write-ahead log (also named replication log). This method can be called by replication clients after an initial synchronization of data. The method returns all “recent” logged operations from the server. Clients can replay and apply these operations locally so they get to the same data state as the server.
Clients can call this method repeatedly to incrementally fetch all changes
from the server. In this case, they should provide the from
value so
they only get returned the log events since their last fetch.
When the from
query parameter is not used, the server returns log
entries starting at the beginning of its replication log. When the from
parameter is used, the server only returns log entries which have
higher tick values than the specified from
value (note: the log entry with a
tick value equal to from
is excluded). Use the from
value when
incrementally fetching log data.
The to
query parameter can be used to optionally restrict the upper bound of
the result to a certain tick value. If used, the result contains only log events
with tick values up to (including) to
. In incremental fetching, there is no
need to use the to
parameter. It only makes sense in special situations,
when only parts of the change log are required.
The chunkSize
query parameter can be used to control the size of the result.
It must be specified in bytes. The chunkSize
value is only honored
approximately. Otherwise, a too low chunkSize
value could cause the server
to not be able to put just one log entry into the result and return it.
Therefore, the chunkSize
value is only consulted after a log entry has
been written into the result. If the result size is then greater than
chunkSize
, the server responds with as many log entries as there are
in the response already. If the result size is still less than chunkSize
,
the server tries to return more data if there’s more data left to return.
If chunkSize
is not specified, some server-side default value is used.
The Content-Type
of the result is application/x-arango-dump
. This is an
easy-to-process format, with all log events going onto separate lines in the
response body. Each log event itself is a JSON object, with at least the
following attributes:
tick
: the log event tick valuetype
: the log event type
Individual log events also have additional attributes, depending on the event type. A few common attributes which are used for multiple events types are:
cuid
: globally unique id of the View or collection the event was fordb
: the database name the event was fortid
: id of the transaction the event was contained indata
: the original document data
For a more detailed description of the individual replication event types and their data structures, see the Operation Types.
The response also contains the following HTTP headers:
x-arango-replication-active
: whether or not the logger is active. Clients can use this flag as an indication for their polling frequency. If the logger is not active and there are no more replication events available, it might be sensible for a client to abort, or to go to sleep for a long time and try again later to check whether the logger has been activated.x-arango-replication-lastincluded
: the tick value of the last included value in the result. In incremental log fetching, this value can be used as thefrom
value for the following request. Note that if the result is empty, the value is0
. This value should not be used asfrom
value by clients in the next request (otherwise the server would return the log events from the start of the log again).x-arango-replication-lastscanned
: the last tick the server scanned while computing the operation log. This might include operations the server did not returned to you due to various reasons (i.e. the value was filtered or skipped). You may use this value in thelastScanned
header to allow the RocksDB storage engine to break up requests over multiple responses.x-arango-replication-lasttick
: the last tick value the server has logged in its write ahead log (not necessarily included in the result). By comparing the last tick and last included tick values, clients have an approximate indication of how many events there are still left to fetch.x-arango-replication-frompresent
: is set to true if server returned all tick values starting from the specified tick in the from parameter. Should this be set to false the server did not have these operations anymore and the client might have missed operations.x-arango-replication-checkmore
: whether or not there already exists more log data which the client could fetch immediately. If there is more log data available, the client could call the tailing API again with an adjustedfrom
value to fetch remaining log entries until there are no more.If there isn’t any more log data to fetch, the client might decide to go to sleep for a while before calling the logger again.
Examples
No log events available
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/tail?from=184200'
A few log events (One JSON document per line)
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/tail?from=184200'
More events than would fit into the response
curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/tail?from=184235&chunkSize=400'
Operation Types
There are several different operation types thar an ArangoDB server might print.
All operations include a tick
value which identified their place in the operations log.
The numeric fields tick and tid always contain stringified numbers to avoid problems with
drivers where numbers in JSON might be mishandled.
The following operation types are used in ArangoDB:
Create Database (1100)
Create a database. Contains the field db with the database name and the field data, contains the database definition.
{
"tick": "2103",
"type": 1100,
"db": "test",
"data": {
"database": 337,
"id": "337",
"name": "test"
}
}
Drop Database (1101)
Drop a database. Contains the field db with the database name.
{
"tick": "3453",
"type": 1101,
"db": "test"
}
Create Collection (2000)
Create a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The data attribute contains the collection definition.
{
"tick": "3702",
"db": "_system",
"cuid": "hC0CF79DA83B4/555",
"type": 2000,
"data": {
"allowUserKeys": true,
"cacheEnabled": false,
"cid": "555",
"deleted": false,
"globallyUniqueId": "hC0CF79DA83B4/555",
"id": "555",
"indexes": [],
"isSystem": false,
"keyOptions": {
"allowUserKeys": true,
"lastValue": 0,
"type": "traditional"
},
"name": "test"
}
}
Drop Collection (2001)
Drop a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection.
{
"tick": "154",
"type": 2001,
"db": "_system",
"cuid": "hD15F8FE99859/555"
}
Rename Collection (2002)
Rename a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The data field contains the name field with the new name
{
"tick": "385",
"db": "_system",
"cuid": "hD15F8FE99859/135",
"type": 2002,
"data": {
"name": "other"
}
}
Change Collection (2003)
Change collection properties. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The data attribute contains the updated collection definition.
{
"tick": "154",
"type": 2003,
"db": "_system",
"cuid": "hD15F8FE99859/555",
"data": {
"waitForSync": true
}
}
Truncate Collection (2004)
Truncate a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection.
{
"tick": "154",
"type": 2004,
"db": "_system",
"cuid": "hD15F8FE99859/555"
}
Create Index (2100)
Create an index. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The field data contains the index definition.
{
"tick": "1327",
"type": 2100,
"db": "_system",
"cuid": "hD15F8FE99859/555",
"data": {
"deduplicate": true,
"fields": [
"value"
],
"id": "260",
"selectivityEstimate": 1,
"sparse": false,
"type": "persistent",
"unique": false
}
}
Drop Index (2101)
Drop an index. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The field data contains the field id with the index id.
{
"tick": "1522",
"type": 2101,
"db": "_system",
"cuid": "hD15F8FE99859/555",
"data": {
"id": "260"
}
}
Create View (2110)
Create a view. Contains the field db with the database name, and cuid with the globally unique id to identify this view. The field data contains the view definition
{
"tick": "1833",
"type": 2110,
"db": "_system",
"cuid": "hD15F8FE99859/322",
"data": {
"cleanupIntervalStep": 10,
"collections": [],
"commitIntervalMsec": 60000,
"consolidate": {
"segmentThreshold": 300,
"threshold": 0.8500000238418579,
"type": "tier"
},
"deleted": false,
"globallyUniqueId": "hD15F8FE99859/322",
"id": "322",
"isSystem": false,
"locale": "C",
"name": "myview",
"type": "arangosearch"
}
}
Drop View (2111)
Drop a view. Contains the field db with the database name, and cuid with the globally unique id to identify this view.
{
"tick": "3113",
"type": 2111,
"db": "_system",
"cuid": "hD15F8FE99859/322"
}
Change View (2112)
Change view properties (including the name). Contains the field db with the database name and cuid with the globally unique id to identify this view. The data attribute contain the updated properties.
{
"tick": "3014",
"type": 2112,
"db": "_system",
"cuid": "hD15F8FE99859/457",
"data": {
"cleanupIntervalStep": 10,
"collections": [
135
],
"commitIntervalMsec": 60000,
"consolidate": {
"segmentThreshold": 300,
"threshold": 0.8500000238418579,
"type": "tier"
},
"deleted": false,
"globallyUniqueId": "hD15F8FE99859/457",
"id": "457",
"isSystem": false,
"locale": "C",
"name": "renamedview",
"type": "arangosearch"
}
}
Start Transaction (2200)
Mark the beginning of a transaction. Contains the field db with the database name and the field tid for the transaction id. This log entry might be followed by zero or more document operations and then either one commit or an abort operation (i.e. types 2300, 2302 and 2201 / 2202) with the same tid value.
{
"tick": "3651",
"type": 2200,
"db": "_system",
"tid": "556"
}
Commit Transaction (2201)
Mark the successful end of a transaction. Contains the field db with the database name and the field tid for the transaction id.
{
"tick": "3652",
"type": 2201,
"db": "_system",
"tid": "556"
}
Abort Transaction (2202)
Mark the abortion of a transaction. Contains the field db with the database name and the field tid for the transaction id.
{
"tick": "3654",
"type": 2202,
"db": "_system",
"tid": "556"
}
Insert / Replace Document (2300)
Insert or replace a document. Contains the field db with the database name, cuid with the globally unique id to identify the collection and the field tid for the transaction id. The field tid might contain the value “0” to identify a single operation that is not part of a multi-document transaction. The field data contains the document. If the field _rev exists the client can choose to perform a revision check against a locally available version of the document to ensure consistency.
{
"tick": "196",
"type": 2300,
"db": "_system",
"tid": "0",
"cuid": "hE0E3D7BE511D/119",
"data": {
"_id": "users/194",
"_key": "194",
"_rev": "_XUJFD3C---",
"value": "test"
}
}
Remove Document (2302)
Remove a document. Contains the field db with the database name, cuid with the globally unique id to identify the collection and the field tid for the transaction id. The field tid might contain the value “0” to identify a single operation that is not part of a multi-document transaction. The field data contains the _key and _rev of the removed document. The client can choose to perform a revision check against a locally available version of the document to ensure consistency.
{
"cuid": "hE0E3D7BE511D/119",
"data": {
"_key": "194",
"_rev": "_XUJIbS---_"
},
"db": "_system",
"tick": "397",
"tid": "0",
"type": 2302
}