HTTP interface for WAL access

The WAL Access API is used to facilitate faster and more reliable asynchronous replication. The API offers access to the write-ahead log or operations log of the ArangoDB server. As a public API, it is only supported to access these REST endpoints on a single-server instance. While these APIs are also available on DB-Server instances, accessing them as a user is not supported. This API replaces some of the APIs in /_api/replication.

{% comment -%} Since the removal of AF, DC2DC, and Leader/Follower replication, the only async replication remaining is used to initialize new DB-Servers for clusters (snapshot transfer / SynchronizeShard) - and this is done using the below endpoints. They can also be used for testing cluster replication code without running a full cluster and can thus not be removed until Replication 1 is gone. {% endcomment -%}

Get the tick ranges available in the WAL

get /_db/{database-name}/_api/wal/range

Returns the currently available ranges of tick values for all Write-Ahead Log (WAL) files. The tick values can be used to determine if certain data (identified by tick value) are still available for replication.

The body of the response contains a JSON object.

  • tickMin: minimum tick available
  • tickMax: maximum tick available
  • time: the server time as string in format YYYY-MM-DDTHH:MM:SSZ
  • server: An object with fields version and serverId
Path Parameters
  • The name of a database. The user account you authenticate with needs at least read access to this database and administrate access to the _system database.

Query Parameters
    HTTP Headers
      Responses
      • is returned if the tick ranges could be determined successfully.

      • is returned when an invalid HTTP method is used.

      • is returned if the server operations state could not be determined.

      • is returned when this operation is called on a Coordinator in a cluster deployment.

      Examples

      Returns the available tick ranges.

      curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/range'
      Show output

      Get the last available tick value

      get /_db/{database-name}/_api/wal/lastTick

      Returns the last available tick value that can be served from the server’s replication log. This corresponds to the tick of the latest successful operation.

      The result is a JSON object containing the attributes tick, time and server.

      • tick: contains the last available tick, time
      • time: the server time as string in format YYYY-MM-DDTHH:MM:SSZ
      • server: An object with fields version and serverId
      This method is not supported on a Coordinator in a cluster deployment.
      Path Parameters
      • The name of a database. The user account you authenticate with needs at least read access to this database and administrate access to the _system database.

      Query Parameters
        HTTP Headers
          Responses
          • is returned if the request was executed successfully.

          • is returned when an invalid HTTP method is used.

          • is returned if an error occurred while assembling the response.

          • is returned when this operation is called on a Coordinator in a cluster deployment.

          Examples

          Returning the first available tick

          curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/lastTick'
          Show output

          Tail recent server operations

          get /_db/{database-name}/_api/wal/tail

          Returns data from the server’s write-ahead log (also named replication log). This method can be called by replication clients after an initial synchronization of data. The method returns all “recent” logged operations from the server. Clients can replay and apply these operations locally so they get to the same data state as the server.

          Clients can call this method repeatedly to incrementally fetch all changes from the server. In this case, they should provide the from value so they only get returned the log events since their last fetch.

          When the from query parameter is not used, the server returns log entries starting at the beginning of its replication log. When the from parameter is used, the server only returns log entries which have higher tick values than the specified from value (note: the log entry with a tick value equal to from is excluded). Use the from value when incrementally fetching log data.

          The to query parameter can be used to optionally restrict the upper bound of the result to a certain tick value. If used, the result contains only log events with tick values up to (including) to. In incremental fetching, there is no need to use the to parameter. It only makes sense in special situations, when only parts of the change log are required.

          The chunkSize query parameter can be used to control the size of the result. It must be specified in bytes. The chunkSize value is only honored approximately. Otherwise, a too low chunkSize value could cause the server to not be able to put just one log entry into the result and return it. Therefore, the chunkSize value is only consulted after a log entry has been written into the result. If the result size is then greater than chunkSize, the server responds with as many log entries as there are in the response already. If the result size is still less than chunkSize, the server tries to return more data if there’s more data left to return.

          If chunkSize is not specified, some server-side default value is used.

          The Content-Type of the result is application/x-arango-dump. This is an easy-to-process format, with all log events going onto separate lines in the response body. Each log event itself is a JSON object, with at least the following attributes:

          • tick: the log event tick value

          • type: the log event type

          Individual log events also have additional attributes, depending on the event type. A few common attributes which are used for multiple events types are:

          • cuid: globally unique id of the View or collection the event was for

          • db: the database name the event was for

          • tid: id of the transaction the event was contained in

          • data: the original document data

          For a more detailed description of the individual replication event types and their data structures, see the Operation Types.

          The response also contains the following HTTP headers:

          • x-arango-replication-active: whether or not the logger is active. Clients can use this flag as an indication for their polling frequency. If the logger is not active and there are no more replication events available, it might be sensible for a client to abort, or to go to sleep for a long time and try again later to check whether the logger has been activated.

          • x-arango-replication-lastincluded: the tick value of the last included value in the result. In incremental log fetching, this value can be used as the from value for the following request. Note that if the result is empty, the value is 0. This value should not be used as from value by clients in the next request (otherwise the server would return the log events from the start of the log again).

          • x-arango-replication-lastscanned: the last tick the server scanned while computing the operation log. This might include operations the server did not returned to you due to various reasons (i.e. the value was filtered or skipped). You may use this value in the lastScanned header to allow the RocksDB storage engine to break up requests over multiple responses.

          • x-arango-replication-lasttick: the last tick value the server has logged in its write ahead log (not necessarily included in the result). By comparing the last tick and last included tick values, clients have an approximate indication of how many events there are still left to fetch.

          • x-arango-replication-frompresent: is set to true if server returned all tick values starting from the specified tick in the from parameter. Should this be set to false the server did not have these operations anymore and the client might have missed operations.

          • x-arango-replication-checkmore: whether or not there already exists more log data which the client could fetch immediately. If there is more log data available, the client could call the tailing API again with an adjusted from value to fetch remaining log entries until there are no more.

            If there isn’t any more log data to fetch, the client might decide to go to sleep for a while before calling the logger again.

          This method is not supported on a Coordinator in a cluster deployment.
          Path Parameters
          • The name of a database. The user account you authenticate with needs at least read access to this database and administrate access to the _system database.

          Query Parameters
          • Whether operations for all databases should be included. If set to false, only the operations for the current database are included. The value true is only valid on the _system database. The default is false.

          • Exclusive lower bound tick value for results. On successive calls to this API you should set this to the value returned with the x-arango-replication-lastincluded header (unless that header contains 0).

          • Inclusive upper bound tick value for results.

          • Should be set to the value of the x-arango-replication-lastscanned header or alternatively 0 on the first try. This allows the RocksDB storage engine to break up large transactions over multiple responses.

          • Approximate maximum size of the returned result.

          • The ID of the client used to tail results. The server uses this to keep operations until the client has fetched them. Must be a positive integer.

            Either syncerId or serverId is required to fetch all operations.

          • The ID of the client machine. If syncerId is unset, the server uses this to keep operations until the client has fetched them. Must be a positive integer.

            Either syncerId or serverId is required to fetch all operations.

          • Short description of the client, used for informative purposes only.

          HTTP Headers
            Responses
            • is returned if the request was executed successfully, and there are log events available for the requested range. The response body is not empty in this case.

            • is returned if the request was executed successfully, but there are no log events available for the requested range. The response body is empty in this case.

            • is returned if either the from or to values are invalid.

            • is returned when an invalid HTTP method is used.

            • is returned if an error occurred while assembling the response.

            • is returned when this operation is called on a Coordinator in a cluster deployment.

            Examples

            No log events available

            curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/tail?from=184233'
            Show output

            A few log events (One JSON document per line)

            curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/tail?from=184233'
            Show output

            More events than would fit into the response

            curl --header 'accept: application/json' --dump - 'http://localhost:8529/_api/wal/tail?from=184259&chunkSize=400'
            Show output

            Operation Types

            There are several different operation types thar an ArangoDB server might print. All operations include a tick value which identified their place in the operations log. The numeric fields tick and tid always contain stringified numbers to avoid problems with drivers where numbers in JSON might be mishandled.

            The following operation types are used in ArangoDB:

            Create Database (1100)

            Create a database. Contains the field db with the database name and the field data, contains the database definition.

            {
              "tick": "2103",
              "type": 1100,
              "db": "test",
              "data": {
                "database": 337,
                "id": "337",
                "name": "test"
              }
            }
            

            Drop Database (1101)

            Drop a database. Contains the field db with the database name.

            {
              "tick": "3453",
              "type": 1101,
              "db": "test"
            }
            

            Create Collection (2000)

            Create a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The data attribute contains the collection definition.

            {
              "tick": "3702",
              "db": "_system",
              "cuid": "hC0CF79DA83B4/555",
              "type": 2000,
              "data": {
                "allowUserKeys": true,
                "cacheEnabled": false,
                "cid": "555",
                "deleted": false,
                "globallyUniqueId": "hC0CF79DA83B4/555",
                "id": "555",
                "indexes": [],
                "isSystem": false,
                "keyOptions": {
                  "allowUserKeys": true,
                  "lastValue": 0,
                  "type": "traditional"
                },
                "name": "test"
              }
            }
            

            Drop Collection (2001)

            Drop a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection.

            {
              "tick": "154",
              "type": 2001,
              "db": "_system",
              "cuid": "hD15F8FE99859/555"
            }
            

            Rename Collection (2002)

            Rename a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The data field contains the name field with the new name

            {
              "tick": "385",
              "db": "_system",
              "cuid": "hD15F8FE99859/135",
              "type": 2002,
              "data": {
                "name": "other"
              }
            }
            

            Change Collection (2003)

            Change collection properties. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The data attribute contains the updated collection definition.

            {
              "tick": "154",
              "type": 2003,
              "db": "_system",
              "cuid": "hD15F8FE99859/555",
              "data": {
                "waitForSync": true
              }
            }
            

            Truncate Collection (2004)

            Truncate a collection. Contains the field db with the database name, and cuid with the globally unique id to identify this collection.

            {
              "tick": "154",
              "type": 2004,
              "db": "_system",
              "cuid": "hD15F8FE99859/555"
            }
            

            Create Index (2100)

            Create an index. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The field data contains the index definition.

            {
              "tick": "1327",
              "type": 2100,
              "db": "_system",
              "cuid": "hD15F8FE99859/555",
              "data": {
                "deduplicate": true,
                "fields": [
                  "value"
                ],
                "id": "260",
                "selectivityEstimate": 1,
                "sparse": false,
                "type": "persistent",
                "unique": false
              }
            }
            

            Drop Index (2101)

            Drop an index. Contains the field db with the database name, and cuid with the globally unique id to identify this collection. The field data contains the field id with the index id.

            {
              "tick": "1522",
              "type": 2101,
              "db": "_system",
              "cuid": "hD15F8FE99859/555",
              "data": {
                "id": "260"
              }
            }
            

            Create View (2110)

            Create a view. Contains the field db with the database name, and cuid with the globally unique id to identify this view. The field data contains the view definition

            {
              "tick": "1833",
              "type": 2110,
              "db": "_system",
              "cuid": "hD15F8FE99859/322",
              "data": {
                "cleanupIntervalStep": 10,
                "collections": [],
                "commitIntervalMsec": 60000,
                "consolidate": {
                  "segmentThreshold": 300,
                  "threshold": 0.8500000238418579,
                  "type": "tier"
                },
                "deleted": false,
                "globallyUniqueId": "hD15F8FE99859/322",
                "id": "322",
                "isSystem": false,
                "locale": "C",
                "name": "myview",
                "type": "arangosearch"
              }
            }
            

            Drop View (2111)

            Drop a view. Contains the field db with the database name, and cuid with the globally unique id to identify this view.

            {
              "tick": "3113",
              "type": 2111,
              "db": "_system",
              "cuid": "hD15F8FE99859/322"
            }
            

            Change View (2112)

            Change view properties (including the name). Contains the field db with the database name and cuid with the globally unique id to identify this view. The data attribute contain the updated properties.

            {
              "tick": "3014",
              "type": 2112,
              "db": "_system",
              "cuid": "hD15F8FE99859/457",
              "data": {
                "cleanupIntervalStep": 10,
                "collections": [
                  135
                ],
                "commitIntervalMsec": 60000,
                "consolidate": {
                  "segmentThreshold": 300,
                  "threshold": 0.8500000238418579,
                  "type": "tier"
                },
                "deleted": false,
                "globallyUniqueId": "hD15F8FE99859/457",
                "id": "457",
                "isSystem": false,
                "locale": "C",
                "name": "renamedview",
                "type": "arangosearch"
              }
            }
            

            Start Transaction (2200)

            Mark the beginning of a transaction. Contains the field db with the database name and the field tid for the transaction id. This log entry might be followed by zero or more document operations and then either one commit or an abort operation (i.e. types 2300, 2302 and 2201 / 2202) with the same tid value.

            {
              "tick": "3651",
              "type": 2200,
              "db": "_system",
              "tid": "556"
            }
            

            Commit Transaction (2201)

            Mark the successful end of a transaction. Contains the field db with the database name and the field tid for the transaction id.

            {
              "tick": "3652",
              "type": 2201,
              "db": "_system",
              "tid": "556"
            }
            

            Abort Transaction (2202)

            Mark the abortion of a transaction. Contains the field db with the database name and the field tid for the transaction id.

            {
              "tick": "3654",
              "type": 2202,
              "db": "_system",
              "tid": "556"
            }
            

            Insert / Replace Document (2300)

            Insert or replace a document. Contains the field db with the database name, cuid with the globally unique id to identify the collection and the field tid for the transaction id. The field tid might contain the value “0” to identify a single operation that is not part of a multi-document transaction. The field data contains the document. If the field _rev exists the client can choose to perform a revision check against a locally available version of the document to ensure consistency.

            {
              "tick": "196",
              "type": 2300,
              "db": "_system",
              "tid": "0",
              "cuid": "hE0E3D7BE511D/119",
              "data": {
                "_id": "users/194",
                "_key": "194",
                "_rev": "_XUJFD3C---",
                "value": "test"
              }
            }
            

            Remove Document (2302)

            Remove a document. Contains the field db with the database name, cuid with the globally unique id to identify the collection and the field tid for the transaction id. The field tid might contain the value “0” to identify a single operation that is not part of a multi-document transaction. The field data contains the _key and _rev of the removed document. The client can choose to perform a revision check against a locally available version of the document to ensure consistency.

            {
              "cuid": "hE0E3D7BE511D/119",
              "data": {
                "_key": "194",
                "_rev": "_XUJIbS---_"
              },
              "db": "_system",
              "tick": "397",
              "tid": "0",
              "type": 2302
            }