Administrate an Active Failover deployment
Introduction
The Active Failover setup requires almost no manual administration.
You may still need to replace, upgrade or remove individual nodes in an Active Failover setup.
Determining the current Leader
It is possible to determine the leader by asking any of the involved single-server
instances. Just send a request to the /_api/cluster/endpoints
REST API.
curl http://server.domain.org:8530/_api/cluster/endpoints
{
"error": false,
"code": 200,
"endpoints": [
{
"endpoint": "tcp://[::1]:8530"
},
{
"endpoint": "tcp://[::1]:8531"
}
]
}
This API will return you all available endpoints, the first endpoint is defined to
be the current Leader. This endpoint is always available and will not be blocked
with a HTTP/1.1 503 Service Unavailable
response on a Follower
Reading from Follower
Followers in the active-failover setup are in a read-only mode. It is possible to read from these
followers by adding a X-Arango-Allow-Dirty-Read: true
header on each request. Responses will then automatically
contain the X-Arango-Potential-Dirty-Read
header so that clients can reject accidental dirty reads.
Depending on the driver support for your specific programming language, you should be able to enable this option.
Upgrading / Replacing / Removing a Leader
A Leader is the active server which can receive all read and write operations in an Active-Failover setup.
Upgrading or removing a Leader can be a little tricky, because as soon as you stop the leader’s process you will trigger a failover situation. This can be intended here, but you will probably want to halt all writes to the leader for a certain amount of time to allow the follower to catch up on all operations.
After you have ensured that the follower is sufficiently caught up, you can
stop the leader process via the shutdown API or by sending a SIGTERM
signal
to the process (i.e. kill <process-id>
). This will trigger an orderly shutdown,
and should trigger an immediate switch to the follower. If your client drivers
are configured correctly, you should notice almost no interruption in your
applications.
Once you upgraded the local server via the --database.auto-upgrade
option,
you can add it again to the Active Failover setup. The server will resync automatically
with the new Leader and become a Follower.
Upgrading / Replacing / Removing a Follower
A Follower is the passive server which tries to mirror all the data stored in the Leader.
To upgrade a follower you only need to stop the process and start it
with --database.auto-upgrade
. The server process will automatically resync
with the Leader after a restart.
The clean way of removing a Follower is to first start a replacement Follower (otherwise you will lose resiliency). After you have your replacement ready you can just kill the process and remove it.