AQL query logging
You can optionally let ArangoDB write metadata of finished AQL queries to a collection for later analysis
Introduced in: v3.12.2
For debugging query issues and to understand usage patterns, it can be helpful to have a persistent log of queries.
ArangoDB lets you store information about past queries to a system collection with a configurable sampling probability and retention period. This allows you to analyze the metadata such as run time, memory usage, and failure reasons directly in the database system.
ArangoDB also supports event logging to a file, syslog, or the attached terminal.
See --log.level
for
details. The relevant log topics are aql
and queries
.
Enable query logging
To activate the logging of AQL queries to the _queries
collection, you need to
enable the --query.collection-logger-enabled
startup option for arangod.
If you want queries that run in the _system
database to be logged, you
additionally need to enable the --query.collection-logger-include-system-database
startup option.
As logging all queries can have a high overhead on busy systems, only a random
sample of the total queries is typically logged. You can control the probability
with the --query.collection-logger-probability
startup option. A value of 100
logs all queries, whereas a value of 1
approximately logs every 100th query
and ignores the rest. Which queries are logged is based on randomness.
You can enable the --query.collection-logger-all-slow-queries
startup option
to always log slow queries regardless of whether they are selected for sampling
or not. You can configure the time threshold for what is considered a slow query
with the --query.slow-threshold
and --query.slow-streaming-threshold
startup options.
Use the logged metadata
Query metadata is stored in the _queries
collection in the _system
database.
This collection is created automatically if needed.
_queries
collection is a normal system collection, so it can be queried by
every user with at least read access to the _system
database and the _queries
collection. Keep this in mind when enabling the query logging as the data may
include sensitive information such as user names, query strings, and bind parameters.Only finished queries are logged to the _queries
collection. Queries are
considered finished when they have executed completely or failed with an error.
In-flight queries are not logged but should eventually become finished queries.
Each document in the _queries
collection represents a past query that has been
sampled and includes the following attributes:
id
: The internal ID of the query.database
: The name of the database the query ran in.user
: The name of the user that executed the query.query
: The query string, if the--query.tracking-with-querystring
startup option is enabled (the default istrue
). Otherwise, the value is"<hidden>"
. The query string is cut off after--query.max-artifact-log-length
characters.bindVars
: The bind parameter values used by the query, if the--query.tracking-with-bindvars
startup option is enabled (the default istrue
). If it is disabled or no bind parameters were used, the value is{}
.dataSources
: An array of collection and View names that were used in the query. Only present if--query.tracking-with-datasources
startup option is enabled (the default isfalse
).started
: An ISO 8601 date time strings with the point in time the query started executing.runTime
: The total duration of the query in seconds.state
: The state of the query (always"finished"
).stream
: Whether the query was a streaming AQL query (stream
option).modificationQuery
: Whether the query created, modified, or deleted any documents.warnings
: The number of warnings issued by the query.exitCode
: The exit code of the query (0
= success, any other exit code indicates a specific error).
You can retrieve and analyze the stored query metadata by running AQL queries on
the _queries
collection in the _system
database. You can use arbitrary
filtering, sorting, and aggregation.
Examples
Return the logged queries issued by a specific user and that ran for at least 10 seconds, sorted by start time:
FOR doc IN _queries
FILTER doc.runTime >= 10.0
FILTER doc.user == @user
SORT doc.started
RETURN doc
Group the logged queries from a specified time range by database and user and return the count, database name, and user name for each group.
FOR doc IN _queries
FILTER doc.started >= @start
FILTER doc.started < @end
COLLECT db = doc.database, user = doc.user WITH COUNT INTO count
RETURN { db, user, count }
Log retention
When query logging is enabled, obsolete entries from the _queries
collection
are purged with a configurable schedule. The --query.collection-logger-retention-time
startup option determines for how long after a query’s start time an entry is
approximately retained in the system collection. For example, a value of 86400
keeps the document around for approximately one day (86400 seconds).
The default retention time is 28800
seconds (8 hours).
The actual cleanup of the system collection runs in configurable intervals.
This ensures that the cleanup process imposes minimal load.
You can configure it with the --query.collection-logger-cleanup-interval
startup option (in milliseconds).
Log buffering
You can use the --query.collection-logger-push-interval
startup option to set
a maximum wait time after which queries are logged to the system collection.
This is a performance optimization that helps to reduce the overhead of the
query logging. For example, a value of 10000
buffers query log entries in
memory for at most 10,000 milliseconds before they are actually written to the
system collection. When additional queries complete within this interval, their
log entries are batched together into a single write operation to the
system collection. This can amortize the cost of writing the query metadata to
the system collection across multiple user queries.
The --query.collection-logger-max-buffered-queries
startup option limits the
number of query log entries to buffer in memory before they are flushed to the
system collection. Once this limit has been reached, no further query metadata
is buffered in memory and it is lost. To make this relatively unlikely, a flush
is triggered automatically once 25% of the limit is reached. However, it is
still possible that the single-thread flush operation cannot keep up with the
rate of incoming queries so that the limit is reached and some query metadata is
not logged.
Any queries that are buffered in memory and are not yet flushed out to the system collection is lost in case the arangod process shuts down or crashes. The query logging functionality should therefore not be used for auditing but rather for debugging and troubleshooting query issues, such as finding long-running queries, queries that produced warnings or errors, users that overuse the database, and so on.