Explain AQL Queries
You can explain and profile AQL queries to inspect the execution plans and to understand the performance characteristics, as well as create debug packages for reporting issues
If it is unclear how a given query will perform, clients can retrieve a query’s execution plan
from the AQL query optimizer without actually executing the query. Getting the query execution
plan from the optimizer is called explaining.
An explain throws an error if the given query is syntactically invalid. Otherwise, it
returns the execution plan and some information about what optimizations could be applied to
the query. The query is not executed.
You can explain a query using the HTTP REST API
or via arangosh.
Inspecting query plans
The explain()
method of an ArangoStatement
(db._createStatement(...).explain()
)
creates very verbose output. To get a human-readable output of the query plan,
you can use db._explain()
. You can use it as follows (without disabling syntax
highlighting with { colors: false }
):
db._explain("LET s = SLEEP(0.25) LET t = SLEEP(0.5) RETURN 1", {}, {colors: false});
Show outputQuery String (47 chars, cacheable: false):
LET s = SLEEP(0.25) LET t = SLEEP(0.5) RETURN 1
Execution plan:
Id NodeType Par Est. Comment
1 SingletonNode 1 * ROOT
4 CalculationNode ✓ 1 - LET #2 = 1 /* json expression */ /* const assignment */
2 CalculationNode 1 - LET s = SLEEP(0.25) /* simple expression */
3 CalculationNode 1 - LET t = SLEEP(0.5) /* simple expression */
5 ReturnNode 1 - RETURN #2
Indexes used:
none
Functions used:
Name Deterministic Cacheable Uses V8
SLEEP false false false
Optimization rules applied:
Id Rule Name Id Rule Name
1 move-calculations-up 2 async-prefetch
55 rule(s) executed, 1 plan(s) created, peak mem [b]: 0, exec time [s]: 0.00013
The plan contains all execution nodes that are used during a query. These nodes represent different
stages in a query. Each stage gets the input from the stage directly above (its dependencies).
The plan shows you the estimated number of items (results) for each query stage (under Est.). Each
query stage roughly equates to a line in your original query, which you can see under Comment.
Profiling queries
Sometimes when you have a complex query it can be unclear on what time is spent
during the execution, even for intermediate ArangoDB users.
By profiling a query it gets executed with special instrumentation code enabled.
It gives you all the usual information like when explaining a query, but
additionally you get the query profile, runtime statistics
and per-node statistics.
To use this in an interactive fashion in the shell, you can call
db._profileQuery()
, or use the web interface. You can use db._profileQuery()
as follows (without disabling syntax highlighting with { colors: false }
):
db._profileQuery("LET s = SLEEP(0.25) LET t = SLEEP(0.5) RETURN 1", {}, {colors: false});
Show outputQuery String (47 chars, cacheable: false):
LET s = SLEEP(0.25) LET t = SLEEP(0.5) RETURN 1
Execution plan:
Id NodeType Calls Par Items Filtered Runtime [s] Comment
1 SingletonNode 1 - 1 0 0.00000 * ROOT
4 CalculationNode 1 0 1 0 0.00000 - LET #2 = 1 /* json expression */ /* const assignment */
2 CalculationNode 1 - 1 0 0.25160 - LET s = SLEEP(0.25) /* simple expression */
3 CalculationNode 1 - 1 0 0.50315 - LET t = SLEEP(0.5) /* simple expression */
5 ReturnNode 1 - 1 0 0.00001 - RETURN #2
Indexes used:
none
Optimization rules applied:
Id Rule Name Id Rule Name
1 move-calculations-up 2 async-prefetch
Query Statistics:
Writes Exec Writes Ign Doc. Lookups Scan Full Scan Index Cache Hits/Misses Filtered Peak Mem [b] Exec Time [s]
0 0 0 0 0 0 / 0 0 0 0.75492
Query Profile:
Query Stage Duration [s] Query Stage Duration [s] Query Stage Duration [s]
initializing 0.00000 loading collections 0.00001 instantiating executors 0.00003
parsing 0.00003 instantiating plan 0.00001 executing 0.75477
optimizing ast 0.00000 optimizing plan 0.00007 finalizing 0.00003
For more information, see Profiling Queries.
Execution plans in detail
By default, the query optimizer returns what it considers to be the optimal plan. The
optimal plan is returned in the plan
attribute of the result. If explain
is
called with the allPlans
option set to true
, all plans are returned in the plans
attribute.
The result object also contains a warnings
attribute, which is an array of
warnings that occurred during optimization or execution plan creation.
Each plan in the result is an object with the following attributes:
nodes
: the array of execution nodes of the plan. See the list of
execution nodesestimatedCost
: the total estimated cost for the plan. If there are multiple
plans, the optimizer chooses the plan with the lowest total cost.collections
: an array of collections used in the queryrules
: an array of rules the optimizer applied. See the list of
optimizer rulesvariables
: array of variables used in the query (note: this may contain
internal variables created by the optimizer)
Here is an example for retrieving the execution plan of a simple query:
var stmt = db._createStatement("FOR user IN _users RETURN user");
stmt.explain();
Show output{
"plan" : {
"nodes" : [
{
"type" : "SingletonNode",
"dependencies" : [ ],
"id" : 1,
"estimatedCost" : 1,
"estimatedNrItems" : 1,
"bindParameterVariables" : {
}
},
{
"type" : "EnumerateCollectionNode",
"dependencies" : [
1
],
"id" : 2,
"estimatedCost" : 3,
"estimatedNrItems" : 1,
"isAsyncPrefetchEnabled" : true,
"random" : false,
"indexHint" : {
"forced" : false,
"lookahead" : 1,
"type" : "none"
},
"outVariable" : {
"id" : 0,
"name" : "user",
"isFullDocumentFromCollection" : true
},
"projections" : [ ],
"filterProjections" : [ ],
"count" : false,
"producesResult" : true,
"readOwnWrites" : false,
"useCache" : true,
"maxProjections" : 5,
"collection" : "_users",
"satellite" : false,
"isSatellite" : false,
"isSatelliteOf" : null
},
{
"type" : "ReturnNode",
"dependencies" : [
2
],
"id" : 3,
"estimatedCost" : 4,
"estimatedNrItems" : 1,
"inVariable" : {
"id" : 0,
"name" : "user",
"isFullDocumentFromCollection" : true
},
"count" : true
}
],
"rules" : [
"async-prefetch"
],
"estimatedCost" : 4,
"estimatedNrItems" : 1,
"collections" : [
{
"name" : "_users",
"type" : "read"
}
],
"variables" : [
{
"id" : 0,
"name" : "user",
"isFullDocumentFromCollection" : true
}
],
"isModificationQuery" : false
},
"warnings" : [ ],
"stats" : {
"rulesExecuted" : 55,
"rulesSkipped" : 0,
"plansCreated" : 1,
"peakMemoryUsage" : 0,
"executionTime" : 0.0001437970000210953
},
"cacheable" : true
}
As the output of explain()
is very detailed, it is recommended to use some
scripting to make the output less verbose:
var formatPlan = function (plan) {
return { estimatedCost: plan.estimatedCost,
nodes: plan.nodes.map(function(node) {
return node.type; }) }; };
formatPlan(stmt.explain().plan);
Show output{
"estimatedCost" : 4,
"nodes" : [
"SingletonNode",
"EnumerateCollectionNode",
"ReturnNode"
]
}
If a query contains bind parameters, they must be added to the statement before
explain()
is called:
var stmt = db._createStatement(
`FOR doc IN @@collection FILTER doc.user == @user RETURN doc`
);
stmt.bind({ "@collection" : "_users", "user" : "root" });
stmt.explain();
Show output{
"plan" : {
"nodes" : [
{
"type" : "SingletonNode",
"dependencies" : [ ],
"id" : 1,
"estimatedCost" : 1,
"estimatedNrItems" : 1,
"bindParameterVariables" : {
}
},
{
"type" : "IndexNode",
"dependencies" : [
1
],
"id" : 6,
"estimatedCost" : 2.04475,
"estimatedNrItems" : 1,
"isAsyncPrefetchEnabled" : true,
"outVariable" : {
"id" : 0,
"name" : "doc",
"isFullDocumentFromCollection" : true
},
"projections" : [ ],
"filterProjections" : [ ],
"count" : false,
"producesResult" : true,
"readOwnWrites" : false,
"useCache" : true,
"maxProjections" : 5,
"collection" : "_users",
"satellite" : false,
"isSatellite" : false,
"isSatelliteOf" : null,
"strategy" : "document",
"needsGatherNodeSort" : false,
"indexCoversProjections" : false,
"indexCoversOutProjections" : false,
"indexCoversFilterProjections" : false,
"indexes" : [
{
"id" : "40",
"type" : "hash",
"name" : "idx_1808559055555264512",
"fields" : [
"user"
],
"selectivityEstimate" : 1,
"unique" : true,
"sparse" : true,
"deduplicate" : true,
"estimates" : true,
"cacheEnabled" : false
}
],
"condition" : {
"type" : "n-ary or",
"typeID" : 63,
"subNodes" : [
{
"type" : "n-ary and",
"typeID" : 62,
"subNodes" : [
{
"type" : "compare ==",
"typeID" : 25,
"excludesNull" : false,
"subNodes" : [
{
"type" : "attribute access",
"typeID" : 35,
"name" : "user",
"subNodes" : [
{
"type" : "reference",
"typeID" : 45,
"name" : "doc",
"id" : 0,
"subqueryReference" : false
}
]
},
{
"type" : "value",
"typeID" : 40,
"value" : "root",
"vTypeID" : 4
}
]
}
]
}
]
},
"allCoveredByOneIndex" : false,
"sorted" : true,
"ascending" : true,
"evalFCalls" : true,
"waitForSync" : false,
"limit" : 0,
"isLateMaterialized" : false,
"lookahead" : 1
},
{
"type" : "ReturnNode",
"dependencies" : [
6
],
"id" : 5,
"estimatedCost" : 3.04475,
"estimatedNrItems" : 1,
"inVariable" : {
"id" : 0,
"name" : "doc",
"isFullDocumentFromCollection" : true
},
"count" : true
}
],
"rules" : [
"use-indexes",
"remove-filter-covered-by-index",
"remove-unnecessary-calculations-2",
"async-prefetch"
],
"estimatedCost" : 3.04475,
"estimatedNrItems" : 1,
"collections" : [
{
"name" : "_users",
"type" : "read"
}
],
"variables" : [
{
"id" : 1,
"name" : "1",
"isFullDocumentFromCollection" : false
},
{
"id" : 0,
"name" : "doc",
"isFullDocumentFromCollection" : true
}
],
"isModificationQuery" : false
},
"warnings" : [ ],
"stats" : {
"rulesExecuted" : 55,
"rulesSkipped" : 0,
"plansCreated" : 1,
"peakMemoryUsage" : 0,
"executionTime" : 0.0001603940000336479
},
"cacheable" : true
}
In some cases, the AQL optimizer creates multiple plans for a single query. By default
only the plan with the lowest total estimated cost is kept, and the other plans are
discarded. To retrieve all plans the optimizer has generated, explain
can be called
with the option allPlans
set to true
.
In the following example, the optimizer has created two plans:
var stmt = db._createStatement(
"FOR user IN _users FILTER user.user == 'root' RETURN user");
stmt.explain({ allPlans: true }).plans.length;
To see a slightly more compact version of the plan, the following
transformation can be applied:
stmt.explain({ allPlans: true }).plans.map(
function(plan) { return formatPlan(plan); });
Show output[
{
"estimatedCost" : 3.04475,
"nodes" : [
"SingletonNode",
"IndexNode",
"ReturnNode"
]
}
]
explain()
also accepts the following additional options:
maxPlans
: limits the maximum number of plans that are created by the AQL query optimizeroptimizer
:rules
: an array of to-be-included or to-be-excluded optimizer rules
can be put into this attribute, telling the optimizer to include or exclude
specific rules. To disable a rule, prefix its name with a -
, to enable a rule, prefix it
with a +
. There is also a pseudo-rule all
, which matches all optimizer rules.
The following example disables all optimizer rules but remove-redundant-calculations
:
stmt.explain({ optimizer: {
rules: [ "-all", "+remove-redundant-calculations" ] } });
Show output{
"plan" : {
"nodes" : [
{
"type" : "SingletonNode",
"dependencies" : [ ],
"id" : 1,
"estimatedCost" : 1,
"estimatedNrItems" : 1,
"bindParameterVariables" : {
}
},
{
"type" : "EnumerateCollectionNode",
"dependencies" : [
1
],
"id" : 2,
"estimatedCost" : 3,
"estimatedNrItems" : 1,
"random" : false,
"indexHint" : {
"forced" : false,
"lookahead" : 1,
"type" : "none"
},
"outVariable" : {
"id" : 0,
"name" : "user",
"isFullDocumentFromCollection" : true
},
"projections" : [ ],
"filterProjections" : [ ],
"count" : false,
"producesResult" : true,
"readOwnWrites" : false,
"useCache" : true,
"maxProjections" : 5,
"collection" : "_users",
"satellite" : false,
"isSatellite" : false,
"isSatelliteOf" : null
},
{
"type" : "CalculationNode",
"dependencies" : [
2
],
"id" : 3,
"estimatedCost" : 4,
"estimatedNrItems" : 1,
"expression" : {
"type" : "compare ==",
"typeID" : 25,
"excludesNull" : false,
"subNodes" : [
{
"type" : "attribute access",
"typeID" : 35,
"name" : "user",
"subNodes" : [
{
"type" : "reference",
"typeID" : 45,
"name" : "user",
"id" : 0,
"subqueryReference" : false
}
]
},
{
"type" : "value",
"typeID" : 40,
"value" : "root",
"vTypeID" : 4
}
]
},
"outVariable" : {
"id" : 1,
"name" : "1",
"isFullDocumentFromCollection" : false
},
"canThrow" : false,
"expressionType" : "simple"
},
{
"type" : "FilterNode",
"dependencies" : [
3
],
"id" : 4,
"estimatedCost" : 5,
"estimatedNrItems" : 1,
"inVariable" : {
"id" : 1,
"name" : "1",
"isFullDocumentFromCollection" : false
}
},
{
"type" : "ReturnNode",
"dependencies" : [
4
],
"id" : 5,
"estimatedCost" : 6,
"estimatedNrItems" : 1,
"inVariable" : {
"id" : 0,
"name" : "user",
"isFullDocumentFromCollection" : true
},
"count" : true
}
],
"rules" : [ ],
"estimatedCost" : 6,
"estimatedNrItems" : 1,
"collections" : [
{
"name" : "_users",
"type" : "read"
}
],
"variables" : [
{
"id" : 1,
"name" : "1",
"isFullDocumentFromCollection" : false
},
{
"id" : 0,
"name" : "user",
"isFullDocumentFromCollection" : true
}
],
"isModificationQuery" : false
},
"warnings" : [ ],
"stats" : {
"rulesExecuted" : 6,
"rulesSkipped" : 49,
"plansCreated" : 1,
"peakMemoryUsage" : 0,
"executionTime" : 0.00008838999997351493
},
"cacheable" : true
}
The contents of an execution plan are meant to be machine-readable. To get a human-readable
version of a query’s execution plan, the following commands can be used
(without disabling syntax highlighting with { colors: false }
):
var query = "FOR doc IN mycollection FILTER doc.value > 42 RETURN doc";
require("@arangodb/aql/explainer").explain(query, {colors:false});
Show outputQuery String (56 chars, cacheable: true):
FOR doc IN mycollection FILTER doc.value > 42 RETURN doc
Execution plan:
Id NodeType Par Est. Comment
1 SingletonNode 1 * ROOT
2 EnumerateCollectionNode ✓ 0 - FOR doc IN mycollection /* full collection scan */ FILTER (doc.`value` > 42) /* early pruning */
5 ReturnNode 0 - RETURN doc
Indexes used:
none
Optimization rules applied:
Id Rule Name Id Rule Name Id Rule Name
1 move-filters-into-enumerate 2 optimize-projections 3 async-prefetch
55 rule(s) executed, 1 plan(s) created, peak mem [b]: 0, exec time [s]: 0.00013
The above command prints the query’s execution plan in the ArangoShell
directly, focusing on the most important information.
If an explain provides no suitable insight into why a query does not perform as
expected, it may be reported to the ArangoDB support. In order to make this as easy
as possible, there is a built-in command in ArangoShell for packaging the query, its
bind parameters, and all data required to execute the query elsewhere.
require("@arangodb/aql/explainer").debugDump(filepath, query[, bindVars[, options]])
You can specify the following parameters:
filepath
(string): A file path to save the debug package toquery
(string): An AQL querybindVars
(object, optional): The bind parameters for the queryoptions
(object, optional): Options for the query, with two additionally
supported settings compared to db._query()
:examples
(number, optional): How many sample documents of your
collection data to include. Default: 0
anonymize
(boolean, optional): Whether all string attribute values of
the sample documents shall be substituted with strings like XXX
.
The command stores all data in a file with a configurable filename:
var query = "FOR doc IN mycollection FILTER doc.value > 42 RETURN doc";
require("@arangodb/aql/explainer").debugDump("/tmp/query-debug-info", query);
Show output2024-08-27T16:32:52.344630Z [1459-1] INFO [99d80] {general} stored query debug information in file '/tmp/query-debug-info'
Entitled users can send the generated file to the ArangoDB support to facilitate
reproduction and debugging.
If a query contains bind parameters, you need to specify them along with the query
string:
var query = "FOR doc IN @@collection FILTER doc.value > @value RETURN doc";
var bindVars = { value: 42, "@collection": "mycollection" };
require("@arangodb/aql/explainer").debugDump("/tmp/query-debug-info", query, bindVars);
Show output2024-08-27T16:32:52.351639Z [1459-1] INFO [99d80] {general} stored query debug information in file '/tmp/query-debug-info'
It is also possible to include example documents from the underlying collection in
order to make reproduction even easier. Example documents can be sent as they are, or
in an anonymized form. The number of example documents can be specified in the examples
options attribute, and should generally be kept low. The anonymize
option replaces
the contents of string attributes in the examples with XXX
. However, it does not
replace any other types of data (e.g. numeric values) or attribute names. Attribute
names in the examples are always preserved because they may be indexed and used in
queries:
var query = "FOR doc IN @@collection FILTER doc.value > @value RETURN doc";
var bind = { value: 42, "@collection": "mycollection" };
var options = { examples: 10, anonymize: true };
require("@arangodb/aql/explainer").debugDump("/tmp/query-debug-info", query, bind, options);
Show output2024-08-27T16:32:52.364978Z [1459-1] INFO [99d80] {general} stored query debug information in file '/tmp/query-debug-info'
To get a human-readable output from a debug package JSON file, you can use the
inspectDump()
method:
require("@arangodb/aql/explainer").inspectDump(inFilepath[, outFilepath])
You can specify the following parameters:
inFilepath
(string): The path to the debug package JSON fileoutFilepath
(string, optional): A path to store the formatted output in a
file instead of printing to the shell