ArangoDB v3.13 is under development and not released yet. This documentation is not final and potentially incomplete.
Incompatible changes in ArangoDB 3.10
Check the following list of potential breaking changes before upgrading to this ArangoDB version and adjust any client applications if necessary
Declaration of start vertex collections
In cluster deployments, you need to declare collections that an AQL query
implicitly reads from using the WITH
operation.
From version 3.10.0 onward, it is necessary to also declare the collections of start vertices that are used for graph traversals if you specify start vertices using strings.
In previous versions, the following query would work:
WITH managers
FOR v, e, p IN 1..2 OUTBOUND 'users/1' usersHaveManagers
RETURN { v, e, p }
Now, you need to declare the users
collection as well because the start vertex
users/1
is specified as a string and the users
collection is not otherwise
specified explicitly in a language construct:
WITH users, managers
FOR v, e, p IN 1..2 OUTBOUND 'users/1' usersHaveManagers
RETURN { v, e, p }
In the following case, the users
collection is accessed explicitly to retrieve
the start vertex. Therefore, the collection does not need to be declared using
WITH
:
WITH managers
LET startVertex = (FOR u IN users FILTER u._id == "users/1" RETURN u)[0]
FOR v, e, p IN 1..2 OUTBOUND startVertex usersHaveManagers
RETURN { v, e, p }
Empty Document Updates
ArangoDB 3.10 adds back the optimization for empty document update operations
(i.e. updates in which no attributes were specified to be updated).
Such updates were handled in a special way in ArangoDB until including 3.7.12,
so that no actual writes were performed and replicated. This
also included the _rev
value of documents not being changed on an empty
update.
ArangoDB 3.7.13 and 3.8.0 changed this behavior so that empty document updates
actually performed a write operation, leading to the replication of changes and modification
of the _rev
value of affected.
ArangoDB 3.10 adds back the optimal behavior for empty document
updates, which no longer perform a write operation, do not need to be
replicated and do not change the documents’ _rev
values.
Foxx / Server Console
Previously a call to db._version(true)
inside a Foxx app or the server console
would return a different structure than the same call from arangosh.
Foxx/server console would return { <details> }
while arangosh would return
{ server: ..., license: ..., version: ..., details: { <details> }}
.
This is now unified so that the result structure is always consistent with the
one in arangosh. Any Foxx app or script that ran in the server console which
used db._version(true)
must now be changed to use db._version(true).details
instead.
Indexes
The fulltext index type is now deprecated in favor of ArangoSearch. Fulltext indexes are still usable in this version of ArangoDB, although their usage is now discouraged.
Geo Indexes
After an upgrade to 3.10 or higher, consider to drop and recreate geo indexes. GeoJSON polygons are interpreted slightly differently (and more correctly) in the newer versions.
Legacy geo indexes will continue to work and continue to produce the
same results as in earlier versions, since they will have the option
legacyPolygons
implicitly set to true
.
Newly created indexes will have legacyPolygons
set to false
by default
and thus enable the new polygon parsing.
Note that linear rings are not normalized automatically from version 3.10 onward, following the GeoJSON standard . The “interior” of a polygon strictly conforms to the GeoJSON standard: it lies to the left of the boundary line (in the direction of travel along the boundary line on the surface of the Earth). This can be the “larger” connected component of the surface, or the smaller one. Note that this differs from the interpretation of GeoJSON polygons in version 3.9 and older:
legacyPolygons enabled | legacyPolygons disabled |
---|---|
The smaller of the two regions defined by a linear ring is interpreted as the interior of the ring. | The area to the left of the boundary ring’s path is considered to be the interior. |
A ring can at most enclose half the Earth’s surface | A ring can enclose the entire surface of the Earth |
This can mean that old polygon GeoJSON data in the database is suddenly interpreted in a different way. See Legacy Polygons for details. Also see the definition of Polygons and GeoJSON interpretation.
geojson
Analyzers
If you use geojson
Analyzers and upgrade from a version below 3.10 to a
version of 3.10 or higher, the interpretation of GeoJSON Polygons changes.
If you have polygons in your data that mean to refer to a relatively small region but have the boundary running clockwise around the intended interior, they are interpreted as intended prior to 3.10, but from 3.10 onward, they are interpreted as “the other side” of the boundary.
Whether a clockwise boundary specifies the complement of the small region intentionally or not cannot be determined automatically. Please test the new behavior manually.
See Legacy Polygons for details.
Introduced in: v3.10.5
Analyzers of the geojson
type have a new legacy
property.
The default is false
.
This option controls how GeoJSON Polygons are interpreted.
If you require the old behavior, upgrade to at least 3.10.5, drop your
geojson
Analyzers, and create new ones with legacy
set to true
.
If these Analyzers are used in arangosearch
Views, then they need to be
dropped as well before dropping the Analyzers, and recreated after creating
the new Analyzers.
legacy enabled | legacy disabled |
---|---|
The smaller of the two regions defined by a linear ring is interpreted as the interior of the ring. | The area to the left of the boundary ring’s path is considered to be the interior. |
A ring can at most enclose half the Earth’s surface | A ring can enclose the entire surface of the Earth |
Also see the definition of Polygons and the
geojson
Analyzer documentation.
Maximum Array / Object Nesting
When reading any data from JSON or VelocyPack input or when serializing any data to JSON or
VelocyPack, there is a maximum recursion depth for nested arrays and objects, which is slightly
below 200. Arrays or objects with higher nesting than this will cause Too deep nesting in Array/Object
exceptions.
The limit is also enforced when converting any server data to JavaScript in Foxx, or
when sending JavaScript input data from Foxx to a server API.
This maximum recursion depth is hard-coded in arangod and all client tools.
Validation of smartGraphAttribute
in SmartGraphs
Introduced in: v3.10.13
The attribute defined by the smartGraphAttribute
graph property is not allowed to be
changed in the documents of SmartGraph vertex collections. This is now strictly enforced.
See API Changes in ArangoDB 3.10
for details and instructions on how to repair affected attributes.
Validation of traversal collection restrictions
Introduced in: v3.9.11, v3.10.7
In AQL graph traversals, you can restrict the vertex and edge collections in the traversal options like so:
FOR v, e, p IN 1..3 OUTBOUND 'products/123' components
OPTIONS {
vertexCollections: [ "bolts", "screws" ],
edgeCollections: [ "productsToBolts", "productsToScrews" ]
}
RETURN v
If you specify collections that don’t exist, queries now fail with
a “collection or view not found” error (code 1203
and HTTP status
404 Not Found
). In previous versions, unknown vertex collections were ignored,
and the behavior for unknown edge collections was undefined.
Additionally, the collection types are now validated. If a document collection
or View is specified in edgeCollections
, an error is raised
(code 1218
and HTTP status 400 Bad Request
).
Furthermore, it is now an error if you specify a vertex collection that is not
part of the specified named graph (code 1926
and HTTP status 404 Not Found
).
It is also an error if you specify an edge collection that is not part of the
named graph’s definition or of the list of edge collections (code 1939
and
HTTP status 400 Bad Request
).
Exit code adjustments
Introduced in: v3.10.13
For some fatal errors like a required database upgrade or a failed version check,
arangod set the generic exit code of 1
. It now returns a different, more
specific exit code in these cases.
Startup Options
Handling of Invalid Startup Options
Starting with ArangoDB 3.10, the arangod executable and all other client tools use more specific process exit codes in the following situations:
- An unknown startup option name is used: Previously, the exit code was
1
. Now, the exit code when using an invalid option is3
(symbolic exit code nameEXIT_INVALID_OPTION_NAME
). - An invalid value is used for a startup option (e.g. a number that is
outside the allowed range for the option’s underlying value type, or a
string value is used for a numeric option): Previously, the exit code was
1
. Now, the exit code for these case is4
(symbolic exit code nameEXIT_INVALID_OPTION_VALUE
). - A config file is specified that does not exist: Previously, the exit code
was either
1
or6
(symbolic exit code nameEXIT_CONFIG_NOT_FOUND
). Now, the exit code in this case is always6
(EXIT_CONFIG_NOT_FOUND
). - A structurally invalid config file is used, e.g. the config file contains
a line that cannot be parsed: Previously, the exit code in this situation
was
1
. Now, it is always6
(symbolic exit code nameEXIT_CONFIG_NOT_FOUND
).
Note that this change can affect any custom scripts that check for startup
failures using the specific exit code 1
. These scripts should be adjusted so
that they check for a non-zero exit code. They can opt into more specific
error handling using the additional exit codes mentioned above, in order to
distinguish between different kinds of startup errors.
Web Interface Options
The --frontend.*
startup options were renamed to --web-interface.*
:
--frontend.proxy-request.check
is now--web-interface.proxy-request.check
--frontend.trusted-proxy
is now--web-interface.trusted-proxy
--frontend.version-check
is now--web-interface.version-check
The former startup options are still supported for backward compatibility.
RocksDB Options
The default value of the --rocksdb.cache-index-and-filter-blocks
startup option was changed
from false
to true
. This makes RocksDB track all loaded index and filter blocks in the
block cache, so they are accounted against the RocksDB’s block cache memory limit.
The default value for the --rocksdb.enforce-block-cache-size-limit
startup option was also
changed from false
to true
to make the RocksDB block cache not temporarily exceed the
configured memory limit.
These default value changes will make RocksDB adhere much better to the configured memory limit
(configurable via --rocksdb.block-cache-size
).
The changes may have a small negative impact on performance because, if the block cache is
not large enough to hold the data plus the index and filter blocks, additional disk I/O may
need to be performed compared to the previous versions.
This is a trade-off between memory usage predictability and performance, and ArangoDB 3.10
will default to more stable and predictable memory usage. If there is still unused RAM
capacity available, it may be sensible to increase the total size of the RocksDB block cache,
by increasing --rocksdb.block-cache-size
. Due to the changed configuration, the block
cache size limit will not be exceeded anymore.
It is possible to opt out of these changes and get back the memory and performance characteristics
of the previous versions by setting the --rocksdb.cache-index-and-filter-blocks
and --rocksdb.enforce-block-cache-size-limit
startup options to false
on startup.
RocksDB File Format
ArangoDB 3.10 internally switches to RocksDB’s format_version
5, which can still be
read by older versions of ArangoDB.
However, ArangoDB 3.10 uses the LZ4 compression scheme to reduce the size of RocksDB .sst
files from LSM tree level 2 onwards. This compression scheme is not supported in ArangoDB
versions before 3.10, so any database files created with ArangoDB 3.10 or higher cannot be
opened with versions before 3.10.
The internal checksum type of RocksDB .sst files has been changed to xxHash64 in ArangoDB
3.10 for a slight performance improvement.
Pregel Options
Pregel jobs now have configurable minimum, maximum and default parallelism values. You can set them by the following startup options:
--pregel.min-parallelism
: minimum parallelism usable in Pregel jobs. Defaults to1
.--pregel.max-parallelism
: maximum parallelism usable in Pregel jobs. Defaults to the number of available cores.--pregel.parallelism
: default parallelism to use in Pregel jobs. Defaults to the number of available cores divided by 4. The result will be clamped to a value between 1 and 16.
Pregel now also stores its temporary data in memory-mapped files on disk by default, whereas in previous versions the default behavior was to buffer it to RAM. Storing temporary data in memory-mapped files rather than in RAM has the advantage of lowering the RAM usage, which reduces the likelihood of out-of-memory situations. However, storing the files on disk requires disk capacity, so that instead of running out of RAM it is now possible to run out of disk space.
--pregel.memory-mapped-files
and --pregel.memory-mapped-files-location-type
.For more information on the new options, please refer to ArangoDB Server Pregel Options.
HTTP RESTful API
Validation of collections in named graphs
The /_api/gharial
endpoints for named graphs have changed:
If you reference a vertex collection in the
_from
or_to
attribute of an edge that doesn’t belong to the graph, an error with the number1947
is returned. The HTTP status code of such anERROR_GRAPH_REFERENCED_VERTEX_COLLECTION_NOT_USED
error has been changed from400
to404
. This change aligns the behavior to the similarERROR_GRAPH_EDGE_COLLECTION_NOT_USED
error (number1930
).Write operations now check if the specified vertex or edge collection is part of the graph definition. If you try to create a vertex via
POST /_api/gharial/{graph}/vertex/{collection}
but thecollection
doesn’t belong to thegraph
, then theERROR_GRAPH_REFERENCED_VERTEX_COLLECTION_NOT_USED
error is returned. If you try to create an edge viaPOST /_api/gharial/{graph}/edge/{collection}
but thecollection
doesn’t belong to thegraph
, then the error isERROR_GRAPH_EDGE_COLLECTION_NOT_USED
.
Encoding of revision IDs
Introduced in: v3.8.8, v3.9.4, v3.10.1
GET /_api/collection/<collection-name>/revision
: The revision ID was previously returned as numeric value, and now it is returned as a string value with either numeric encoding or HLC-encoding inside.GET /_api/collection/<collection-name>/checksum
: The revision ID in therevision
attribute was previously encoded as a numeric value in single server, and as a string in cluster. This is now unified so that therevision
attribute always contains a string value with either numeric encoding or HLC-encoding inside.
Client Tools
arangobench
Renamed the --concurrency
startup option to --threads
.
The following deprecated arangobench testcases have been removed from arangobench:
aqltrx
aqlv8
counttrx
deadlocktrx
multi-collection
multitrx
random-shapes
shapes
shapes-append
skiplist
stream-cursor
These test cases had been deprecated since ArangoDB 3.9.
The testcase hash
was renamed to persistent-index
to better reflect its
scope.
arangoexport
To improve naming consistency across different client tools, the existing
arangoexport startup options --query
and --query-max-runtime
were renamed
to --custom-query
and --custom-query-max-runtime
.
Using the old option names (--query
and --query-max-runtime
) is still
supported and will implicitly use the --custom-query
and
--custom-query-max-runtime
options under the hood. Client scripts should
eventually be updated to use the new option name, however.
ArangoDB Starter
The ArangoDB Starter comes with the following usability improvements:
- Headers are now added to generated command files, indicating the purpose of the file.
- The process output is now shown when errors occur during process startup.
- When passing through other database options, explicit hints are now displayed to indicate how to pass those options.
- The Starter now returns exit code
1
if it encounters any errors while starting. Previously, the exit code was0
. Note that this change can affect any custom scripts that check for startup errors or invalid command line options. These scripts can be adjusted so that they check for a non-zero exit code.