Schema Validation
How to enforce attributes and their data types for documents of individual collections using JSON Schema
While ArangoDB is schema-less, it allows to enforce certain document structures
on the collection level. The desired structure can be described in the popular
JSON Schema format (draft-4, without support for
remote schemas for security reasons). The level of validation and a custom error
message can be configured. The system attributes _key
, _id
, _rev
, _from
and _to
are ignored by the schema validation.
Enable schema validation for a collection
Schema validation can be managed via the JavaScript API, typically using arangosh, as well as via the HTTP interface.
To enable schema validation either pass the schema
property on collection
creation or when updating the properties of an existing collection. It expects an
object with the following attributes: rule
, level
and message
.
- The
rule
attribute must contain the JSON Schema description. level
controls when the validation is applied.message
sets the message that is used when validation fails.
var schema = {
rule: {
properties: { nums: { type: "array", items: { type: "number", maximum: 6 } } },
additionalProperties: { type: "string" },
required: ["nums"]
},
level: "moderate",
message: "The document does not contain an array of numbers in attribute 'nums', or one of the numbers is greater than 6."
};
/* Create a new collection with schema */
db._create("schemaCollection", { "schema": schema });
/* Update the schema of an existing collection */
db.schemaCollection.properties({ "schema": schema });
To remove an existing schema from a collection, a schema value of either null
or {}
(empty object) can be stored:
/* Remove the schema of an existing collection */
db.schemaCollection.properties({ "schema": null });
JSON Schema Rule
The rule
must be a valid JSON Schema object as outlined in the
specification .
See Understanding JSON Schema
for a user guide on how to write JSON Schema descriptions.
System attributes are invisible to the schema validation, i.e. _key
, _rev
and _id
(in edge collections additionally _from
and _to
) do not need to be
specified in the schema. You may set additionalProperties: false
to only
allow attributes described by the schema. System attributes do not fall under
this restriction.
Attributes with numeric values always have the type "number"
, even if they are
whole numbers (and internally use an integer
type). If you want to restrict an
attribute to integer values, use "type": "number"
together with "multipleOf": 1
.
Levels
The level controls when the validation is triggered:
none
: The rule is inactive and validation thus turned off.new
: Only newly inserted documents are validated.moderate
: New and modified documents must pass validation, except for modified documents where the OLD value did not pass validation already. This level is useful if you have documents which do not match your target structure, but you want to stop the insertion of more invalid documents and prohibit that valid documents are changed to invalid documents.strict
: All new and modified document must strictly pass validation. No exceptions are made (default).
Error message
If the schema validation for a document fails, then a generic error is raised.
You may customize the error message via the message
attribute to provide a
summary of what the expected document structure is or point out common mistakes.
The schema validation cannot pin-point which part of a rule made it fail because
it is difficult to determine and report for complex schemas. For example, when
using not
and anyOf
, this would result in trees of possible errors. You can
use tools like jsonschemavalidator.net
to examine schema validation issues.
Performance
The schema validation is executed for data-modification operations according to the levels described above. That means that it can slow down document write operations, with more complex schemas typically taking more time for the validation than very simple schemas.
Related AQL functions
The following AQL functions are available to work with schemas:
Backup and restore
Logical backups created with arangodump include the schema configuration, which is a collection property.
When using arangorestore to restore to a collection with a defined schema, no schema validation is executed.