Vertex-Centric Indexes

You can create indexes over the _from or _to attribute and one or more additional edge attributes to improve certain graph traversals

All edge collections in ArangoDB have a special edge index that enables fast graph operations. If you have graphs that contain supernodes (vertices that have an exceptionally high amount of connected edges) and you apply filters in graph traversal queries, you can create so-called vertex-centric indexes that can perform better than the default edge indexes. You can use the persistent and mdi-prefixed index types for this purpose.

Motivation

The idea of a vertex-centric index is to index a combination of a vertex, the direction, and an arbitrary set of attributes on the edges. This can be achieved by indexing the _from or _to attribute of an edge as the first field, which contains the document identifier of a vertex and implicitly captures the direction, followed by any number of other attributes of an edge.

To support traversals in OUTBOUND direction, you need to index the _from attribute as the first attribute. For the INBOUND direction, you need to use the _to attribute. To support both (ANY or mixed INBOUND and OUTBOUND directions), you need to create two indexes, using _from in one and _to in the other as the first attribute the index is over.

For example, if you have an attribute called type on the edges and traverse in OUTBOUND direction, you can create a vertex-centric persistent index over ["_from", "type"] to find all edges attached to a vertex with a given type. The following query can benefit from such an index:

FOR v, e, p IN 3..5 OUTBOUND @start GRAPH @graphName
  FILTER p.edges[*].type ALL == "friend"
  RETURN v

Using the built-in edge-index, ArangoDB can find the list of all edges attached to the vertex fast but it still it has to walk through this list and check if all of them have the attribute type == "friend". A vertex-centric index allows ArangoDB to find all edges with the attribute type == "friend" for the vertex in one go, saving the iteration to verify the condition.

If you have numeric attributes on edges and want to filter by them using value ranges, perhaps in addition to filtering by a type using an equality check, you can create a vertex-centric mdi-prefixed index. Assuming the numeric attributes are called x and y, a possible query could look like this:

FOR v, e, p in 0..3 INBOUND @start GRAPH @graphName
  OPTIONS { order: "bfs", uniqueVertices: "path" }
  FILTER p.edges[*].type ALL == "friend"
     AND p.edges[*].x ALL >= 5
     AND p.edges[*].y ALL <= 7
  RETURN p

Index creation

A vertex-centric has to be of the type Persistent Index or prefixed Multi-dimensional index and is created like any other index of the respective type. However, in the list of fields used to create the index over, you need to use either _from or _to as the first field.

For example, if you want to create a vertex-centric index on the type attribute that supports traversing in the OUTBOUND direction, you would create the index in the following way:

db.edgeCollection.ensureIndex({ type: "persistent", fields: [ "_from", "type" ] });
Show output

If you want to create a vertex-centric index on multi-dimensional data in the x and y attributes with a type attribute as prefix and support traversing in the INBOUND direction, you would create an index as follows:

db.edgeCollection.ensureIndex({
  type: "mdi-prefixed",
  prefixFields: ["_to", "type"],
  fields: [ "x", "y" ],
  fieldValueTypes: "double"
});
Show output

All options that are supported by persistent or multi-dimensional indexes are supported by the vertex-centric index as well.

Index usage

The AQL optimizer can decide to use a vertex-centric whenever suitable. However, it is not guaranteed that this index is used. The optimizer may estimate that another index, in particular the built-in edge index, is a better fit.

The optimizer considers vertex-centric indexes in pattern matching queries:

FOR v, e, p IN 3..5 OUTBOUND @start GRAPH @graphName
  FILTER p.edges[*].type ALL == "friend"
  RETURN v

It also considers them when you iterate over an edge collection directly and explicitly filter on _from respectively _to and the other indexed attributes:

FOR edge IN edgeCollection
  FILTER edge._from == "vertices/123456" AND edge.type == "friend"
  RETURN edge