RecallGraph - A versioning data store for time-variant graph data.
GitHub - RecallGraph/RecallGraph: A versioning data store for time-variant graph data.
Project Source @ Github
RecallGraph is a versioned-graph data store - it retains all changes that its data (vertices and edges) have gone through to reach their current state. It supports point-in-time graph traversals, letting the user query any past state of the graph just as easily as the present.
It is a Foxx Microservice for ArangoDB that features VCS-like semantics in many parts of its interface, and is backed by a transactional event tracker. It is currently being developed and tested on ArangoDB v3.5 and v3.6, with support for v3.7 in the pipeline.
To get an idea of where such a data store might be used, see:
Also check out the recordings/slides below:
RecallGraph presented @ ArangoDB Online Meetup
The associated slide deck
A discussion on RecallGraph's development roadmap
The associate slide deck
TL;DR: RecallGraph is a potential fit for scenarios where data is best represented as a network of vertices and edges (i.e., a graph) having the following characteristics:
- 1.Both vertices and edges can hold properties in the form of attribute/value pairs (equivalent to JSON objects).
- 2.Documents (vertices/edges) mutate within their lifespan (both in their individual attributes/values and in their relations with each other).
- 3.Past states of documents are as important as their present, necessitating retention and queryability of their change history.
RecallGraph's API is split into 3 top-level categories:
- Create - Create single/multiple documents (vertices/edges).
- Replace - Replace entire single/multiple documents with new content.
- Delete - Delete single/multiple documents.
- Update - Add/Update specific fields in single/multiple documents.
- Restore - Restore deleted nodes back to their last known undeleted state.
- (Planned) Materialization - Point-in-time checkouts.
- (Planned) CQRS/ES Operation Mode - Async implicit commits.
- Log - Fetch a log of events (commits) for a given path pattern (path determines scope of documents to pick). The log can be optionally grouped/sorted/sliced within a specified time interval.
- Diff - Fetch a list of forward or reverse commands (diffs) between commits for specified documents.
- Explicit Commits - Commit a document's changes separately, after it has been written to DB via other means (AQL / Core REST API / Client).
- (Planned) Branch/Tag - Create parallel versions of history, branching off from a specific event point of the main timeline. Also, tag specific points in branch+time for convenient future reference.
- Show - Fetch a set of documents, optionally grouped/sorted/sliced, that match a given path pattern, at a given point in time.
- Filter - In addition to a path pattern like in 'Show', apply an expression-based, simple/compound post-filter on the retrieved documents.
- Traverse - A point-in-time traversal (walk) of a past version of the graph, with the option to apply additional post-filters to the result.
- k Shortest Paths - Point-in-time, weighted, shortest paths between two endpoints.
- Purge - Delete all history for specified nodes.
- 1.Although the test cases are quite extensive and have good coverage, this service has only been tested on single-instance DB deployments, and not on clusters.
- 2.As of version 3.6, ArangoDB does not support ACID transactions for multi-document/collection writes in cluster mode. Transactional ACIDity is not guaranteed for such deployments.
- 1.Support for absolute/relative revision-based queries on individual documents (in addition to the timestamp-based queries supported currently),
- 2.Branching/tag support,
- 3.Support for the valid time dimension in addition to the currently implemented transaction time dimension (https://www.researchgate.net/publication/221212735_A_Taxonomy_of_Time_in_Databases),
- 4.Support for ArangoDB v3.7,
- 5.Multiple, simultaneous materialized checkouts (a la
git) of selectable sections of the database (entire DB, named graph, named collection, document list, document pattern), with eventual branch-level specificity,
- 6.CQRS/ES operation mode (async implicit commits),
- 7.Support for ArangoDB clusters (limited at present by lack of support for multi-document ACID transactions in clusters).
- 8.Multiple authentication and authorization mechanisms.