Motivations and Goals

There were numerous motivations that formed the design of versioned nodes and the data network based on subscriptions, and especially the core data model. I wanted a data synchronization tool, and there were a bunch of different things I wanted to be able to do with it.

Small Core

The core requirements to run a node should be small enough to fit in a moderately capable embedded device. It could be used to log sensor data or distribute configuration (or anything short of real-time control).

General Purpose

A lot of existing tools are dedicated to being a particular application - the most general being a file system or database. This should be very general purpose, such that distributed file systems and databases can be created with it, as well as other more specialized applications.

Capability Access Control with Delegation

If I can read an object, I should be able to give you access to that object without creating a duplicate object or sharing my credentials.

End-to-End Encryption

The data stored or put into a data network should be encrypted. Intermediate servers shouldn't be able to read it, storage servers shouldn't be able to read it. It should only be readable by the endpoints that have the appropriate capabilities.

Concurrent Modification and Disconnected Operation

Two disconnected applications should be able to modify the same object at the same time, and be able to reconcile any differences in their changes later.

Public Versioning

The encrypted form of mutable objects should have sufficient information to determine the latest versions of that object.

Public Dependencies

The encrypted form of the objects should have sufficient information to optimistically deliver objects an object depends on, and remove unused objects from storage.

Eventual Consistency and Determinism

Two endpoints with the same information and making the same update, by default create identical updates, with identical ciphertext. This is important for avoiding synchronization loops.

If the two identical updates result in different ciphertext, they then may both create a new pair of updates when those updated versions propagate, which again results in new pair of different ciphertexts…

Transport Agnostism

There shouldn't be any special transport requirements for synchronization. Links may be formed over any sort of serial port, packet switched network, or storage media.

Indistinguishable from Random Transport-Adapter

The bits sent across a link should be indistinguishable from random, without knowing some shared key.

Forward and Post-Compromise Secure Transport-Adapter

If someone records the bits sent across a link, and then later compromises the state of one or both parties, they should not be able to recover past message and should shortly become unable to recover messages in the future.

Topology Agnostic

There shouldn't be any particular topology required: highly connected DHT/BitTorrent swarm style, prearranged trusted connections only, as a hierarchical content distribution network, or extremely sparsely connected opportunistic networking.

Access Authorized

It should be possible to restrict fetching objects from a node to authorized users, even with them being encrypted. In particular, it should be possible to limit which objects can be fetched over swarm mode.

Liveness

Updates to objects should be propagated quickly though the network to interested parties.

Retention

Operators should be able to mark objects (and trees of objects) to be kept longer term. That is to say, pinning, backups, et cetera.