Blobs

Blobs are minimal immutable nodes, containing nothing more than some encrypted data and a set of references.

#![allow(unused)]
fn main() {
struct Blob {
  data: Ciphertext,
  references: ReferenceSet,
};
}

Serialization

Blobs are serialized using atlv, as an array of those two fields, using the tag 0. The references must be listed in lexicographic order. As an example, a 128 byte ciphertext with a single reference to another blob:

offsetbytesdescription
0080the tag (0) of the blob
0142the array header (2 elements)
02c1 00the binary header for the ciphertext (128 bytes)
03…82...the ciphertext
8341the array header (1 element)
8480the tag (0) of a Blob Hash Reference
8581the tag (1) of a Blake3 hash
8620the binary header (32 bytes)
87…a6...the content of the blake3 hash

References

There is only one type of reference to blobs, "Blob Hash References". They are produced by initializing a stateful hash object with the domain "Versioned Node Subscriptions: Reference: Blob: Hash", feeding in the ciphertext, demarcating, feeding the serialized list of references, and finalizing.

Blob Hash References are serialized with the tag 0 followed by the serialization of the contained hash.

Summaries

Blob Summaries are records of blobs that elide the contained data.

#![allow(unused)]
fn main() {
struct BlobSummary {
  state: HashState,
  references: ReferenceSet,
}
}

The hash state for a Blob Summary is produced by initializing a SHO with the domain "Versioned Node Subscriptions: Reference: Blob: Hash", feeding in the ciphertext, and then extracting the state.

The reference for a Blob may be generated from its Summary, by injecting the preserved hash state, feeding the serialized list of references, and finalizing.

Cryptography

Encryption

The ciphertext held within a Blob is produced with using deterministic authenticated encryption with associated data, with a domain of "Versioned Node Subscriptions: Blob Encryption". The associated data is the serialized set of references, including the length. (In the above example, this is bytes 0x83 to the end.) The encryption key is derived using the DAEAD's FromPlaintext method, with an application-provided convergence domain, which may be empty.

Analysis

Blobs provide read-only content addressed encrypted storage. As such they vulnerable to chosen-plaintext attacks, including the confirmation of file and learn the remaining information attacks described here by Tahoe-LAFS.

There are two mechanisms to mitigate this. One is that the key and nonce derivation both take the transitive content of the Blob into account, so two blobs with the same data, but different references, will have different ciphertexts and keys. The other is the same as Tahoe-LAFS uses; the application can provide a convergence domains, and attacks only applies within that convergence domain.