Blobs
Blobs are minimal immutable nodes, containing nothing more than some encrypted data and a set of references.
#![allow(unused)] fn main() { struct Blob { data: Ciphertext, references: ReferenceSet, }; }
Serialization
Blobs are serialized using atlv, as an array of those two fields, using the tag 0. The references must be listed in lexicographic order. As an example, a 128 byte ciphertext with a single reference to another blob:
offset | bytes | description |
---|---|---|
00 | 80 | the tag (0) of the blob |
01 | 42 | the array header (2 elements) |
02 | c1 00 | the binary header for the ciphertext (128 bytes) |
03…82 | ... | the ciphertext |
83 | 41 | the array header (1 element) |
84 | 80 | the tag (0) of a Blob Hash Reference |
85 | 81 | the tag (1) of a Blake3 hash |
86 | 20 | the binary header (32 bytes) |
87…a6 | ... | the content of the blake3 hash |
References
There is only one type of reference to blobs, "Blob Hash References". They are produced by initializing a stateful hash object with the domain "Versioned Node Subscriptions: Reference: Blob: Hash", feeding in the ciphertext, demarcating, feeding the serialized list of references, and finalizing.
Blob Hash References are serialized with the tag 0 followed by the serialization of the contained hash.
Summaries
Blob Summaries are records of blobs that elide the contained data.
#![allow(unused)] fn main() { struct BlobSummary { state: HashState, references: ReferenceSet, } }
The hash state for a Blob Summary is produced by initializing a SHO with the domain "Versioned Node Subscriptions: Reference: Blob: Hash", feeding in the ciphertext, and then extracting the state.
The reference for a Blob may be generated from its Summary, by injecting the preserved hash state, feeding the serialized list of references, and finalizing.
Cryptography
Encryption
The ciphertext held within a Blob is produced with using deterministic authenticated encryption with associated data,
with a domain of "Versioned Node Subscriptions: Blob Encryption".
The associated data is the serialized set of references, including the length.
(In the above example, this is bytes 0x83 to the end.)
The encryption key is derived using the DAEAD's FromPlaintext
method,
with an application-provided convergence domain, which may be empty.
Analysis
Blobs provide read-only content addressed encrypted storage. As such they vulnerable to chosen-plaintext attacks, including the confirmation of file and learn the remaining information attacks described here by Tahoe-LAFS.
There are two mechanisms to mitigate this. One is that the key and nonce derivation both take the transitive content of the Blob into account, so two blobs with the same data, but different references, will have different ciphertexts and keys. The other is the same as Tahoe-LAFS uses; the application can provide a convergence domains, and attacks only applies within that convergence domain.