Content-Addressed Storage
At its core, ContextSubstrate operates as a Content-Addressed Storage (CAS) system. Unlike traditional file systems that locate data by path or name, ctx locates data by its cryptographic hash. This architecture ensures that every agent execution is immutable, verifiable, and storage-efficient through automatic deduplication.
The Object Store
When you run ctx pack, the substrate decomposes an execution log into discrete "blobs." Every piece of data—including system prompts, input files, tool outputs, and the final results—is hashed and stored in the .ctx/objects/ directory.
How Blobs are Managed
- Identity is Content: If two different agent runs use the exact same 10MB PDF as input, the substrate stores that file only once.
- Granular Deduplication: Because prompts and tool outputs are stored as individual blobs, common instructions or repetitive tool responses do not consume additional space across multiple context packs.
- Integrity: Because the filename is the hash of the content, the substrate can instantly detect data corruption or unauthorized modifications.
Anatomy of a Context Pack
A Context Pack is a content-addressed manifest that orchestrates these blobs into a coherent execution history.
The Manifest Structure
The Pack type defines the structure of this manifest. When a pack is created, it includes references (hashes) to the underlying blobs:
type Pack struct {
Version string `json:"version"`
Hash string `json:"hash"` // Identity of the manifest itself
Created time.Time `json:"created"`
SystemPrompt string `json:"system_prompt"` // Reference to a blob hash
Prompts []Prompt `json:"prompts"` // List of references
Inputs []Input `json:"inputs"` // Name + Blob Hash
Steps []Step `json:"steps"` // Execution trace with blob references
Outputs []Output `json:"outputs"` // Final artifacts with blob references
}
Deterministic Hashing
To ensure that identical execution logs always produce the identical Pack Hash, ctx uses Canonical JSON serialization. This process recursively sorts map keys and removes non-deterministic whitespace before computing the final SHA-256 hash of the manifest.
Working with Hashes
The substrate uses a standard URI scheme (ctx://<hash>) to reference objects. You can interact with these hashes directly via the CLI.
Inspecting Content
To see the human-readable summary of a pack's manifest:
ctx show <hash>
Verifying Provenance
The ctx verify command uses the CAS layer to prove that a local artifact (like a generated code file or report) exactly matches the content captured during a specific agent run.
# Checks the artifact's hash against the store's records
ctx verify ./generated_report.md
Storage Benefits
- Zero-Cost Versioning: Forking a pack (
ctx fork) doesn't duplicate the underlying data. It creates a new mutable draft that points to the same immutable blobs. - Efficient Synchronization: When sharing context packs between environments, only the missing blobs need to be transferred.
- Traceability: Every step in a
ctx logorctx diffis backed by the CAS, meaning the "Drift Reports" generated are comparing the actual bit-for-bit content of the model's reasoning and tool usage.