Content-Addressable Storage (CAS)¶
What is Content-Addressable Storage?¶
Content-Addressable Storage (CAS) is a storage system where content is identified by its cryptographic hash (SHA-256) instead of its location (path). In Nexus, every file's content is stored once based on its hash, enabling automatic deduplication and immutable version history.
Traditional Storage vs CAS¶
| Traditional Storage | Content-Addressable Storage (CAS) |
|---|---|
File identified by path (/workspace/file.txt) | Content identified by hash (abc123...) |
| Same content at 10 paths = 10 copies | Same content at 10 paths = 1 copy |
| Modify file = overwrite content | Modify file = new hash, old content preserved |
| No built-in versioning | Automatic immutable version history |
| Manual deduplication (complex) | Automatic deduplication (transparent) |
Key Innovation: Store once, reference many times.
How CAS Works in Nexus¶
Architecture: Two-Layer System¶
graph TB
subgraph "Virtual Layer (What Users See)"
P1["/workspace/alice/doc.txt"]
P2["/workspace/bob/doc.txt"]
P3["/backup/v1/doc.txt"]
end
subgraph "Metadata Layer"
M1["Metadata: path=/workspace/alice/doc.txt<br>content_hash=abc123..."]
M2["Metadata: path=/workspace/bob/doc.txt<br>content_hash=abc123..."]
M3["Metadata: path=/backup/v1/doc.txt<br>content_hash=abc123..."]
end
subgraph "CAS Layer (Physical Storage)"
C["Content Hash: abc123...<br>Content: 'Hello World'<br>ref_count: 3"]
end
P1 --> M1
P2 --> M2
P3 --> M3
M1 & M2 & M3 --> C
style C fill:#e1f5e1 Key Components:
- Virtual Layer: Paths users see (
/workspace/file.txt) - Metadata Layer: Maps paths to content hashes
- CAS Layer: Physical storage indexed by SHA-256 hash
Content Hashing¶
SHA-256 Hash Computation¶
import hashlib
def compute_hash(content: bytes) -> str:
"""Compute SHA-256 hash of content."""
return hashlib.sha256(content).hexdigest()
# Example
content = b"Hello World"
hash = compute_hash(content)
# Result: "a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e"
Hash Properties: - Algorithm: SHA-256 (256-bit hash) - Output: 64 hexadecimal characters - Deterministic: Same content always produces same hash - Collision-free: Practically impossible for different content to have same hash - One-way: Cannot reverse hash to recover content
Storage Layout¶
Content is organized in a two-level directory structure to prevent filesystem bottlenecks:
backend_root/
├── cas/ # Content-Addressable Storage
│ ├── ab/ # First 2 chars of hash
│ │ ├── cd/ # Next 2 chars of hash
│ │ │ ├── abcd1234...ef56 # Actual content file
│ │ │ └── abcd1234...ef56.meta # Metadata (ref count, size)
│ │ └── de/
│ │ └── abde5678...gh12
│ └── ef/
│ └── ...
Example:
Hash: "a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e"
Storage path:
cas/a5/91/a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e
cas/a5/91/a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e.meta
Why two levels? - First 2 chars: Distributes across 256 directories (00-ff) - Next 2 chars: Further distribution (65,536 leaf directories max) - Prevents: Single directory with millions of files (filesystem bottleneck) - Used by: Git, Docker, many production storage systems
Write Operations and Deduplication¶
Write Flow¶
def write_content(content: bytes) -> str:
"""
Write content to CAS storage.
Returns content hash (SHA-256).
If content already exists, increments reference count.
"""
# 1. Compute hash
content_hash = hashlib.sha256(content).hexdigest()
# 2. Check if content already exists
content_path = _hash_to_path(content_hash)
if content_path.exists():
# Content exists! Increment ref count (deduplication)
metadata = _read_metadata(content_hash)
metadata["ref_count"] += 1
_write_metadata(content_hash, metadata)
return content_hash # Return immediately (no disk write!)
# 3. Content doesn't exist - write atomically
# Write to temp file first (atomic operation)
tmp_path = write_to_temp_file(content)
# Atomic move to final location
os.replace(tmp_path, content_path)
# 4. Create metadata
metadata = {"ref_count": 1, "size": len(content)}
_write_metadata(content_hash, metadata)
return content_hash
Automatic Deduplication¶
Example Scenario:
# 100 users upload the same file
content = b"Python is great!"
hashes = []
for user in range(100):
hash = backend.write_content(content)
hashes.append(hash)
# All hashes are identical
assert len(set(hashes)) == 1 # Only 1 unique hash
# Disk usage: ~15 bytes (not 1,500 bytes)
# Reference count: 100
Real-World Benefits:
| Scenario | Without CAS | With CAS | Savings |
|---|---|---|---|
| 100 copies of 10MB file | 1 GB | 10 MB | 99% saved |
| Document versions (80% unchanged) | 500 MB | 100 MB | 80% saved |
| AI agent logs (repetitive patterns) | 2 GB | 400 MB | 80% saved |
Reference Counting¶
Metadata Format¶
Each content item has a metadata file (.meta suffix):
Fields: - ref_count: Number of files/paths referencing this content - size: Size of content in bytes
Safe Deletion with Reference Counting¶
def delete_content(content_hash: str) -> None:
"""Delete content with reference counting."""
metadata = _read_metadata(content_hash)
ref_count = metadata["ref_count"]
if ref_count <= 1:
# Last reference - delete file and metadata
content_path = _hash_to_path(content_hash)
content_path.unlink() # Delete content
meta_path = _get_meta_path(content_hash)
meta_path.unlink() # Delete metadata
# Clean up empty directories
_cleanup_empty_dirs(content_path.parent)
else:
# Decrement reference count
metadata["ref_count"] = ref_count - 1
_write_metadata(content_hash, metadata)
Deletion Example¶
Initial State:
content_hash="abc123..." has ref_count=3
Paths: /file1.txt, /file2.txt, /file3.txt (all same hash)
Step 1: Delete /file1.txt
- ref_count decremented to 2
- Content remains on disk
- Other files unaffected
Step 2: Delete /file2.txt
- ref_count decremented to 1
- Content remains on disk
- /file3.txt still accessible
Step 3: Delete /file3.txt
- ref_count is 1, this is the last reference
- Content file deleted from CAS
- Metadata file deleted
- Empty directories cleaned up
Result: Zero disk usage for this content
Version History and Immutability¶
Immutable Content¶
Once content is written to CAS, it cannot be modified:
# Write content
hash1 = backend.write_content(b"Version 1")
# hash1 = "a1b2c3..."
# "Modify" content = write new version
hash2 = backend.write_content(b"Version 2")
# hash2 = "d4e5f6..." (different hash)
# Original content is IMMUTABLE
content = backend.read_content(hash1)
# Still returns b"Version 1"
Key Properties: 1. Content hash is permanent: Once created, hash never changes 2. Content is read-only: Different content = different hash 3. No destructive updates: Old versions preserved 4. Guaranteed integrity: Hash mismatch = corrupted content
Version History¶
Nexus tracks all versions of a file in version_history table:
class VersionHistoryModel:
version_id: str # Unique version ID
resource_id: str # Path ID
version_number: int # 1, 2, 3, ...
content_hash: str # SHA-256 hash (CAS key)
size_bytes: int # File size at this version
created_at: datetime # When version created
created_by: str # User/agent who created it
Example History:
File: /workspace/doc.txt
Version 3 (current):
content_hash: "abc123..."
size: 1024 bytes
created_at: 2025-01-15 10:30:00
created_by: alice
Version 2:
content_hash: "def456..."
size: 512 bytes
created_at: 2025-01-15 10:00:00
created_by: alice
Version 1:
content_hash: "789ghi..."
size: 256 bytes
created_at: 2025-01-15 09:00:00
created_by: alice
Time Travel¶
Access any previous version using version notation:
# Read current version
nexus cat /workspace/doc.txt
# Read version 2
nexus cat /workspace/doc.txt@2
# Read version 1
nexus cat /workspace/doc.txt@1
Python API:
# Get version history
versions = nx.stat("/workspace/doc.txt").versions
for v in versions:
print(f"v{v['version']}: {v['size']} bytes, {v['created_at']}")
# Read specific version
meta = nx.metadata.get("/workspace/doc.txt@2")
content = nx.backend.read_content(meta.etag) # Read from CAS
Read Operations¶
Read Flow¶
def read(path: str) -> bytes:
"""
Read file content by path.
Process:
1. Get metadata for path
2. Extract content_hash from metadata
3. Read from CAS using content_hash
"""
# 1. Get metadata
meta = metadata_store.get(path)
if meta is None:
raise FileNotFoundError(path)
# 2. Extract content hash
content_hash = meta.etag # SHA-256 hash
# 3. Read from CAS
content = backend.read_content(content_hash)
return content
Key Insight: - Virtual path → metadata lookup → content_hash → CAS read - Multiple paths can share same content (deduplication)
Content Cache¶
Nexus includes an LRU content cache to speed up repeated reads:
class ContentCache:
"""LRU cache for file content indexed by content hash."""
def __init__(self, max_size_mb: int = 256):
"""Default: 256MB cache."""
self._max_size_bytes = max_size_mb * 1024 * 1024
self._cache: OrderedDict[str, bytes] = OrderedDict()
def get(self, content_hash: str) -> bytes | None:
"""Get from cache (thread-safe LRU)."""
if content_hash in self._cache:
# Move to end (most recently used)
self._cache.move_to_end(content_hash)
return self._cache[content_hash]
return None
def put(self, content_hash: str, content: bytes) -> None:
"""Add to cache with size-based eviction."""
# Evict LRU items until space available
while self._current_size + len(content) > self._max_size_bytes:
# Remove least recently used item
_, evicted = self._cache.popitem(last=False)
self._current_size -= len(evicted)
self._cache[content_hash] = content
self._current_size += len(content)
Cache Benefits: - Fast reads: Avoid disk I/O for popular content - Size-based eviction: Default 256MB (configurable) - Thread-safe: Multiple concurrent reads - Transparent: Automatic caching on read/write
Batch Read Optimization¶
Read multiple content items efficiently:
def batch_read_content(content_hashes: list[str]) -> dict[str, bytes | None]:
"""Batch read multiple content items."""
result = {}
# First pass: check cache
uncached = []
for hash in content_hashes:
cached = content_cache.get(hash)
if cached:
result[hash] = cached
else:
uncached.append(hash)
# Second pass: read uncached from disk
for hash in uncached:
try:
content = read_content(hash)
result[hash] = content
except:
result[hash] = None # Missing
return result
Streaming for Large Files¶
For very large files (GB+), use streaming to avoid loading entire file into memory:
def stream_content(content_hash: str, chunk_size: int = 8192):
"""Stream content in chunks (memory-efficient)."""
content_path = _hash_to_path(content_hash)
with open(content_path, "rb") as f:
while True:
chunk = f.read(chunk_size)
if not chunk:
break
yield chunk
# Example: Process 10GB file with 1MB memory usage
for chunk in stream_content(hash, chunk_size=1024*1024):
process_chunk(chunk) # Memory = 1MB, not 10GB
Backend Integration¶
Unified Backend Interface¶
All backends implement the same CAS interface:
class Backend(ABC):
"""Unified backend interface."""
@abstractmethod
def write_content(self, content: bytes) -> str:
"""Write content, return hash."""
pass
@abstractmethod
def read_content(self, content_hash: str) -> bytes:
"""Read content by hash."""
pass
@abstractmethod
def delete_content(self, content_hash: str) -> None:
"""Delete content (ref counting)."""
pass
@abstractmethod
def content_exists(self, content_hash: str) -> bool:
"""Check if content exists."""
pass
LocalBackend (Filesystem)¶
Storage Structure:
/var/nexus/backend/
├── cas/
│ ├── a5/
│ │ └── 91/
│ │ ├── a591a6d40bf420... # Content file
│ │ └── a591a6d40bf420....meta # Metadata
Features: - Thread-safe file locking - Windows/macOS/Linux compatible - Atomic writes (temp file + rename) - Automatic directory cleanup
GCSBackend (Google Cloud Storage)¶
Storage Structure:
gs://my-bucket/
├── cas/ab/cd/abcd1234...ef56 # Content blob
└── cas/ab/cd/abcd1234...ef56.meta # Metadata blob
Features: - Unlimited scalability - Global distribution - Automatic redundancy - Pay-per-byte pricing - Same CAS interface as LocalBackend
Switching Backends (Zero Code Changes):
# Local development
nx = NexusFS(backend=LocalBackend("/tmp/nexus"))
# Production (cloud)
nx = NexusFS(backend=GCSBackend("my-bucket", "my-project"))
# All file operations work identically!
nx.write("/workspace/data.txt", b"content")
content = nx.read("/workspace/data.txt")
Real-World Examples¶
Example 1: Document Backups¶
# User saves document 5 times
v1 = b"Draft 1"
v2 = b"Draft 2"
v3 = b"Draft 3"
v4 = b"Draft 3" # Accidentally same as v3
v5 = b"Final"
# Write all versions
hash1 = nx.backend.write_content(v1) # ref_count=1
hash2 = nx.backend.write_content(v2) # ref_count=1
hash3 = nx.backend.write_content(v3) # ref_count=1
hash4 = nx.backend.write_content(v4) # hash3 again! ref_count=2 (deduped)
hash5 = nx.backend.write_content(v5) # ref_count=1
# Disk usage
# v1: 7 bytes
# v2: 7 bytes
# v3: 7 bytes
# v4: 0 bytes (deduped with v3)
# v5: 5 bytes
# Total: 26 bytes (not 31 bytes)
Example 2: AI Agent Logs¶
# 100 agents log the same error message
error_msg = b"Connection timeout to database"
hashes = []
for agent_id in range(100):
hash = nx.backend.write_content(error_msg)
hashes.append(hash)
# Store metadata pointing to hash
nx.metadata.put({
"path": f"/logs/agent_{agent_id}/error.log",
"content_hash": hash
})
# All hashes identical
assert len(set(hashes)) == 1
# Disk usage
# 100 files pointing to 1 content blob
# Storage: ~29 bytes (not 2,900 bytes)
# 99% deduplication!
Example 3: Shared Team Files¶
# Alice creates a file
nx_alice.write("/workspace/alice/doc.txt", b"Team document")
# Bob copies it
content = nx_alice.read("/workspace/alice/doc.txt")
nx_bob.write("/workspace/bob/doc.txt", content)
# Charlie copies it
nx_charlie.write("/workspace/charlie/doc.txt", content)
# Behind the scenes:
# - 3 metadata entries (different paths)
# - 1 content blob in CAS
# - ref_count = 3
# - Storage: 1× size, not 3× size
Performance Characteristics¶
Write Performance¶
| Operation | Cost | Notes |
|---|---|---|
| Hash computation | O(n) | n = file size, ~200 MB/s on modern CPU |
| Duplicate check | O(1) | Hash lookup in filesystem |
| New content write | O(n) | Write to disk |
| Metadata write | O(1) | Small JSON file |
Benchmark (Local SSD): - 1KB file: ~0.5ms (hash + write) - 1MB file: ~5ms (hash + write) - 100MB file: ~500ms (hash + write)
Read Performance¶
| Operation | Cost | Notes |
|---|---|---|
| Metadata lookup | O(1) | Database index |
| Cache hit | O(1) | ~0.1ms |
| Cache miss (disk) | O(n) | n = file size |
Benchmark (with cache): - Cached read (1MB): ~0.1ms - Uncached read (1MB): ~5ms - Batch read (100 files): ~50ms (cache) or ~500ms (disk)
Best Practices¶
1. Use Batch Reads for Multiple Files¶
# ✅ Good: Batch read
hashes = [meta.etag for meta in file_metas]
contents = backend.batch_read_content(hashes)
# ❌ Bad: Individual reads
for meta in file_metas:
content = backend.read_content(meta.etag) # N round trips!
2. Stream Large Files¶
# ✅ Good: Stream large file
for chunk in backend.stream_content(hash, chunk_size=1024*1024):
process_chunk(chunk) # Memory = 1MB
# ❌ Bad: Load entire file
content = backend.read_content(hash) # Memory = file_size
process(content)
3. Enable Content Cache¶
# ✅ Good: Enable cache for read-heavy workloads
backend = LocalBackend(
root_dir="/tmp/nexus",
content_cache=ContentCache(max_size_mb=512) # 512MB cache
)
# ❌ Bad: No cache (repeated disk reads)
backend = LocalBackend(root_dir="/tmp/nexus") # No cache
4. Monitor Reference Counts¶
# Check ref count for content
nexus backend ref-count abc123...
# Find content with zero refs (garbage)
nexus backend list-unreferenced
# Cleanup (manual GC)
nexus backend gc --dry-run
Troubleshooting¶
Content Not Found¶
Problem: backend.read_content(hash) raises FileNotFoundError
Check: 1. Verify hash exists:
-
Check ref count:
-
Check metadata consistency:
Disk Usage Growing¶
Problem: CAS directory growing too large
Solutions:
-
Check deduplication rate:
-
Run garbage collection:
-
Check for large files:
Hash Mismatch¶
Problem: Content corruption detected (hash doesn't match)
Causes: - Disk corruption - Incomplete write (power failure) - Backend storage issue
Recovery:
# Verify content integrity
content = backend.read_content(hash)
computed_hash = hashlib.sha256(content).hexdigest()
if computed_hash != hash:
# CORRUPTED!
# Restore from backup or re-upload
pass
FAQ¶
Q: Can I modify content in-place?¶
A: No. CAS content is immutable. Modifying a file creates a new version with a new hash. Old versions are preserved in version history.
Q: What happens if I delete a file but other paths reference it?¶
A: Reference count is decremented. Content remains until ref_count reaches 0.
Q: How much storage overhead does CAS add?¶
A: Minimal. Metadata is ~100 bytes per file. For large files, overhead is <0.1%.
Q: Can I use CAS with S3 or Azure?¶
A: Yes! Implement the Backend interface for your storage: - S3Backend: Use S3 bucket with hash-based keys - AzureBackend: Use Azure Blob Storage - Same CAS benefits (deduplication, versioning)
Q: How does CAS compare to Git?¶
A: Git uses CAS for commits and trees. Nexus uses CAS for file content. Similar benefits (deduplication, immutability), different scope (filesystems vs version control).
Next Steps¶
- Mounts & Backends - Multi-backend routing
- Memory System - CAS for agent memory
- Version History - Time-travel with CAS
- API Reference: Backend API - Complete API docs
Related Files¶
- Interface:
src/nexus/backends/backend.py:1 - Local:
src/nexus/backends/local.py:81 - GCS:
src/nexus/backends/gcs.py:1 - Models:
src/nexus/storage/models.py:346 - Cache:
src/nexus/storage/content_cache.py:1 - Tests:
tests/unit/backend/test_local_backend.py:172