Mounts & Backends¶
What are Mounts and Backends?¶
Backends are storage systems (local filesystem, S3, GCS, databases) that store your actual data. Mounts route virtual filesystem paths to specific backends, allowing you to use multiple storage systems through a unified API.
Traditional Approach vs Mounts¶
| Traditional | With Mounts |
|---|---|
| ❌ Hardcode storage location | ✅ Configure at runtime |
| ❌ Rewrite code to switch storage | ✅ Change mount, keep code |
| ❌ One storage system per app | ✅ Multiple backends simultaneously |
| ❌ Manual routing logic | ✅ Automatic path-based routing |
Key Innovation: One filesystem API, multiple storage backends.
Core Concepts¶
Backends (Physical Storage)¶
A backend is where data is actually stored:
from nexus.backends import LocalBackend, GCSBackend
# Local filesystem backend
local = LocalBackend(root_dir="/var/nexus/data")
# Google Cloud Storage backend
gcs = GCSBackend(bucket="my-bucket", project="my-project")
Available backends: - LocalBackend - Local filesystem - GCSBackend - Google Cloud Storage - Custom backends (S3, Azure, PostgreSQL, Redis, MongoDB)
Mounts (Routing Rules)¶
A mount maps a virtual path to a backend:
# Mount local backend at /workspace
nx.mount("/workspace", local_backend)
# Mount GCS backend at /cloud
nx.mount("/cloud", gcs_backend)
# Now use both through same API:
nx.write("/workspace/local.txt", b"Local file")
nx.write("/cloud/remote.txt", b"Cloud file")
Architecture:
graph LR
A[Agent writes to<br/>/workspace/file.txt] --> R[PathRouter]
B[Agent writes to<br/>/cloud/data.json] --> R
R -->|matches /workspace| L[LocalBackend<br/>/var/nexus/data]
R -->|matches /cloud| G[GCSBackend<br/>gs://my-bucket]
L -->|writes| D1[/var/nexus/data/cas/ab/cd/...]
G -->|uploads| D2[gs://my-bucket/cas/ab/cd/...] Path Routing¶
How Routing Works¶
The PathRouter uses longest-prefix matching (like IP routing):
# Mount configuration
nx.mount("/workspace", local_backend, priority=0)
nx.mount("/workspace/shared", gcs_backend, priority=10)
nx.mount("/cloud", gcs_backend, priority=0)
# Routing examples
nx.write("/workspace/file.txt", ...) # → local_backend
nx.write("/workspace/shared/doc.txt", ...) # → gcs_backend (longer prefix)
nx.write("/cloud/data.json", ...) # → gcs_backend
Routing algorithm:
- Get all mounts matching path prefix
- Sort by priority (DESC), then prefix length (DESC)
- Return first match (longest, highest priority)
Mount Priority¶
When multiple mounts match, priority determines winner:
# Both match /workspace/file.txt
nx.mount("/workspace", backend_a, priority=0)
nx.mount("/workspace", backend_b, priority=10) # Higher priority wins
nx.write("/workspace/file.txt", ...) # → backend_b (priority=10)
Use case: Override specific paths with higher-priority mounts.
Backend Types¶
LocalBackend¶
Stores files on local filesystem with Content-Addressable Storage (CAS):
from nexus.backends import LocalBackend
backend = LocalBackend(
root_dir="/var/nexus/data",
content_cache_size_mb=256 # Optional LRU cache
)
nx.mount("/workspace", backend)
Directory structure:
/var/nexus/data/
├── cas/ # Content-addressable storage
│ ├── ab/
│ │ └── cd/
│ │ ├── abcd1234... # Content file
│ │ └── abcd1234....meta # Metadata (ref_count, size)
└── metadata.db # SQLite metadata store
Features: - SHA-256 content hashing - Automatic deduplication - Reference counting - Thread-safe file locking - Cross-platform (Windows, macOS, Linux)
GCSBackend¶
Stores files in Google Cloud Storage:
from nexus.backends import GCSBackend
backend = GCSBackend(
bucket="my-bucket",
project="my-project", # Optional
credentials_path="/path/to/service-account.json" # Optional
)
nx.mount("/cloud", backend)
Authentication methods (priority order): 1. credentials_path parameter 2. GOOGLE_APPLICATION_CREDENTIALS env var 3. Application Default Credentials (gcloud auth) 4. GCE/Cloud Run service account
Storage structure:
gs://my-bucket/
├── cas/ab/cd/abcd1234... # Content blob
└── cas/ab/cd/abcd1234....meta # Metadata blob
Features: - Unlimited scalability - Global distribution - Automatic redundancy - Same CAS as LocalBackend
Custom Backends¶
Implement the Backend interface:
from nexus.backends import Backend
from nexus.contracts.types import OperationContext
class S3Backend(Backend):
@property
def name(self) -> str:
return "s3"
def write_content(self, content: bytes, context: OperationContext) -> str:
"""Write content, return SHA-256 hash."""
content_hash = hashlib.sha256(content).hexdigest()
# Upload to S3...
return content_hash
def read_content(self, content_hash: str, context: OperationContext) -> bytes:
"""Read content by hash."""
# Download from S3...
return content
def delete_content(self, content_hash: str, context: OperationContext) -> None:
"""Delete content (ref counting)."""
# Delete from S3...
pass
def content_exists(self, content_hash: str, context: OperationContext) -> bool:
"""Check if content exists."""
# Check S3...
return True
Mount Management¶
Creating Mounts¶
# Python API
nx.mount(
path="/workspace",
backend=local_backend,
priority=0, # Optional, default=0
enabled=True, # Optional, default=True
metadata={} # Optional metadata
)
# CLI (future)
nexus mounts add /workspace local --root /var/nexus/data
nexus mounts add /cloud gcs --bucket my-bucket
Listing Mounts¶
# Get all mounts
mounts = nx.list_mounts()
for mount in mounts:
print(f"{mount['path']} → {mount['backend_name']} (priority={mount['priority']})")
# Output:
# /workspace/shared → gcs (priority=10)
# /workspace → local (priority=0)
# /cloud → gcs (priority=0)
Removing Mounts¶
# Unmount by path
nx.unmount("/workspace/shared")
# Unmount all
for mount in nx.list_mounts():
nx.unmount(mount['path'])
Built-in Namespaces¶
Nexus provides 5 built-in namespaces for common use cases:
| Namespace | Path Format | Read-Only | Purpose |
|---|---|---|---|
| workspace | /workspace/{path} | No | User workspaces with ReBAC |
| shared | /shared/{tenant}/{path} | No | Tenant shared data |
| external | /external/{path} | No | Pass-through backends |
| system | /system/{path} | Yes | System metadata (admin-only) |
| archives | /archives/{tenant}/{path} | Yes | Read-only cold storage |
workspace Namespace¶
User-specific workspaces with fine-grained permissions:
# Alice's workspace
nx.write("/workspace/alice/notes.txt", b"My notes")
# Bob's workspace
nx.write("/workspace/bob/project.md", b"Project docs")
# ReBAC controls who can access what
nx.rebac.grant("user", "alice", "owner", "file", "/workspace/alice/")
shared Namespace¶
Tenant-wide shared storage:
# Shared team files
nx.write("/shared/acme/team-docs/policy.pdf", pdf_data)
# All acme tenant users can access (based on ReBAC)
nx.rebac.grant("group", "acme-team", "viewer", "file", "/shared/acme/")
external Namespace¶
Direct backend pass-through (no ReBAC, minimal overhead):
# Mount external data source
nx.mount("/external/raw-data", s3_backend)
# Fast read/write without permission checks
data = nx.read("/external/raw-data/sensor-logs.csv")
system Namespace¶
System metadata (read-only, admin access only):
# System config (read-only)
config = nx.read("/system/config/settings.json")
# Write forbidden (even for admins, use special API)
nx.write("/system/config/settings.json", ...) # ❌ Raises error
archives Namespace¶
Historical/read-only data:
# Mount archive backend
nx.mount("/archives/acme", glacier_backend)
# Read archived data
old_data = nx.read("/archives/acme/2023/reports/q1.pdf")
# Write forbidden
nx.write("/archives/acme/new.pdf", ...) # ❌ Read-only namespace
Multi-Backend Scenarios¶
Scenario 1: Local + Cloud Hybrid¶
# Local for hot data
local = LocalBackend("/var/nexus/hot")
nx.mount("/workspace", local)
# Cloud for cold storage
gcs = GCSBackend("my-archive-bucket")
nx.mount("/archives", gcs)
# Use both
nx.write("/workspace/active.txt", b"Working on this")
old_data = nx.read("/archives/2023/report.pdf")
Scenario 2: Multi-Region¶
# US region
us_gcs = GCSBackend("us-bucket", region="us-east1")
nx.mount("/us", us_gcs)
# EU region
eu_gcs = GCSBackend("eu-bucket", region="europe-west1")
nx.mount("/eu", eu_gcs)
# Route by geography
nx.write("/us/user-data.json", us_user_data)
nx.write("/eu/user-data.json", eu_user_data)
Scenario 3: Development → Production¶
# Development: local backend
if env == "dev":
nx.mount("/workspace", LocalBackend("/tmp/nexus-dev"))
# Production: cloud backend
elif env == "prod":
nx.mount("/workspace", GCSBackend("prod-bucket"))
# Code stays the same!
nx.write("/workspace/file.txt", data)
Content-Addressable Storage (CAS)¶
How CAS Works¶
All backends use SHA-256 content hashing for storage:
# Write file
content = b"Hello World"
hash = hashlib.sha256(content).hexdigest()
# → "a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e"
# Storage path
# cas/a5/91/a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e
Benefits: - ✅ Automatic deduplication - Same content stored once - ✅ Immutable history - Content never changes (hash = identity) - ✅ Efficient storage - No duplicate data - ✅ Integrity verification - Hash mismatch = corruption
Reference Counting¶
Each content blob has a reference count:
# First write
nx.write("/file1.txt", b"content") # ref_count=1
# Same content, different path
nx.write("/file2.txt", b"content") # ref_count=2 (deduped!)
# Delete first file
nx.delete("/file1.txt") # ref_count=1 (content remains)
# Delete second file
nx.delete("/file2.txt") # ref_count=0 (content deleted)
Metadata file:
Virtual Paths vs Physical Storage¶
Virtual Filesystem¶
Users interact with virtual paths:
Physical Storage¶
Data is stored by content hash:
# LocalBackend physical layout
/var/nexus/data/cas/ab/cd/abcd1234... # document.pdf content
/var/nexus/data/cas/ef/gh/efgh5678... # backup.zip content
# GCSBackend physical layout
gs://my-bucket/cas/ab/cd/abcd1234...
gs://my-bucket/cas/ef/gh/efgh5678...
Mapping¶
Metadata store maps virtual → physical:
{
"virtual_path": "/workspace/alice/document.pdf",
"content_hash": "abcd1234...", # Physical CAS location
"backend_id": "local",
"size": 102400,
"version": 1
}
Configuration Examples¶
Basic Setup (Local Only)¶
from nexus import NexusFS, LocalBackend
from nexus.backends import LocalBackend
# Create backend
backend = LocalBackend("/var/nexus/data")
# Create filesystem
nx = NexusFS(backend=backend, is_admin=True)
# Use it
nx.write("/file.txt", b"content")
Multi-Backend Setup¶
from nexus import NexusFS, LocalBackend
from nexus.backends import LocalBackend, GCSBackend
# Create filesystem (no default backend)
backend = LocalBackend(root_path="/tmp/nexus-data")
nx = NexusFS(backend=backend, is_admin=True)
# Mount local for workspace
local = LocalBackend("/var/nexus/local")
nx.mount("/workspace", local)
# Mount GCS for cloud storage
gcs = GCSBackend("my-bucket", project="my-project")
nx.mount("/cloud", gcs)
# Mount GCS for archives (same backend, different mount)
nx.mount("/archives", gcs, priority=5)
# Use all mounts
nx.write("/workspace/local.txt", b"local")
nx.write("/cloud/remote.json", b'{"cloud": true}')
nx.write("/archives/old-data.csv", b"archived")
Priority Override¶
# Default: all workspace files go to local
nx.mount("/workspace", local_backend, priority=0)
# Override: /workspace/shared goes to cloud
nx.mount("/workspace/shared", gcs_backend, priority=10)
# Routing
nx.write("/workspace/file.txt", ...) # → local (priority=0)
nx.write("/workspace/shared/doc.txt", ...) # → gcs (priority=10, longer prefix)
CLI Commands¶
List Mounts¶
nexus mounts list
# Output:
# PATH BACKEND PRIORITY ENABLED
# /workspace/shared gcs 10 yes
# /workspace local 0 yes
# /cloud gcs 0 yes
Add Mount¶
# Add local mount
nexus mounts add /workspace local --root /var/nexus/data
# Add GCS mount
nexus mounts add /cloud gcs --bucket my-bucket --project my-project
# Add with priority
nexus mounts add /workspace/shared gcs --bucket shared-bucket --priority 10
Remove Mount¶
FUSE Mount (Linux/macOS)¶
# Mount Nexus as local filesystem
nexus fuse mount /mnt/nexus
# Now use normal tools
ls /mnt/nexus/workspace/
cat /mnt/nexus/workspace/file.txt
cp myfile.txt /mnt/nexus/workspace/
# Unmount
fusermount -u /mnt/nexus # Linux
umount /mnt/nexus # macOS
Best Practices¶
1. Use Namespaces¶
# ✅ Good: Organized by namespace
nx.mount("/workspace", local_backend)
nx.mount("/archives", gcs_backend)
# ❌ Bad: Flat structure
nx.mount("/", mixed_backend) # Everything in one backend
2. Separate Hot and Cold Data¶
# ✅ Good: Hot data local, cold data cloud
nx.mount("/workspace", LocalBackend("/var/nexus/hot"))
nx.mount("/archives", GCSBackend("cold-storage"))
# ❌ Bad: All data in expensive cloud storage
nx.mount("/", GCSBackend("expensive-bucket"))
3. Use Priority for Overrides¶
# ✅ Good: Override specific paths
nx.mount("/workspace", local, priority=0)
nx.mount("/workspace/shared", gcs, priority=10) # Override
# ❌ Bad: Create new top-level path
nx.mount("/workspace-shared", gcs) # Inconsistent naming
4. Test Before Production¶
# ✅ Good: Environment-based config
backend = (
LocalBackend("/tmp/nexus-dev") if env == "dev"
else GCSBackend("prod-bucket")
)
nx.mount("/workspace", backend)
# ❌ Bad: Hardcode production
nx.mount("/workspace", GCSBackend("prod-bucket")) # Always prod
5. Monitor Backend Health¶
# Check backend connectivity
try:
backend.content_exists("test-hash")
except Exception as e:
logger.error(f"Backend unhealthy: {e}")
# Fallback or alert
Performance Considerations¶
LocalBackend¶
- Read: ~5ms (uncached), ~0.1ms (cached)
- Write: ~10ms (includes hash computation)
- Cache: 256MB LRU cache by default
GCSBackend¶
- Read: ~50-200ms (network latency)
- Write: ~100-500ms (upload time)
- Batch: Use
batch_read_content()for multiple files
Optimization Tips¶
- Cache hot data - Enable content cache for frequently accessed files
- Batch operations - Read multiple files in one call
- Use local for hot, cloud for cold - Hybrid approach
- Monitor latency - Alert on slow backend responses
Troubleshooting¶
Mount Not Found¶
Problem: nx.write("/path/file.txt", ...) raises "No mount found"
Fix: Check mounts:
Wrong Backend¶
Problem: File goes to wrong backend
Fix: Check mount priority:
# List mounts (sorted by priority)
for mount in nx.list_mounts():
print(f"{mount['path']} (priority={mount['priority']})")
# Higher priority wins
nx.mount("/workspace", backend_a, priority=0)
nx.mount("/workspace", backend_b, priority=10) # This wins
Backend Connection Failed¶
Problem: GCS authentication fails
Fix: Check credentials:
# Set credentials
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# Or use gcloud
gcloud auth application-default login
# Test connection
python -c "from google.cloud import storage; storage.Client().list_buckets()"
Slow Reads¶
Problem: Reads are slow
Solutions: 1. Enable content cache:
-
Use batch reads:
-
Check backend latency:
FAQ¶
Q: Can I use multiple backends simultaneously?¶
A: Yes! That's the whole point of mounts. Mount different backends at different paths.
Q: What happens if I unmount while files are open?¶
A: Existing file operations complete, but new operations on that path will fail until remounted.
Q: Can I change backend without losing data?¶
A: Yes, but you need to migrate data:
# Read from old
content = nx.read("/workspace/file.txt")
# Unmount old, mount new
nx.unmount("/workspace")
nx.mount("/workspace", new_backend)
# Write to new
nx.write("/workspace/file.txt", content)
Q: How do I backup data across backends?¶
A: Mount source and destination, then copy:
nx.mount("/local", local_backend)
nx.mount("/backup", gcs_backend)
# Copy
for file in nx.readdir("/local"):
content = nx.read(f"/local/{file.name}")
nx.write(f"/backup/{file.name}", content)
Q: Are mounts persisted across restarts?¶
A: If using MountManager with database, yes. Otherwise, you need to recreate mounts on startup.
Next Steps¶
- Content-Addressable Storage - Deep dive into CAS
- Plugin System - Create custom backends
- Multi-Tenancy - Tenant isolation with mounts
- API Reference: Mount API - Complete API docs
Related Files¶
- Router:
src/nexus/core/router.py:1 - Mount Manager:
src/nexus/core/mount_manager.py:1 - Mounts Mixin:
src/nexus/core/nexus_fs_mounts.py:1 - Backend Interface:
src/nexus/backends/backend.py:1 - LocalBackend:
src/nexus/backends/local.py:1 - GCSBackend:
src/nexus/backends/gcs.py:1 - CLI:
src/nexus/cli/commands/mounts.py:1 - FUSE:
src/nexus/fuse/mount.py:1 - Tests:
tests/unit/core/test_router.py:1