Nexus Kernel Architecture¶
Status: Active — kernel architecture SSOT Rule: Keep this file small and precise. Prefer inplace edits over additions. Delegate details to federation-memo.md and data-storage-matrix.md.
1. Design Philosophy¶
NexusFS follows an OS-inspired layered architecture.
┌──────────────────────────────────────────────────────────────┐
│ SERVICES (user space) │
│ Installable/removable. ReBAC, Auth, Agents, Scheduler, etc. │
└──────────────────────────────────────────────────────────────┘
↓ protocol interface
┌──────────────────────────────────────────────────────────────┐
│ KERNEL │
│ Minimal compilable unit. VFS, MetastoreABC, │
│ ObjectStoreABC interface definitions. │
└──────────────────────────────────────────────────────────────┘
↓ dependency injection
┌──────────────────────────────────────────────────────────────┐
│ DRIVERS │
│ Pluggable at startup. redb, S3, LocalDisk, gRPC, etc. │
└──────────────────────────────────────────────────────────────┘
Interface Taxonomy¶
Every kernel interface belongs to exactly one of four categories:
| Category | Direction | Audience | Kernel relationship | API tier |
|---|---|---|---|---|
| User Contract (§2) | ↑ upward | Users, AI, agents, services | Kernel implements | Tier 1: Syscalls (sys_*) |
| HAL — Driver Contract (§3) | ↓ downward | Driver implementors | Kernel requires | Tier 2: 3 pillar ABCs |
| Kernel Primitive (§4) | internal | Kernel-internal only | Kernel owns | Tier 3: Kernel Module API (create_from_backend, register_resolver) |
| Kernel-Authored Standard (§5) | sideways | Services | Kernel defines but doesn't own | — (service standards, not kernel API) |
Tier 1 is the only user-facing interface. Tier 3 is for trusted kernel modules (federation resolvers, ACP) — analogous to Linux EXPORT_SYMBOL.
Swap Tiers¶
Follows Linux's monolithic kernel model, not microkernel:
| Tier | Swap time | Nexus | Syscall | Linux analogue |
|---|---|---|---|---|
| Static kernel | Never | MetastoreABC, VFS route(), syscall dispatch | — | vmlinuz core (scheduler, mm, VFS) |
| Drivers | Runtime mount/unmount | redb, S3, PostgreSQL, Dragonfly, SearchBrick | sys_setattr(DT_MOUNT) / rmdir | mount/umount |
| Services | Runtime register/swap/unregister | 40+ protocols (ReBAC, Mount, Auth, Agents, Search, Skills, ...) | sys_setattr("/__sys__/services/X") / sys_unlink | insmod/rmmod |
Invariant: Services depend on kernel interfaces, never the reverse. The kernel operates with zero services loaded. Kernel code (core/nexus_fs.py) has zero reads of service containers — all service wiring flows through ServiceRegistry (nx.service("name")), factory-injected closures (functools.partial), or KernelDispatch hooks. Services flow through sys_setattr("/__sys__/services/X") — factory uses the same syscall API as runtime callers (factory = first user).
Drivers are mounted at runtime via sys_setattr(entry_type=DT_MOUNT, backend=...), unmounted via rmdir. MetastoreABC is the only startup-time driver (sole kernel init param). Other drivers are mounted post-init by factory or at runtime.
Service Lifecycle¶
factory/ acts as the init system (like systemd): creates selected services and injects them via DI. DeploymentProfile gates which bricks are constructed (see §7).
Factory boot sequence:
create_nexus_services()—_boot_pre_kernel_services()+_boot_independent_bricks()+_boot_dependent_bricks()NexusFS()constructor — Instantiate kernel primitives (no I/O,routerpassed directly)_wire_services()— Wire topology, boot post-kernel services, enlist into ServiceRegistry_initialize_services()— Register VFS hooks, IPC adapter bind
See factory/orchestrator.py for implementation.
Service Lifecycle Protocols¶
One-dimension model: the only user-facing lifecycle dimension is daemon vs on-demand (PersistentService protocol). Hook management uses duck-typed hook_spec() — the kernel auto-captures hooks via hasattr(instance, 'hook_spec') at enlist() time.
| Mechanism | Methods | Kernel auto-manages |
|---|---|---|
PersistentService protocol | start(), stop() | start() on bootstrap (dependency order); stop() on shutdown (reverse order) |
Duck-typed hook_spec() | hook_spec() → HookSpec | Hook registration into KernelDispatch at enlist() time; unregister at shutdown |
One-click contract: implement protocol / hook_spec() → ServiceRegistry.enlist() → kernel handles the rest. ServiceRegistry (kernel-owned, lifecycle integrated) scans the registry and auto-calls the appropriate methods during NexusFS.bootstrap() / NexusFS.close().
swap_service() supports all services (#1452). Unified path: refcount drain → unhook old → replace → rehook new.
AgentRegistry (core.agent_registry): In-memory agent process table (task_struct analogue). Not a kernel primitive — accessed via nx.service("agent_registry"). See core/agent_registry.py.
Kernel DI patterns (two mechanisms, never reads service containers directly):
| Pattern | Kernel __init__ | Factory _do_link() | Example |
|---|---|---|---|
| Kernel owns | Creates instance | — | VFSLockManager, LockManager (advisory), KernelDispatch, PipeManager, StreamManager, FileWatcher, ServiceRegistry, DriverLifecycleCoordinator |
| Kernel knows (sentinel) | self._x = None | Injects real value; None = graceful degrade | _token_manager, _sandbox_manager, _coordination_client, _event_client |
"Kernel knows" follows the Linux LSM pattern: kernel declares a default (None), factory overrides at link-time. The kernel never imports service-layer modules. AgentRegistry is accessed via ServiceRegistry (register_factory), not as a kernel sentinel — no-agent profiles (REMOTE) never construct it.
Permission enforcement is fully delegated to KernelDispatch INTERCEPT hooks (PermissionCheckHook). No hook registered = no check = zero overhead.
Zone identity: self._zone_id = ROOT_ZONE_ID — kernel namespace partition (analogous to Linux sb->s_dev). PathRouter canonicalizes all paths to /{zone_id}/{path} for zone-aware LPM routing. Standalone: always "root". Federation: set at link time. All primitives (LockManager, FileEvent) receive canonical paths — zone handling is PathRouter's responsibility, not theirs.
Source of truth: contracts/protocols/service_lifecycle.py
Entry Point: connect()¶
connect(config=...) is the mode-dispatcher factory function — the single entry point for all Nexus users. It auto-detects deployment mode (standalone/remote/federation), bootstraps the appropriate stack, and returns NexusFilesystem.
from nexus.sdk import connect
nx = connect() # auto-detect from env/config
nx = connect(config={"profile": "remote", "url": "http://..."})
Linux analogue: the boot sequence that selects rootfs and mounts it (mount_root() in init/do_mounts.c). After connect() returns, you have a usable filesystem. All three modes return the same NexusFilesystem contract — clients never need to know which mode is running.
Not DI — it's the user-facing entry point. The factory/DI machinery is internal.
2. User Contract — Syscall Interface¶
Category: User Contract (↑) | Audience: Users, AI, agents | Package: contracts.filesystem, core.nexus_fs
2.1 NexusFilesystem — Published Contract¶
The published user-facing contract is NexusFilesystem (Protocol, in contracts/filesystem/):
| Tier | Content | Caller responsibility |
|---|---|---|
| Tier 1 (abstract) | sys_* kernel syscalls | Implementors MUST override |
| Tier 2 (concrete) | Convenience methods composing Tier 1 (mkdir, rmdir, read, write, …) | Inherit — no override needed |
Relationship: POSIX spec (contract) vs Linux kernel (implementation) — clients program against the contract, kernel implements it.
2.2 Kernel Syscalls — POSIX-Aligned, Path-Addressed¶
NexusFS is the kernel implementation of NexusFilesystem. It wires primitives (§4) into user-facing operations. NexusFS contains no service business logic.
Kernel syscalls, all POSIX-aligned, all path-addressed:
| Plane | Syscalls |
|---|---|
| Metadata | sys_stat, sys_setattr, sys_rename, sys_unlink, sys_readdir |
| Content | sys_read (pread), sys_write (pwrite), sys_copy |
| Locking | sys_lock (acquire + extend), sys_unlock (release + force) |
| Watch | sys_watch (inotify) |
sys_setattr is the universal creation/management syscall: mkdir = sys_setattr(entry_type=DT_DIR), mount = sys_setattr(entry_type=DT_MOUNT, backend=...), umount = rmdir on DT_MOUNT path.
Lock operations are consolidated into two syscalls (POSIX fcntl(F_SETLK) pattern): - sys_lock(path, lock_id=None) — acquire (lock_id=None) or extend TTL (lock_id=existing) - sys_unlock(path, lock_id=None, force=False) — release by lock_id or force-release all holders - Lock state: sys_stat(path, include_lock=True) — zero cost when False (default) - Lock listing: sys_readdir("/__sys__/locks/") — virtual namespace (like /proc/locks) /__sys__/ paths are kernel management operations (not filesystem metadata): sys_setattr("/__sys__/services/X", service=inst) registers, sys_unlink("/__sys__/services/X") unregisters.
Primitive usage pattern:
- Mutating syscalls (write, unlink, rename, rmdir): full pipeline — VFSRouter → VFSLock → KernelDispatch (3-phase) → Metastore → FileEvent
- Read: same pipeline minus FileEvent (reads are not mutations)
- Read-only metadata (stat, access, readdir, is_directory): direct Metastore lookup only — no routing, locking, or dispatch
- setattr: Metastore-only (Tier 2
mkdiradds routing + hooks)
See syscall-design.md for the full per-syscall primitive matrix.
2.3 Tier 2 Convenience Methods¶
Tier 2 methods compose Tier 1 syscalls — concrete implementations in NexusFilesystem:
| Half | Examples | Addressing |
|---|---|---|
| VFS half (POSIX-aligned) | mkdir(), rmdir(), read(), write(consistency=), append(), edit(), write_batch(), access(), is_directory(), lock(), locked(), glob(), grep(), service() | Path-addressed, delegates to sys_* |
| HDFS half (driver-level) | read_content(), write_content(), stream(), stream_range(), write_stream() | Hash-addressed (etag/CAS), direct to ObjectStoreABC |
The HDFS half bypasses path resolution and metadata lookup — CAS is a driver detail. Like HDFS separates ClientProtocol (NameNode, path-based) from DataTransferProtocol (DataNode, block-based). The metadata layer above ensures etag ownership and zone isolation.
Kernel-managed metadata side effects (POSIX generic_write_end pattern): kernel updates mtime, size, version, etag in VFS lock after backend.write_content(). Drivers only manage content. "sc" (strong, default) or "ec" (eventual, local-first) consistency.
2.4 VFS Dispatch (KernelDispatch)¶
The kernel provides callback-based dispatch at 6 VFS operation points (read, write, delete, rename, mkdir, rmdir) plus driver lifecycle events (mount, unmount). These are kernel-owned callback lists (implemented by KernelDispatch, §4) that any authorized caller populates.
Three-phase dispatch per VFS operation:
| Phase | Semantics | Short-circuit? | Linux Analogue |
|---|---|---|---|
| PRE-DISPATCH | First-match short-circuit | Yes (skips pipeline) | VFS file->f_op dispatch (procfs, sysfs) |
| INTERCEPT | Synchronous, ordered (pre + post) | Yes (abort/policy) | LSM security hooks |
| OBSERVE | Fire-and-forget | No | fsnotify() / notifier_call_chain() |
Driver lifecycle hooks (Issue #1811):
| Phase | Semantics | Short-circuit? | Linux Analogue |
|---|---|---|---|
| MOUNT | Fire-and-forget on backend mount | No | file_system_type.mount() |
| UNMOUNT | Fire-and-forget on backend unmount | No | kill_sb() |
Mount/unmount hooks are dispatched by DriverLifecycleCoordinator (§4) via KernelDispatch. Backends declare mount hooks via hook_spec() (same pattern as VFS hooks). CASAddressingEngine uses on_mount for mount-time logging.
PRE-DISPATCH: VFSPathResolver instances checked in order; first match handles entire operation. Each resolver owns its own permission semantics.
INTERCEPT: Per-operation VFS*Hook protocols. Hooks receive a typed context dataclass, can modify context or abort. POST hooks support sync and async (classified by Rust HookRegistry). Audit is a factory-registered interceptor, not a kernel built-in.
OBSERVE: VFSObserver instances receive frozen FileEvent (§4.3) on all mutations. Strictly fire-and-forget — failures never abort the syscall. Observers needing causal ordering belong in INTERCEPT post-hooks, not OBSERVE.
Hook protocols and context dataclasses are defined in contracts/vfs_hooks.py (tier-neutral). Concrete implementations live in services/hooks/.
Registration API: Each phase has a symmetric register_*() / unregister_*() pair — runtime-callable by any authorized caller.
2.5 Mediation Principle¶
Users access HAL only through syscalls. For mutating syscalls the pipeline is: PRE-DISPATCH → route → INTERCEPT pre → lock → HAL I/O → unlock → INTERCEPT post → OBSERVE. See syscall-design.md for the full per-syscall flow.
Exception: Tier 2 hash-addressed operations (see §2.3 HDFS half) access ObjectStoreABC directly by etag, bypassing path resolution and metadata lookup.
3. HAL — Storage Driver Contracts¶
Category: HAL — Driver Contract (↓) | Audience: Driver implementors
NexusFS abstracts storage by Capability (access pattern + consistency guarantee), not by domain or implementation.
| Pillar | ABC | Capability | Kernel Role | Package |
|---|---|---|---|---|
| Metastore | MetastoreABC | Ordered KV, CAS, prefix scan, optional Raft SC | Required — sole kernel init param | core.metastore |
| ObjectStore | ObjectStoreABC (= Backend) | Streaming I/O, immutable blobs, petabyte scale | Interface only — instances mounted via nx.mount() | core.object_store |
| CacheStore | CacheStoreABC | Ephemeral KV, Pub/Sub, TTL | Optional — defaults to NullCacheStore | contracts.cache_store |
Orthogonality: Between pillars = different query patterns. Within pillars = interchangeable drivers (deployment-time config). See data-storage-matrix.md.
Kernel self-inclusiveness: Kernel boots with 1 pillar (Metastore). ObjectStore mounted post-init. Kernel does NOT need: JOINs, FK, vector search, TTL, pub/sub (all service-layer). Like Linux: kernel defines VFS + block device interface but doesn't ship a filesystem.
3.1 MetastoreABC — Inode Layer¶
Linux analogue: struct inode_operations
The typed contract between VFS and storage. Without it, the kernel cannot describe files. Operations: O(1) KV (get/put/delete), ordered prefix scan (list), batch ops, implicit directory detection. System config stored under /__sys__/ prefix.
Data type: FileMetadata — path, backend_name, etag, size, version, zone_id, owner_id, timestamps, mime_type. Always tagged with zone_id (P0 invariant). zone_id is a kernel namespace partition identifier (analogous to Linux sb->s_dev). Federation extends zones with Raft consensus groups, but the kernel owns the concept. owner_id is the kernel's posix_uid — used by PermissionEnforcerProtocol.check_owner() for O(1) DAC before service-layer hooks run. Audit trail (who created a file) is a service concern tracked by VersionRecorder, not a kernel inode field.
3.2 ObjectStoreABC (= Backend) — Blob I/O¶
Linux analogue: struct file_operations
CAS-addressed blob storage: read/write/delete by etag (content hash), plus streaming variants. Directory ops (mkdir/rmdir/list_dir) for backends that support them. Rename is optional (capability-dependent).
3.3 CacheStoreABC — Ephemeral KV + Pub/Sub (Optional)¶
Linux analogue: /dev/shm + message bus
The only optional HAL pillar. Kernel defines the ABC (ephemeral KV + pub/sub); services consume it for caching, event fan-out, and session storage. Drivers: Dragonfly/Redis (production), InMemoryCacheStore (dev).
Graceful degradation: NullCacheStore (no-op) is the default. Without a real CacheStore, EventBus disables, permission/tiger caches fall back to RecordStore, and sessions stay in RecordStore. No kernel functionality is lost.
3.4 Dual-Axis ABC Architecture¶
Two independent ABC axes, composed via DI:
- Data ABCs (this section): WHERE is data stored? → 3 kernel pillars by storage capability
- Ops ABCs (§5.3): WHAT can users/agents DO? → 40+ scenario domains by ops affinity
A concrete class sits at the intersection: e.g. ReBACManager implements PermissionProtocol (Ops) and internally uses RecordStoreABC (Data). See ops-scenario-matrix.md for full proof.
3.5 Transport × Addressing Composition¶
Linux analogue: Block device driver (Transport) × filesystem (Addressing)
ObjectStoreABC backends decompose into two orthogonal axes: Transport (WHERE — raw key→bytes I/O) and Addressing Engine (HOW — CAS or Path). Every backend, including external API connectors, is a Transport composed with an addressing engine. REST APIs are filesystems: GET = fetch, PUT = store, DELETE = remove.
DT_EXTERNAL_STORAGE (entry_type=5): Mount-time detection via ConnectorRegistry.category for OAuth APIs and CLI tools.
See backend-architecture.md §2 for the full composition matrix and Transport protocol. See connector-transport-matrix.md for per-connector details.
4. Kernel Primitives¶
Category: Kernel Primitive (internal) | Audience: Kernel-internal | Package: core.*
Primitives mediate between user-facing syscalls and HAL drivers. Users interact with them indirectly through syscalls. See §2.2 for per-syscall usage.
| Primitive | Package | Linux Analogue | Role |
|---|---|---|---|
| VFSRouter | core.router | VFS lookup_slow() | route(path, zone_id) → RouteResult. Zone-canonical LPM (~30ns Rust / ~300ns Python). In-memory mount table keyed by /{zone_id}/{mount_point} |
| VFSLockManager | core.lock_fast | per-inode i_rwsem | Per-path RW lock with hierarchy-aware conflict detection. Details in §4.1 |
| LockManager (advisory) | lib.distributed_lock | flock(2) | Advisory locks via sys_lock/sys_unlock (acquire+extend / release+force). Zone-agnostic (receives canonical paths from router). Local: VFSSemaphore. Federation: RaftLockManager. Details in §4.4 |
| Dispatch (Rust Kernel + DispatchMixin) | core.nexus_fs_dispatch + rust/nexus_kernel/src/dispatch.rs | security_hook_heads + fsnotify | Three-phase VFS dispatch (§2.4) + driver lifecycle hooks (MOUNT/UNMOUNT). Rust Kernel owns PathTrie + HookRegistry + ObserverRegistry. DispatchMixin provides Python-side registration API. Empty = zero overhead |
| PipeManager + StreamManager | core.pipe_manager + core.stream_manager | pipe(2) + append-only log | VFS named IPC. DT_PIPE: destructive FIFO (RingBuffer). DT_STREAM: non-destructive offset reads (pluggable StreamBackend). Details in §4.2 |
| FileWatcher + FileEvent | core.file_watcher + core.file_events | inotify(7) + fsnotify_event | File change notification + immutable mutation records. Local OBSERVE waiters + optional RemoteWatchProtocol. Details in §4.3 |
| ServiceRegistry | core.service_registry | init/main.c + module.c | Kernel-owned symbol table + lifecycle orchestration (enlist/swap/shutdown). PersistentService + duck-typed hook_spec() |
| DriverLifecycleCoordinator | core.driver_lifecycle_coordinator | register_filesystem + kern_mount | Driver mount lifecycle: routing table + VFS hook registration + mount/unmount KernelDispatch notification |
4.1 VFSLockManager — Per-Path RW Lock¶
| Property | Value |
|---|---|
| Modes | "read" (shared) / "write" (exclusive) |
| Hierarchy awareness | Ancestor/descendant conflict detection |
| Latency | ~200ns (Rust PyO3) / ~500ns–1μs (Python fallback) |
| Scope | In-memory, process-scoped (crash → released), metadata-invisible |
| Lock release timing | Released BEFORE observers (like Linux inotify after i_rwsem) |
Advisory locks are a separate concern — see lock-architecture.md §4.
4.2 IPC Primitives — Named Pipes & Streams¶
Two-layer architecture for both: VFS metadata (inode) in MetastoreABC, data (bytes) in process heap buffer (like Linux kmalloc'd pipe buffer).
| Primitive | Linux Analogue | Buffer | Read |
|---|---|---|---|
| DT_PIPE | kfifo ring | RingBuffer | Destructive |
| DT_STREAM | append-only log | StreamBuffer | Non-destructive (offset-based) |
DT_PIPE (PipeManager + RingBuffer):
- PipeManager (mkpipe) — VFS named pipe lifecycle (created via
sys_setattrupsert, read/write viasys_read/sys_write, destroyed viasys_unlink), per-pipe lock for MPMC safety. Reads are destructive (consumed on read). - RingBuffer (kpipe) — Lock-free SPSC kernel primitive (
kfifoanalogue), no internal synchronization. PipeManager wraps with per-pipeasyncio.Lockfor MPMC safety. Direct RingBuffer access is kernel-internal only.
DT_STREAM (StreamManager + pluggable StreamBackend):
- StreamManager (mkstream) — VFS named stream lifecycle (same syscall surface as mkpipe). Per-stream lock for concurrent writers. Reads are non-destructive — multiple readers maintain independent byte offsets (fan-out).
- StreamBackend protocol — pluggable backing store for DT_STREAM data. Mount configuration determines which backend is used when creating a stream under that mount (like Linux filesystem type determines pipe implementation). Implementations:
StreamBuffer(in-memory, default),RemoteStreamBackend(federation gRPC proxy),WALStreamBackend(durable, EC WAL-backed for cross-node at-least-once),SHMStreamBackend(mmap shared memory, cross-process, ~1-5μs),StdioStream(OS subprocess pipe adapter for agent I/O). - Mount-determined backend —
_MountEntry.stream_backend_factoryis baked at mount time.sys_setattr(entry_type=DT_STREAM)checks the enclosing mount's factory; if set, creates a custom backend instead of default memory.
See federation-memo.md §7j for design rationale.
4.3 FileWatcher + FileEvent — File Change Notification¶
| Property | Value |
|---|---|
| Event types | FILE_WRITE, FILE_DELETE, FILE_RENAME, METADATA_CHANGE, DIR_CREATE, DIR_DELETE, SYNC_*, CONFLICT_* |
| FileEvent | Frozen dataclass: path, etag, size, version, zone_id, agent_id, user_id, vector_clock |
| FileWatcher (kernel-owned) | Local OBSERVE waiters — on_mutation() resolves in-memory futures (~0µs) |
| FileWatcher (kernel-knows) | Optional RemoteWatchProtocol for distributed watch, set via set_remote_watcher() |
| Emission point | Always AFTER lock release |
4.4 LockManager — Kernel Advisory Lock¶
| Property | Value |
|---|---|
| Linux analogue | flock(2) / fcntl(F_SETLK) |
| Package | lib.distributed_lock (LocalLockManager, RaftLockManager) |
| Storage | sm_locks redb table (separate from FileMetadata) |
| Lifecycle | Kernel-owned: LocalLockManager constructed in __init__; federation upgrades to RaftLockManager via _upgrade_lock_manager() |
Same pattern as FileWatcher: kernel-owned local + kernel-knows remote.
- Local:
LocalLockManagerwrapsVFSSemaphore— exclusive (mutex), shared (RW), counting (semaphore) - Remote:
RaftLockManagerwrapsRaftMetadataStore.acquire_lock()— strong consistency via Raft consensus - Syscalls:
sys_lock(try-acquire, Tier 1),sys_unlock(release, Tier 1),lock()(blocking wait, Tier 2) - Upgrade:
_upgrade_lock_manager()called by factory at link time when federation is available
5. Kernel-Authored Standards¶
Category: Kernel-Authored Standard (≠ kernel interface) | Audience: Services
5.1 The "Standard Plug" Principle¶
The kernel defines contracts it doesn't own — so kernel infrastructure works automatically with any service that conforms.
Linux analogies:
| Linux pattern | What kernel defines | What modules provide | Kernel benefit |
|---|---|---|---|
file_operations | Struct with read/write/ioctl pointers | Each filesystem fills the struct | VFS calls any filesystem uniformly |
security_operations | Struct with 200+ LSM hook pointers | SELinux, AppArmor fill hooks | Security framework calls any LSM |
Nexus equivalent:
| Nexus pattern | What kernel defines | What services provide | Infrastructure benefit |
|---|---|---|---|
RecordStoreABC | Session factory + read replica interface | PostgreSQL, SQLite drivers | Services get pooling, error translation, replica routing |
VFS*Hook protocols | Hook shapes (context dataclasses) | Service-layer hook implementations | KernelDispatch calls any conforming hook uniformly |
VFSSemaphoreProtocol | Named counting semaphore interface | lib.semaphore implementation | Advisory locks + CAS coordination use uniform semaphore API |
| Service Protocols | @runtime_checkable typed interfaces | Concrete service implementations | Typed contracts for service implementors |
Integration mechanisms: Factory auto-discovers bricks via brick_factory.py convention (RESULT_KEY + PROTOCOL + create()), validates protocol conformance at registration, and resolves kernel dependencies via EXPORT_SYMBOL() pattern (see §1 Service Lifecycle).
5.2 RecordStoreABC — Relational Storage Standard¶
Package: storage.record_store | NOT a kernel interface — service-only
| Property | Value |
|---|---|
| Kernel role | Kernel defines the ABC; kernel does NOT consume it |
| Consumers | Services only (ReBAC, Auth, Agents, Scheduler, etc.) |
| Interface | session_factory + read_session_factory (SQLAlchemy ORM) |
| Drivers | PostgreSQL, SQLite (interchangeable without code changes) |
| Rule | Direct SQL or raw driver access is an abstraction break |
The kernel is the standards body — it defines the interface shape that forces driver implementors to provide pooling, error translation, read replica routing, WAL mode, async lazy init. Both sides (drivers and services) conform to the same interface; neither needs to know the other. The value comes from bilateral interface conformance, not from kernel providing these features directly.
5.3 Service Protocols — 40+ Scenario Domains¶
Package: contracts.protocols | NOT kernel interfaces — service standards
40+ typing.Protocol classes with @runtime_checkable, organized by domain (Permission, Search, Mount, Agent, Events, Memory, Domain, Audit, Cross-Cutting).
See ops-scenario-matrix.md §2–§3 for full enumeration and affinity matching.
5.4 VFSSemaphore — Named Counting Semaphore¶
Package: lib.semaphore | Protocol: contracts.protocols.semaphore.VFSSemaphoreProtocol
| Property | Value |
|---|---|
| POSIX analogue | sem_t (named semaphore, extended with TTL + holder tracking) |
| Kernel role | Kernel defines the protocol and provides the implementation in lib/; kernel does NOT own it as a primitive |
| Modes | Counting (N holders), mutex (max_holders=1) |
| Latency | ~200ns (Rust PyO3) / ~500ns-1us (Python fallback) |
| Scope | In-memory, process-scoped, TTL-based lazy expiry |
| Consumers | Advisory lock layer (LocalLockManager), CAS metadata RMW |
Advisory lock layer uses two semaphores per path for RW gate pattern (shared/exclusive). See lock-architecture.md §3.
6. Tier-Neutral Infrastructure (contracts/, lib/)¶
Two packages sit outside the Kernel → Services → Drivers stack. Any layer may import from them; they must not import from nexus.core, nexus.services, nexus.fuse, nexus.bricks, or any other tier-specific package.
| Package | Contains | Linux Analogue | Rule |
|---|---|---|---|
contracts/ | Types, enums, exceptions, constants | include/linux/ (header files) | Declarations only — no implementation logic, no I/O |
lib/ | Reusable helper functions, pure utilities | lib/ (libc, libm) | Implementation allowed, but zero kernel deps |
Core distinction: contracts/ = what (shapes of data). lib/ = how (behavior).
Placement Decision Tree¶
Is it used by a SINGLE layer?
→ Yes: stays in that layer (e.g. fuse/filters.py)
→ No (multi-layer):
Is it a type / ABC / exception / enum / constant?
→ Yes: contracts/
→ No (function / helper / I/O logic): lib/
Import Rules¶
contracts/ and lib/ may import from: each other, stdlib, third-party packages. They must never import from: nexus.core, nexus.services, nexus.server, nexus.cli, nexus.fuse, nexus.bricks, nexus.rebac.
7. Deployment Profiles¶
The kernel's layered design (§1) and DI contracts (§3) enable a range of deployment profiles. Not kernel-owned, but kernel-enabled.
Like Linux distros select packages from the same kernel, Nexus profiles select which bricks to enable and which drivers to inject.
| Profile | Target | Metastore | Linux Analogue |
|---|---|---|---|
| slim | Bare minimum runnable | redb (embedded) | initramfs |
| cluster | Minimal multi-node (IPC + federation, no auth) | redb (Raft) | CoreOS |
| embedded | MCU, WASM (<1 MB) | redb (embedded) | BusyBox |
| lite | Pi, Jetson, mobile | redb (embedded) | Alpine |
| full | Desktop, laptop | redb (embedded) | Ubuntu Desktop |
| cloud | k8s, serverless | redb (Raft) | Ubuntu Server |
| innovation | Experimental tier | redb (Raft) | Ubuntu + PPAs |
| remote | Client-side proxy (zero local bricks) | RemoteMetastore | NFS client |
Profile hierarchy: slim ⊂ cluster ⊂ embedded ⊂ lite ⊂ full ⊆ cloud ⊆ innovation. REMOTE is orthogonal — stateless proxy, all operations via gRPC to server.
Same kernel binary, different driver injection. See §1 connect(). Source of truth: src/nexus/contracts/deployment_profile.py.
8. Communication¶
Kernel-adjacent services built on kernel primitives (§4.2 IPC, §4.3 FileEvent). Not kernel-owned, but bottom-layer infrastructure.
| Tier | Nexus | Built on | Topology |
|---|---|---|---|
| Kernel | DT_PIPE (§4.2) | RingBuffer — destructive FIFO | Local or distributed (transparent) |
| Kernel | DT_STREAM (§4.2) | StreamBuffer — append-only log | Local or distributed (transparent) |
| System | gRPC + IPC | PipeManager/StreamManager, consensus proto | Point-to-point |
| User Space | EventBus | CacheStoreABC pub/sub + FileEvent (§4.3) | Fan-out (1:N) |
See federation-memo.md §2–§5 for gRPC/consensus details.
9. Cross-References¶
| Topic | Document |
|---|---|
| Data type → pillar mapping | data-storage-matrix.md |
| Ops ABC × scenario affinity | ops-scenario-matrix.md |
| Syscall table and design rationale | syscall-design.md |
| VFS lock design + advisory locks | lock-architecture.md §4 |
| Zone model, DT_MOUNT, federation | federation-memo.md §5–§6 |
| Raft, gRPC, write flows | federation-memo.md §2–§5 |
| Pipe + Stream design rationale | federation-memo.md §7j |
| Backend storage composition (CAS × Backend) | backend-architecture.md |
| CLI nexus/nexusd split | cli-design.md |