Nexus Kernel Architecture¶
Status: Active β kernel architecture SSOT Rule: Keep this file small and precise. Prefer inplace edits over additions. Delegate details to federation-memo.md and data-storage-matrix.md.
1. Design Philosophy¶
NexusFS follows an OS-inspired layered architecture.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SERVICES (user space) β
β Installable/removable. ReBAC, Auth, Agents, Scheduler, etc. β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β protocol interface
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β KERNEL β
β Minimal compilable unit. VFS, MetastoreABC, β
β ObjectStoreABC interface definitions. β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β dependency injection
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DRIVERS β
β Pluggable at startup. redb, S3, LocalDisk, gRPC, etc. β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Kernel minimality: The kernel is the minimal compilable unit β it cannot run alone (like Linux's vmlinuz needs bootloader + init). It defines interfaces; drivers provide implementations via DI at startup.
Three swap tiers (follows Linux's monolithic kernel model, not microkernel):
| Tier | Swap time | Nexus | Linux analogue |
|---|---|---|---|
| Static kernel | Never | MetastoreABC, VFS route(), syscall dispatch | vmlinuz core (scheduler, mm, VFS) |
| Drivers | Config-time (DI at startup) | redb, S3, PostgreSQL, Dragonfly, SearchBrick | compiled-in drivers (=y) |
| Services | Runtime (load/unload) | 23 protocols (ReBAC, Mount, Auth, Agents, Search, Skills, ...) | loadable kernel modules (insmod/rmmod) |
Invariant: Services depend on kernel interfaces, never the reverse. The kernel operates with zero services loaded.
Drivers use constructor DI at startup β same binary, different config (NEXUS_METASTORE=redb, NEXUS_RECORD_STORE=postgresql). Immutable after init.
Services have two maturity phases, both preserving the invariant above:
Phase 1 β Init-time DI (distro composition). factory.py acts as the init system (like systemd): creates selected services and injects them via KernelServices dataclass. Different distros select different service sets at startup β nexus-server loads all 22+, nexus-embedded loads zero.
Resolved (Issue #643):
factory.pygates all services viaDeploymentProfile+enabled_bricksfrozenset (see Β§5.1)._wire_services()migrated tofactory._boot_wired_services()β NexusFS constructor no longer imports or creates services. Two-phase init:NexusFS(...)β_boot_wired_services(nx, ...)βnx._bind_wired_services(dict).
Phase 2 β Runtime hot-swap (Linux LKM model). A ServiceRegistry manages in-process service modules following the Loadable Kernel Module pattern:
- Lifecycle protocol:
service_init()βservice_start()βservice_stop()βservice_cleanup(), plusservice_nameandservice_dependenciesdeclarations - Capability registration: services register the Protocols they implement (like LKMs call
register_filesystem()orregister_chrdev()) - Dependency graph:
load_service()rejects when dependencies missing;unload_service()rejects when dependents still loaded - Reference counting: prevents unloading while callers hold references
Why LKM, not systemd? Nexus services are in-process components (shared memory, zero IPC overhead), not separate daemon processes. LKMs have the same property β in-kernel modules that register capabilities with subsystems.
Gap: No
ServiceRegistry, no lifecycle protocol, noload_service()/unload_service(). Path: extract remaining mixins β standalone service classes (in progress) β introduceServiceRegistrywith LKM lifecycle.
2. The Four Storage Pillars¶
NexusFS abstracts storage by Capability (access pattern + consistency guarantee), not by domain or implementation.
| Pillar | ABC | Capability | Kernel Role |
|---|---|---|---|
| Metastore | MetastoreABC | Ordered KV, CAS, prefix scan, optional Raft SC | Required β sole kernel init param |
| ObjectStore | ObjectStoreABC (= Backend) | Streaming I/O, immutable blobs, petabyte scale | Interface only β instances mounted dynamically |
| RecordStore | RecordStoreABC | Relational ACID, JOINs, FK, vector search | Services only β optional, injected for ReBAC/Auth/etc. |
| CacheStore | CacheStoreABC | Ephemeral KV, Pub/Sub, TTL | Optional β ABC in contracts/ (like include/linux/fscache.h), kernel accepts via DI, services consume; defaults to NullCacheStore |
Orthogonality: Between pillars = different query patterns. Within pillars = interchangeable drivers (deployment-time config). See data-storage-matrix.md for full proof.
Kernel Self-Inclusiveness¶
Kernel compiles and inits with 1 pillar (Metastore). ObjectStore is mounted post-init. Like Linux: kernel defines VFS + block device interface but doesn't ship a filesystem.
| Kernel need | Source |
|---|---|
| File metadata (inode) | MetastoreABC β KV by path |
| Directory index (dentry) | MetastoreABC β ordered prefix scan |
| System settings, zone tracking | MetastoreABC β /__sys__/ KV entries |
| File content (bytes) | ObjectStoreABC β mounted via nx.mount(), not init param |
Kernel does NOT need: JOINs, FK, vector search, TTL, pub/sub (all service-layer).
CacheStore Graceful Degradation¶
No CacheStore β EventBus disabled, PermissionCache falls back to RecordStore, TigerCache O(n), UserSession stays in RecordStore. NullCacheStore provides no-op impl.
RecordStoreABC Usage Pattern¶
Services consume RecordStoreABC.session_factory + SQLAlchemy ORM. Direct SQL or raw driver access is an abstraction break. This ensures driver interchangeability (PostgreSQL β SQLite) without code changes.
Dual-Axis ABC Architecture¶
Two independent ABC axes, composed via DI:
- Data ABCs (this section): WHERE is data stored? β 4 pillars by storage capability
- Ops ABCs (Β§3): WHAT can users/agents DO? β 29 scenario domains by ops affinity
A concrete class sits at the intersection: e.g. ReBACManager implements PermissionProtocol (Ops) and internally uses RecordStoreABC (Data). The Protocol itself has no storage opinion. See ops-scenario-matrix.md for full Ops-Scenario affinity proof.
3. Kernel vs Services Boundary¶
Kernel Interfaces (nexus.core)¶
| Interface | Linux Analogue | Purpose |
|---|---|---|
MetastoreABC | struct inode_operations | Typed FileMetadata CRUD (the inode layer) |
VFSRouterProtocol | VFS lookup_slow() | Path resolution only β mount CRUD lives in Service MountProtocol |
ObjectStoreABC (= Backend) | struct file_operations | Blob I/O interface (read/write/delete/list) |
CacheStoreABC | include/linux/fscache.h | Ephemeral KV + Pub/Sub primitives β ABC in contracts/, kernel accepts optionally, services/bricks consume |
VFSLockManagerProtocol | per-inode i_rwsem | Path-level RW locking with hierarchy awareness |
PipeManagerProtocol | pipe(2) + fs/pipe.c | Named pipe lifecycle + MPMC data path (see Β§6 Kernel Tier) |
MetastoreABC is kernel because it IS the inode layer β the typed contract between VFS and storage. Without it, the kernel cannot describe files.
NexusFS β Syscall Dispatch Layer¶
NexusFS is the kernel entry point, analogous to Linux's syscall layer (sys_open, sys_read). It wires VFSRouter + MetastoreABC + ObjectStoreABC into user-facing operations (read, write, list, mkdir, mount). NexusFS contains no service business logic β services are accessed through ServiceRegistry (Phase 2) or thin delegation stubs (Phase 1).
factory.py is the init system (analogous to systemd): constructs kernel + drivers + services and wires them together. NexusFS receives pre-built dependencies via its constructor and never auto-creates services.
Resolved: Event mixins fully extracted β
NexusFSEventsMixinremoved (#573),FileWatchermoved toservices/watch/(#706), orphaned kernel attrs cleaned (#656)._wire_services()deleted β all service creation moved tofactory._boot_wired_services()(#643). Remaining: replaceKernelServicesdataclass withServiceRegistry.
Service Protocols (nexus.services.protocols)¶
29 scenario domains mapped to Ops ABCs. 23 Protocols exist, 9 gaps remain.
| Category | Protocols | Count |
|---|---|---|
| Permission & Visibility | PermissionProtocol, NamespaceManagerProtocol | 2 |
| Search & Content | SearchProtocol, SearchBrickProtocol (driver), LLMProtocol | 3 |
| Mount & Storage | MountProtocol, ShareLinkProtocol, OAuthProtocol | 3 |
| Agent Infra | AgentRegistryProtocol, SchedulerProtocol | 2 |
| Events & Hooks | EventLogProtocol, HookEngineProtocol, WatchProtocol, LockProtocol | 4 |
| Domain Services | SkillsProtocol, PaymentProtocol | 2 |
| Missing (9 gaps) | Version, Memory, Trajectory, Delegation, Governance, Reputation, OperationLog, Plugin, Workflow | 9 |
All use typing.Protocol with @runtime_checkable. See ops-scenario-matrix.md Β§2βΒ§3 for full enumeration and affinity matching.
3.1. Tier-Neutral Layers (contracts/, lib/)¶
Two packages sit outside the Kernel β Services β Drivers stack. Any layer may import from them; they must not import from nexus.core, nexus.services, nexus.fuse, nexus.bricks, or any other tier-specific package.
| Package | Contains | Linux Analogue | Rule |
|---|---|---|---|
contracts/ | Types, enums, exceptions, constants | include/linux/ (header files) | Declarations only β no implementation logic, no I/O |
lib/ | Reusable helper functions, pure utilities | lib/ (libc, libm) | Implementation allowed, but zero kernel deps |
Core distinction: contracts/ = what (shapes of data). lib/ = how (behavior). When you see from nexus.contracts import X you know X is a lightweight type/exception with near-zero deps. from nexus.lib import Y means Y is a function that does something.
Placement Decision Tree¶
Is it used by a SINGLE layer?
β Yes: stays in that layer (e.g. fuse/filters.py)
β No (multi-layer):
Is it a type / ABC / exception / enum / constant?
β Yes: contracts/
β No (function / helper / I/O logic): lib/
Import Rules¶
contracts/ and lib/ may import from: each other, stdlib, third-party packages. They must never import from: nexus.core, nexus.services, nexus.server, nexus.cli, nexus.fuse, nexus.bricks, nexus.rebac.
What Goes Where β Examples¶
| Module | Destination | Reason |
|---|---|---|
OperationContext, Permission (type defs) | contracts/types.py | Type declarations |
NexusError, BackendError (exceptions) | contracts/exceptions.py | Exception hierarchy |
Base, TimestampMixin (ORM base/mixins) | lib/db_base.py | Schema helpers with implementation (uuid gen, server_default) |
EmailList, ISODateTimeStr (Pydantic Annotated) | lib/validators.py | Annotated types with validation logic |
get_database_url() (env var resolution) | lib/env.py | Implementation helper |
path_matches_pattern() (glob matching) | lib/path_utils.py | Pure utility function |
PathInterner, SegmentedPathInterner (string interning) | lib/path_interner.py | Generic utility (like lib/string.c in Linux) |
is_os_metadata_file() (OS file filter) | fuse/filters.py | Single-layer (FUSE only) |
4. Zone¶
A Zone is the fundamental isolation and consensus unit in NexusFS.
What a Zone determines: - Data isolation: Each zone has its own independent redb database (no shared metadata) - Consensus boundary: 1 Zone = 1 Raft group (consistency guarantees scope) - Visibility: Only nodes participating in a zone can see its metadata - Scalability unit: Zones scale horizontally; adding zones adds capacity without coordination
What a Zone does NOT determine: - Permissions: Read/write access controlled by ReBAC (service layer), not zone membership - User identity: Authentication and user management are services, not zone concerns - File content location: ObjectStore (S3, local disk) is independent of zone topology
Operations: - Mount = create new zone, all participants are equal Voters (no Learner asymmetry) - DT_MOUNT entries in Metastore compose zones into a namespace tree (NFS-style) - DNS-style hierarchical discovery β each zone only knows direct children, no global registry
See federation-memo.md Β§5βΒ§6 for implementation details.
5. Deployment Modes¶
5.1 Deployment Profiles (Distro)¶
Like Linux distros (Ubuntu, Alpine, BusyBox) select which packages to include from the same kernel, Nexus deployment profiles select which bricks to enable from the same codebase. Two orthogonal axes:
- Mode = network topology (standalone, client-server, federation)
- Profile = feature set (which bricks are enabled)
| Profile | Target | Bricks | Linux Analogue |
|---|---|---|---|
| embedded | MCU, WASM (<1 MB) | 2 (storage + eventlog) | BusyBox |
| lite | Pi, Jetson, mobile (512 MB-4 GB) | 8 (+namespace, agent, permissions, cache, ipc, scheduler) | Alpine |
| full | Desktop, laptop (4-32 GB) | 21 (all except federation) | Ubuntu Desktop |
| cloud | k8s, serverless (unlimited) | 22 (all) | Ubuntu Server |
Profile hierarchy: embedded β lite β full β cloud
Mechanism: factory.py (the init system) resolves the active profile via NEXUS_PROFILE env var -> DeploymentProfile enum -> resolve_enabled_bricks() -> frozenset[str]. Each service in the 3-tier boot (_boot_kernel_services, _boot_system_services, _boot_brick_services) checks brick membership before construction. Individual brick overrides via FeaturesConfig YAML always win over profile defaults.
Source of truth: src/nexus/core/deployment_profile.py (22 canonical brick names, 4 profile-to-brick mappings, resolve_enabled_bricks() merge function).
5.2 Network Modes¶
| Mode | Description | Metastore | Services |
|---|---|---|---|
| Standalone | Single process, local storage | redb (local) | Optional |
| Client-Server | RemoteNexusFS connects to a NexusFS server | redb (local) on server | On server |
| Federation | Multiple nodes sharing zones via Raft | redb (Raft) | Per-node |
| Embedded | Minimal kernel on constrained devices | redb (local) | None (profile: embedded) |
Driver selection is config-time: same binary, different NEXUS_METASTORE, NEXUS_RECORD_STORE, etc.
6. Communication¶
Messaging Tiers¶
Three tiers, mirroring Linux's kernel β system β user space communication:
| Tier | Linux Analogue | Nexus | Latency | Topology |
|---|---|---|---|---|
| Kernel | kfifo ring buffer | Nexus Native Pipe (DT_PIPE, MetastoreABC) | ~5ΞΌs | Intra-process |
| System | sendmsg() / Unix sockets / POSIX MQ | gRPC (consensus) + IPC (agent messaging) | ~0.5β1ms | Point-to-point (1:1) |
| User Space | dbus-daemon / Netlink | EventBus (CacheStoreABC pub/sub) | ~1β5ms | Fan-out (1:N) |
Selection rule: Consensus write path β System (gRPC, 1:1). Agent-to-agent messaging β System (IPC, 1:1 queue). Notification read path β User Space (EventBus, 1:N fan-out to 100s of observers). Internal signaling β Kernel (Pipe, zero-copy).
See federation-memo.md Β§7j for Pipe design.
System Tier¶
gRPC for consensus (Raft node-to-node, zone API) and Exchange (agent-to-agent value exchange). IPC for agent messaging β 1:1 queue semantics using VFS as transport.
SSOT: Proto files in
proto/define all RPC services. Seefederation-memo.mdΒ§2βΒ§5. IPC details inops-scenario-matrix.mdS29.
User Space Tier: EventBus¶
EventBusProtocol (service protocol in nexus.services.event_bus.protocol) provides pub/sub for file system change notifications. Kernel defines only the event data types (FileEvent, FileEventType in nexus.core.event_bus).
Linux analogue: dbus-daemon (1:N broadcast). Consumed by WatchProtocol (S8) and EventLogProtocol (S17). Backends: Redis/Dragonfly (current default), NATS JetStream (preferred long-term). All should route through CacheStoreABC pub/sub.
Federation gap: EventBus is currently zone-local. Cross-zone event propagation not yet designed.
7. RecordStoreABC Pattern¶
Services consume RecordStoreABC.session_factory + SQLAlchemy ORM. Direct SQL or raw driver access is an abstraction break. This ensures driver interchangeability (PostgreSQL β SQLite) without code changes.
Cross-References¶
| Topic | Document |
|---|---|
| Data type β pillar mapping (50+ types) | data-storage-matrix.md |
| Storage orthogonality proof | data-storage-matrix.md Β§ORTHOGONALITY |
| Ops ABC Γ scenario affinity (29 domains, 23 protocols) | ops-scenario-matrix.md |
| Ops ABC orthogonality + gap analysis | ops-scenario-matrix.md Β§2βΒ§3 |
| Raft, gRPC, write flows | federation-memo.md Β§2βΒ§5 |
| Zone model, DT_MOUNT | federation-memo.md Β§5βΒ§6 |
| SC vs EC consistency | federation-memo.md Β§4.1 |
| API privilege levels (agents vs ops vs admin) | federation-memo.md Β§6.10 |