fsspec Internals¶
This page explains how the NexusFileSystem adapter bridges fsspec to nexus-fs.
Registration¶
nexus-fs registers the nexus protocol via the fsspec.specs entry point in pyproject.toml:
After installing nexus-fs[fsspec], any call to fsspec.filesystem("nexus") returns a NexusFileSystem instance.
Architecture¶
graph LR
A[pandas / dask / HF] -->|nexus:// URL| B[fsspec]
B -->|filesystem protocol| C[NexusFileSystem]
C -->|sync bridge| D[SlimNexusFS]
D -->|mount routing| E[Storage Backend] NexusFileSystem is a thin synchronous adapter that:
- Receives fsspec method calls (
cat,pipe,ls,info,open) - Strips the
nexus://protocol prefix - Delegates to
SlimNexusFSmethods viaanyio.from_thread.run() - Converts results to fsspec's expected format
Auto-discovery¶
When NexusFileSystem is created without an explicit SlimNexusFS instance, it auto-discovers mounts from the local state directory:
- Reads
$TMPDIR/nexus-fs/mounts.json(or$NEXUS_FS_STATE_DIR/mounts.jsonif set) - Reconstructs mount URIs from the saved state
- Calls
mount()to create aSlimNexusFSinstance
This means you can mount backends via the CLI or Python, and then use nexus:// URLs in pandas without any manual setup.
File-like objects¶
NexusFileSystem.open() returns one of two file-like objects:
NexusBufferedFile (read mode)¶
For rb and r modes. Supports:
read(length)— read up tolengthbytesreadline()— read one linereadlines()— read all linesseek(offset, whence)— seek to positiontell()— current position- Iteration via
for line in f - Context manager (
with fs.open(...) as f)
Internally, NexusBufferedFile fetches the full file content on first access and buffers it in memory. For large files (>1 GB), use SlimNexusFS.read_range() instead.
NexusWriteFile (write mode)¶
For wb and w modes. Supports:
write(data)— append data to bufferflush()— no-op (data is buffered)close()— flushes buffer to backend viaSlimNexusFS.write()- Context manager (
with fs.open(...) as f)
The write buffer has a 1 GB limit. Writing more than 1 GB raises ValueError.
Method mapping¶
| fsspec method | nexus-fs call |
|---|---|
cat(path) | read(path) |
cat(path, start, end) | read_range(path, start, end) |
pipe(path, data) | write(path, data) |
ls(path, detail) | ls(path, detail) |
info(path) | stat(path) |
rm(path) | delete(path) or rmdir(path, recursive=True) |
cp(src, dst) | copy(src, dst) |
mkdir(path) | mkdir(path, parents=True) |
Sync bridge¶
NexusFileSystem is synchronous (as fsspec requires), but SlimNexusFS is async. The bridge uses anyio.from_thread.run() to call async methods from synchronous fsspec code. This is the same approach used by httpx and other async-first libraries with sync wrappers.