NBD Block Devices
ZeroFS runs a Network Block Device (NBD) server that exposes S3 storage as raw block devices. The devices hold ext4 filesystems, ZFS pools, databases, or VM boot disks. TRIM/discard is supported.
NBD Features
- Raw Block Access - S3 storage appears as standard block devices (/dev/nbd*)
- Dynamic Device Management - Create and delete devices through the filesystem; new devices are picked up at runtime without a server restart
- Multiple Devices - One NBD server exposes every device in the
.nbd/directory - Unix Socket Support - Connect over TCP or a Unix socket
- TRIM Support - Discard deletes the corresponding chunks from the LSM tree; compaction then reclaims the space in S3
- Shared Caches - NBD reads and writes go through the same memory and disk caches as NFS and 9P
- Any Filesystem - Format with ext4, XFS, ZFS, or any other filesystem
Configuration
Configure NBD support in your ZeroFS configuration file:
# TCP mode (default port 10809)
[servers.nbd]
addresses = ["127.0.0.1:10809"]
# Unix socket mode (better performance for local access)
[servers.nbd]
unix_socket = "/tmp/zerofs-nbd.sock"
# Both TCP and Unix socket
[servers.nbd]
addresses = ["127.0.0.1:10809"]
unix_socket = "/tmp/zerofs-nbd.sock"
Then start ZeroFS:
zerofs run --config zerofs.toml
Configuration options:
addresses- Array of bind addresses for NBD TCP server (default: ["127.0.0.1:10809"])unix_socket- Unix socket path for NBD (optional)
Device Management
NBD devices are managed as regular files in the .nbd/ directory. First, mount ZeroFS via NFS or 9P:
# Mount via NFS
mount -t nfs 127.0.0.1:/ /mnt/zerofs
# Or mount via 9P
mount -t 9p -o trans=tcp,port=5564 127.0.0.1 /mnt/zerofs
# Create devices dynamically
mkdir -p /mnt/zerofs/.nbd
truncate -s 1G /mnt/zerofs/.nbd/database
truncate -s 5G /mnt/zerofs/.nbd/storage
truncate -s 10G /mnt/zerofs/.nbd/backup
# List devices
ls -lh /mnt/zerofs/.nbd/
Connecting to NBD Devices
Connect to devices using nbd-client with the device name:
# Connect via TCP (recommended settings for optimal performance)
nbd-client 127.0.0.1 10809 /dev/nbd0 -N database -persist -timeout 600 -connections 4
nbd-client 127.0.0.1 10809 /dev/nbd1 -N storage -persist -timeout 600 -connections 4
nbd-client 127.0.0.1 10809 /dev/nbd2 -N backup -persist -timeout 600 -connections 4
# Or connect via Unix socket (better performance for local access)
nbd-client -unix /tmp/zerofs-nbd.sock /dev/nbd0 -N database -persist -timeout 600 -connections 4
nbd-client -unix /tmp/zerofs-nbd.sock /dev/nbd1 -N storage -persist -timeout 600 -connections 4
# Verify devices are connected
nbd-client -check /dev/nbd0
lsblk | grep nbd
Important Parameters
-N <name>- Device name from.nbd/directory (required)-unix <path>- Use Unix socket instead of TCP-persist- Automatically reconnect if connection drops (recommended)-timeout 600- Use high timeout for S3 latency (recommended: 600 seconds)-connections 4- Number of parallel connections (recommended: 4-8)-readonly- Mount device as read-only-block-size <size>- Block size (512, 1024, 2048, or 4096)
Durability Semantics
The handshake advertises NBD_FLAG_SEND_FLUSH, NBD_FLAG_SEND_FUA, and NBD_FLAG_CAN_MULTI_CONN. Command behavior:
- FLUSH - Flushes the entire filesystem database: memtable to object storage, or a WAL append when the WAL is enabled. The reply is sent only after the flush completes. Concurrent FLUSH and FUA requests coalesce into a single database flush.
- FUA - A WRITE, TRIM, or WRITE_ZEROES with the FUA flag set blocks until the data is durable, through the same flush path as FLUSH.
- CACHE - Accepted as a no-op.
- Structured replies - Not negotiated; clients fall back to simple replies.
See Durability for what durable means under each configuration.
Multiple Connections
Every connection to the NBD server shares one filesystem instance and one flush path, so a FLUSH on any connection covers writes completed on every connection. This satisfies the NBD_FLAG_CAN_MULTI_CONN contract: -connections 4-8 is safe for ZFS pools and databases that rely on write barriers.
Write Barriers
Write barriers hold under the default configuration. With sync_writes = false (the default), individual writes are buffered, but FLUSH and FUA still force durability at the barrier. ZFS transaction syncs and database WAL fsyncs issue exactly these commands.
Using Block Devices
Creating Filesystems
# Format with ext4
mkfs.ext4 /dev/nbd0
mount /dev/nbd0 /mnt/block
# Format with XFS
mkfs.xfs /dev/nbd1
mount /dev/nbd1 /mnt/xfs
ZFS on S3
Create ZFS pools backed by S3 storage:
# Create a ZFS pool
zpool create mypool /dev/nbd0 /dev/nbd1 /dev/nbd2
# Create datasets
zfs create mypool/data
zfs create mypool/backups
# Enable compression
zfs set compression=lz4 mypool
TRIM/Discard Support
ZeroFS NBD devices accept TRIM from any filesystem or zpool:
# Manual TRIM
fstrim /mnt/block
# Enable automatic discard for filesystems
mount -o discard /dev/nbd0 /mnt/block
# ZFS automatic TRIM
zpool set autotrim=on mypool
zpool trim mypool
When blocks are trimmed:
- ZeroFS deletes the corresponding chunks from the LSM tree
- Compaction reclaims the space in S3
- Freed blocks come off the bill
See Garbage Collection for the full reclamation pipeline.
WRITE_ZEROES vs TRIM
WRITE_ZEROES writes physical zeros in 1 MiB chunks. It overwrites chunks and never deletes them, so blkdiscard --zeroout does not free object-store space. TRIM is the only command that deletes chunks and lets compaction reclaim the space. To free space, use plain blkdiscard, fstrim, or zpool trim / autotrim.
Managing Device Files
NBD devices are regular files in the .nbd directory:
# View NBD device files
ls -lh /mnt/zerofs/.nbd/
# -rw-r--r-- 1 root root 1.0G database
# -rw-r--r-- 1 root root 5.0G storage
# -rw-r--r-- 1 root root 10G backup
# Add a new device (picked up at runtime, no restart)
truncate -s 20G /mnt/zerofs/.nbd/new-device
# Remove a device (disconnect NBD client first)
nbd-client -d /dev/nbd3
rm /mnt/zerofs/.nbd/old-device
Important: Device sizes cannot be changed after creation. To resize:
# Disconnect the NBD client
nbd-client -d /dev/nbd0
# Delete and recreate with new size
rm /mnt/zerofs/.nbd/database
truncate -s 2G /mnt/zerofs/.nbd/database
# Reconnect NBD client with optimal settings
nbd-client 127.0.0.1 10809 /dev/nbd0 -N database -persist -timeout 600 -connections 4
Advanced Use Cases
Geo-Distributed ZFS
Mirror a ZFS pool across regions:
# Machine 1 - US East (10.0.1.5)
# zerofs-us-east.toml
[storage]
url = "s3://my-bucket/us-east-db"
encryption_password = "shared-key"
[aws]
region = "us-east-1"
[servers.nbd]
addresses = ["0.0.0.0:10809"]
# Start: zerofs run -c zerofs-us-east.toml
# Machine 2 - EU West (10.0.2.5)
# zerofs-eu-west.toml
[storage]
url = "s3://my-bucket/eu-west-db"
encryption_password = "shared-key"
[aws]
region = "eu-west-1"
[servers.nbd]
addresses = ["0.0.0.0:10809"]
# Start: zerofs run -c zerofs-eu-west.toml
# Create devices on both machines
mount -t nfs 10.0.1.5:/ /mnt/zerofs
truncate -s 100G /mnt/zerofs/.nbd/storage
umount /mnt/zerofs
mount -t nfs 10.0.2.5:/ /mnt/zerofs
truncate -s 100G /mnt/zerofs/.nbd/storage
umount /mnt/zerofs
# From a client machine, connect to both regions with optimal settings
nbd-client 10.0.1.5 10809 /dev/nbd0 -N storage -persist -timeout 600 -connections 8
nbd-client 10.0.2.5 10809 /dev/nbd1 -N storage -persist -timeout 600 -connections 8
# Create mirrored pool across continents
zpool create global-pool mirror /dev/nbd0 /dev/nbd1
ZFS mirrors every block across both regions. The pool stays available if either region fails.
ZFS L2ARC Tiering
Use local NVMe as cache for S3-backed storage:
# Create S3-backed pool
zpool create mypool /dev/nbd0 /dev/nbd1
# Add local NVMe as L2ARC cache
zpool add mypool cache /dev/nvme0n1
# Monitor cache performance
zpool iostat -v mypool 1
Storage tiers:
- NVMe L2ARC - Hot data on local flash
- ZeroFS Caches - Recently used blocks in the configured memory and disk caches
- S3 Storage - Cold data in the bucket
Database Storage
Run databases on NBD devices:
# Connect the database device created in .nbd/
nbd-client 127.0.0.1 10809 /dev/nbd0 \
-N database \
-persist \
-timeout 600 \
-connections 4 \
-block-size 4096
mkfs.ext4 /dev/nbd0
mount /dev/nbd0 /var/lib/postgresql
# Initialize database
sudo -u postgres initdb -D /var/lib/postgresql/16/main
Virtual Machine Storage
Boot VMs from NBD devices:
# Create VM disk
qemu-img create -f raw /dev/nbd0 50G
# Boot VM using NBD device
qemu-system-x86_64 \
-drive file=/dev/nbd0,format=raw,cache=writeback \
-m 4G -enable-kvm
Performance Considerations
Network Optimization
# Multiple parallel connections
nbd-client 127.0.0.1 10809 /dev/nbd0 \
-N database \
-persist \
-timeout 600 \
-connections 4 \
-block-size 4096
# For high-latency connections (e.g., a server in another region)
nbd-client 10.0.2.5 10809 /dev/nbd0 \
-N storage \
-persist \
-timeout 600 \
-connections 8
Monitoring
Device Status
# Check if device is connected
nbd-client -check /dev/nbd0
# List all NBD exports from server
nbd-client -list 127.0.0.1
# View device statistics
cat /sys/block/nbd0/stat
# Monitor I/O performance
iostat -x 1 /dev/nbd0
# Disconnect device safely
nbd-client -disconnect /dev/nbd0
ZFS Monitoring
# Pool status
zpool status
# I/O statistics
zpool iostat -v 1
Troubleshooting
Connection Issues
# If connection fails, check:
# 1. ZeroFS is running and NBD ports are configured
ps aux | grep zerofs
# 2. NBD module is loaded
sudo modprobe nbd
# 3. Export name matches a file in the .nbd/ directory
nbd-client -list 127.0.0.1
# 4. Try with explicit parameters
nbd-client 127.0.0.1 10809 /dev/nbd0 \
-N database \
-nofork # Stay in foreground for debugging
Performance Issues
# Use multiple connections for better throughput
nbd-client 127.0.0.1 10809 /dev/nbd0 \
-N database \
-connections 8 \
-persist \
-timeout 600
# For large sequential workloads, increase block size
nbd-client 127.0.0.1 10809 /dev/nbd0 \
-N database \
-block-size 4096 \
-persist
Persistent Mount Configuration
# Add to /etc/rc.local or systemd service
cat > /etc/systemd/system/zerofs-nbd.service << EOF
[Unit]
Description=ZeroFS NBD Client
After=network.target
[Service]
Type=forking
ExecStart=/usr/sbin/nbd-client 127.0.0.1 10809 /dev/nbd0 -N database -persist -timeout 600 -block-size 4096 -connections 4
ExecStop=/usr/sbin/nbd-client -d /dev/nbd0
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
systemctl enable zerofs-nbd
systemctl start zerofs-nbd