Troubleshooting

Diagnostic steps for common ZeroFS problems: mount failures, startup errors, S3 connectivity errors, cache and configuration issues, encryption passwords, NBD devices, runtime storage errors, and debug logging.

Common Issues

Mount Failures

# Error: mount.nfs: access denied by server
# Solution: Check NFS port and permissions

# Verify ZeroFS is running
ps aux | grep zerofs

# Check if port is listening
netstat -tlnp | grep 2049
# or
ss -tlnp | grep 2049

# Try mounting with verbose output
sudo mount -v -t nfs -o vers=3,tcp,port=2049 127.0.0.1:/ /mnt/zerofs

S3 Connectivity Issues

# Error: InvalidAccessKeyId or SignatureDoesNotMatch
# Solution: Verify AWS credentials in config

# Test credentials directly
aws s3 ls s3://your-bucket/ --debug

# Check configuration file
cat zerofs.toml | grep -A 5 '\[aws\]'

# Verify values are correct:
# [aws]
# access_key_id = "your-key"
# secret_access_key = "your-secret"
# region = "us-east-1"
# endpoint = "https://s3.wasabisys.com"  # For S3-compatible

Cache Problems

# Error: Failed to create cache directory
# Solution: Check permissions and disk space

# Verify cache directory from config
grep -A 3 '\[cache\]' zerofs.toml

# Check if directory exists
ls -la /var/cache/zerofs

# Check disk space
df -h /var/cache/zerofs

# Fix permissions
sudo mkdir -p /var/cache/zerofs
sudo chown $USER:$USER /var/cache/zerofs
chmod 755 /var/cache/zerofs

# Verify write access
touch /var/cache/zerofs/test && rm /var/cache/zerofs/test

Startup Errors

Illegal Instruction (SIGILL)

zerofs crashes at startup with Illegal instruction (SIGILL) before producing any log output.

The prebuilt Linux amd64 binary is compiled with -C target-cpu=x86-64-v3 and requires AVX2 (Intel Haswell 2013 and later, AMD Excavator 2015 and later). The published amd64 Docker image contains the same binary. On a CPU without AVX2, the process exits at the first unsupported instruction.

# Check for AVX2 — no output means the CPU cannot run the prebuilt binary
grep -o avx2 /proc/cpuinfo | head -1

# Fix: build from source (targets baseline x86-64)
git clone https://github.com/Barre/ZeroFS
cd ZeroFS/zerofs
cargo build --release

See Installation in the quickstart for all install methods.

Conditional Write Check Fails

Startup fails with:

Storage provider does not support conditional writes. PutMode::Create succeeded when it should have failed. This feature is required for fencing in ZeroFS.

Before every read-write open, ZeroFS writes a probe object named .zerofs_compatibility_test_<uuid> to the database path and attempts a second, create-only write to the same key. The store must reject the second write with an already-exists error. A store that accepts it lacks put-if-not-exists support, which fencing requires, so ZeroFS refuses to start. A message starting with Storage provider precondition check failed means the probe itself returned an unexpected error; the underlying store error follows the colon.

For S3-compatible stores without conditional put support, set conditional_put to a Redis URL:

[aws]
conditional_put = "redis://localhost:6379"

See the Conditional Put Support section under Storage Backends for configuration and Conditional Writes and Fencing for the fencing model.

Configuration Issues

Missing Configuration File

# Error: Failed to load config
# Solution: Create or specify config file

# Initialize a new config
zerofs init

# Run with specific config file
zerofs run --config /path/to/zerofs.toml

# Or use short form
zerofs run -c zerofs.toml

Encryption Issues

Password Problems

# Wrong password or corrupted data

# Verify password in config file:
[storage]
url = "s3://bucket/path"
encryption_password = "correct-password"

# Values undergo environment variable expansion.
# Write a literal $ as $$ — this sets the password to p@$w0rd!
encryption_password = 'p@$$w0rd!'

# Or use environment variable substitution:
encryption_password = "${ZEROFS_PASSWORD}"

Changing Passwords

# Change password using the CLI
# The new password is read from stdin
echo "new-password" | zerofs change-password --config zerofs.toml

# After changing, update your config file:
# [storage]
# encryption_password = "new-password"

NBD Issues

Device Connection

# Error: nbd-client connection failed
# Solution: Check NBD configuration and availability

# Verify NBD is configured in zerofs.toml:
# [servers.nbd]
# addresses = ["127.0.0.1:10809"]

# Check if port is listening
ss -tlnp | grep 10809

# Ensure NBD module is loaded (Linux)
sudo modprobe nbd
lsmod | grep nbd

# Create NBD device file
sudo mkdir -p /mnt/zerofs/.nbd
sudo truncate -s 10G /mnt/zerofs/.nbd/my-device

# Connect to the device
sudo nbd-client 127.0.0.1 10809 /dev/nbd0 -N my-device

Runtime Errors

Reads Stall with Repeated Short-Body Warnings

Reads hang while the log repeats:

object store returned a short body; re-fetching

ZeroFS checks the body of every object store read against the length the store reported. When the body comes back shorter, ZeroFS discards it and fetches again rather than return truncated data. There is no retry limit: the read blocks until a full body arrives, with backoff starting at 100 ms, doubling per attempt, and capping at 800 ms. The warning fields report the object location, the expected and received byte counts, and the attempt number.

Repeated warnings for the same object mean something between ZeroFS and the storage backend truncates responses — a proxy or the provider itself. Reads recover without intervention once full responses arrive.

Process Exits with a Fatal Write Error

The process exits with code 1 after logging:

Fatal write error, exiting: <underlying error>

Any failed write to the database terminates the process: after a write failure, the database state is unknown, and a restart rebuilds from the last committed state. Run ZeroFS under a supervisor that restarts it — the systemd unit under System Integration in the configuration guide sets Restart=always.

The text after the colon is the underlying storage error. Diagnose that error; the exit is the recovery mechanism, not the fault.

Debugging Techniques

Enable Detailed Logging

# Maximum verbosity
export RUST_LOG='zerofs=trace,slatedb=debug'

# Log to file for analysis
export RUST_LOG='zerofs=debug'
zerofs run -c zerofs.toml 2>&1 | tee zerofs.log

# Filter specific components
export RUST_LOG='zerofs::nfs=trace,zerofs::nbd=debug'

debug list-keys Fences a Running Instance

zerofs debug list-keys --config zerofs.toml prints every key in the database with a per-type summary. It opens the database in read-write mode, and fencing allows one writer per database. When it opens a database that a running ZeroFS instance is serving, it takes over as the writer: the instance's next write fails and the instance exits with Fatal write error, exiting.

Stop the instance before running debug list-keys against its database, or accept that a supervisor with Restart=always restarts the instance after the exit.

Was this page helpful?