Standalone Compactor
ZeroFS supports running compaction as a separate process, allowing you to offload this CPU and I/O intensive work from your main server instance.
Overview
Compaction is an essential background process in ZeroFS's LSM tree storage engine. It merges sorted runs to:
- Reduce storage space by removing old versions of updated/deleted data
- Improve read performance by reducing the number of files to search
By default, compaction runs within the main ZeroFS server process. For high-throughput workloads, you can run a standalone compactor on a separate machine.
Architecture
In a separated compactor setup:
- Writer instance: Runs with
--no-compactorflag, handling all read/write operations - Compactor instance: Runs
zerofs compactor, exclusively handling compaction
Both instances access the same object storage backend. The compactor coordinates with the writer through the shared manifest.
Usage
Starting the Writer Without Compaction
Start your main ZeroFS server with the --no-compactor flag:
zerofs run -c zerofs.toml --no-compactor
This disables the built-in compactor, leaving the database in a state where SST files accumulate in level 0 until an external compactor processes them.
Starting the Standalone Compactor
On a separate instance (or the same machine), run the compactor:
zerofs compactor -c zerofs.toml
The compactor reads the same configuration file to access the storage backend and uses the [lsm] settings for compaction parameters.
The compactor only needs access to the object storage backend. It doesn't need to run any network servers (NFS, 9P, NBD).
Configuration
The standalone compactor uses the same configuration file as the main server. The relevant sections are:
[storage]
url = "s3://my-bucket/zerofs-data"
encryption_password = "your-password"
[aws]
access_key_id = "..."
secret_access_key = "..."
[lsm]
max_concurrent_compactions = 8 # Number of parallel compaction jobs
When to Use a Standalone Compactor
Consider separating the compactor when:
- Reduce egress costs: Run a small compactor instance in the same region/zone as your S3 bucket. Compaction reads and rewrites large amounts of data during merges - keeping it co-located with the bucket avoids expensive cross-region data transfer fees while your main ZeroFS server can run anywhere.
- High write throughput: Compaction competes with reads and writes for CPU and I/O
- Latency-sensitive workloads: Isolate compaction spikes from user requests
- Multi-tenant environments: Prevent compaction from affecting other services
For most workloads, the built-in compactor is sufficient and simpler to operate.