Standalone Compactor

ZeroFS supports running compaction as a separate process, allowing you to offload this CPU and I/O intensive work from your main server instance.

Overview

Compaction is an essential background process in ZeroFS's LSM tree storage engine. It merges sorted runs to:

  • Reduce storage space by removing old versions of updated/deleted data
  • Improve read performance by reducing the number of files to search

By default, compaction runs within the main ZeroFS server process. For high-throughput workloads, you can run a standalone compactor on a separate machine.

Architecture

In a separated compactor setup:

  1. Writer instance: Runs with --no-compactor flag, handling all read/write operations
  2. Compactor instance: Runs zerofs compactor, exclusively handling compaction

Both instances access the same object storage backend. The compactor coordinates with the writer through the shared manifest.

Usage

Starting the Writer Without Compaction

Start your main ZeroFS server with the --no-compactor flag:

zerofs run -c zerofs.toml --no-compactor

This disables the built-in compactor, leaving the database in a state where SST files accumulate in level 0 until an external compactor processes them.

Starting the Standalone Compactor

On a separate instance (or the same machine), run the compactor:

zerofs compactor -c zerofs.toml

The compactor reads the same configuration file to access the storage backend and uses the [lsm] settings for compaction parameters.

Configuration

The standalone compactor uses the same configuration file as the main server. The relevant sections are:

[storage]
url = "s3://my-bucket/zerofs-data"
encryption_password = "your-password"

[aws]
access_key_id = "..."
secret_access_key = "..."

[lsm]
max_concurrent_compactions = 8  # Number of parallel compaction jobs

When to Use a Standalone Compactor

Consider separating the compactor when:

  • Reduce egress costs: Run a small compactor instance in the same region/zone as your S3 bucket. Compaction reads and rewrites large amounts of data during merges - keeping it co-located with the bucket avoids expensive cross-region data transfer fees while your main ZeroFS server can run anywhere.
  • High write throughput: Compaction competes with reads and writes for CPU and I/O
  • Latency-sensitive workloads: Isolate compaction spikes from user requests
  • Multi-tenant environments: Prevent compaction from affecting other services

For most workloads, the built-in compactor is sufficient and simpler to operate.

Was this page helpful?