Standalone Compactor

ZeroFS can run compaction as a separate process on its own machine. This moves compaction's CPU and I/O load off the main server.

Overview

Compaction is a background process in ZeroFS's LSM tree storage engine. It merges sorted runs to:

  • Reduce storage space by removing old versions of updated/deleted data
  • Improve read performance by reducing the number of files to search

By default, compaction runs within the main ZeroFS server process. For high-throughput workloads, you can run a standalone compactor on a separate machine.

Architecture

In a separated compactor setup:

  1. Writer instance: Runs with --no-compactor flag, handling all read/write operations
  2. Compactor instance: Runs zerofs compactor, exclusively handling compaction

Both instances access the same object storage backend. The compactor coordinates with the writer through the shared manifest.

Usage

Starting the Writer Without Compaction

Start your main ZeroFS server with the --no-compactor flag:

zerofs run -c zerofs.toml --no-compactor

This disables the built-in compactor. SST files accumulate in level 0 until an external compactor processes them.

Starting the Standalone Compactor

On a separate instance (or the same machine), run the compactor:

zerofs compactor -c zerofs.toml

The compactor reads the same configuration file to access the storage backend and uses the [lsm] settings for compaction parameters. It also honors the [storage] storage_class.

Configuration

The standalone compactor uses the same configuration file as the main server. The relevant sections are:

[storage]
url = "s3://my-bucket/zerofs-data"
encryption_password = "your-password"

[aws]
access_key_id = "..."
secret_access_key = "..."

[lsm]
max_concurrent_compactions = 8  # Number of parallel compaction jobs

When to Use a Standalone Compactor

Consider separating the compactor when:

  • Egress costs: Compaction reads and rewrites large amounts of data during merges. A compactor instance in the same region/zone as the S3 bucket avoids cross-region data transfer fees; the main ZeroFS server can run anywhere.
  • High write throughput: Compaction competes with reads and writes for CPU and I/O
  • Latency-sensitive workloads: Isolate compaction spikes from user requests
  • Multi-tenant environments: Prevent compaction from affecting other services

For most workloads, the built-in compactor is sufficient and simpler to operate.

Was this page helpful?