Monitoring & Tracing

A running ZeroFS server exposes a gRPC admin API. Three CLI commands consume it: zerofs monitor renders a live stats dashboard in the terminal, zerofs fatrace prints one line per filesystem operation, and zerofs flush persists buffered writes to durable storage on demand. Any gRPC client can call the same API.

The RPC Server

All three commands connect to a zerofs run instance through the admin server configured under [servers.rpc]:

zerofs.toml

[servers.rpc]
addresses = ["127.0.0.1:7000"]
unix_socket = "/tmp/zerofs.rpc.sock"

addresses binds TCP listeners; unix_socket binds a Unix domain socket. Either or both can be set. The configuration generated by zerofs init enables both, with TCP on 127.0.0.1:7000 and the socket at /tmp/zerofs.rpc.sock.

Each command takes -c with the same configuration file and reads the connection details from it. The client tries the Unix socket first if the socket file exists, then each TCP address in turn. Without a [servers.rpc] section, the commands exit with RPC server not configured in config file.

zerofs monitor

zerofs monitor opens a full-screen terminal dashboard fed by a stats stream from the server:

  • I/O throughput and IOPS charts: separate read and write series; rates are computed over windows of one second or one refresh interval, whichever is longer, and the latest 120 samples are kept
  • Total operations sparkline: operations per second across all operation types
  • Storage gauge: used bytes against the configured capacity, plus the inode count
  • Operation counters: files, directories, and links created, deleted, and renamed since server start
  • Garbage collection counters: tombstones created and processed, chunks deleted, and GC runs since server start
  • jemalloc memory: allocated, resident, retained, and metadata bytes of the server process, plus a fragmentation percentage (resident minus allocated, relative to allocated)

Press q or Ctrl+C to quit.

Run the Dashboard

zerofs monitor -c zerofs.toml

# Refresh every second instead of
# the default 250 ms
zerofs monitor -c zerofs.toml --interval 1000

--interval sets the snapshot interval in milliseconds and defaults to 250. The server clamps intervals below 250 ms to 250 ms.

zerofs fatrace

zerofs fatrace subscribes to the server's file access stream and prints one line per filesystem operation: the operation type, the resolved path, and the operation's parameters. 14 operation types are traced: read, write, create, remove, rename, mkdir, readdir, lookup, setattr, link, symlink, mknod, trim, and fsync.

Parameters depend on the operation:

OperationsExtra fields
read, write, trimoffset= and len= in bytes
create, mkdir, setattr, mknodmode= in octal
rename, link-> destination path
symlink-> link target
lookupname= the looked-up filename

Trace a Log Rotation

$ zerofs fatrace -c zerofs.toml
Tracing file access (Ctrl+C to stop)...
write   | /logs/app.log offset=53248 len=4096
fsync   | /logs/app.log
rename  | /logs/app.log -> /logs/app.log.1
create  | /logs/app.log mode=0644
lookup  | /logs/app.log.1 name=app.log.1
read    | /logs/app.log.1 offset=0 len=65536

Overhead

When no client is subscribed, the server checks the subscriber count and returns: path resolution and event dispatch are skipped entirely. With a subscriber attached, the server resolves paths and broadcasts events in background tasks, so the filesystem operation itself never waits on tracing.

Events pass through a broadcast channel that buffers 1,024 events. When a subscriber falls behind during a burst, the oldest buffered events are dropped from that subscriber's stream rather than slowing the filesystem. The trace is an observation tool, not a complete audit log.

zerofs flush

zerofs flush asks the running server to persist all buffered writes to durable storage. The command returns after the flush completes. The server coalesces concurrent flush requests into a single flush.

By default, writes between fsync calls live in the server's memtable (see Durability & Consistency). zerofs flush creates a durability point on demand without requiring clients to call fsync — for example, before stopping the server or before creating a checkpoint.

Flush

$ zerofs flush -c zerofs.toml
Flush completed successfully

The gRPC AdminService

The three commands, along with the checkpoint commands, are clients of a single gRPC service: zerofs.admin.AdminService, defined in zerofs/proto/admin.proto. It exposes seven RPCs:

  • Name
    CreateCheckpoint
    Type
    unary
    Description

    Creates a named checkpoint; returns its name, UUID, and creation timestamp

  • Name
    ListCheckpoints
    Type
    unary
    Description

    Returns all checkpoints

  • Name
    DeleteCheckpoint
    Type
    unary
    Description

    Deletes a checkpoint by name

  • Name
    GetCheckpointInfo
    Type
    unary
    Description

    Returns one checkpoint by name; NOT_FOUND if it does not exist

  • Name
    Flush
    Type
    unary
    Description

    Persists buffered writes to durable storage; returns on completion

  • Name
    WatchFileAccess
    Type
    server streaming
    Description

    Streams one FileAccessEvent per filesystem operation (the zerofs fatrace feed)

  • Name
    StreamStats
    Type
    server streaming
    Description

    Streams StatsSnapshot messages at the requested interval_ms, clamped to a 250 ms minimum (the zerofs monitor feed)

The service answers on the configured TCP addresses and Unix socket. It does not register gRPC server reflection, so generic clients such as grpcurl need the proto file:

Stream Stats with grpcurl

# admin.proto lives at zerofs/proto/admin.proto in the ZeroFS repository
grpcurl -plaintext \
  -import-path zerofs/proto -proto admin.proto \
  -d '{"interval_ms": 1000}' \
  127.0.0.1:7000 zerofs.admin.AdminService/StreamStats

This prints one JSON StatsSnapshot per second: operation counters, bytes read and written, GC activity, used bytes and inodes against the configured maximum, and jemalloc memory statistics.

Was this page helpful?