Notice

This document is for a development version of Ceph.

Metrics

CephFS uses Perf counters to track metrics. The counters can be labeled (Labeled Perf Counters).

Client Metrics

CephFS exports client metrics as Labeled Perf Counters, which could be used to monitor the client performance. CephFS exports the below client metrics.

Client Metrics

Name

Type

Description

num_clients

Gauge

Number of client sessions

cap_hits

Gauge

Percentage of file capability hits over total number of caps

cap_miss

Gauge

Percentage of file capability misses over total number of caps

avg_read_latency

Gauge

Mean value of the read latencies

avg_write_latency

Gauge

Mean value of the write latencies

avg_metadata_latency

Gauge

Mean value of the metadata latencies

dentry_lease_hits

Gauge

Percentage of dentry lease hits handed out over the total dentry lease requests

dentry_lease_miss

Gauge

Percentage of dentry lease misses handed out over the total dentry lease requests

opened_files

Gauge

Number of opened files

opened_inodes

Gauge

Number of opened inodes

pinned_icaps

Gauge

Number of pinned Inode Caps

total_inodes

Gauge

Total number of Inodes

total_read_ops

Gauge

Total number of read operations generated by all process

total_read_size

Gauge

Number of bytes read in input/output operations generated by all process

total_write_ops

Gauge

Total number of write operations generated by all process

total_write_size

Gauge

Number of bytes written in input/output operations generated by all processes

MDS Rank Metrics

Per-MDS-rank metrics are also exported as Labeled Perf Counters with a rank label. These describe the MDS daemon process itself, not any particular client or subvolume.

MDS Rank Metrics

Name

Type

Description

cpu_usage

Gauge

Sum of per-core CPU utilisation for the MDS process (100 == one fully saturated core; values can exceed 100 on multi-core systems).

open_requests

Gauge

Number of metadata requests currently in flight for this MDS rank.

Subvolume Metrics

CephFS exports subvolume metrics as Labeled Perf Counters, which can be used to monitor subvolume performance and utilization.

I/O Performance Metrics

I/O performance metrics (IOPS, throughput, latency) are aggregated within a sliding window of 30 seconds by default. This interval is configurable via the subv_metrics_window_interval parameter (see MDS Config Reference). In large clusters with tens of thousands of subvolumes, this parameter also controls when stale metrics are evicted: once the sliding window becomes empty (no I/O activity), the metrics entry is removed rather than reporting zeros, reducing memory usage and computational overhead.

Important

Metadata operations do NOT trigger metric updates. Only actual data I/O (reads and writes to file contents) updates the sliding window and keeps the subvolume metrics entry active. Metadata-only operations such as mkdir, rmdir, unlink, rename, chmod, chown, setxattr, stat, and ls do not generate I/O metrics.

This means:

  • If a subvolume has only metadata activity (e.g., creating/deleting files without writing data), its I/O metrics will show zeros or the entry may be evicted after the window expires.

  • After deleting files, the used_bytes value will not immediately reflect the freed space until either new data I/O occurs or the MDS broadcasts updated quota information.

Utilization Metrics

In addition to I/O performance, subvolume metrics include utilization counters:

  • quota_bytes: The configured quota limit for the subvolume (0 if unlimited).

  • used_bytes: Current space usage based on the inode’s recursive statistics (rstat.rbytes).

These values are updated when the MDS broadcasts quota information to clients. The used_bytes reflects the recursive byte count of the subvolume root inode, which is maintained by the MDS as files are created, modified, or deleted. However, since metric reporting depends on I/O activity to keep entries alive, the utilization values are only reported while the subvolume has active I/O within the sliding window.

Subvolume Metrics

Name

Type

Description

avg_read_iops

Gauge

Average read IOPS (input/output operations per second) over the sliding window.

avg_read_tp_Bps

Gauge

Average read throughput in bytes per second.

avg_read_lat_msec

Gauge

Average read latency in milliseconds.

avg_write_iops

Gauge

Average write IOPS over the sliding window.

avg_write_tp_Bps

Gauge

Average write throughput in bytes per second.

avg_write_lat_msec

Gauge

Average write latency in milliseconds.

quota_bytes

Gauge

Configured quota limit in bytes (0 if no quota/unlimited).

used_bytes

Gauge

Current space usage in bytes (recursive byte count of subvolume root).

Getting Metrics

The metrics could be scraped from the MDS admin socket as well as using the tell interface. The mds_client_metrics-<fsname> section in the output of counter dump command displays the metrics for each client as shown below:

"mds_client_metrics": [
    {
        "labels": {
            "fs_name": "<fsname>",
            "id": "14213"
        },
        "counters": {
            "num_clients": 2
        }
    }
],
"mds_client_metrics-<fsname>": [
    {
        "labels": {
            "client": "client.0",
            "rank": "0"
        },
        "counters": {
            "cap_hits": 5149,
            "cap_miss": 1,
            "avg_read_latency": 0.000000000,
            "avg_write_latency": 0.000000000,
            "avg_metadata_latency": 0.000000000,
            "dentry_lease_hits": 0,
            "dentry_lease_miss": 0,
            "opened_files": 1,
            "opened_inodes": 2,
            "pinned_icaps": 2,
            "total_inodes": 2,
            "total_read_ops": 0,
            "total_read_size": 0,
            "total_write_ops": 4836,
            "total_write_size": 633864192
        }
    },
    {
        "labels": {
            "client": "client.1",
            "rank": "0"
        },
        "counters": {
            "cap_hits": 3375,
            "cap_miss": 8,
            "avg_read_latency": 0.000000000,
            "avg_write_latency": 0.000000000,
            "avg_metadata_latency": 0.000000000,
            "dentry_lease_hits": 0,
            "dentry_lease_miss": 0,
            "opened_files": 1,
            "opened_inodes": 2,
            "pinned_icaps": 2,
            "total_inodes": 2,
            "total_read_ops": 0,
            "total_read_size": 0,
            "total_write_ops": 3169,
            "total_write_size": 415367168
        }
    }
]

The subvolume metrics are dumped as a part of the same command. The mds_subvolume_metrics section in the output of counter dump command displays the metrics for each subvolume as shown below:

"mds_subvolume_metrics": [
    {
        "labels": {
            "fs_name": "a",
            "subvolume_path": "/volumes/_nogroup/test_subvolume"
        },
        "counters": {
            "avg_read_iops": 0,
            "avg_read_tp_Bps": 11,
            "avg_read_lat_msec": 0,
            "avg_write_iops": 1564,
            "avg_write_tp_Bps": 6408316,
            "avg_write_lat_msec": 338,
            "quota_bytes": 10737418240,
            "used_bytes": 1073741824
        }
    }

Brought to you by the Ceph Foundation

The Ceph Documentation is a community resource funded and hosted by the non-profit Ceph Foundation. If you would like to support this and our other efforts, please consider joining now.