Notice
This document is for a development version of Ceph.
Metrics
CephFS uses Perf counters to track metrics. The counters can be labeled (Labeled Perf Counters).
Client Metrics
CephFS exports client metrics as Labeled Perf Counters, which could be used to monitor the client performance. CephFS exports the below client metrics.
Name |
Type |
Description |
|---|---|---|
num_clients |
Gauge |
Number of client sessions |
cap_hits |
Gauge |
Percentage of file capability hits over total number of caps |
cap_miss |
Gauge |
Percentage of file capability misses over total number of caps |
avg_read_latency |
Gauge |
Mean value of the read latencies |
avg_write_latency |
Gauge |
Mean value of the write latencies |
avg_metadata_latency |
Gauge |
Mean value of the metadata latencies |
dentry_lease_hits |
Gauge |
Percentage of dentry lease hits handed out over the total dentry lease requests |
dentry_lease_miss |
Gauge |
Percentage of dentry lease misses handed out over the total dentry lease requests |
opened_files |
Gauge |
Number of opened files |
opened_inodes |
Gauge |
Number of opened inodes |
pinned_icaps |
Gauge |
Number of pinned Inode Caps |
total_inodes |
Gauge |
Total number of Inodes |
total_read_ops |
Gauge |
Total number of read operations generated by all process |
total_read_size |
Gauge |
Number of bytes read in input/output operations generated by all process |
total_write_ops |
Gauge |
Total number of write operations generated by all process |
total_write_size |
Gauge |
Number of bytes written in input/output operations generated by all processes |
MDS Rank Metrics
Per-MDS-rank metrics are also exported as Labeled Perf Counters with
a rank label. These describe the MDS daemon process itself, not any
particular client or subvolume.
Name |
Type |
Description |
|---|---|---|
|
Gauge |
Sum of per-core CPU utilisation for the MDS process ( |
|
Gauge |
Number of metadata requests currently in flight for this MDS rank. |
Subvolume Metrics
CephFS exports subvolume metrics as Labeled Perf Counters, which can be used to monitor subvolume performance and utilization.
I/O Performance Metrics
I/O performance metrics (IOPS, throughput, latency) are aggregated within a
sliding window of 30 seconds by default. This interval is configurable via
the subv_metrics_window_interval parameter (see MDS Config Reference).
In large clusters with tens of thousands of subvolumes, this parameter also
controls when stale metrics are evicted: once the sliding window becomes empty
(no I/O activity), the metrics entry is removed rather than reporting zeros,
reducing memory usage and computational overhead.
Important
Metadata operations do NOT trigger metric updates. Only actual data
I/O (reads and writes to file contents) updates the sliding window and
keeps the subvolume metrics entry active. Metadata-only operations such
as mkdir, rmdir, unlink, rename, chmod, chown,
setxattr, stat, and ls do not generate I/O metrics.
This means:
If a subvolume has only metadata activity (e.g., creating/deleting files without writing data), its I/O metrics will show zeros or the entry may be evicted after the window expires.
After deleting files, the
used_bytesvalue will not immediately reflect the freed space until either new data I/O occurs or the MDS broadcasts updated quota information.
Utilization Metrics
In addition to I/O performance, subvolume metrics include utilization counters:
quota_bytes: The configured quota limit for the subvolume (0 if unlimited).used_bytes: Current space usage based on the inode’s recursive statistics (rstat.rbytes).
These values are updated when the MDS broadcasts quota information to
clients. The used_bytes reflects the recursive byte count of the
subvolume root inode, which is maintained by the MDS as files are created,
modified, or deleted. However, since metric reporting depends on I/O
activity to keep entries alive, the utilization values are only reported
while the subvolume has active I/O within the sliding window.
Name |
Type |
Description |
|---|---|---|
|
Gauge |
Average read IOPS (input/output operations per second) over the sliding window. |
|
Gauge |
Average read throughput in bytes per second. |
|
Gauge |
Average read latency in milliseconds. |
|
Gauge |
Average write IOPS over the sliding window. |
|
Gauge |
Average write throughput in bytes per second. |
|
Gauge |
Average write latency in milliseconds. |
|
Gauge |
Configured quota limit in bytes (0 if no quota/unlimited). |
|
Gauge |
Current space usage in bytes (recursive byte count of subvolume root). |
Getting Metrics
The metrics could be scraped from the MDS admin socket as well as using the tell interface. The mds_client_metrics-<fsname> section in the output of counter dump command displays the metrics for each client as shown below:
"mds_client_metrics": [
{
"labels": {
"fs_name": "<fsname>",
"id": "14213"
},
"counters": {
"num_clients": 2
}
}
],
"mds_client_metrics-<fsname>": [
{
"labels": {
"client": "client.0",
"rank": "0"
},
"counters": {
"cap_hits": 5149,
"cap_miss": 1,
"avg_read_latency": 0.000000000,
"avg_write_latency": 0.000000000,
"avg_metadata_latency": 0.000000000,
"dentry_lease_hits": 0,
"dentry_lease_miss": 0,
"opened_files": 1,
"opened_inodes": 2,
"pinned_icaps": 2,
"total_inodes": 2,
"total_read_ops": 0,
"total_read_size": 0,
"total_write_ops": 4836,
"total_write_size": 633864192
}
},
{
"labels": {
"client": "client.1",
"rank": "0"
},
"counters": {
"cap_hits": 3375,
"cap_miss": 8,
"avg_read_latency": 0.000000000,
"avg_write_latency": 0.000000000,
"avg_metadata_latency": 0.000000000,
"dentry_lease_hits": 0,
"dentry_lease_miss": 0,
"opened_files": 1,
"opened_inodes": 2,
"pinned_icaps": 2,
"total_inodes": 2,
"total_read_ops": 0,
"total_read_size": 0,
"total_write_ops": 3169,
"total_write_size": 415367168
}
}
]
The subvolume metrics are dumped as a part of the same command. The mds_subvolume_metrics section in the output of counter dump command displays the metrics for each subvolume as shown below:
"mds_subvolume_metrics": [
{
"labels": {
"fs_name": "a",
"subvolume_path": "/volumes/_nogroup/test_subvolume"
},
"counters": {
"avg_read_iops": 0,
"avg_read_tp_Bps": 11,
"avg_read_lat_msec": 0,
"avg_write_iops": 1564,
"avg_write_tp_Bps": 6408316,
"avg_write_lat_msec": 338,
"quota_bytes": 10737418240,
"used_bytes": 1073741824
}
}
Brought to you by the Ceph Foundation
The Ceph Documentation is a community resource funded and hosted by the non-profit Ceph Foundation. If you would like to support this and our other efforts, please consider joining now.