x86,fs/resctrl: Update documentation for telemetry events

Update resctrl filesystem documentation with the details about the resctrl
files that support telemetry events.

  [ bp: Drop the debugfs hunk of the documentation until a better debugging
    solution is found. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
This commit is contained in:
Tony Luck
2025-12-17 09:21:19 -08:00
committed by Borislav Petkov (AMD)
parent 4bbfc90122
commit a8848c4b43

View File

@@ -252,13 +252,12 @@ with respect to allocation:
bandwidth percentages are directly applied to
the threads running on the core
If RDT monitoring is available there will be an "L3_MON" directory
If L3 monitoring is available there will be an "L3_MON" directory
with the following files:
"num_rmids":
The number of RMIDs available. This is the
upper bound for how many "CTRL_MON" + "MON"
groups can be created.
The number of RMIDs supported by hardware for
L3 monitoring events.
"mon_features":
Lists the monitoring events if
@@ -484,6 +483,24 @@ with the following files:
bytes) at which a previously used LLC_occupancy
counter can be considered for re-use.
If telemetry monitoring is available there will be a "PERF_PKG_MON" directory
with the following files:
"num_rmids":
The number of RMIDs for telemetry monitoring events.
On Intel resctrl will not enable telemetry events if the number of
RMIDs that can be tracked concurrently is lower than the total number
of RMIDs supported. Telemetry events can be force-enabled with the
"rdt=" kernel parameter, but this may reduce the number of
monitoring groups that can be created.
"mon_features":
Lists the telemetry monitoring events that are enabled on this system.
The upper bound for how many "CTRL_MON" + "MON" can be created
is the smaller of the L3_MON and PERF_PKG_MON "num_rmids" values.
Finally, in the top level of the "info" directory there is a file
named "last_cmd_status". This is reset with every "command" issued
via the file system (making new directories or writing to any of the
@@ -589,15 +606,40 @@ When control is enabled all CTRL_MON groups will also contain:
When monitoring is enabled all MON groups will also contain:
"mon_data":
This contains a set of files organized by L3 domain and by
RDT event. E.g. on a system with two L3 domains there will
be subdirectories "mon_L3_00" and "mon_L3_01". Each of these
directories have one file per event (e.g. "llc_occupancy",
"mbm_total_bytes", and "mbm_local_bytes"). In a MON group these
files provide a read out of the current value of the event for
all tasks in the group. In CTRL_MON groups these files provide
the sum for all tasks in the CTRL_MON group and all tasks in
This contains directories for each monitor domain.
If L3 monitoring is enabled, there will be a "mon_L3_XX" directory for
each instance of an L3 cache. Each directory contains files for the enabled
L3 events (e.g. "llc_occupancy", "mbm_total_bytes", and "mbm_local_bytes").
If telemetry monitoring is enabled, there will be a "mon_PERF_PKG_YY"
directory for each physical processor package. Each directory contains
files for the enabled telemetry events (e.g. "core_energy". "activity",
"uops_retired", etc.)
The info/`*`/mon_features files provide the full list of enabled
event/file names.
"core energy" reports a floating point number for the energy (in Joules)
consumed by cores (registers, arithmetic units, TLB and L1/L2 caches)
during execution of instructions summed across all logical CPUs on a
package for the current monitoring group.
"activity" also reports a floating point value (in Farads). This provides
an estimate of work done independent of the frequency that the CPUs used
for execution.
Note that "core energy" and "activity" only measure energy/activity in the
"core" of the CPU (arithmetic units, TLB, L1 and L2 caches, etc.). They
do not include L3 cache, memory, I/O devices etc.
All other events report decimal integer values.
In a MON group these files provide a read out of the current value of
the event for all tasks in the group. In CTRL_MON groups these files
provide the sum for all tasks in the CTRL_MON group and all tasks in
MON groups. Please see example section for more details on usage.
On systems with Sub-NUMA Cluster (SNC) enabled there are extra
directories for each node (located within the "mon_L3_XX" directory
for the L3 cache they occupy). These are named "mon_sub_L3_YY"