2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
linux/tools/perf/arch
Ian Rogers 0ffca606e9 perf pmu intel: Adjust cpumaks for sub-NUMA clusters on graniterapids
On graniterapids the cache home agent (CHA) and memory controller
(IMC) PMUs all have their cpumask set to per-socket information. In
order for per NUMA node aggregation to work correctly the PMUs cpumask
needs to be set to CPUs for the relevant sub-NUMA grouping.

For example, on a 2 socket graniterapids machine with sub NUMA
clustering of 3, for uncore_cha and uncore_imc PMUs the cpumask is
"0,120" leading to aggregation only on NUMA nodes 0 and 3:
```
$ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1

 Performance counter stats for 'system wide':

N0        1    277,835,681,344      UNC_CHA_CLOCKTICKS
N0        1     19,242,894,228      UNC_M_CLOCKTICKS
N3        1    277,803,448,124      UNC_CHA_CLOCKTICKS
N3        1     19,240,741,498      UNC_M_CLOCKTICKS

       1.002113847 seconds time elapsed
```

By updating the PMUs cpumasks to "0,120", "40,160" and "80,200" then
the correctly 6 NUMA node aggregations are achieved:
```
$ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1

 Performance counter stats for 'system wide':

N0        1     92,748,667,796      UNC_CHA_CLOCKTICKS
N0        0      6,424,021,142      UNC_M_CLOCKTICKS
N1        0     92,753,504,424      UNC_CHA_CLOCKTICKS
N1        1      6,424,308,338      UNC_M_CLOCKTICKS
N2        0     92,751,170,084      UNC_CHA_CLOCKTICKS
N2        0      6,424,227,402      UNC_M_CLOCKTICKS
N3        1     92,745,944,144      UNC_CHA_CLOCKTICKS
N3        0      6,423,752,086      UNC_M_CLOCKTICKS
N4        0     92,725,793,788      UNC_CHA_CLOCKTICKS
N4        1      6,422,393,266      UNC_M_CLOCKTICKS
N5        0     92,717,504,388      UNC_CHA_CLOCKTICKS
N5        0      6,421,842,618      UNC_M_CLOCKTICKS

       1.003406645 seconds time elapsed
```

In general, having the perf tool adjust cpumasks isn't desirable as
ideally the PMU driver would be advertising the correct cpumask.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250515181417.491401-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 23:15:48 -03:00
..
alpha perf build: Remove Makefile.syscalls 2025-03-20 22:58:20 -07:00
arc/annotate perf build: Remove Makefile.syscalls 2025-03-20 22:58:20 -07:00
arm tools headers: Update the syscall table with the kernel sources 2025-04-10 09:28:24 -07:00
arm64 perf build: Remove Makefile.syscalls 2025-03-20 22:58:20 -07:00
csky perf build: Remove Makefile.syscalls 2025-03-20 22:58:20 -07:00
loongarch perf build: Remove Makefile.syscalls 2025-03-20 22:58:20 -07:00
mips tools headers: Update the syscall table with the kernel sources 2025-04-10 09:28:24 -07:00
parisc perf build: Remove Makefile.syscalls 2025-03-20 22:58:20 -07:00
powerpc tools headers: Update the syscall table with the kernel sources 2025-04-10 09:28:24 -07:00
riscv perf build: Remove Makefile.syscalls 2025-03-20 22:58:20 -07:00
riscv64/annotate perf disasm: Add e_machine/e_flags to struct arch 2024-11-09 08:39:13 -08:00
s390 tools headers: Update the syscall table with the kernel sources 2025-04-10 09:28:24 -07:00
sh tools headers: Update the syscall table with the kernel sources 2025-04-10 09:28:24 -07:00
sparc tools headers: Update the syscall table with the kernel sources 2025-04-10 09:28:24 -07:00
x86 perf pmu intel: Adjust cpumaks for sub-NUMA clusters on graniterapids 2025-05-22 23:15:48 -03:00
xtensa tools headers: Update the syscall table with the kernel sources 2025-04-10 09:28:24 -07:00
Build perf util: Make util its own library 2024-06-26 11:07:42 -07:00
common.c perf tools riscv: Add support for riscv lookup_binutils_path 2023-05-12 15:21:48 -03:00
common.h perf annotate: Own objdump_path and disassembler_style strings 2023-04-04 09:39:56 -03:00