2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

47501 Commits

Author SHA1 Message Date
Zhang Rui
2a535d6cc3 tools/power turbostat: Dump RAPL sysfs info
for example:

intel-rapl:1: psys 28.0s:100W 976.0us:100W
intel-rapl:0: package-0 28.0s:57W,max:15W 2.4ms:57W
intel-rapl:0/intel-rapl:0:0: core disabled
intel-rapl:0/intel-rapl:0:1: uncore disabled
intel-rapl-mmio:0: package-0 28.0s:28W,max:15W 2.4ms:57W

[lenb: simplified format]

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

squish me

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
69078520fd tools/power turbostat: Avoid probing the same perf counters
For the RAPL package energy status counter, Intel and AMD share the same
perf_subsys and perf_name, but with different MSR addresses.

Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are
introduced to describe this counter for different Vendors.

As a result, the perf counter is probed twice, and causes a failure in
in get_rapl_counters() because expected_read_size and actual_read_size
don't match.

Fix the problem by skipping the already probed counter.

Note, this is not a perfect fix. For example, if different
vendors/platforms use the same MSR value for different purpose, the code
can be fooled when it probes a rapl_counter_arch_infos[] entry that does
not belong to the running Vendor/Platform.

In a long run, better to put rapl_counter_arch_infos[] into the
platform_features so that this becomes Vendor/Platform specific.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
ff3d019e98 tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
platform_features->rapl_msrs describes the RAPL MSRs supported. While
RAPL Perf counters can be exposed from different kernel backend drivers,
e.g. RAPL MSR I/F driver, or RAPL TPMI I/F driver.

Thus, turbostat should first blindly probe all the available RAPL Perf
counters, and falls back to the RAPL MSR counters if they are listed in
platform_features->rapl_msrs.

With this, platforms that don't have RAPL MSRs can clear the
platform_features->rapl_msrs bits and use RAPL Perf counters only.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
0362337968 tools/power turbostat: Clean up add perf/msr counter logic
Increase the code readability by moving the no_perf/no_msr flag and the
cai->perf_name/cai->msr sanity checks into the counter probe functions.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
1ab2e19b4c tools/power turbostat: Introduce add_msr_counter()
probe_rapl_msr() is reused for probing RAPL MSR counters, cstate MSR
counters and MPERF/APERF/SMI MSR counters, thus its name is misleading.

Similar to add_perf_counter(), introduce add_msr_counter() to probe a
counter via MSR. Introduce wrapper function add_rapl_msr_counter() at
the same time to add extra check for Zero return value for specified
RAPL counters.

No functional change intended.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
3403e89f97 tools/power turbostat: Remove add_msr_perf_counter_()
As the only caller of add_msr_perf_counter_(), add_msr_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.

Remove add_msr_perf_counter_() and move all the logic to
add_msr_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
4d6ced7bef tools/power turbostat: Remove add_cstate_perf_counter_()
As the only caller of add_cstate_perf_counter_(),
add_cstate_perf_counter() just gives extra debug output on top. There is
no need to keep both functions.

Remove add_cstate_perf_counter_() and move all the logic to
add_cstate_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
c8bca955da tools/power turbostat: Remove add_rapl_perf_counter_()
As the only caller of add_rapl_perf_counter_(), add_rapl_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.

Remove add_rapl_perf_counter_() and move all the logic to
add_rapl_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
57b53787f0 tools/power turbostat: Quit early for unsupported RAPL counters
Quit early for unsupported RAPL counters.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
fdea6b883b tools/power turbostat: Always check rapl_joules flag
rapl_joules bit should always be checked even if
platform_features->rapl_msrs is not set or no_msr flag is used.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Gautham R. Shenoy
adb49732c8 tools/power turbostat: Fix AMD package-energy reporting
commit 05a2f07db8 ("tools/power turbostat: read RAPL counters via
perf") that adds support to read RAPL counters via perf defines the
notion of a RAPL domain_id which is set to physical_core_id on
platforms which support per_core_rapl counters (Eg: AMD processors
Family 17h onwards) and is set to the physical_package_id on all the
other platforms.

However, the physical_core_id is only unique within a package and on
platforms with multiple packages more than one core can have the same
physical_core_id and thus the same domain_id. (For eg, the first cores
of each package have the physical_core_id = 0). This results in all
these cores with the same physical_core_id using the same entry in the
rapl_counter_info_perdomain[]. Since rapl_perf_init() skips the
perf-initialization for cores whose domain_ids have already been
visited, cores that have the same physical_core_id always read the
perf file corresponding to the physical_core_id of the first package
and thus the package-energy is incorrectly reported to be the same
value for different packages.

Note: This issue only arises when RAPL counters are read via perf and
not when they are read via MSRs since in the latter case the MSRs are
read separately on each core.

Fix this issue by associating each CPU with rapl_core_id which is
unique across all the packages in the system.

Fixes: 05a2f07db8 ("tools/power turbostat: read RAPL counters via perf")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Kaushlendra Kumar
b4a734d383 tools/power turbostat: Fix RAPL_GFX_ALL typo
Fix typo in the currently unused RAPL_GFX_ALL macro definition.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Kaushlendra Kumar
5663785ae0 tools/power turbostat: Add Android support for MSR device handling
It uses /dev/msrN device paths on Android instead of /dev/cpu/N/msr,
updates error messages and permission checks to reflect the Android
device path, and wraps platform-specific code with #if defined(ANDROID)
to ensure correct behavior on both Android and non-Android systems.
These changes improve compatibility and usability of turbostat on
Android devices.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Len Brown
c967900fcb tools/power turbostat.8: pm_domain wording fix
turbostat.8: clarify that uncore "domains" are Power Management domains,
aka pm_domains.

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Len Brown
394c1127ab tools/power turbostat.8: fix typo: idle_pct should be pct_idle
idle_pct should be pct_idle

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:01 -04:00
Linus Torvalds
35b574a6c2 mount-related bugfixes
this cycle regression (well, bugfix for this cycle bugfix for v6.15-rc1 regression)
 	do_move_mount(): split the checks in subtree-of-our-ns and entire-anon cases
 	selftests/mount_setattr: adapt detached mount propagation test
 v6.15	fs: allow clone_private_mount() for a path on real rootfs
 v6.11	fs/fhandle.c: fix a race in call of has_locked_children()
 v5.15	fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
 v5.15	clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns
 v5.7	path_overmount(): avoid false negatives
 v3.12	finish_automount(): don't leak MNT_LOCKED from parent to child
 v2.6.15	do_change_type(): refuse to operate on unmounted/not ours mounts
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaEXGDwAKCRBZ7Krx/gZQ
 61WQAPwNBpcwum3F5fqT8rcKymqAUFpc0+rluJoBi+qfCQA9ywEAwn+Kh5qqtz++
 cdVnUYQxBrh0u5IOzMEFITlgfYFJZA4=
 =BIeU
 -----END PGP SIGNATURE-----

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull mount fixes from Al Viro:
 "Various mount-related bugfixes:

   - split the do_move_mount() checks in subtree-of-our-ns and
     entire-anon cases and adapt detached mount propagation selftest for
     mount_setattr

   - allow clone_private_mount() for a path on real rootfs

   - fix a race in call of has_locked_children()

   - fix move_mount propagation graph breakage by MOVE_MOUNT_SET_GROUP

   - make sure clone_private_mnt() caller has CAP_SYS_ADMIN in the right
     userns

   - avoid false negatives in path_overmount()

   - don't leak MNT_LOCKED from parent to child in finish_automount()

   - do_change_type(): refuse to operate on unmounted/not ours mounts"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  do_change_type(): refuse to operate on unmounted/not ours mounts
  clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns
  selftests/mount_setattr: adapt detached mount propagation test
  do_move_mount(): split the checks in subtree-of-our-ns and entire-anon cases
  fs: allow clone_private_mount() for a path on real rootfs
  fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
  finish_automount(): don't leak MNT_LOCKED from parent to child
  path_overmount(): avoid false negatives
  fs/fhandle.c: fix a race in call of has_locked_children()
2025-06-08 10:35:12 -07:00
Linus Torvalds
bdc7f8c5ad - The 4 patch series "Fix uprobe pte be overwritten when expanding vma"
fixes a longstanding and quite obscure bug related to the vma merging of
   the uprobe mmap page.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaEN1LAAKCRDdBJ7gKXxA
 jsfLAQCC+C8397X6lNKPI3bHGLGEHubP2uzb6bOFMAU6fIRobQEAqUnoUhfP+xsu
 tDbcQEBZ+vfyeT9Zr9vA+uBDcA3OGw0=
 =9oaG
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-06-06-16-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:
 "The series 'Fix uprobe pte be overwritten when expanding vma' fixes a
  longstanding and quite obscure bug related to the vma merging of the
  uprobe mmap page"

* tag 'mm-stable-2025-06-06-16-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  selftests/mm: add test about uprobe pte be orphan during vma merge
  selftests/mm: extract read_sysfs and write_sysfs into vm_util
  mm: expose abnormal new_pte during move_ptes
  mm: fix uprobe pte be overwritten when expanding vma
  mm/damon: s/primitives/code/ on comments
2025-06-06 22:06:57 -07:00
Linus Torvalds
d3c82f618a 13 hotfixes. 6 are cc:stable and the remainder address post-6.15 issues
or aren't considered necessary for -stable kernels.  11 are for MM.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaENzlAAKCRDdBJ7gKXxA
 joNYAP9n38QNDUoRR6ChFikzzY77q4alD2NL0aqXBZdcSRXoUgEAlQ8Ea+t6xnzp
 GnH+cnsA6FDp4F6lIoZBdENJyBYrkQE=
 =ud9O
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2025-06-06-16-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "13 hotfixes.

  6 are cc:stable and the remainder address post-6.15 issues or aren't
  considered necessary for -stable kernels. 11 are for MM"

* tag 'mm-hotfixes-stable-2025-06-06-16-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count
  MAINTAINERS: add mm swap section
  kmsan: test: add module description
  MAINTAINERS: add tlb trace events to MMU GATHER AND TLB INVALIDATION
  mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race
  mm/hugetlb: unshare page tables during VMA split, not before
  MAINTAINERS: add Alistair as reviewer of mm memory policy
  iov_iter: use iov_offset for length calculation in iov_iter_aligned_bvec
  mm/mempolicy: fix incorrect freeing of wi_kobj
  alloc_tag: handle module codetag load errors as module load failures
  mm/madvise: handle madvise_lock() failure during race unwinding
  mm: fix vmstat after removing NR_BOUNCE
  KVM: s390: rename PROT_NONE to PROT_TYPE_DUMMY
2025-06-06 21:45:45 -07:00
Christian Brauner
7054674ee9 selftests/mount_setattr: adapt detached mount propagation test
Make sure that detached trees don't receive mount propagation.

Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-07 00:41:29 -04:00
Linus Torvalds
119b1e61a7 RISC-V Patches for the 6.16 Merge Window, Part 1
* Support for the FWFT SBI extension, which is part of SBI 3.0 and a
   dependency for many new SBI and ISA extensions.
 * Support for getrandom() in the VDSO.
 * Support for mseal.
 * Optimized routines for raid6 syndrome and recovery calculations.
 * kexec_file() supports loading Image-formatted kernel binaries.
 * Improvements to the instruction patching framework to allow for atomic
   instruction patching, along with rules as to how systems need to
   behave in order to function correctly.
 * Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
   some SiFive vendor extensions.
 * Various fixes and cleanups, including: misaligned access handling, perf
   symbol mangling, module loading, PUD THPs, and improved uaccess
   routines.
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmhDLP8ZHHBhbG1lcmRh
 YmJlbHRAZ29vZ2xlLmNvbQAKCRAuExnzX7sYiZhFD/4+Zikkld812VjFb9dTF+Wj
 n/x9h86zDwAEFgf2BMIpUQhHru6vtdkO2l/Ky6mQblTPMWLafF4eK85yCsf84sQ0
 +RX4sOMLZ0+qvqxKX+aOFe9JXOWB0QIQuPvgBfDDOV4UTm60sglIxwqOpKcsBEHs
 2nplXXjiv0ckaMFLos8xlwu1uy4A/jMfT3Y9FDcABxYCqBoKOZ1frcL9ezJZbHbv
 BoOKLDH8ZypFxIG/eQ511lIXXtrnLas0l4jHWjrfsWu6pmXTgJasKtbGuH3LoLnM
 G/4qvHufR6lpVUOIL5L0V6PpsmYwDi/ciFIFlc8NH2oOZil3qiVaGSEbJIkWGFu9
 8lWTXQWnbinZbfg2oYbWp8GlwI70vKomtDyYNyB9q9Cq9jyiTChMklRNODr4764j
 ZiEnzc/l4KyvaxUg8RLKCT595lKECiUDnMytbIbunJu05HBqRCoGpBtMVzlQsyUd
 ybkRt3BA7eOR8/xFA7ZZQeJofmiu2yxkBs5ggMo8UnSragw27hmv/OA0mWMXEuaD
 aaWc4ZKpKqf7qLchLHOvEl5ORUhsisyIJgZwOqdme5rQoWorVtr51faA4AKwFAN4
 vcKgc5qJjK8vnpW+rl3LNJF9LtH+h4TgmUI853vUlukPoH2oqRkeKVGSkxG0iAze
 eQy2VjP1fJz6ciRtJZn9aw==
 =cZGy
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the FWFT SBI extension, which is part of SBI 3.0 and a
   dependency for many new SBI and ISA extensions

 - Support for getrandom() in the VDSO

 - Support for mseal

 - Optimized routines for raid6 syndrome and recovery calculations

 - kexec_file() supports loading Image-formatted kernel binaries

 - Improvements to the instruction patching framework to allow for
   atomic instruction patching, along with rules as to how systems need
   to behave in order to function correctly

 - Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
   some SiFive vendor extensions

 - Various fixes and cleanups, including: misaligned access handling,
   perf symbol mangling, module loading, PUD THPs, and improved uaccess
   routines

* tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits)
  riscv: uaccess: Only restore the CSR_STATUS SUM bit
  RISC-V: vDSO: Wire up getrandom() vDSO implementation
  riscv: enable mseal sysmap for RV64
  raid6: Add RISC-V SIMD syndrome and recovery calculations
  riscv: mm: Add support for Svinval extension
  RISC-V: Documentation: Add enough title underlines to CMODX
  riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE
  MAINTAINERS: Update Atish's email address
  riscv: uaccess: do not do misaligned accesses in get/put_user()
  riscv: process: use unsigned int instead of unsigned long for put_user()
  riscv: make unsafe user copy routines use existing assembly routines
  riscv: hwprobe: export Zabha extension
  riscv: Make regs_irqs_disabled() more clear
  perf symbols: Ignore mapping symbols on riscv
  RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND
  riscv: module: Optimize PLT/GOT entry counting
  riscv: Add support for PUD THP
  riscv: xchg: Prefetch the destination word for sc.w
  riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop
  riscv: Add support for Zicbop
  ...
2025-06-06 18:05:18 -07:00
Andrii Nakryiko
02670deede libbpf: Handle unsupported mmap-based /sys/kernel/btf/vmlinux correctly
libbpf_err_ptr() helpers are meant to return NULL and set errno, if
there is an error. But btf_parse_raw_mmap() is meant to be used
internally and is expected to return ERR_PTR() values. Because of this
mismatch, when libbpf tries to mmap /sys/kernel/btf/vmlinux, we don't
detect the error correctly with IS_ERR() check, and never fallback to
old non-mmap-based way of loading vmlinux BTF.

Fix this by using proper ERR_PTR() returns internally.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: 3c0421c93c ("libbpf: Use mmap to parse vmlinux BTF from sysfs")
Cc: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250606202134.2738910-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-06 14:07:07 -07:00
Linus Torvalds
6d8854216e block-6.16-20250606
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmhC7/UQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgps+6D/9BOhkMyMkUF9LAev4PBNE+x3aftjl7Y1AY
 EHv2vozb4nDwXIaalG4qGUhprz2+z+hqxYjmnlOAsqbixhcSzKK5z9rjxyDka776
 x03vfvKXaXZUG7XN7ENY8sJnLx4QJ0nh4+0gzT9yDyq2vKvPFLEKweNOxKDKCSbE
 31vGoLFwjltp74hX+Qrnj1KMaTLgvAaV0eXKWlbX7Iiw6GFVm200zb27gth6U8bV
 WQAmjSkFQ0daHtAWmXIVy7hrXiCqe8D6YPKvBXnQ4cfKVbgG0HHDuTmQLpKGzfMi
 rr24MU5vZjt6OsYalceiTtifSUcf/I2+iFV7HswOk9kpOY5A2ylsWawRP2mm4PDI
 nJE3LaSTRpEvs5kzPJ2kr8Zp4/uvF6ehSq8Y9w52JekmOzxusLcRcswezaO00EI0
 32uuK+P505EGTcCBTrEdtaI6k7zzQEeVoIpxqvMhNRG/s5vzvIV3eVrALu2HSDma
 P3paEdx7PwJla3ndmdChfh1vUR3TW3gWoZvoNCVmJzNCnLEAScTS2NsiQeEjy8zs
 20IGsrRgIqt9KR8GZ2zj1ZOM47Cg0dIU3pbbA2Ja71wx4TYXJCSFFRK7mzDtXYlY
 BWOix/Dks8tk118cwuxnT+IiwmWDMbDZKnygh+4tiSyrs0IszeekRADLUu03C0Ve
 Dhpljqf3zA==
 =gs32
 -----END PGP SIGNATURE-----

Merge tag 'block-6.16-20250606' of git://git.kernel.dk/linux

Pull more block updates from Jens Axboe:

 - NVMe pull request via Christoph:
      - TCP error handling fix (Shin'ichiro Kawasaki)
      - TCP I/O stall handling fixes (Hannes Reinecke)
      - fix command limits status code (Keith Busch)
      - support vectored buffers also for passthrough (Pavel Begunkov)
      - spelling fixes (Yi Zhang)

 - MD pull request via Yu:
      - fix REQ_RAHEAD and REQ_NOWAIT IO err handling for raid1/10
      - fix max_write_behind setting for dm-raid
      - some minor cleanups

 - Integrity data direction fix and cleanup

 - bcache NULL pointer fix

 - Fix for loop missing write start/end handling

 - Decouple hardware queues and IO threads in ublk

 - Slew of ublk selftests additions and updates

* tag 'block-6.16-20250606' of git://git.kernel.dk/linux: (29 commits)
  nvme: spelling fixes
  nvme-tcp: fix I/O stalls on congested sockets
  nvme-tcp: sanitize request list handling
  nvme-tcp: remove tag set when second admin queue config fails
  nvme: enable vectored registered bufs for passthrough cmds
  nvme: fix implicit bool to flags conversion
  nvme: fix command limits status code
  selftests: ublk: kublk: improve behavior on init failure
  block: flip iter directions in blk_rq_integrity_map_user()
  block: drop direction param from bio_integrity_copy_user()
  selftests: ublk: cover PER_IO_DAEMON in more stress tests
  Documentation: ublk: document UBLK_F_PER_IO_DAEMON
  selftests: ublk: add stress test for per io daemons
  selftests: ublk: add functional test for per io daemons
  selftests: ublk: kublk: decouple ublk_queues from ublk server threads
  selftests: ublk: kublk: move per-thread data out of ublk_queue
  selftests: ublk: kublk: lift queue initialization out of thread
  selftests: ublk: kublk: tie sqe allocation to io instead of queue
  selftests: ublk: kublk: plumb q_id in io_uring user_data
  ublk: have a per-io daemon instead of a per-queue daemon
  ...
2025-06-06 13:12:50 -07:00
Linus Torvalds
c26f4fbd58 Char/Misc/IIO pull request for 6.16-rc1
Here is the big char/misc/iio and other small driver subsystem pull
 request for 6.16-rc1.
 
 Overall, a lot of individual changes, but nothing major, just the normal
 constant forward progress of new device support and cleanups to existing
 subsystems.  Highlights in here are:
   - Large IIO driver updates and additions and device tree changes
   - Android binder bugfixes and logfile fixes
   - mhi driver updates
   - comedi driver updates
   - counter driver updates and additions
   - coresight driver updates and additions
   - echo driver removal as there are no in-kernel users of it
   - nvmem driver updates
   - spmi driver updates
   - new amd-sbi driver "subsystem" and drivers added
   - rust miscdriver binding documentation fix
   - other small driver fixes and updates (uio, w1, acrn, hpet, xillybus,
     cardreader drivers, fastrpc and others.)
 
 All of these have been in linux-next for quite a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCaEKg5Q8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykUyACgmAzrzKMoQUwwhQ6ed2l7tHdrlOcAoIORI1/x
 pNqQdrE1EbmAAyl47IN4
 =ts6J
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc / iio driver updates from Greg KH:
 "Here is the big char/misc/iio and other small driver subsystem pull
  request for 6.16-rc1.

  Overall, a lot of individual changes, but nothing major, just the
  normal constant forward progress of new device support and cleanups to
  existing subsystems. Highlights in here are:

   - Large IIO driver updates and additions and device tree changes

   - Android binder bugfixes and logfile fixes

   - mhi driver updates

   - comedi driver updates

   - counter driver updates and additions

   - coresight driver updates and additions

   - echo driver removal as there are no in-kernel users of it

   - nvmem driver updates

   - spmi driver updates

   - new amd-sbi driver "subsystem" and drivers added

   - rust miscdriver binding documentation fix

   - other small driver fixes and updates (uio, w1, acrn, hpet,
     xillybus, cardreader drivers, fastrpc and others)

  All of these have been in linux-next for quite a while with no
  reported problems"

* tag 'char-misc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (390 commits)
  binder: fix yet another UAF in binder_devices
  counter: microchip-tcb-capture: Add watch validation support
  dt-bindings: iio: adc: Add ROHM BD79100G
  iio: adc: add support for Nuvoton NCT7201
  dt-bindings: iio: adc: add NCT7201 ADCs
  iio: chemical: Add driver for SEN0322
  dt-bindings: trivial-devices: Document SEN0322
  iio: adc: ad7768-1: reorganize driver headers
  iio: bmp280: zero-init buffer
  iio: ssp_sensors: optimalize -> optimize
  HID: sensor-hub: Fix typo and improve documentation
  iio: admv1013: replace redundant ternary operator with just len
  iio: chemical: mhz19b: Fix error code in probe()
  iio: adc: at91-sama5d2: use IIO_DECLARE_BUFFER_WITH_TS
  iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS
  iio: adc: ad7380: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: adc: ad4695: rename AD4695_MAX_VIN_CHANNELS
  iio: adc: ad4695: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: introduce IIO_DECLARE_BUFFER_WITH_TS macros
  iio: make IIO_DMA_MINALIGN minimum of 8 bytes
  ...
2025-06-06 11:50:47 -07:00
Jann Horn
081056dc00 mm/hugetlb: unshare page tables during VMA split, not before
Currently, __split_vma() triggers hugetlb page table unsharing through
vm_ops->may_split().  This happens before the VMA lock and rmap locks are
taken - which is too early, it allows racing VMA-locked page faults in our
process and racing rmap walks from other processes to cause page tables to
be shared again before we actually perform the split.

Fix it by explicitly calling into the hugetlb unshare logic from
__split_vma() in the same place where THP splitting also happens.  At that
point, both the VMA and the rmap(s) are write-locked.

An annoying detail is that we can now call into the helper
hugetlb_unshare_pmds() from two different locking contexts:

1. from hugetlb_split(), holding:
    - mmap lock (exclusively)
    - VMA lock
    - file rmap lock (exclusively)
2. hugetlb_unshare_all_pmds(), which I think is designed to be able to
   call us with only the mmap lock held (in shared mode), but currently
   only runs while holding mmap lock (exclusively) and VMA lock

Backporting note:
This commit fixes a racy protection that was introduced in commit
b30c14cd61 ("hugetlb: unshare some PMDs when splitting VMAs"); that
commit claimed to fix an issue introduced in 5.13, but it should actually
also go all the way back.

[jannh@google.com: v2]
  Link: https://lkml.kernel.org/r/20250528-hugetlb-fixes-splitrace-v2-1-1329349bad1a@google.com
Link: https://lkml.kernel.org/r/20250528-hugetlb-fixes-splitrace-v2-0-1329349bad1a@google.com
Link: https://lkml.kernel.org/r/20250527-hugetlb-fixes-splitrace-v1-1-f4136f5ec58a@google.com
Fixes: 39dde65c99 ("[PATCH] shared page table for hugetlb page")
Signed-off-by: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>	[b30c14cd61: hugetlb: unshare some PMDs when splitting VMAs]
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-05 22:02:24 -07:00
Pu Lehui
efe99fabeb selftests/mm: add test about uprobe pte be orphan during vma merge
Add test about uprobe pte be orphan during vma merge.

[akpm@linux-foundation.org: include sys/syscall.h, per Lorenzo]
Link: https://lkml.kernel.org/r/20250529155650.4017699-5-pulehui@huaweicloud.com
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-05 21:55:42 -07:00
Pu Lehui
6fb6223347 selftests/mm: extract read_sysfs and write_sysfs into vm_util
Extract read_sysfs and write_sysfs into vm_util.  Meanwhile, rename the
function in thuge-gen that has the same name as read_sysfs.

Link: https://lkml.kernel.org/r/20250529155650.4017699-4-pulehui@huaweicloud.com
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-05 21:55:42 -07:00
Palmer Dabbelt
2670a39b1e
Merge tag 'riscv-mw2-6.16-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into for-next
riscv patches for 6.16-rc1, part 2

* Performance improvements
  - Add support for vdso getrandom
  - Implement raid6 calculations using vectors
  - Introduce svinval tlb invalidation

* Cleanup
  - A bunch of deduplication of the macros we use for manipulating instructions

* Misc
  - Introduce a kunit test for kprobes
  - Add support for mseal as riscv fits the requirements (thanks to Lorenzo for making sure of that :))

[Palmer: There was a rebase between part 1 and part 2, so I've had to do
some more git surgery here... at least two rounds of surgery...]

* alex-pr-2: (866 commits)
  RISC-V: vDSO: Wire up getrandom() vDSO implementation
  riscv: enable mseal sysmap for RV64
  raid6: Add RISC-V SIMD syndrome and recovery calculations
  riscv: mm: Add support for Svinval extension
  riscv: Add kprobes KUnit test
  riscv: kprobes: Remove duplication of RV_EXTRACT_ITYPE_IMM
  riscv: kprobes: Remove duplication of RV_EXTRACT_UTYPE_IMM
  riscv: kprobes: Remove duplication of RV_EXTRACT_RD_REG
  riscv: kprobes: Remove duplication of RVC_EXTRACT_BTYPE_IMM
  riscv: kprobes: Remove duplication of RVC_EXTRACT_C2_RS1_REG
  riscv: kproves: Remove duplication of RVC_EXTRACT_JTYPE_IMM
  riscv: kprobes: Remove duplication of RV_EXTRACT_BTYPE_IMM
  riscv: kprobes: Remove duplication of RV_EXTRACT_RS1_REG
  riscv: kprobes: Remove duplication of RV_EXTRACT_JTYPE_IMM
  riscv: kprobes: Move branch_funct3 to insn.h
  riscv: kprobes: Move branch_rs2_idx to insn.h
  Linux 6.15-rc6
  Input: xpad - fix xpad_device sorting
  Input: xpad - add support for several more controllers
  Input: xpad - fix Share button on Xbox One controllers
  ...
2025-06-05 14:03:16 -07:00
Xi Ruoyao
ee0d03053e
RISC-V: vDSO: Wire up getrandom() vDSO implementation
Hook up the generic vDSO implementation to the generic vDSO getrandom
implementation by providing the required __arch_chacha20_blocks_nostack
and getrandom_syscall implementations. Also wire up the selftests.

The benchmark result:

	vdso: 25000000 times in 2.466341333 seconds
	libc: 25000000 times in 41.447720005 seconds
	syscall: 25000000 times in 41.043926672 seconds

	vdso: 25000000 x 256 times in 162.286219353 seconds
	libc: 25000000 x 256 times in 2953.855018685 seconds
	syscall: 25000000 x 256 times in 2796.268546000 seconds

[ alex: - Fix dynamic relocation
        - Squash Nathan's fix https://lore.kernel.org/all/20250423-riscv-fix-compat_vdso-lld-v2-1-b7bbbc244501@kernel.org/
	- Add comment from Loongarch ]

Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Link: https://lore.kernel.org/r/20250411024600.16045-1-xry111@xry111.site
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
2025-06-05 14:03:09 -07:00
Linus Torvalds
2c7e4a2663 Including fixes from CAN, wireless, Bluetooth, and Netfilter.
Current release - regressions:
 
  - Revert "kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN
    in all_tests", makes kunit error out if compiler is old
 
  - wifi: iwlwifi: mvm: fix assert on suspend
 
  - rxrpc: fix return from none_validate_challenge()
 
 Current release - new code bugs:
 
  - ovpn: couple of fixes for socket cleanup and UDP-tunnel teardown
 
  - can: kvaser_pciefd: refine error prone echo_skb_max handling logic
 
  - fix net_devmem_bind_dmabuf() stub when DEVMEM not compiled
 
  - eth: airoha: fixes for config / accel in bridge mode
 
 Previous releases - regressions:
 
  - Bluetooth: hci_qca: move the SoC type check to the right place,
    fix GPIO integration
 
  - prevent a NULL deref in rtnl_create_link() after locking changes
 
  - fix udp gso skb_segment after pull from frag_list
 
  - hv_netvsc: fix potential deadlock in netvsc_vf_setxdp()
 
 Previous releases - always broken:
 
  - netfilter:
    - nf_nat: also check reverse tuple to obtain clashing entry
    - nf_set_pipapo_avx2: fix initial map fill (zeroing)
 
  - fix the helper for incremental update of packet checksums after
    modifying the IP address, used by ILA and BPF
 
  - eth: stmmac: prevent div by 0 when clock rate is misconfigured
 
  - eth: ice: fix Tx scheduler handling of XDP and changing queue count
 
  - eth: b53: fix support for the RGMII interface when delays configured
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmhBv5kACgkQMUZtbf5S
 Irs/DA/+PIh7a33iVcsGIcmWtpnGp+18id1tSLnYGUGx1cW6zxutPD8rb6BsAN84
 KR+XVsbMDUehIa10xPoF2L5mX5YujEiPSkjP8eE2KJKDLGpDtYNOyOWKT21yudnd
 4EVF5JQoEbWHrkHMKF97tla84QLd5fFtgsvejVeZtQYSIDOteNGfra4Jly8iiR+J
 i9k+HdB0CNEKVvvibQZjZ5CrkpmdNPmB9UoJ59bG15q2+vXdzOPm/CCNo//9ZQJB
 I8O40nu16msRRVA9nc2V/Tp98fTk9dnDpTSyWiBlNCut9g9ftx456Ew+tjobMRIT
 yeh+q9+1z3YHjGJB8P1FGmMZWK3tbrwyqjFGqpSjr7juucFok9kxAaRPqrQxga7H
 Yxq3RegeNqukEAV39ZE14TL765Jy+XXF1uTHhNBkUADlNJVKnZygSk78/Ut2nDvQ
 vkfoto+CfKny5qkSbTk8KKv1rZu3xwewoOjlcdkHlOBoouCjPOxTC7yxTZgUZB5c
 yap0jQsedJct4OAA+O7IGLCmf3KrJ0H32HbWEY68mpTEd+4Df5vAWiIi7vmVJmk3
 DX9JWmu5A5yjNMhOEsBQU98gkNw366aA/E8dr+lEfp3AoqDrmdbG3l8+qqhqYnb+
 nnL1sNiQH1griZwQBUROAhrtXnYlYsAsZi+cv23Q0hQiGIvIC2Q=
 =sRQt
 -----END PGP SIGNATURE-----

Merge tag 'net-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from CAN, wireless, Bluetooth, and Netfilter.

  Current release - regressions:

   - Revert "kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in
     all_tests", makes kunit error out if compiler is old

   - wifi: iwlwifi: mvm: fix assert on suspend

   - rxrpc: fix return from none_validate_challenge()

  Current release - new code bugs:

   - ovpn: couple of fixes for socket cleanup and UDP-tunnel teardown

   - can: kvaser_pciefd: refine error prone echo_skb_max handling logic

   - fix net_devmem_bind_dmabuf() stub when DEVMEM not compiled

   - eth: airoha: fixes for config / accel in bridge mode

  Previous releases - regressions:

   - Bluetooth: hci_qca: move the SoC type check to the right place, fix
     GPIO integration

   - prevent a NULL deref in rtnl_create_link() after locking changes

   - fix udp gso skb_segment after pull from frag_list

   - hv_netvsc: fix potential deadlock in netvsc_vf_setxdp()

  Previous releases - always broken:

   - netfilter:
       - nf_nat: also check reverse tuple to obtain clashing entry
       - nf_set_pipapo_avx2: fix initial map fill (zeroing)

   - fix the helper for incremental update of packet checksums after
     modifying the IP address, used by ILA and BPF

   - eth:
       - stmmac: prevent div by 0 when clock rate is misconfigured
       - ice: fix Tx scheduler handling of XDP and changing queue count
       - eth: fix support for the RGMII interface when delays configured"

* tag 'net-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
  calipso: unlock rcu before returning -EAFNOSUPPORT
  seg6: Fix validation of nexthop addresses
  net: prevent a NULL deref in rtnl_create_link()
  net: annotate data-races around cleanup_net_task
  selftests: drv-net: tso: make bkg() wait for socat to quit
  selftests: drv-net: tso: fix the GRE device name
  selftests: drv-net: add configs for the TSO test
  wireguard: device: enable threaded NAPI
  netlink: specs: rt-link: decode ip6gre
  netlink: specs: rt-link: add missing byte-order properties
  net: wwan: mhi_wwan_mbim: use correct mux_id for multiplexing
  wifi: cfg80211/mac80211: correctly parse S1G beacon optional elements
  net: dsa: b53: do not touch DLL_IQQD on bcm53115
  net: dsa: b53: allow RGMII for bcm63xx RGMII ports
  net: dsa: b53: do not configure bcm63xx's IMP port interface
  net: dsa: b53: do not enable RGMII delay on bcm63xx
  net: dsa: b53: do not enable EEE on bcm63xx
  net: ti: icssg-prueth: Fix swapped TX stats for MII interfaces.
  selftests: netfilter: nft_nat.sh: add test for reverse clash with nat
  netfilter: nf_nat: also check reverse tuple to obtain clashing entry
  ...
2025-06-05 12:34:55 -07:00
Haibo Xu
4d6319289e
perf symbols: Ignore mapping symbols on riscv
RISCV ELF use mapping symbols with special names $x, $d to
identify regions of RISCV code or code with different ISAs[1].
These symbols don't identify functions, so will confuse the
perf output.

The patch filters out these symbols at load time, similar to
"4886f2ca perf symbols: Ignore mapping symbols on aarch64".

[1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/
    master/riscv-elf.adoc#mapping-symbol

Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250409025202.201046-1-haibo1.xu@intel.com
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
2025-06-05 11:10:16 -07:00
Jakub Kicinski
e6854be4d8 selftests: drv-net: tso: make bkg() wait for socat to quit
Commit 846742f7e3 ("selftests: drv-net: add a warning for
bkg + shell + terminate") added a warning for bkg() used
with terminate=True. The tso test was missed as we didn't
have it running anywhere in NIPA. Add exit_wait=True, to avoid:

  # Warning: combining shell and terminate is risky!
  #          SIGTERM may not reach the child on zsh/ksh!

getting printed twice for every variant.

Fixes: 0d0f4174f6 ("selftests: drv-net: add a simple TSO test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250604012055.891431-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-05 08:01:00 -07:00
Jakub Kicinski
c68804c934 selftests: drv-net: tso: fix the GRE device name
The device type for IPv4 GRE is "gre" not "ipgre",
unlike for IPv6 which uses "ip6gre".

Not sure how I missed this when writing the test, perhaps
because all HW I have access to is on an IPv6-only network.

Fixes: 0d0f4174f6 ("selftests: drv-net: add a simple TSO test")
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250604012031.891242-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-05 08:00:55 -07:00
Jakub Kicinski
7eb6b63aa3 selftests: drv-net: add configs for the TSO test
Add missing config options for the tso.py test, specifically
to make sure the kernel is built with vxlan and gre tunnels.

I noticed this while adding a TSO-capable device QEMU to the CI.
Previously we only run virtio tests and it doesn't report LSO
stats on the QEMU we have.

Fixes: 0d0f4174f6 ("selftests: drv-net: add a simple TSO test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250604001653.853008-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-05 08:00:50 -07:00
Sebastian Ott
fad4cf9448 KVM: arm64: selftests: Determine effective counter width in arch_timer_edge_cases
arch_timer_edge_cases uses ~0 as the maximum counter value, however there's
no architectural guarantee that this is valid.

Figure out the effective counter width based on the effective frequency
like it's done by the kernel.

This also serves as a workaround for AC03_CPU_14 that led to the
following assertion failure on ampere-one machines:

==== Test Assertion Failure ====
  arm64/arch_timer_edge_cases.c:169: timer_condition == istatus
  pid=11236 tid=11236 errno=4 - Interrupted system call
     1  0x0000000000404ce7: test_run at arch_timer_edge_cases.c:938
     2  0x0000000000401ebb: main at arch_timer_edge_cases.c:1053
     3  0x0000ffff9fa8625b: ?? ??:0
     4  0x0000ffff9fa8633b: ?? ??:0
     5  0x0000000000401fef: _start at ??:?
  0x1 != 0x0 (timer_condition != istatus)

Note that the following subtest only worked since the counter initialized
with CVAL_MAX would instantly overflow (which is no longer the case):

	test_set_cnt_after_cval_no_irq(timer, 0, DEF_CNT, CVAL_MAX, sm);

To fix this we could swap CVAL_MAX for 0 here but since that is already
done by test_move_counters_behind_timers() let's remove that subtest.

Link: https://lore.kernel.org/kvmarm/ac1de1d2-ef2b-d439-dc48-8615e121b07b@redhat.com
Link: https://amperecomputing.com/assets/AmpereOne_Developer_ER_v0_80_20240823_28945022f4.pdf
Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250605103613.14544-5-sebott@redhat.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-05 14:28:44 +01:00
Sebastian Ott
05ce38d489 KVM: arm64: selftests: Fix xVAL init in arch_timer_edge_cases
arch_timer_edge_cases hits the following assertion in < 10% of the test runs:

==== Test Assertion Failure ====
  arm64/arch_timer_edge_cases.c:490: timer_get_cntct(timer) >= DEF_CNT + (timer_get_cntfrq() * (uint64_t)(delta_2_ms) / 1000)
  pid=17110 tid=17110 errno=4 - Interrupted system call
     1  0x0000000000404ec7: test_run at arch_timer_edge_cases.c:945
     2  0x0000000000401fa3: main at arch_timer_edge_cases.c:1074
     3  0x0000ffffa774b587: ?? ??:0
     4  0x0000ffffa774b65f: ?? ??:0
     5  0x000000000040206f: _start at ??:?
  timer_get_cntct(timer) >= DEF_CNT + msec_to_cycles(delta_2_ms)

Enabling the timer without proper xval initialization in set_tval_irq()
resulted in an early interrupt during timer reprogramming. Make sure
to set the xval before setting the enable bit.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250605103613.14544-4-sebott@redhat.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-05 14:28:01 +01:00
Sebastian Ott
050632ae65 KVM: arm64: selftests: Fix thread migration in arch_timer_edge_cases
arch_timer_edge_cases tries to migrate itself across host cpus. Before
the first test, it migrates to cpu 0 by setting up an affinity mask with
only bit 0 set. After that it looks for the next possible cpu in the
current affinity mask which still has only bit 0 set. So there is no
migration at all.

Fix this by reading the default mask at start and use this to find
the next cpu in each iteration.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250605103613.14544-3-sebott@redhat.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-05 14:27:21 +01:00
Sebastian Ott
9a9864fd09 KVM: arm64: selftests: Fix help text for arch_timer_edge_cases
Fix the help text for arch_timer_edge_cases to show the correct
option for setting the wait time.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20250605103613.14544-2-sebott@redhat.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-06-05 14:27:00 +01:00
Sebastian Andrzej Siewior
0ecb4232fc selftests/futex: Set the home_node in futex_numa_mpol
The test fails at the MPOL step if multiple nodes are available. The
reason is that mbind() sets the policy but the home_node, which is
retrieved by the futex code, is not set. This causes to retrieve the
current node and with multiple nodes it fails on one of the iterations.

Use numa_set_mempolicy_home_node() to set the expected node.
Use ksft_exit_fail_msg() to fail and exit in order not to confuse ktap.

Fixes: 3163369407 ("selftests/futex: Add futex_numa_mpol")
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250528085521.1938355-3-bigeasy@linutronix.de
2025-06-05 14:37:58 +02:00
Sebastian Andrzej Siewior
1a9dcf69c7 selftests/futex: getopt() requires int as return value.
Mark reported that futex_priv_hash fails on ARM64.
It turns out that the command line parsing does not terminate properly
and ends in the default case assuming an invalid option was passed.

Use an int as the return type for getopt().

Closes: https://lore.kernel.org/all/31869a69-063f-44a3-a079-ba71b2506cce@sirena.org.uk/
Fixes: 3163369407 ("selftests/futex: Add futex_numa_mpol")
Fixes: cda95faef7 ("selftests/futex: Add futex_priv_hash")
Reported-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250528085521.1938355-2-bigeasy@linutronix.de
2025-06-05 14:37:58 +02:00
Paolo Abeni
edafd348a0 netfilter pull request 25-06-05
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmhBWmkACgkQ1w0aZmrP
 KyFWpxAAqMGJCK2pp09/3lUJNaPfGR0HJGTk9LAhWpMWoTvwJfnJYZ5PQzMcOpbZ
 d4lZoYiJph3eo0FofJXR1wzNvj2WeBStJCpiInd618QoxZkEVG2UN+5K4h64UVKo
 Hq/Zc8wjGfQ48KMP5AlliM0W/ES9c0R+5E34mn6Arid9Yoj1cnFTVEl1M1bvQ+lK
 wb91JtYbeXbUxMogRQ05fpSSK+lqthwHx4BlzX39eRrywHWIvVHaJaJZrbfjnR6K
 9uSW6ff1t5ONuHDfv+jHJyOMgfSMoy2z219sMxnu4JA8JEJhqDLL6coCHLmG1tRH
 we5cUtK8g6vkC0k2w13N93s2B9RVZkBC8LnK0Hqijznhgqwn+2iHGBLnsYoyT0kp
 YtZw6uZXlPZBFM2rtdefznV4KhGGvWzURUQPd+XHKYMasnl4SWrVG8HXPlrbSJKH
 jpAY1ED3d7ehezzRvIIJq14CixrqMN+72AwmsXkQm/H4qO58/5RpV/keGDbIOMjK
 YtyITxDqLxS8iRKn7yqmhP9XOl8ys91oH4qt0Ro5yqMnQgjZ7/102vkGnF4BUA/i
 VpEVxtQ24w0DiKe4Slviw97JopDtlgb4EsfZ70VbbJ3HUndM2eXYbjO13byngvCv
 yRimI3KzCXnLkPbUKyi6uy4ooTClYN4bvizfQQGFjKb16hj3dkY=
 =CFYz
 -----END PGP SIGNATURE-----

Merge tag 'nf-25-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Zero out the remainder in nft_pipapo AVX2 implementation, otherwise
   next lookup could bogusly report a mismatch. This is followed by two
   patches to update nft_pipapo selftests to cover for the previous bug.
   From Florian Westphal.

2) Check for reverse tuple too in case of esoteric NAT collisions for
   UDP traffic and extend selftest coverage. Also from Florian.

netfilter pull request 25-06-05

* tag 'nf-25-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  selftests: netfilter: nft_nat.sh: add test for reverse clash with nat
  netfilter: nf_nat: also check reverse tuple to obtain clashing entry
  selftests: netfilter: nft_concat_range.sh: add datapath check for map fill bug
  selftests: netfilter: nft_concat_range.sh: prefer per element counters for testing
  netfilter: nf_set_pipapo_avx2: fix initial map fill
====================

Link: https://patch.msgid.link/20250605085735.52205-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-06-05 13:37:03 +02:00
Paolo Abeni
ec6a328b2e This bugfix batch includes the following changes:
* dropped bogus call to setup_udp_tunnel_sock() during
   cleanup, substituted by proper state unwind
 * fixed race condition between peer removal (by kernel
   space) and socket closing (by user space)
 * fixed sleep in atomic context along TCP RX error path
 * fixes for ovpn kselftests
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEmavcbPjUEuTeX8D8C3DlOqA41YcFAmg+17IACgkQC3DlOqA4
 1YfGnQf7BCqweJAqPZs45DrMO/zu0mygbvzxI5x1VnBSQvtUfEEwgY3Mk9Y/8OxW
 +QxD9yYl8sf30ZzGTFFYtBq0XwvcP0Nlo4WFD7+2ufvFgojIAwCsF25of+hODGLZ
 5KF83VYDAMvUmK7Rq4gGyi/HC7+rWNhShfoXBh6nfQjNa75NWwIwp8wEsTMovq2Y
 DiH6VzCjkUgnRjJYG5g/D6n0pHnbUkrAPi3AT+aB8PX9sikrTZhBWX4mmY8n6Xlo
 KT83ByzJqc5svnbsQEGRuhXbyuKipBhcVmOohmulBB/uhXUZHDcem9L0wp1hR2SB
 y/lZQxt9bNdEXfh0J7r5QxI2vkZgZQ==
 =Y6+P
 -----END PGP SIGNATURE-----

Merge tag 'ovpn-net-20250603' of https://github.com/OpenVPN/ovpn-net-next

Antonio Quartulli says:

====================
In this batch you can find the following bug fixes:

Patch 1: when releasing a UDP socket we were wrongly invoking
setup_udp_tunnel_sock() with an empty config. This was not
properly shutting down the UDP encap state.
With this patch we simply undo what was done during setup.

Patch 2: ovpn was holding a reference to a 'struct socket'
without increasing its reference counter. This was intended
and worked as expected until we hit a race condition where
user space tries to close the socket while kernel space is
also releasing it. In this case the (struct socket *)->sk
member would disappear under our feet leading to a null-ptr-deref.
This patch fixes this issue by having struct ovpn_socket hold
a reference directly to the sk member while also increasing
its reference counter.

Patch 3: in case of errors along the TCP RX path (softirq)
we want to immediately delete the peer, but this operation may
sleep. With this patch we move the peer deletion to a scheduled
worker.

Patch 4 and 5 are instead fixing minor issues in the ovpn
kselftests.

* tag 'ovpn-net-20250603' of https://github.com/OpenVPN/ovpn-net-next:
  selftest/net/ovpn: fix missing file
  selftest/net/ovpn: fix TCP socket creation
  ovpn: avoid sleep in atomic context in TCP RX error path
  ovpn: ensure sk is still valid during cleanup
  ovpn: properly deconfigure UDP-tunnel
====================

Link: https://patch.msgid.link/20250603111110.4575-1-antonio@openvpn.net/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-06-05 12:41:28 +02:00
Florian Westphal
3c3c324849 selftests: netfilter: nft_nat.sh: add test for reverse clash with nat
This will fail without the previous bug fix because we erronously
believe that the clashing entry went way.

However, the clash exists in the opposite direction due to an
existing nat mapping:
 PASS: IP statless for ns2-LgTIuS
 ERROR: failed to test udp ns1-x4iyOW to ns2-LgTIuS with dnat rule step 2, result: ""

This is partially adapted from test instructions from the below
ubuntu tracker.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2109889
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Shaun Brady <brady.1345@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-06-05 10:50:05 +02:00
Florian Westphal
38399f2b0f selftests: netfilter: nft_concat_range.sh: add datapath check for map fill bug
commit 0935ee6032 ("selftests: netfilter: add test case for recent mismatch bug")
added a regression check for incorrect initial fill of the result map
that was fixed with 791a615b7a ("netfilter: nf_set_pipapo: fix initial map fill").

The test used 'nft get element', i.e., control plane checks for
match/nomatch results.

The control plane however doesn't use avx2 version, so we need to
send+match packets.

As the additional packet match/nomatch is slow, don't do this for
every element added/removed: add and use maybe_send_(no)match
helpers and use them.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-06-05 10:50:05 +02:00
Florian Westphal
febe7eda74 selftests: netfilter: nft_concat_range.sh: prefer per element counters for testing
The selftest uses following rule:
  ... @test counter name "test"

Then sends a packet, then checks if the named counter did increment or
not.

This is fine for the 'no-match' test case: If anything matches the
counter increments and the test fails as expected.

But for the 'should match' test cases this isn't optimal.
Consider buggy matching, where the packet matches entry x, but it
should have matched entry y.

In that case the test would erronously pass.

Rework the selftest to use per-element counters to avoid this.

After sending packet that should have matched entry x, query the
relevant element via 'nft reset element' and check that its counter
had incremented.

The 'nomatch' case isn't altered, no entry should match so the named
counter must be 0, changing it to the per-element counter would then
pass if another entry matches.

The downside of this change is a slight increase in test run-time by
a few seconds.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-06-05 10:50:04 +02:00
Linus Torvalds
ec7714e494 Rust changes for v6.16
Toolchain and infrastructure:
 
  - KUnit '#[test]'s:
 
    - Support KUnit-mapped 'assert!' macros.
 
      The support that landed last cycle was very basic, and the
      'assert!' macros panicked since they were the standard library
      ones. Now, they are mapped to the KUnit ones in a similar way to
      how is done for doctests, reusing the infrastructure there.
 
      With this, a failing test like:
 
          #[test]
          fn my_first_test() {
              assert_eq!(42, 43);
          }
 
      will report:
 
          # my_first_test: ASSERTION FAILED at rust/kernel/lib.rs:251
          Expected 42 == 43 to be true, but is false
          # my_first_test.speed: normal
          not ok 1 my_first_test
 
    - Support tests with checked 'Result' return types.
 
      The return value of test functions that return a 'Result' will be
      checked, thus one can now easily catch errors when e.g. using the
      '?' operator in tests.
 
      With this, a failing test like:
 
          #[test]
          fn my_test() -> Result {
              f()?;
              Ok(())
          }
 
      will report:
 
          # my_test: ASSERTION FAILED at rust/kernel/lib.rs:321
          Expected is_test_result_ok(my_test()) to be true, but is false
          # my_test.speed: normal
          not ok 1 my_test
 
    - Add 'kunit_tests' to the prelude.
 
  - Clarify the remaining language unstable features in use.
 
  - Compile 'core' with edition 2024 for Rust >= 1.87.
 
  - Workaround 'bindgen' issue with forward references to 'enum' types.
 
  - objtool: relax slice condition to cover more 'noreturn' functions.
 
  - Use absolute paths in macros referencing 'core' and 'kernel' crates.
 
  - Skip '-mno-fdpic' flag for bindgen in GCC 32-bit arm builds.
 
  - Clean some 'doc_markdown' lint hits -- we may enable it later on.
 
 'kernel' crate:
 
  - 'alloc' module:
 
    - 'Box': support for type coercion, e.g. 'Box<T>' to 'Box<dyn U>' if
      'T' implements 'U'.
 
    - 'Vec': implement new methods (prerequisites for nova-core and
      binder): 'truncate', 'resize', 'clear', 'pop',
      'push_within_capacity' (with new error type 'PushError'),
      'drain_all', 'retain', 'remove' (with new error type
      'RemoveError'), insert_within_capacity' (with new error type
      'InsertError').
 
      In addition, simplify 'push' using 'spare_capacity_mut', split
      'set_len' into 'inc_len' and 'dec_len', add type invariant
      'len <= capacity' and simplify 'truncate' using 'dec_len'.
 
  - 'time' module:
 
    - Morph the Rust hrtimer subsystem into the Rust timekeeping
      subsystem, covering delay, sleep, timekeeping, timers. This new
      subsystem has all the relevant timekeeping C maintainers listed in
      the entry.
 
    - Replace 'Ktime' with 'Delta' and 'Instant' types to represent a
      duration of time and a point in time.
 
    - Temporarily add 'Ktime' to 'hrtimer' module to allow 'hrtimer' to
      delay converting to 'Instant' and 'Delta'.
 
  - 'xarray' module:
 
    - Add a Rust abstraction for the 'xarray' data structure. This
      abstraction allows Rust code to leverage the 'xarray' to store
      types that implement 'ForeignOwnable'. This support is a dependency
      for memory backing feature of the Rust null block driver, which is
      waiting to be merged.
 
    - Set up an entry in 'MAINTAINERS' for the XArray Rust support.
      Patches will go to the new Rust XArray tree and then via the Rust
      subsystem tree for now.
 
    - Allow 'ForeignOwnable' to carry information about the pointed-to
      type. This helps asserting alignment requirements for the pointer
      passed to the foreign language.
 
  - 'container_of!': retain pointer mut-ness and add a compile-time check
    of the type of the first parameter ('$field_ptr').
 
  - Support optional message in 'static_assert!'.
 
  - Add C FFI types (e.g. 'c_int') to the prelude.
 
  - 'str' module: simplify KUnit tests 'format!' macro, convert
    'rusttest' tests into KUnit, take advantage of the '-> Result'
    support in KUnit '#[test]'s.
 
  - 'list' module: add examples for 'List', fix path of 'assert_pinned!'
    (so far unused macro rule).
 
  - 'workqueue' module: remove 'HasWork::OFFSET'.
 
  - 'page' module: add 'inline' attribute.
 
 'macros' crate:
 
  - 'module' macro: place 'cleanup_module()' in '.exit.text' section.
 
 'pin-init' crate:
 
  - Add 'Wrapper<T>' trait for creating pin-initializers for wrapper
    structs with a structurally pinned value such as 'UnsafeCell<T>' or
    'MaybeUninit<T>'.
 
  - Add 'MaybeZeroable' derive macro to try to derive 'Zeroable', but
    not error if not all fields implement it. This is needed to derive
    'Zeroable' for all bindgen-generated structs.
 
  - Add 'unsafe fn cast_[pin_]init()' functions to unsafely change the
    initialized type of an initializer. These are utilized by the
    'Wrapper<T>' implementations.
 
  - Add support for visibility in 'Zeroable' derive macro.
 
  - Add support for 'union's in 'Zeroable' derive macro.
 
  - Upstream dev news: streamline CI, fix some bugs. Add new workflows
    to check if the user-space version and the one in the kernel tree
    have diverged. Use the issues tab [1] to track them, which should
    help folks report and diagnose issues w.r.t. 'pin-init' better.
 
      [1] https://github.com/rust-for-linux/pin-init/issues
 
 Documentation:
 
  - Testing: add docs on the new KUnit '#[test]' tests.
 
  - Coding guidelines: explain that '///' vs. '//' applies to private
    items too. Add section on C FFI types.
 
  - Quick Start guide: update Ubuntu instructions and split them into
    "25.04" and "24.04 LTS and older".
 
 And a few other cleanups and improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAmhBAvYACgkQGXyLc2ht
 IW3qvA/+KRTCYKcI6JyUT9TdhRmaaMsQ0/5j6Kx4CfRQPZTSWsXyBEU75yEIZUQD
 SUGQFwmMAYeAKQD1SumFCRy973VzUO45DyKM+7vuVhKN1ZjnAtv63+31C3UFATlA
 8Tm3GCqQEGKl4IER7xI3D/vpZA5FOv+GotjNieF3O9FpHDCvV/JQScq9I2oXQPCt
 17kRLww/DTfpf4qiLmxmxHn6nCsbecdfEce1kfjk3nNuE6B2tPf+ddYOwunLEvkB
 LA4Cr6T1Cy1ovRQgxg9Pdkl/0Rta0tFcsKt1LqPgjR+n95stsHgAzbyMGuUKoeZx
 u2R2pwlrJt6Xe4CEZgTIRfYWgF81qUzdcPuflcSMDCpH0nTep74A2lIiWUHWZSh4
 LbPh7r90Q8YwGKVJiWqLfHUmQBnmTEm3D2gydSExPKJXSzB4Rbv4w4fPF3dhzMtC
 4+KvmHKIojFkAdTLt+5rkKipJGo/rghvQvaQr9JOu+QO4vfhkesB4pUWC4sZd9A9
 GJBP97ynWAsXGGaeaaSli0b851X+VE/WIDOmPMselbA3rVADChE6HsJnY/wVVeWK
 jupvAhUExSczDPCluGv8T9EVXvv6+fg3bB5pD6R01NNJe6iE/LIDQ5Gj5rg4qahM
 EFzMgPj6hMt5McvWI8q1/ym0bzdeC2/cmaV6E14hvphAZoORUKI=
 =JRqL
 -----END PGP SIGNATURE-----

Merge tag 'rust-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux

Pull Rust updates from Miguel Ojeda:
 "Toolchain and infrastructure:

   - KUnit '#[test]'s:

      - Support KUnit-mapped 'assert!' macros.

        The support that landed last cycle was very basic, and the
        'assert!' macros panicked since they were the standard library
        ones. Now, they are mapped to the KUnit ones in a similar way to
        how is done for doctests, reusing the infrastructure there.

        With this, a failing test like:

            #[test]
            fn my_first_test() {
                assert_eq!(42, 43);
            }

        will report:

            # my_first_test: ASSERTION FAILED at rust/kernel/lib.rs:251
            Expected 42 == 43 to be true, but is false
            # my_first_test.speed: normal
            not ok 1 my_first_test

      - Support tests with checked 'Result' return types.

        The return value of test functions that return a 'Result' will
        be checked, thus one can now easily catch errors when e.g. using
        the '?' operator in tests.

        With this, a failing test like:

            #[test]
            fn my_test() -> Result {
                f()?;
                Ok(())
            }

        will report:

            # my_test: ASSERTION FAILED at rust/kernel/lib.rs:321
            Expected is_test_result_ok(my_test()) to be true, but is false
            # my_test.speed: normal
            not ok 1 my_test

      - Add 'kunit_tests' to the prelude.

   - Clarify the remaining language unstable features in use.

   - Compile 'core' with edition 2024 for Rust >= 1.87.

   - Workaround 'bindgen' issue with forward references to 'enum' types.

   - objtool: relax slice condition to cover more 'noreturn' functions.

   - Use absolute paths in macros referencing 'core' and 'kernel'
     crates.

   - Skip '-mno-fdpic' flag for bindgen in GCC 32-bit arm builds.

   - Clean some 'doc_markdown' lint hits -- we may enable it later on.

  'kernel' crate:

   - 'alloc' module:

      - 'Box': support for type coercion, e.g. 'Box<T>' to 'Box<dyn U>'
        if 'T' implements 'U'.

      - 'Vec': implement new methods (prerequisites for nova-core and
        binder): 'truncate', 'resize', 'clear', 'pop',
        'push_within_capacity' (with new error type 'PushError'),
        'drain_all', 'retain', 'remove' (with new error type
        'RemoveError'), insert_within_capacity' (with new error type
        'InsertError').

        In addition, simplify 'push' using 'spare_capacity_mut', split
        'set_len' into 'inc_len' and 'dec_len', add type invariant 'len
        <= capacity' and simplify 'truncate' using 'dec_len'.

   - 'time' module:

      - Morph the Rust hrtimer subsystem into the Rust timekeeping
        subsystem, covering delay, sleep, timekeeping, timers. This new
        subsystem has all the relevant timekeeping C maintainers listed
        in the entry.

      - Replace 'Ktime' with 'Delta' and 'Instant' types to represent a
        duration of time and a point in time.

      - Temporarily add 'Ktime' to 'hrtimer' module to allow 'hrtimer'
        to delay converting to 'Instant' and 'Delta'.

   - 'xarray' module:

      - Add a Rust abstraction for the 'xarray' data structure. This
        abstraction allows Rust code to leverage the 'xarray' to store
        types that implement 'ForeignOwnable'. This support is a
        dependency for memory backing feature of the Rust null block
        driver, which is waiting to be merged.

      - Set up an entry in 'MAINTAINERS' for the XArray Rust support.
        Patches will go to the new Rust XArray tree and then via the
        Rust subsystem tree for now.

      - Allow 'ForeignOwnable' to carry information about the pointed-to
        type. This helps asserting alignment requirements for the
        pointer passed to the foreign language.

   - 'container_of!': retain pointer mut-ness and add a compile-time
     check of the type of the first parameter ('$field_ptr').

   - Support optional message in 'static_assert!'.

   - Add C FFI types (e.g. 'c_int') to the prelude.

   - 'str' module: simplify KUnit tests 'format!' macro, convert
     'rusttest' tests into KUnit, take advantage of the '-> Result'
     support in KUnit '#[test]'s.

   - 'list' module: add examples for 'List', fix path of
     'assert_pinned!' (so far unused macro rule).

   - 'workqueue' module: remove 'HasWork::OFFSET'.

   - 'page' module: add 'inline' attribute.

  'macros' crate:

   - 'module' macro: place 'cleanup_module()' in '.exit.text' section.

  'pin-init' crate:

   - Add 'Wrapper<T>' trait for creating pin-initializers for wrapper
     structs with a structurally pinned value such as 'UnsafeCell<T>' or
     'MaybeUninit<T>'.

   - Add 'MaybeZeroable' derive macro to try to derive 'Zeroable', but
     not error if not all fields implement it. This is needed to derive
     'Zeroable' for all bindgen-generated structs.

   - Add 'unsafe fn cast_[pin_]init()' functions to unsafely change the
     initialized type of an initializer. These are utilized by the
     'Wrapper<T>' implementations.

   - Add support for visibility in 'Zeroable' derive macro.

   - Add support for 'union's in 'Zeroable' derive macro.

   - Upstream dev news: streamline CI, fix some bugs. Add new workflows
     to check if the user-space version and the one in the kernel tree
     have diverged. Use the issues tab [1] to track them, which should
     help folks report and diagnose issues w.r.t. 'pin-init' better.

       [1] https://github.com/rust-for-linux/pin-init/issues

  Documentation:

   - Testing: add docs on the new KUnit '#[test]' tests.

   - Coding guidelines: explain that '///' vs. '//' applies to private
     items too. Add section on C FFI types.

   - Quick Start guide: update Ubuntu instructions and split them into
     "25.04" and "24.04 LTS and older".

  And a few other cleanups and improvements"

* tag 'rust-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (78 commits)
  rust: list: Fix typo `much` in arc.rs
  rust: check type of `$ptr` in `container_of!`
  rust: workqueue: remove HasWork::OFFSET
  rust: retain pointer mut-ness in `container_of!`
  Documentation: rust: testing: add docs on the new KUnit `#[test]` tests
  Documentation: rust: rename `#[test]`s to "`rusttest` host tests"
  rust: str: take advantage of the `-> Result` support in KUnit `#[test]`'s
  rust: str: simplify KUnit tests `format!` macro
  rust: str: convert `rusttest` tests into KUnit
  rust: add `kunit_tests` to the prelude
  rust: kunit: support checked `-> Result`s in KUnit `#[test]`s
  rust: kunit: support KUnit-mapped `assert!` macros in `#[test]`s
  rust: make section names plural
  rust: list: fix path of `assert_pinned!`
  rust: compile libcore with edition 2024 for 1.87+
  rust: dma: add missing Markdown code span
  rust: task: add missing Markdown code spans and intra-doc links
  rust: pci: fix docs related to missing Markdown code spans
  rust: alloc: add missing Markdown code span
  rust: alloc: add missing Markdown code spans
  ...
2025-06-04 21:18:37 -07:00
Linus Torvalds
64980441d2 bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmhA1fUACgkQ6rmadz2v
 bTraiRAAlhNFnR0gBLORq91Vgg0UbzTTV07m4mp7EveurgFAyrY/svRKlJIEC+Jh
 TqUHeRRHq2Sa08Rhn8QCF/g6ExINqwp9SbdZEKe6OC9AzuSE00a2PzJL9dqw207K
 6Br4naVaL6ckEEkfDlAstnDLuGRdwPQQ+7MMCJY0jrmQEJgAr50nkzGbokCefgbg
 GGjPjag+ulIumSlmYM/VeoNmcORlF278IsDMipBlgDl1xiJhEUXKPvkmW4E6i3Ac
 j8snRg+HVRSTACEnSAh8LT8+o4W0yYNlKO9D2sQTeBMzCovmYITabfVhY1B55Mg2
 Yyh2wKvmpVbpRs/lne7DZGzT0Zku0iReSkVBVHJe+1tz7qee/rvMZW87whSVEqNY
 4dRlkmFe7HTDDL5olH8CuhoihR/jYEElxN9Q5gA6I8x9ye+DrzoOg1PO126E8y1+
 B6gzUlupfWClLVRzA84Y/kWLRvVK6iYkqFPnrNlrtLKdnHo6unXdI8QGbtp5NKG2
 y1srAiHeqRvYCMMFxPe6+lY03TVyfUtvK/qLKNSEtYbBJibyP8dv+zsMpxEz8WZh
 XBwqOI+CWPYE5/7kdJN2vDSo7F2TQ04NK58miv1YFYFl6Gc6GZl+9zcn2PFTje+w
 FIECx6igNerDI0N4H1ev4+LQSELqqGGDJnACOOx7d0AwjPJaWSE=
 =IsJU
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:
 "Two small fixes to selftests"

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Fix selftest btf_tag/btf_type_tag_percpu_vmlinux_helper failure
  selftests/bpf: Fix bpf selftest build error
2025-06-04 19:46:22 -07:00
Uday Shankar
a2f4c1ae16 selftests: ublk: kublk: improve behavior on init failure
Some failure modes are handled poorly by kublk. For example, if ublk_drv
is built as a module but not currently loaded into the kernel, ./kublk
add ... just hangs forever. This happens because in this case (and a few
others), the worker process does not notify its parent (via a write to
the shared eventfd) that it has tried and failed to initialize, so the
parent hangs forever. Fix this by ensuring that we always notify the
parent process of any initialization failure, and have the parent print
a (not very descriptive) log line when this happens.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250603-ublk_init_fail-v1-1-87c91486230e@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-03 20:19:44 -06:00
Linus Torvalds
0939bd2fcf perf tools improvements and fixes for Linux v6.16:
perf report/top/annotate TUI:
 
 - Accept the left arrow key as a Zoom out if done on the first column.
 
 - Show if source code toggle status in title, to help spotting bugs with
   the various disassemblers (capstone, llvm, objdump).
 
 - Provide feedback on unhandled hotkeys.
 
 Build:
 
 - Better inform when certain features are not available with warnings in the
   build process and in 'perf version --build-options' or 'perf -vv'.
 
 perf record:
 
 - Improve the --off-cpu code by synthesizing events for switch-out -> switch-in
   intervals using a BPF program. This can be fine tuned using a --off-cpu-thresh
   knob.
 
 perf report:
 
 - Add 'tgid' sort key.
 
 perf mem/c2c:
 
 - Add 'op', 'cache', 'snoop', 'dtlb' output fields.
 
 - Add support for 'ldlat' on AMD IBS (Instruction Based Sampling).
 
 perf ftrace:
 
 - Use process/session specific trace settings instead of messing with
   the global ftrace knobs.
 
 perf trace:
 
 - Implement syscall summary in BPF.
 
 - Support --summary-mode=cgroup.
 
 - Always print return value for syscalls returning a pid.
 
 - The rseq and set_robust_list don't return a pid, just -errno.
 
 perf lock contention:
 
 -  Symbolize zone->lock using BTF.
 
 - Add -J/--inject-delay option to estimate impact on application performance by
   optimization of kernel locking behavior.
 
 perf stat:
 
 - Improve hybrid support for the NMI watchdog warning.
 
 Symbol resolution:
 
 - Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust symbols.
 
 - Improve Rust demangler.
 
 Hardware tracing:
 
 Intel PT:
 
 - Fix PEBS-via-PT data_src.
 
 - Do not default to recording all switch events.
 
 - Fix pattern matching with python3 on the SQL viewer script.
 
 arm64:
 
 - Fixups for the hip08 hha PMU.
 
 Vendor events:
 
 - Update Intel events/metrics files for alderlake, alderlaken, arrowlake,
   bonnell, broadwell, broadwellde, broadwellx, cascadelakex, clearwaterforest,
   elkhartlake, emeraldrapids, grandridge, graniterapids, haswell, haswellx,
   icelake, icelakex, ivybridge, ivytown, jaketown, lunarlake, meteorlake,
   nehalemep, nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest,
   skylake, skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp,
   westmereep-sx.
 
 python support:
 
 - Add support for event counts in the python binding, add a counting.py example.
 
 perf list:
 
 - Display the PMU name associated with a perf metric in JSON.
 
 perf test:
 
 - Hybrid improvements for metric value validation test.
 
 - Fix LBR test by ignoring idle task.
 
 - Add AMD IBS sw filter ana d'ldlat' tests.
 
 - Add 'perf trace --summary-mode=cgroup' test.
 
 - Add tests for the various language symbol demanglers.
 
 Miscellaneous.
 
 - Allow specifying the cpu an event will be tied using '-e event/cpu=N/'.
 
 - Sync various headers with the kernel sources.
 
 - Add annotations to use clang's -Wthread-safety and fix some problems
   it detected.
 
 - Make dump_stack() use perf's symbol resolution to provide better backtraces.
 
 - Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS
   (Precision Event-Based Sampling), that adds timing info, the retirement
   latency of instructions.
 
 - Various memory allocation (some detected by ASAN) and reference counting
   fixes.
 
 - Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace PERF_RECORD_COMPRESSED.
 
 - Skip unsupported event types in perf.data files, don't stop when finding one.
 
 - Improve lookups using hashmaps and binary searches.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCaD9ViwAKCRCyPKLppCJ+
 JzOfAQDXlukhPQyuJ4j1ie0x1QO4jalloMbG1Bkp3hn6yjxafAD9Ha5wr+dwnAj4
 FfxOVqua29r8Htn4aGahXZ0nnlVp9Ac=
 =bwgD
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "perf report/top/annotate TUI:

   - Accept the left arrow key as a Zoom out if done on the first column

   - Show if source code toggle status in title, to help spotting bugs
     with the various disassemblers (capstone, llvm, objdump)

   - Provide feedback on unhandled hotkeys

  Build:

   - Better inform when certain features are not available with warnings
     in the build process and in 'perf version --build-options' or 'perf -vv'

  perf record:

   - Improve the --off-cpu code by synthesizing events for switch-out ->
     switch-in intervals using a BPF program. This can be fine tuned
     using a --off-cpu-thresh knob

  perf report:

   - Add 'tgid' sort key

  perf mem/c2c:

   - Add 'op', 'cache', 'snoop', 'dtlb' output fields

   - Add support for 'ldlat' on AMD IBS (Instruction Based Sampling)

  perf ftrace:

   - Use process/session specific trace settings instead of messing with
     the global ftrace knobs

  perf trace:

   - Implement syscall summary in BPF

   - Support --summary-mode=cgroup

   - Always print return value for syscalls returning a pid

   - The rseq and set_robust_list don't return a pid, just -errno

  perf lock contention:

   - Symbolize zone->lock using BTF

   - Add -J/--inject-delay option to estimate impact on application
     performance by optimization of kernel locking behavior

  perf stat:

   - Improve hybrid support for the NMI watchdog warning

  Symbol resolution:

   - Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust
     symbols

   - Improve Rust demangler

  Hardware tracing:

  Intel PT:

   - Fix PEBS-via-PT data_src

   - Do not default to recording all switch events

   - Fix pattern matching with python3 on the SQL viewer script

  arm64:

   - Fixups for the hip08 hha PMU

  Vendor events:

   - Update Intel events/metrics files for alderlake, alderlaken,
     arrowlake, bonnell, broadwell, broadwellde, broadwellx,
     cascadelakex, clearwaterforest, elkhartlake, emeraldrapids,
     grandridge, graniterapids, haswell, haswellx, icelake, icelakex,
     ivybridge, ivytown, jaketown, lunarlake, meteorlake, nehalemep,
     nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest,
     skylake, skylakex, snowridgex, tigerlake, westmereep-dp,
     westmereep-sp, westmereep-sx

  python support:

   - Add support for event counts in the python binding, add a
     counting.py example

  perf list:

   - Display the PMU name associated with a perf metric in JSON

  perf test:

   - Hybrid improvements for metric value validation test

   - Fix LBR test by ignoring idle task

   - Add AMD IBS sw filter ana d'ldlat' tests

   - Add 'perf trace --summary-mode=cgroup' test

   - Add tests for the various language symbol demanglers

  Miscellaneous:

   - Allow specifying the cpu an event will be tied using '-e
     event/cpu=N/'

   - Sync various headers with the kernel sources

   - Add annotations to use clang's -Wthread-safety and fix some
     problems it detected

   - Make dump_stack() use perf's symbol resolution to provide better
     backtraces

   - Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS
     (Precision Event-Based Sampling), that adds timing info, the
     retirement latency of instructions

   - Various memory allocation (some detected by ASAN) and reference
     counting fixes

   - Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace
     PERF_RECORD_COMPRESSED

   - Skip unsupported event types in perf.data files, don't stop when
     finding one

   - Improve lookups using hashmaps and binary searches"

* tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (206 commits)
  perf callchain: Always populate the addr_location map when adding IP
  perf lock contention: Reject more than 10ms delays for safety
  perf trace: Set errpid to false for rseq and set_robust_list
  perf symbol: Move demangling code out of symbol-elf.c
  perf trace: Always print return value for syscalls returning a pid
  perf script: Print PERF_AUX_FLAG_COLLISION flag
  perf mem: Show absolute percent in mem_stat output
  perf mem: Display sort order only if it's available
  perf mem: Describe overhead calculation in brief
  perf record: Fix incorrect --user-regs comments
  Revert "perf thread: Ensure comm_lock held for comm_list"
  perf test trace_summary: Skip --bpf-summary tests if no libbpf
  perf test intel-pt: Skip jitdump test if no libelf
  perf intel-tpebs: Avoid race when evlist is being deleted
  perf test demangle-java: Don't segv if demangling fails
  perf symbol: Fix use-after-free in filename__read_build_id
  perf pmu: Avoid segv for missing name/alias_name in wildcarding
  perf machine: Factor creating a "live" machine out of dwarf-unwind
  perf test: Add AMD IBS sw filter test
  perf mem: Count L2 HITM for c2c statistic
  ...
2025-06-03 15:11:44 -07:00
Linus Torvalds
29e9359005 CXL changes for v6.16
- Remove always true condition in cxl features code.
 - Add verification of CHBS length for CXL 2.0
 - Ignore interleave granularity when interleave ways is 1
 - Add update addressing mising MODULE_DESCRIPTION for cxl_test
 - A series of cleanups/refactor to prep for AMD Zen5 translate code
 - Clean %pa debug printk in core/hdm.c
 - Documentation updates
   - Update to CXL Maturity Map
   - Fixes to source linking in CXL documentation
   - CXL documentation fixes, spelling corrections
   - A large collection of CXL documentation for the entire CXL subsystem, including
     documentation on CXL related platform and firmware notes
 - Remove redundant code of cxlctl_get_supported_features()
 - Series to support CXL RAS Features
   - Including "Patrol Scrub Control", "Error Check Scrub", "Performance Maitenance"
     and "Memory Sparing". The series connects CXL to EDAC.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmg/EX0ACgkQYGjFFmlT
 OErkxA/+MuvYH6PjdSwbJiJUcLCrq0gdNczX7t+CAo3v4rNs5CrOJSKuXMxGVIpf
 lCjwS/J7j4XOa9DigDO1Dxl4PWG/R0K3HjbHQJLVjy3jmsVr5GgJVt4s5EqS78Or
 QoM9d/2dq8Q6dqk89Z4rFY2JlAmXcibe+lz2m9k5vy8KPQvrZI1KruZMG0qN1rWC
 SBa+eUWW49MP3Ab6pBDRRCI7EPcJ44QF+49SWXrkkiJjll/OTtYu3V1JymaPV4zT
 /UM/CwHLnmb5odUfOx5EZJcZIZzqasBD28Xu6Y6Vjs2pgTNPr9VNGCs+8lLwQCg7
 1O8cyPjPa1p5HwKo2INJfM1Xdpo1Nqar1qGcSPVJKk0+a538i07YuXR3uWqnJ+mO
 uplJwvtL1Rvg9h0C0fHwfB86Tl5poFIDn0zeZQ4tqMWH6y2wged5PcS5RSVuVYdX
 CHEzAvp1RreQvXd9KcSo6ITKbIvv5PplfcyTiftG0R71ewXYOpB386D7ihh8ZvvO
 Y6TJN1nXUv8w3ve7rbs3T2ncX+BbhHRKSnXTp3rbkXTQLF5t+2gjzVuwSSEH71Ps
 4bD+7EeE0JGSz76qLbfQPNt6l3HnG6ctycpAydsUs/YmIOnciKpT5R9OGwncKHrT
 /Ccx+9uN5+CizqsKCi+rxadoOw7Pk4bKmo0a4wCZ5sDPGix0g18=
 =evKj
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull Compute Express Link (CXL) updates from Dave Jiang:

 - Remove always true condition in cxl features code

 - Add verification of CHBS length for CXL 2.0

 - Ignore interleave granularity when interleave ways is 1

 - Add update addressing mising MODULE_DESCRIPTION for cxl_test

 - A series of cleanups/refactor to prep for AMD Zen5 translate code

 - Clean %pa debug printk in core/hdm.c

 - Documentation updates:
     - Update to CXL Maturity Map
     - Fixes to source linking in CXL documentation
     - CXL documentation fixes, spelling corrections
     - A large collection of CXL documentation for the entire CXL
       subsystem, including documentation on CXL related platform and
       firmware notes

 - Remove redundant code of cxlctl_get_supported_features()

 - Series to support CXL RAS Features
     - Including "Patrol Scrub Control", "Error Check Scrub",
       "Performance Maitenance" and "Memory Sparing". The series
       connects CXL to EDAC.

* tag 'cxl-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (53 commits)
  cxl/edac: Add CXL memory device soft PPR control feature
  cxl/edac: Add CXL memory device memory sparing control feature
  cxl/edac: Support for finding memory operation attributes from the current boot
  cxl/edac: Add support for PERFORM_MAINTENANCE command
  cxl/edac: Add CXL memory device ECS control feature
  cxl/edac: Add CXL memory device patrol scrub control feature
  cxl: Update prototype of function get_support_feature_info()
  EDAC: Update documentation for the CXL memory patrol scrub control feature
  cxl/features: Remove the inline specifier from to_cxlfs()
  cxl/feature: Remove redundant code of get supported features
  docs: ABI: Fix "firwmare" to "firmware"
  cxl/Documentation: Fix typo in sysfs write_bandwidth attribute path
  cxl: doc/linux/access-coordinates Update access coordinates calculation methods
  cxl: docs/platform/acpi/srat Add generic target documentation
  cxl: docs/platform/cdat reference documentation
  Documentation: Update the CXL Maturity Map
  cxl: Sync up the driver-api/cxl documentation
  cxl: docs - add self-referencing cross-links
  cxl: docs/allocation/hugepages
  cxl: docs/allocation/reclaim
  ...
2025-06-03 13:24:14 -07:00
Linus Torvalds
c00b285024 hyperv-next for v6.16
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmg+jmETHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXiWaCACjYSQcCXW2nnZuWUnGMJq8HD5XGBAH
 tNYzOyp2Y4bXEJzfmbHv8UpJynGr3IFKybCnhm0uAQZCmiR5k4CfMvjPQXcJu9LK
 7yUI/dTGrRGG7f3NClWK2vXg7ATqzRGiPuPDk2lDcP04aQQWaUMDYe5SXIgcqKyZ
 cm2OVHapHGbQ7wA+xXGQcUBb6VJ5+BrQUVOqaEQyl4LURvjaQcn7rVDS0SmEi8gq
 42+KDVd8uWYos5dT57HIq9UI5og3PeTvAvHsx26eX8JWNqwXLgvxRH83kstK+GWY
 uG3sOm5yRbJvErLpJHnyBOlXDFNw2EBeLC1VyhdJXBR8RabgI+H/mrY3
 =4bTC
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - Support for Virtual Trust Level (VTL) on arm64 (Roman Kisel)

 - Fixes for Hyper-V UIO driver (Long Li)

 - Fixes for Hyper-V PCI driver (Michael Kelley)

 - Select CONFIG_SYSFB for Hyper-V guests (Michael Kelley)

 - Documentation updates for Hyper-V VMBus (Michael Kelley)

 - Enhance logging for hv_kvp_daemon (Shradha Gupta)

* tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (23 commits)
  Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests
  Drivers: hv: vmbus: Add comments about races with "channels" sysfs dir
  Documentation: hyperv: Update VMBus doc with new features and info
  PCI: hv: Remove unnecessary flex array in struct pci_packet
  Drivers: hv: Remove hv_alloc/free_* helpers
  Drivers: hv: Use kzalloc for panic page allocation
  uio_hv_generic: Align ring size to system page
  uio_hv_generic: Use correct size for interrupt and monitor pages
  Drivers: hv: Allocate interrupt and monitor pages aligned to system page boundary
  arch/x86: Provide the CPU number in the wakeup AP callback
  x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap()
  PCI: hv: Get vPCI MSI IRQ domain from DeviceTree
  ACPI: irq: Introduce acpi_get_gsi_dispatcher()
  Drivers: hv: vmbus: Introduce hv_get_vmbus_root_device()
  Drivers: hv: vmbus: Get the IRQ number from DeviceTree
  dt-bindings: microsoft,vmbus: Add interrupt and DMA coherence properties
  arm64, x86: hyperv: Report the VTL the system boots in
  arm64: hyperv: Initialize the Virtual Trust Level field
  Drivers: hv: Provide arch-neutral implementation of get_vtl()
  Drivers: hv: Enable VTL mode for arm64
  ...
2025-06-03 08:39:20 -07:00
Antonio Quartulli
9c7e8b31da selftest/net/ovpn: fix missing file
test-large-mtu.sh is referenced by the Makefile
but does not exist.

Add it along the other scripts.

Fixes: 944f8b6aba ("selftest/net/ovpn: extend coverage with more test cases")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
2025-06-03 13:08:15 +02:00
Antonio Quartulli
fdf4064aae selftest/net/ovpn: fix TCP socket creation
TCP sockets cannot be created with AF_UNSPEC, but
one among the supported family must be used.

Since commit 944f8b6aba ("selftest/net/ovpn: extend
coverage with more test cases") the default address
family for all tests was changed from AF_INET to AF_UNSPEC,
thus breaking all TCP cases.

Restore AF_INET as default address family for TCP listeners.

Fixes: 944f8b6aba ("selftest/net/ovpn: extend coverage with more test cases")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
2025-06-03 13:08:15 +02:00
Bui Quang Minh
d3f2a9587e selftests: net: build net/lib dependency in all target
We have the logic to include net/lib automatically for net related
selftests. However, currently, this logic is only in install target
which means only `make install` will have net/lib included. This commit
adds the logic to all target so that all `make`, `make run_tests` and
`make install` will have net/lib included in net related selftests.

Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Link: https://patch.msgid.link/20250601142914.13379-1-minhquangbui99@gmail.com
Fixes: b86761ff63 ("selftests: net: add scaffolding for Netlink tests in Python")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-06-03 12:21:04 +02:00
Jakub Kicinski
f6695269dc Revert "kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests"
This reverts commit a571a9a1b1.

The commit in question breaks kunit for older compilers:

$ gcc --version
 gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5)

$ ./tools/testing/kunit/kunit.py run  --alltests --json --arch=x86_64
 Configuring KUnit Kernel ...
 Regenerating .config ...
 Populating config with:
 $ make ARCH=x86_64 O=.kunit olddefconfig
 ERROR:root:Not all Kconfig options selected in kunitconfig were in the generated .config.
 This is probably due to unsatisfied dependencies.
 Missing: CONFIG_INIT_STACK_ALL_PATTERN=y

Link: https://lore.kernel.org/20250529083811.778bc31b@kernel.org
Fixes: a571a9a1b1 ("kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://patch.msgid.link/20250530135800.13437-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-06-03 11:20:21 +02:00
Linus Torvalds
546b1c9e93 Bootconfig updates for v6.15:
- Allow overriding CFLAGS for tools/bootconfig
  - Allow overriding LDFLAGS for tools/bootconfig
    Both allow overriding build flags for tools/bootconfig, for
    example making it a static binary.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmg9xsYbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8boEAIAMRZWpackA9KXxdhEuIg
 RzQ0POBKlv1CpUtPTEW6SX0x8hwDVn+bq27GrSfrwM+Vr8bs4twpwvWCjrnkf7kn
 vUeRyhrI8ZoGQ/ov7ZILtn3kNnfCjCr1AlSxKkjDbqQhxPJbccy5CjLOoodeFdYh
 JC+Pd7MiFDrXyYvhe1sxvfFsBrjJ876RdusXrBQduSp5XlnaNv4pSSKG1+VncGFV
 tGeSZMLbLCP3XVZIe2pkmWqW5BpNulkxu+f2zuQPH/PcgTva5kaYJufqFXHsVNU5
 m4vTCpvyG9m+xr8If1NPf12n/BO7MkN/HlP/Zmv5MkInQFnvVDL5VZESOH0BByGT
 dkg=
 =xZKT
 -----END PGP SIGNATURE-----

Merge tag 'bootconfig-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bootconfig updates from Masami Hiramatsu:

 - Allow overriding CFLAGS and LDFLAGS for tools/bootconfig, for example
   making it a static binary.

* tag 'bootconfig-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tools/bootconfig: specify LDFLAGS as an argument to CC
  tools/bootconfig: allow overriding CFLAGS assignment
2025-06-02 17:39:24 -07:00
Linus Torvalds
fd1f847350 - The 2 patch series "zram: support algorithm-specific parameters" from
Sergey Senozhatsky adds infrastructure for passing algorithm-specific
   parameters into zram.  A single parameter `winbits' is implemented at
   this time.
 
 - The 5 patch series "memcg: nmi-safe kmem charging" from Shakeel Butt
   makes memcg charging nmi-safe, which is required by BFP, which can
   operate in NMI context.
 
 - The 5 patch series "Some random fixes and cleanup to shmem" from
   Kemeng Shi implements small fixes and cleanups in the shmem code.
 
 - The 2 patch series "Skip mm selftests instead when kernel features are
   not present" from Zi Yan fixes some issues in the MM selftest code.
 
 - The 2 patch series "mm/damon: build-enable essential DAMON components
   by default" from SeongJae Park reworks DAMON Kconfig to make it easier
   to enable CONFIG_DAMON.
 
 - The 2 patch series "sched/numa: add statistics of numa balance task
   migration" from Libo Chen adds more info into sysfs and procfs files to
   improve visibility into the NUMA balancer's task migration activity.
 
 - The 4 patch series "selftests/mm: cow and gup_longterm cleanups" from
   Mark Brown provides various updates to some of the MM selftests to make
   them play better with the overall containing framework.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaDzA9wAKCRDdBJ7gKXxA
 js8sAP9V3COg+vzTmimzP3ocTkkbbIJzDfM6nXpE2EQ4BR3ejwD+NsIT2ZLtTF6O
 LqAZpgO7ju6wMjR/lM30ebCq5qFbZAw=
 =oruw
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-06-01-14-06' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

 - "zram: support algorithm-specific parameters" from Sergey Senozhatsky
   adds infrastructure for passing algorithm-specific parameters into
   zram. A single parameter `winbits' is implemented at this time.

 - "memcg: nmi-safe kmem charging" from Shakeel Butt makes memcg
   charging nmi-safe, which is required by BFP, which can operate in NMI
   context.

 - "Some random fixes and cleanup to shmem" from Kemeng Shi implements
   small fixes and cleanups in the shmem code.

 - "Skip mm selftests instead when kernel features are not present" from
   Zi Yan fixes some issues in the MM selftest code.

 - "mm/damon: build-enable essential DAMON components by default" from
   SeongJae Park reworks DAMON Kconfig to make it easier to enable
   CONFIG_DAMON.

 - "sched/numa: add statistics of numa balance task migration" from Libo
   Chen adds more info into sysfs and procfs files to improve visibility
   into the NUMA balancer's task migration activity.

 - "selftests/mm: cow and gup_longterm cleanups" from Mark Brown
   provides various updates to some of the MM selftests to make them
   play better with the overall containing framework.

* tag 'mm-stable-2025-06-01-14-06' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (43 commits)
  mm/khugepaged: clean up refcount check using folio_expected_ref_count()
  selftests/mm: fix test result reporting in gup_longterm
  selftests/mm: report unique test names for each cow test
  selftests/mm: add helper for logging test start and results
  selftests/mm: use standard ksft_finished() in cow and gup_longterm
  selftests/damon/_damon_sysfs: skip testcases if CONFIG_DAMON_SYSFS is disabled
  sched/numa: add statistics of numa balance task
  sched/numa: fix task swap by skipping kernel threads
  tools/testing: check correct variable in open_procmap()
  tools/testing/vma: add missing function stub
  mm/gup: update comment explaining why gup_fast() disables IRQs
  selftests/mm: two fixes for the pfnmap test
  mm/khugepaged: fix race with folio split/free using temporary reference
  mm: add CONFIG_PAGE_BLOCK_ORDER to select page block order
  mmu_notifiers: remove leftover stub macros
  selftests/mm: deduplicate test names in madv_populate
  kcov: rust: add flags for KCOV with Rust
  mm: rust: make CONFIG_MMU ifdefs more narrow
  mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables()
  mm/damon/Kconfig: enable CONFIG_DAMON by default
  ...
2025-06-02 16:00:26 -07:00
Linus Torvalds
7f9039c524 Generic:
* Clean up locking of all vCPUs for a VM by using the *_nest_lock()
   family of functions, and move duplicated code to virt/kvm/.
   kernel/ patches acked by Peter Zijlstra.
 
 * Add MGLRU support to the access tracking perf test.
 
 ARM fixes:
 
 * Make the irqbypass hooks resilient to changes in the GSI<->MSI
   routing, avoiding behind stale vLPI mappings being left behind. The
   fix is to resolve the VGIC IRQ using the host IRQ (which is stable)
   and nuking the vLPI mapping upon a routing change.
 
 * Close another VGIC race where vCPU creation races with VGIC
   creation, leading to in-flight vCPUs entering the kernel w/o private
   IRQs allocated.
 
 * Fix a build issue triggered by the recently added workaround for
   Ampere's AC04_CPU_23 erratum.
 
 * Correctly sign-extend the VA when emulating a TLBI instruction
   potentially targeting a VNCR mapping.
 
 * Avoid dereferencing a NULL pointer in the VGIC debug code, which can
   happen if the device doesn't have any mapping yet.
 
 s390:
 
 * Fix interaction between some filesystems and Secure Execution
 
 * Some cleanups and refactorings, preparing for an upcoming big series
 
 x86:
 
 * Wait for target vCPU to acknowledge KVM_REQ_UPDATE_PROTECTED_GUEST_STATE to
   fix a race between AP destroy and VMRUN.
 
 * Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for the VM.
 
 * Refine and harden handling of spurious faults.
 
 * Add support for ALLOWED_SEV_FEATURES.
 
 * Add #VMGEXIT to the set of handlers special cased for CONFIG_RETPOLINE=y.
 
 * Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing features
   that utilize those bits.
 
 * Don't account temporary allocations in sev_send_update_data().
 
 * Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock Threshold.
 
 * Unify virtualization of IBRS on nested VM-Exit, and cross-vCPU IBPB, between
   SVM and VMX.
 
 * Advertise support to userspace for WRMSRNS and PREFETCHI.
 
 * Rescan I/O APIC routes after handling EOI that needed to be intercepted due
   to the old/previous routing, but not the new/current routing.
 
 * Add a module param to control and enumerate support for device posted
   interrupts.
 
 * Fix a potential overflow with nested virt on Intel systems running 32-bit kernels.
 
 * Flush shadow VMCSes on emergency reboot.
 
 * Add support for SNP to the various SEV selftests.
 
 * Add a selftest to verify fastops instructions via forced emulation.
 
 * Refine and optimize KVM's software processing of the posted interrupt bitmap, and share
   the harvesting code between KVM and the kernel's Posted MSI handler
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmg9TjwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOUxQf7B7nnWqIKd7jSkGzSD6YsSX9TXktr
 2tJIOfWM3zNYg5GRCidg+m4Y5+DqQWd3Hi5hH2P9wUw7RNuOjOFsDe+y0VBr8ysE
 ve39t/yp+mYalNmHVFl8s3dBDgrIeGKiz+Wgw3zCQIBZ18rJE1dREhv37RlYZ3a2
 wSvuObe8sVpCTyKIowDs1xUi7qJUBoopMSuqfleSHawRrcgCpV99U8/KNFF5plLH
 7fXOBAHHniVCVc+mqQN2wxtVJDhST+U3TaU4GwlKy9Yevr+iibdOXffveeIgNEU4
 D6q1F2zKp6UdV3+p8hxyaTTbiCVDqsp9WOgY/0I/f+CddYn0WVZgOlR+ow==
 =mYFL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:
  Generic:

   - Clean up locking of all vCPUs for a VM by using the *_nest_lock()
     family of functions, and move duplicated code to virt/kvm/. kernel/
     patches acked by Peter Zijlstra

   - Add MGLRU support to the access tracking perf test

  ARM fixes:

   - Make the irqbypass hooks resilient to changes in the GSI<->MSI
     routing, avoiding behind stale vLPI mappings being left behind. The
     fix is to resolve the VGIC IRQ using the host IRQ (which is stable)
     and nuking the vLPI mapping upon a routing change

   - Close another VGIC race where vCPU creation races with VGIC
     creation, leading to in-flight vCPUs entering the kernel w/o
     private IRQs allocated

   - Fix a build issue triggered by the recently added workaround for
     Ampere's AC04_CPU_23 erratum

   - Correctly sign-extend the VA when emulating a TLBI instruction
     potentially targeting a VNCR mapping

   - Avoid dereferencing a NULL pointer in the VGIC debug code, which
     can happen if the device doesn't have any mapping yet

  s390:

   - Fix interaction between some filesystems and Secure Execution

   - Some cleanups and refactorings, preparing for an upcoming big
     series

  x86:

   - Wait for target vCPU to ack KVM_REQ_UPDATE_PROTECTED_GUEST_STATE
     to fix a race between AP destroy and VMRUN

   - Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for
     the VM

   - Refine and harden handling of spurious faults

   - Add support for ALLOWED_SEV_FEATURES

   - Add #VMGEXIT to the set of handlers special cased for
     CONFIG_RETPOLINE=y

   - Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing
     features that utilize those bits

   - Don't account temporary allocations in sev_send_update_data()

   - Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock
     Threshold

   - Unify virtualization of IBRS on nested VM-Exit, and cross-vCPU
     IBPB, between SVM and VMX

   - Advertise support to userspace for WRMSRNS and PREFETCHI

   - Rescan I/O APIC routes after handling EOI that needed to be
     intercepted due to the old/previous routing, but not the
     new/current routing

   - Add a module param to control and enumerate support for device
     posted interrupts

   - Fix a potential overflow with nested virt on Intel systems running
     32-bit kernels

   - Flush shadow VMCSes on emergency reboot

   - Add support for SNP to the various SEV selftests

   - Add a selftest to verify fastops instructions via forced emulation

   - Refine and optimize KVM's software processing of the posted
     interrupt bitmap, and share the harvesting code between KVM and the
     kernel's Posted MSI handler"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (93 commits)
  rtmutex_api: provide correct extern functions
  KVM: arm64: vgic-debug: Avoid dereferencing NULL ITE pointer
  KVM: arm64: vgic-init: Plug vCPU vs. VGIC creation race
  KVM: arm64: Unmap vLPIs affected by changes to GSI routing information
  KVM: arm64: Resolve vLPI by host IRQ in vgic_v4_unset_forwarding()
  KVM: arm64: Protect vLPI translation with vgic_irq::irq_lock
  KVM: arm64: Use lock guard in vgic_v4_set_forwarding()
  KVM: arm64: Mask out non-VA bits from TLBI VA* on VNCR invalidation
  arm64: sysreg: Drag linux/kconfig.h to work around vdso build issue
  KVM: s390: Simplify and move pv code
  KVM: s390: Refactor and split some gmap helpers
  KVM: s390: Remove unneeded srcu lock
  s390: Remove unneeded includes
  s390/uv: Improve splitting of large folios that cannot be split while dirty
  s390/uv: Always return 0 from s390_wiggle_split_folio() if successful
  s390/uv: Don't return 0 from make_hva_secure() if the operation was not successful
  rust: add helper for mutex_trylock
  RISC-V: KVM: use kvm_trylock_all_vcpus when locking all vCPUs
  KVM: arm64: use kvm_trylock_all_vcpus when locking all vCPUs
  x86: KVM: SVM: use kvm_lock_all_vcpus instead of a custom implementation
  ...
2025-06-02 12:24:58 -07:00
Ming Lei
da12597a1d selftests: ublk: cover PER_IO_DAEMON in more stress tests
We have stress_03, stress_04 and stress_05 for checking new feature vs.
stress IO & device removal & ublk server crash & recovery, so let the
three existing stress tests cover PER_IO_DAEMON.

Then stress_06 can be removed, since the same test function is included in
stress_03.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250602132113.1398645-1-ming.lei@redhat.com
Reviewed-by: Uday Shankar <ushankar@purestorage.com>
[axboe: remove test_stress_06.sh from Makefile too]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-02 12:00:31 -06:00
Yonghong Song
baa39c169d selftests/bpf: Fix selftest btf_tag/btf_type_tag_percpu_vmlinux_helper failure
Ihor Solodrai reported selftest 'btf_tag/btf_type_tag_percpu_vmlinux_helper'
failure ([1]) during 6.16 merge window. The failure log:

  ...
  7: (15) if r0 == 0x0 goto pc+1        ; R0=ptr_css_rstat_cpu()
  ; *(volatile int *)rstat; @ btf_type_tag_percpu.c:68
  8: (61) r1 = *(u32 *)(r0 +0)
  cannot access ptr member updated_children with moff 0 in struct css_rstat_cpu with off 0 size 4

Two changes are needed. First, 'struct cgroup_rstat_cpu' needs to be
replaced with 'struct css_rstat_cpu' to be consistent with new data
structure. Second, layout of 'css_rstat_cpu' is changed compared
to 'cgroup_rstat_cpu'. The first member becomes a pointer so
the bpf prog needs to do 8-byte load instead of 4-byte load.

  [1] https://lore.kernel.org/bpf/6f688f2e-7d26-423a-9029-d1b1ef1c938a@linux.dev/

Cc: Ihor Solodrai <ihor.solodrai@linux.dev>
Cc: JP Kobryn <inwardvessel@gmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: JP Kobryn <inwardvessel@gmail.com>
Link: https://lore.kernel.org/r/20250529201151.1787575-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-01 13:07:47 -07:00
Saket Kumar Bhaskar
4b65d5ae97 selftests/bpf: Fix bpf selftest build error
On linux-next, build for bpf selftest displays an error due to
mismatch in the expected function signature of bpf_testmod_test_read
and bpf_testmod_test_write.

Commit 97d06802d1 ("sysfs: constify bin_attribute argument of bin_attribute::read/write()")
changed the required type for struct bin_attribute to const struct bin_attribute.

To resolve the error, update corresponding signature for the callback.

Fixes: 97d06802d1 ("sysfs: constify bin_attribute argument of bin_attribute::read/write()")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/all/e915da49-2b9a-4c4c-a34f-877f378129f6@linux.ibm.com/
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Link: https://lore.kernel.org/r/20250512091108.2015615-1-skb99@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-01 12:57:41 -07:00
Mark Brown
66bce7afba selftests/mm: fix test result reporting in gup_longterm
The kselftest framework uses the string logged when a test result is
reported as the unique identifier for a test, using it to track test
results between runs.  The gup_longterm test fails to follow this pattern,
it runs a single test function repeatedly with various parameters but each
result report is a string logging an error message which is fixed between
runs.

Since the code already logs each test uniquely before it starts refactor
to also print this to a buffer, then use that name as the test result. 
This isn't especially pretty but is relatively straightforward and is a
great help to tooling.

Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-4-ff198df8e38e@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:16 -07:00
Mark Brown
3f2d9a9ac5 selftests/mm: report unique test names for each cow test
The kselftest framework uses the string logged when a test result is
reported as the unique identifier for a test, using it to track test
results between runs.  The cow test completely fails to follow this
pattern, it runs test functions repeatedly with various parameters with
each result report from those functions being a string logging an error
message which is fixed between runs.

Since the code already logs each test uniquely before it starts refactor
to also print this to a buffer, then use that name as the test result. 
This isn't especially pretty but is relatively straightforward and is a
great help to tooling.

Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-3-ff198df8e38e@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:16 -07:00
Mark Brown
3f192afbed selftests/mm: add helper for logging test start and results
Several of the MM tests have a pattern of printing a description of the
test to be run then reporting the actual TAP result using a generic string
not connected to the specific test, often in a shared function used by
many tests.  The name reported typically varies depending on the specific
result rather than the test too.  This causes problems for tooling that
works with test results, the names reported with the results are used to
deduplicate tests and track them between runs so both duplicated names and
changing names cause trouble for things like UIs and automated bisection.

As a first step towards matching these tests better with the expectations
of kselftest provide helpers which record the test name as part of the
initial print and then use that as part of reporting a result.

This is not added as a generic kselftest helper partly because the use of
a variable to store the test name doesn't fit well with the header only
implementation of kselftest.h and partly because it's not really an
intended pattern.  Ideally at some point the mm tests that use it will be
updated to not need it.

Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-2-ff198df8e38e@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:16 -07:00
Mark Brown
109364fce5 selftests/mm: use standard ksft_finished() in cow and gup_longterm
Patch series "selftests/mm: cow and gup_longterm cleanups", v2.

The bulk of these changes modify the cow and gup_longterm tests to report
unique and stable names for each test, bringing them into line with the
expectations of tooling that works with kselftest.  The string reported as
a test result is used by tooling to both deduplicate tests and track tests
between test runs, using the same string for multiple tests or changing
the string depending on test result causes problems for user interfaces
and automation such as bisection.

It was suggested that converting to use kselftest_harness.h would be a
good way of addressing this, however that really wants the set of tests to
run to be known at compile time but both test programs dynamically
enumarate the set of huge page sizes the system supports and test each. 
Refactoring to handle this would be even more invasive than these changes
which are large but straightforward and repetitive.

A version of the main gup_longterm cleanup was previously sent separately,
this version factors out the helpers for logging the start of the test
since the cow test looks very similar.


This patch (of 4):

The cow and gup_longterm test programs open code something that looks a
lot like the standard ksft_finished() helper to summarise the test results
and provide an exit code, convert to use ksft_finished().

Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-0-ff198df8e38e@kernel.org
Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-1-ff198df8e38e@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:15 -07:00
Enze Li
79509ec1d2 selftests/damon/_damon_sysfs: skip testcases if CONFIG_DAMON_SYSFS is disabled
When CONFIG_DAMON_SYSFS is disabled, the selftests fail with the following
outputs,

not ok 2 selftests: damon: sysfs_update_schemes_tried_regions_wss_estimation.py # exit=1
not ok 3 selftests: damon: damos_quota.py # exit=1
not ok 4 selftests: damon: damos_quota_goal.py # exit=1
not ok 5 selftests: damon: damos_apply_interval.py # exit=1
not ok 6 selftests: damon: damos_tried_regions.py # exit=1
not ok 7 selftests: damon: damon_nr_regions.py # exit=1
not ok 11 selftests: damon: sysfs_update_schemes_tried_regions_hang.py # exit=1

The root cause of this issue is that all the testcases above do not check
the sysfs interface of DAMON whether it exists or not.  With this patch
applied, all the testcases above now pass successfully.

Link: https://lkml.kernel.org/r/20250531093937.1555159-1-lienze@kylinos.cn
Signed-off-by: Enze Li <lienze@kylinos.cn>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:15 -07:00
Dan Carpenter
83da212b7f tools/testing: check correct variable in open_procmap()
Check if "procmap_out->fd" is negative instead of "procmap_out" (which is
a pointer).

Link: https://lkml.kernel.org/r/aDbFuUTlJTBqziVd@stanley.mountain
Fixes: bd23f293a0 ("tools/testing: add PROCMAP_QUERY helper functions in mm self tests")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: levi.yun <yeoreum.yun@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:14 -07:00
Lorenzo Stoakes
918850c136 tools/testing/vma: add missing function stub
The hugetlb fix introduced in commit ee40c9920a ("mm: fix copy_vma()
error handling for hugetlb mappings") mistakenly did not provide a stub
for the VMA userland testing, which results in a compile error when trying
to build this.

Provide this stub to resolve the issue.

Link: https://lkml.kernel.org/r/20250528-fix-vma-test-v1-1-c8a5f533b38f@oracle.com
Fixes: ee40c9920a ("mm: fix copy_vma() error handling for hugetlb mappings")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by:  Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Jann Horn <jannh@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:14 -07:00
David Hildenbrand
bb084994d3 selftests/mm: two fixes for the pfnmap test
When unregistering the signal handler, we have to pass SIG_DFL, and
blindly reading from PFN 0 and PFN 1 seems to be problematic on !x86
systems.  In particularly, on arm64 tx2 machines where noting resides at
these physical memory locations, we can generate RAS errors.

Let's fix it by scanning /proc/iomem for actual "System RAM".

Link: https://lkml.kernel.org/r/20250528195244.1182810-1-david@redhat.com
Fixes: 2616b37032 ("selftests/mm: add simple VM_PFNMAP tests based on mmap'ing /dev/mem")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Closes: https://lore.kernel.org/all/232960c2-81db-47ca-a337-38c4bce5f997@arm.com/T/#u
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:14 -07:00
Mark Brown
cfc695109a selftests/mm: deduplicate test names in madv_populate
The madv_populate selftest has some repetitive code for several different
cases that it covers, included repeated test names used in
ksft_test_result() reports.  This causes problems for automation, the test
name is used to both track the test between runs and distinguish between
multiple tests within the same run.  Fix this by tweaking the messages
with duplication to be more specific about the contexts they're in.

Link: https://lkml.kernel.org/r/20250522-selftests-mm-madv-populate-dedupe-v1-1-fd1dedd79b4b@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:13 -07:00
Zi Yan
115155901d selftests/mm: skip hugevm test if kernel config file is not present
When running hugevm tests in a machine without kernel config present,
e.g., a VM running a kernel without CONFIG_IKCONFIG_PROC nor
/boot/config-*, skip hugevm tests, which reads kernel config to get page
table level information.

Link: https://lkml.kernel.org/r/20250516132938.356627-3-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Adam Sindelar <adam@wowsignal.io>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:11 -07:00
Zi Yan
6d21130312 selftests/mm: skip guard_regions.uffd tests when uffd is not present
Patch series "Skip mm selftests instead when kernel features are not
present", v2.

Two guard_regions tests on userfaultfd fail when userfaultfd is not
present.  Skip them instead.

hugevm test reads kernel config to get page table level information and
fails when neither /proc/config.gz nor /boot/config-* is present.  Skip it
instead.


This patch (of 2):

When userfaultfd is not compiled into kernel, userfaultfd() returns -1,
causing guard_regions.uffd tests to fail.  Skip the tests instead.

Link: https://lkml.kernel.org/r/20250516132938.356627-1-ziy@nvidia.com
Link: https://lkml.kernel.org/r/20250516132938.356627-2-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Adam Sindelar <adam@wowsignal.io>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:11 -07:00
Mark Brown
9abb8c208f selftests/mm: deduplicate default page size test results in thuge-gen
The thuge-gen test program runs mmap() and shmget() tests for both every
available page size and the default page size, resulting in two tests for
the default size.  These tests are distinct since the flags in the default
case do not specify an explicit size, add the flags to the test name that
is logged to deduplicate.

Link: https://lkml.kernel.org/r/20250515-selfests-mm-thuge-gen-dup-v1-1-057d2836553f@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Dev Jain <dev.jain@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:08 -07:00
Mark Brown
62973e3867 selftests/mm: deduplicate test logging in test_mlock_lock()
The mlock2-tests test_mlock_lock() test reports two test results with an
identical string, one reporitng if it successfully locked a block of
memory and another reporting if the lock is still present after doing an
unlock (following a similar pattern to other tests in the same program). 
This confuses test automation since the test string is used to deduplicate
tests, change the post unlock test to report "Unlocked" instead like the
other tests to fix this.

Link: https://lkml.kernel.org/r/20250515-selftest-mm-mlock2-dup-v1-1-963d5d7d243a@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Dev Jain <dev.jain@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31 22:46:08 -07:00
Linus Torvalds
7d4e49a77d - The 3 patch series "hung_task: extend blocking task stacktrace dump to
semaphore" from Lance Yang enhances the hung task detector.  The
   detector presently dumps the blocking tasks's stack when it is blocked
   on a mutex.  Lance's series extends this to semaphores.
 
 - The 2 patch series "nilfs2: improve sanity checks in dirty state
   propagation" from Wentao Liang addresses a couple of minor flaws in
   nilfs2.
 
 - The 2 patch series "scripts/gdb: Fixes related to lx_per_cpu()" from
   Illia Ostapyshyn fixes a couple of issues in the gdb scripts.
 
 - The 9 patch series "Support kdump with LUKS encryption by reusing LUKS
   volume keys" from Coiby Xu addresses a usability problem with kdump.
   When the dump device is LUKS-encrypted, the kdump kernel may not have
   the keys to the encrypted filesystem.  A full writeup of this is in the
   series [0/N] cover letter.
 
 - The 2 patch series "sysfs: add counters for lockups and stalls" from
   Max Kellermann adds /sys/kernel/hardlockup_count and
   /sys/kernel/hardlockup_count and /sys/kernel/rcu_stall_count.
 
 - The 3 patch series "fork: Page operation cleanups in the fork code"
   from Pasha Tatashin implements a number of code cleanups in fork.c.
 
 - The 3 patch series "scripts/gdb/symbols: determine KASLR offset on
   s390 during early boot" from Ilya Leoshkevich fixes some s390 issues in
   the gdb scripts.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaDuCvQAKCRDdBJ7gKXxA
 jrkxAQCnFAp/uK9ckkbN4nfpJ0+OMY36C+A+dawSDtuRsIkXBAEAq3e6MNAUdg5W
 Ca0cXdgSIq1Op7ZKEA+66Km6Rfvfow8=
 =g45L
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "hung_task: extend blocking task stacktrace dump to semaphore" from
   Lance Yang enhances the hung task detector.

   The detector presently dumps the blocking tasks's stack when it is
   blocked on a mutex. Lance's series extends this to semaphores

 - "nilfs2: improve sanity checks in dirty state propagation" from
   Wentao Liang addresses a couple of minor flaws in nilfs2

 - "scripts/gdb: Fixes related to lx_per_cpu()" from Illia Ostapyshyn
   fixes a couple of issues in the gdb scripts

 - "Support kdump with LUKS encryption by reusing LUKS volume keys" from
   Coiby Xu addresses a usability problem with kdump.

   When the dump device is LUKS-encrypted, the kdump kernel may not have
   the keys to the encrypted filesystem. A full writeup of this is in
   the series [0/N] cover letter

 - "sysfs: add counters for lockups and stalls" from Max Kellermann adds
   /sys/kernel/hardlockup_count and /sys/kernel/hardlockup_count and
   /sys/kernel/rcu_stall_count

 - "fork: Page operation cleanups in the fork code" from Pasha Tatashin
   implements a number of code cleanups in fork.c

 - "scripts/gdb/symbols: determine KASLR offset on s390 during early
   boot" from Ilya Leoshkevich fixes some s390 issues in the gdb
   scripts

* tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (67 commits)
  llist: make llist_add_batch() a static inline
  delayacct: remove redundant code and adjust indentation
  squashfs: add optional full compressed block caching
  crash_dump, nvme: select CONFIGFS_FS as built-in
  scripts/gdb/symbols: determine KASLR offset on s390 during early boot
  scripts/gdb/symbols: factor out pagination_off()
  scripts/gdb/symbols: factor out get_vmlinux()
  kernel/panic.c: format kernel-doc comments
  mailmap: update and consolidate Casey Connolly's name and email
  nilfs2: remove wbc->for_reclaim handling
  fork: define a local GFP_VMAP_STACK
  fork: check charging success before zeroing stack
  fork: clean-up naming of vm_stack/vm_struct variables in vmap stacks code
  fork: clean-up ifdef logic around stack allocation
  kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count
  kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count
  x86/crash: make the page that stores the dm crypt keys inaccessible
  x86/crash: pass dm crypt keys to kdump kernel
  Revert "x86/mm: Remove unused __set_memory_prot()"
  crash_dump: retrieve dm crypt keys in kdump kernel
  ...
2025-05-31 19:12:53 -07:00
Linus Torvalds
00c010e130 - The 11 patch series "Add folio_mk_pte()" from Matthew Wilcox
simplifies the act of creating a pte which addresses the first page in a
   folio and reduces the amount of plumbing which architecture must
   implement to provide this.
 
 - The 8 patch series "Misc folio patches for 6.16" from Matthew Wilcox
   is a shower of largely unrelated folio infrastructure changes which
   clean things up and better prepare us for future work.
 
 - The 3 patch series "memory,x86,acpi: hotplug memory alignment
   advisement" from Gregory Price adds early-init code to prevent x86 from
   leaving physical memory unused when physical address regions are not
   aligned to memory block size.
 
 - The 2 patch series "mm/compaction: allow more aggressive proactive
   compaction" from Michal Clapinski provides some tuning of the (sadly,
   hard-coded (more sadly, not auto-tuned)) thresholds for our invokation
   of proactive compaction.  In a simple test case, the reduction of a guest
   VM's memory consumption was dramatic.
 
 - The 8 patch series "Minor cleanups and improvements to swap freeing
   code" from Kemeng Shi provides some code cleaups and a small efficiency
   improvement to this part of our swap handling code.
 
 - The 6 patch series "ptrace: introduce PTRACE_SET_SYSCALL_INFO API"
   from Dmitry Levin adds the ability for a ptracer to modify syscalls
   arguments.  At this time we can alter only "system call information that
   are used by strace system call tampering, namely, syscall number,
   syscall arguments, and syscall return value.
 
   This series should have been incorporated into mm.git's "non-MM"
   branch, but I goofed.
 
 - The 3 patch series "fs/proc: extend the PAGEMAP_SCAN ioctl to report
   guard regions" from Andrei Vagin extends the info returned by the
   PAGEMAP_SCAN ioctl against /proc/pid/pagemap.  This permits CRIU to more
   efficiently get at the info about guard regions.
 
 - The 2 patch series "Fix parameter passed to page_mapcount_is_type()"
   from Gavin Shan implements that fix.  No runtime effect is expected
   because validate_page_before_insert() happens to fix up this error.
 
 - The 3 patch series "kernel/events/uprobes: uprobe_write_opcode()
   rewrite" from David Hildenbrand basically brings uprobe text poking into
   the current decade.  Remove a bunch of hand-rolled implementation in
   favor of using more current facilities.
 
 - The 3 patch series "mm/ptdump: Drop assumption that pxd_val() is u64"
   from Anshuman Khandual provides enhancements and generalizations to the
   pte dumping code.  This might be needed when 128-bit Page Table
   Descriptors are enabled for ARM.
 
 - The 12 patch series "Always call constructor for kernel page tables"
   from Kevin Brodsky "ensures that the ctor/dtor is always called for
   kernel pgtables, as it already is for user pgtables".  This permits the
   addition of more functionality such as "insert hooks to protect page
   tables".  This change does result in various architectures performing
   unnecesary work, but this is fixed up where it is anticipated to occur.
 
 - The 9 patch series "Rust support for mm_struct, vm_area_struct, and
   mmap" from Alice Ryhl adds plumbing to permit Rust access to core MM
   structures.
 
 - The 3 patch series "fix incorrectly disallowed anonymous VMA merges"
   from Lorenzo Stoakes takes advantage of some VMA merging opportunities
   which we've been missing for 15 years.
 
 - The 4 patch series "mm/madvise: batch tlb flushes for MADV_DONTNEED
   and MADV_FREE" from SeongJae Park optimizes process_madvise()'s TLB
   flushing.  Instead of flushing each address range in the provided iovec,
   we batch the flushing across all the iovec entries.  The syscall's cost
   was approximately halved with a microbenchmark which was designed to
   load this particular operation.
 
 - The 6 patch series "Track node vacancy to reduce worst case allocation
   counts" from Sidhartha Kumar makes the maple tree smarter about its node
   preallocation.  stress-ng mmap performance increased by single-digit
   percentages and the amount of unnecessarily preallocated memory was
   dramaticelly reduced.
 
 - The 3 patch series "mm/gup: Minor fix, cleanup and improvements" from
   Baoquan He removes a few unnecessary things which Baoquan noted when
   reading the code.
 
 - The 3 patch series ""Enhance sysfs handling for memory hotplug in
   weighted interleave" from Rakie Kim "enhances the weighted interleave
   policy in the memory management subsystem by improving sysfs handling,
   fixing memory leaks, and introducing dynamic sysfs updates for memory
   hotplug support".  Fixes things on error paths which we are unlikely to
   hit.
 
 - The 7 patch series "mm/damon: auto-tune DAMOS for NUMA setups
   including tiered memory" from SeongJae Park introduces new DAMOS quota
   goal metrics which eliminate the manual tuning which is required when
   utilizing DAMON for memory tiering.
 
 - The 5 patch series "mm/vmalloc.c: code cleanup and improvements" from
   Baoquan He provides cleanups and small efficiency improvements which
   Baoquan found via code inspection.
 
 - The 2 patch series "vmscan: enforce mems_effective during demotion"
   from Gregory Price "changes reclaim to respect cpuset.mems_effective
   during demotion when possible".  because "presently, reclaim explicitly
   ignores cpuset.mems_effective when demoting, which may cause the cpuset
   settings to violated." "This is useful for isolating workloads on a
   multi-tenant system from certain classes of memory more consistently."
 
 - The 2 patch series ""Clean up split_huge_pmd_locked() and remove
   unnecessary folio pointers" from Gavin Guo provides minor cleanups and
   efficiency gains in in the huge page splitting and migrating code.
 
 - The 3 patch series "Use kmem_cache for memcg alloc" from Huan Yang
   creates a slab cache for `struct mem_cgroup', yielding improved memory
   utilization.
 
 - The 4 patch series "add max arg to swappiness in memory.reclaim and
   lru_gen" from Zhongkun He adds a new "max" argument to the "swappiness="
   argument for memory.reclaim MGLRU's lru_gen.  This directs proactive
   reclaim to reclaim from only anon folios rather than file-backed folios.
 
 - The 17 patch series "kexec: introduce Kexec HandOver (KHO)" from Mike
   Rapoport is the first step on the path to permitting the kernel to
   maintain existing VMs while replacing the host kernel via file-based
   kexec.  At this time only memblock's reserve_mem is preserved.
 
 - The 7 patch series "mm: Introduce for_each_valid_pfn()" from David
   Woodhouse provides and uses a smarter way of looping over a pfn range.
   By skipping ranges of invalid pfns.
 
 - The 2 patch series "sched/numa: Skip VMA scanning on memory pinned to
   one NUMA node via cpuset.mems" from Libo Chen removes a lot of pointless
   VMA scanning when a task is pinned a single NUMA mode.  Dramatic
   performance benefits were seen in some real world cases.
 
 - The 2 patch series "JFS: Implement migrate_folio for
   jfs_metapage_aops" from Shivank Garg addresses a warning which occurs
   during memory compaction when using JFS.
 
 - The 4 patch series "move all VMA allocation, freeing and duplication
   logic to mm" from Lorenzo Stoakes moves some VMA code from kernel/fork.c
   into the more appropriate mm/vma.c.
 
 - The 6 patch series "mm, swap: clean up swap cache mapping helper" from
   Kairui Song provides code consolidation and cleanups related to the
   folio_index() function.
 
 - The 2 patch series "mm/gup: Cleanup memfd_pin_folios()" from Vishal
   Moola does that.
 
 - The 8 patch series "memcg: Fix test_memcg_min/low test failures" from
   Waiman Long addresses some bogus failures which are being reported by
   the test_memcontrol selftest.
 
 - The 3 patch series "eliminate mmap() retry merge, add .mmap_prepare
   hook" from Lorenzo Stoakes commences the deprecation of
   file_operations.mmap() in favor of the new
   file_operations.mmap_prepare().  The latter is more restrictive and
   prevents drivers from messing with things in ways which, amongst other
   problems, may defeat VMA merging.
 
 - The 4 patch series "memcg: decouple memcg and objcg stocks"" from
   Shakeel Butt decouples the per-cpu memcg charge cache from the objcg's
   one.  This is a step along the way to making memcg and objcg charging
   NMI-safe, which is a BPF requirement.
 
 - The 6 patch series "mm/damon: minor fixups and improvements for code,
   tests, and documents" from SeongJae Park is "yet another batch of
   miscellaneous DAMON changes.  Fix and improve minor problems in code,
   tests and documents."
 
 - The 7 patch series "memcg: make memcg stats irq safe" from Shakeel
   Butt converts memcg stats to be irq safe.  Another step along the way to
   making memcg charging and stats updates NMI-safe, a BPF requirement.
 
 - The 4 patch series "Let unmap_hugepage_range() and several related
   functions take folio instead of page" from Fan Ni provides folio
   conversions in the hugetlb code.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaDt5qgAKCRDdBJ7gKXxA
 ju6XAP9nTiSfRz8Cz1n5LJZpFKEGzLpSihCYyR6P3o1L9oe3mwEAlZ5+XAwk2I5x
 Qqb/UGMEpilyre1PayQqOnct3aSL9Ao=
 =tYYm
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "Add folio_mk_pte()" from Matthew Wilcox simplifies the act of
   creating a pte which addresses the first page in a folio and reduces
   the amount of plumbing which architecture must implement to provide
   this.

 - "Misc folio patches for 6.16" from Matthew Wilcox is a shower of
   largely unrelated folio infrastructure changes which clean things up
   and better prepare us for future work.

 - "memory,x86,acpi: hotplug memory alignment advisement" from Gregory
   Price adds early-init code to prevent x86 from leaving physical
   memory unused when physical address regions are not aligned to memory
   block size.

 - "mm/compaction: allow more aggressive proactive compaction" from
   Michal Clapinski provides some tuning of the (sadly, hard-coded (more
   sadly, not auto-tuned)) thresholds for our invokation of proactive
   compaction. In a simple test case, the reduction of a guest VM's
   memory consumption was dramatic.

 - "Minor cleanups and improvements to swap freeing code" from Kemeng
   Shi provides some code cleaups and a small efficiency improvement to
   this part of our swap handling code.

 - "ptrace: introduce PTRACE_SET_SYSCALL_INFO API" from Dmitry Levin
   adds the ability for a ptracer to modify syscalls arguments. At this
   time we can alter only "system call information that are used by
   strace system call tampering, namely, syscall number, syscall
   arguments, and syscall return value.

   This series should have been incorporated into mm.git's "non-MM"
   branch, but I goofed.

 - "fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions" from
   Andrei Vagin extends the info returned by the PAGEMAP_SCAN ioctl
   against /proc/pid/pagemap. This permits CRIU to more efficiently get
   at the info about guard regions.

 - "Fix parameter passed to page_mapcount_is_type()" from Gavin Shan
   implements that fix. No runtime effect is expected because
   validate_page_before_insert() happens to fix up this error.

 - "kernel/events/uprobes: uprobe_write_opcode() rewrite" from David
   Hildenbrand basically brings uprobe text poking into the current
   decade. Remove a bunch of hand-rolled implementation in favor of
   using more current facilities.

 - "mm/ptdump: Drop assumption that pxd_val() is u64" from Anshuman
   Khandual provides enhancements and generalizations to the pte dumping
   code. This might be needed when 128-bit Page Table Descriptors are
   enabled for ARM.

 - "Always call constructor for kernel page tables" from Kevin Brodsky
   ensures that the ctor/dtor is always called for kernel pgtables, as
   it already is for user pgtables.

   This permits the addition of more functionality such as "insert hooks
   to protect page tables". This change does result in various
   architectures performing unnecesary work, but this is fixed up where
   it is anticipated to occur.

 - "Rust support for mm_struct, vm_area_struct, and mmap" from Alice
   Ryhl adds plumbing to permit Rust access to core MM structures.

 - "fix incorrectly disallowed anonymous VMA merges" from Lorenzo
   Stoakes takes advantage of some VMA merging opportunities which we've
   been missing for 15 years.

 - "mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE" from
   SeongJae Park optimizes process_madvise()'s TLB flushing.

   Instead of flushing each address range in the provided iovec, we
   batch the flushing across all the iovec entries. The syscall's cost
   was approximately halved with a microbenchmark which was designed to
   load this particular operation.

 - "Track node vacancy to reduce worst case allocation counts" from
   Sidhartha Kumar makes the maple tree smarter about its node
   preallocation.

   stress-ng mmap performance increased by single-digit percentages and
   the amount of unnecessarily preallocated memory was dramaticelly
   reduced.

 - "mm/gup: Minor fix, cleanup and improvements" from Baoquan He removes
   a few unnecessary things which Baoquan noted when reading the code.

 - ""Enhance sysfs handling for memory hotplug in weighted interleave"
   from Rakie Kim "enhances the weighted interleave policy in the memory
   management subsystem by improving sysfs handling, fixing memory
   leaks, and introducing dynamic sysfs updates for memory hotplug
   support". Fixes things on error paths which we are unlikely to hit.

 - "mm/damon: auto-tune DAMOS for NUMA setups including tiered memory"
   from SeongJae Park introduces new DAMOS quota goal metrics which
   eliminate the manual tuning which is required when utilizing DAMON
   for memory tiering.

 - "mm/vmalloc.c: code cleanup and improvements" from Baoquan He
   provides cleanups and small efficiency improvements which Baoquan
   found via code inspection.

 - "vmscan: enforce mems_effective during demotion" from Gregory Price
   changes reclaim to respect cpuset.mems_effective during demotion when
   possible. because presently, reclaim explicitly ignores
   cpuset.mems_effective when demoting, which may cause the cpuset
   settings to violated.

   This is useful for isolating workloads on a multi-tenant system from
   certain classes of memory more consistently.

 - "Clean up split_huge_pmd_locked() and remove unnecessary folio
   pointers" from Gavin Guo provides minor cleanups and efficiency gains
   in in the huge page splitting and migrating code.

 - "Use kmem_cache for memcg alloc" from Huan Yang creates a slab cache
   for `struct mem_cgroup', yielding improved memory utilization.

 - "add max arg to swappiness in memory.reclaim and lru_gen" from
   Zhongkun He adds a new "max" argument to the "swappiness=" argument
   for memory.reclaim MGLRU's lru_gen.

   This directs proactive reclaim to reclaim from only anon folios
   rather than file-backed folios.

 - "kexec: introduce Kexec HandOver (KHO)" from Mike Rapoport is the
   first step on the path to permitting the kernel to maintain existing
   VMs while replacing the host kernel via file-based kexec. At this
   time only memblock's reserve_mem is preserved.

 - "mm: Introduce for_each_valid_pfn()" from David Woodhouse provides
   and uses a smarter way of looping over a pfn range. By skipping
   ranges of invalid pfns.

 - "sched/numa: Skip VMA scanning on memory pinned to one NUMA node via
   cpuset.mems" from Libo Chen removes a lot of pointless VMA scanning
   when a task is pinned a single NUMA mode.

   Dramatic performance benefits were seen in some real world cases.

 - "JFS: Implement migrate_folio for jfs_metapage_aops" from Shivank
   Garg addresses a warning which occurs during memory compaction when
   using JFS.

 - "move all VMA allocation, freeing and duplication logic to mm" from
   Lorenzo Stoakes moves some VMA code from kernel/fork.c into the more
   appropriate mm/vma.c.

 - "mm, swap: clean up swap cache mapping helper" from Kairui Song
   provides code consolidation and cleanups related to the folio_index()
   function.

 - "mm/gup: Cleanup memfd_pin_folios()" from Vishal Moola does that.

 - "memcg: Fix test_memcg_min/low test failures" from Waiman Long
   addresses some bogus failures which are being reported by the
   test_memcontrol selftest.

 - "eliminate mmap() retry merge, add .mmap_prepare hook" from Lorenzo
   Stoakes commences the deprecation of file_operations.mmap() in favor
   of the new file_operations.mmap_prepare().

   The latter is more restrictive and prevents drivers from messing with
   things in ways which, amongst other problems, may defeat VMA merging.

 - "memcg: decouple memcg and objcg stocks"" from Shakeel Butt decouples
   the per-cpu memcg charge cache from the objcg's one.

   This is a step along the way to making memcg and objcg charging
   NMI-safe, which is a BPF requirement.

 - "mm/damon: minor fixups and improvements for code, tests, and
   documents" from SeongJae Park is yet another batch of miscellaneous
   DAMON changes. Fix and improve minor problems in code, tests and
   documents.

 - "memcg: make memcg stats irq safe" from Shakeel Butt converts memcg
   stats to be irq safe. Another step along the way to making memcg
   charging and stats updates NMI-safe, a BPF requirement.

 - "Let unmap_hugepage_range() and several related functions take folio
   instead of page" from Fan Ni provides folio conversions in the
   hugetlb code.

* tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (285 commits)
  mm: pcp: increase pcp->free_count threshold to trigger free_high
  mm/hugetlb: convert use of struct page to folio in __unmap_hugepage_range()
  mm/hugetlb: refactor __unmap_hugepage_range() to take folio instead of page
  mm/hugetlb: refactor unmap_hugepage_range() to take folio instead of page
  mm/hugetlb: pass folio instead of page to unmap_ref_private()
  memcg: objcg stock trylock without irq disabling
  memcg: no stock lock for cpu hot-unplug
  memcg: make __mod_memcg_lruvec_state re-entrant safe against irqs
  memcg: make count_memcg_events re-entrant safe against irqs
  memcg: make mod_memcg_state re-entrant safe against irqs
  memcg: move preempt disable to callers of memcg_rstat_updated
  memcg: memcg_rstat_updated re-entrant safe against irqs
  mm: khugepaged: decouple SHMEM and file folios' collapse
  selftests/eventfd: correct test name and improve messages
  alloc_tag: check mem_profiling_support in alloc_tag_init
  Docs/damon: update titles and brief introductions to explain DAMOS
  selftests/damon/_damon_sysfs: read tried regions directories in order
  mm/damon/tests/core-kunit: add a test for damos_set_filters_default_reject()
  mm/damon/paddr: remove unused variable, folio_list, in damon_pa_stat()
  mm/damon/sysfs-schemes: fix wrong comment on damons_sysfs_quota_goal_metric_strs
  ...
2025-05-31 15:44:16 -07:00
Uday Shankar
17574aa2a0 selftests: ublk: add stress test for per io daemons
Add a new test_stress_06 for the per io daemons feature. This is just a
copy of test_stress_01 with the per_io_tasks flag added, with varying
amounts of nthreads. This test is able to reproduce a panic which was
caught manually during development [1]; in the current version of this
patch set, it passes.

Note that this commit also makes all stress tests using the
run_io_and_remove helper more stressful by additionally exercising the
batch submit (queue_rqs) path.

[1] https://lore.kernel.org/linux-block/aDgwGoGCEpwd1mFY@fedora/

Suggested-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-8-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-31 14:38:53 -06:00
Uday Shankar
236918d3e9 selftests: ublk: add functional test for per io daemons
Add a new test test_generic_12 which:

- sets up a ublk server with per_io_tasks and a different number of ublk
  server threads and ublk_queues. This is possible now that these
  objects are decoupled
- runs some I/O load from a single CPU
- verifies that all the ublk server threads handle some I/O

Before this changeset, this test fails, since I/O issued from one CPU is
always handled by the one ublk server thread. After this changeset, the
test passes.

In the future, the last check above may be strengthened to "verify that
all ublk server threads handle the same amount of I/O." However, this
requires some adjustments/bugfixes to tag allocation, so this work is
postponed to a followup.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-7-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-31 14:38:48 -06:00
Uday Shankar
abe54c1603 selftests: ublk: kublk: decouple ublk_queues from ublk server threads
Add support in kublk for decoupled ublk_queues and ublk server threads.
kublk now has two modes of operation:

- (preexisting mode) threads and queues are paired 1:1, and each thread
  services all the I/Os of one queue
- (new mode) thread and queue counts are independently configurable.
  threads service I/Os in a way that balances load across threads even
  if load is not balanced over queues.

The default is the preexisting mode. The new mode is activated by
passing the --per_io_tasks flag.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-6-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-31 14:38:43 -06:00
Uday Shankar
b9848ca7a7 selftests: ublk: kublk: move per-thread data out of ublk_queue
Towards the goal of decoupling ublk_queues from ublk server threads,
move resources/data that should be per-thread rather than per-queue out
of ublk_queue and into a new struct ublk_thread.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-5-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-31 14:38:39 -06:00
Uday Shankar
8f75ba28b8 selftests: ublk: kublk: lift queue initialization out of thread
Currently, each ublk server I/O handler thread initializes its own
queue. However, as we move towards decoupled ublk_queues and ublk server
threads, this model does not make sense anymore, as there will no longer
be a concept of a thread having "its own" queue. So lift queue
initialization out of the per-thread ublk_io_handler_fn and into a loop
in ublk_start_daemon (which runs once for each device).

There is a part of ublk_queue_init (ring initialization) which does
actually need to happen on the thread that will use the ring; that is
separated into a separate ublk_thread_init which is still called by each
I/O handler thread.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-4-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-31 14:38:35 -06:00
Uday Shankar
9773709752 selftests: ublk: kublk: tie sqe allocation to io instead of queue
We currently have a helper ublk_queue_alloc_sqes which the ublk targets
use to allocate SQEs for their own operations. However, as we move
towards decoupled ublk_queues and ublk server threads, this helper does
not make sense anymore. SQEs are allocated from rings, and we will have
one ring per thread to avoid locking. Change the SQE allocation helper
to ublk_io_alloc_sqes. Currently this still allocates SQEs from the io's
queue's ring, but when we fully decouple threads and queues, it will
allocate from the io's thread's ring instead.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-3-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-31 14:38:30 -06:00
Uday Shankar
bf098d7269 selftests: ublk: kublk: plumb q_id in io_uring user_data
Currently, when we process CQEs, we know which ublk_queue we are working
on because we know which ring we are working on, and ublk_queues and
rings are in 1:1 correspondence. However, as we decouple ublk_queues
from ublk server threads, ublk_queues and rings will no longer be in 1:1
correspondence - each ublk server thread will have a ring, and each
thread may issue commands against more than one ublk_queue. So in order
to know which ublk_queue a CQE refers to, plumb that information in the
associated SQE's user_data.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250529-ublk_task_per_io-v8-2-e9d3b119336a@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-31 14:38:26 -06:00
Mark Brown
4cb6c8af85 selftests/filesystems: Fix build of anon_inode_test
The newly added anon_inode_test test fails to build due to attempting to
include a nonexisting overlayfs/wrapper.h:

  anon_inode_test.c:10:10: fatal error: overlayfs/wrappers.h: No such file or directory
     10 | #include "overlayfs/wrappers.h"
        |          ^~~~~~~~~~~~~~~~~~~~~~

This is due to 0bd92b9fe5 ("selftests/filesystems: move wrapper.h out
of overlayfs subdir") which was added in the vfs-6.16.selftests branch
which was based on -rc5 and did not contain the newly added test so once
things were merged into mainline the build started failing - both parent
commits are fine.

Fixes: 3e406741b1 ("Merge tag 'vfs-6.16-rc1.selftests' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-05-31 08:43:53 -07:00
Ian Rogers
a913ef6fd8 perf callchain: Always populate the addr_location map when adding IP
Dropping symbols also meant the callchain maps wasn't populated, but
the callchain map is needed to find the DSO.

Plumb the symbols option better, falling back to thread__find_map()
rather than thread__find_symbol() when symbols are disabled.

Fixes: 02b2705017 ("perf callchain: Allow symbols to be optional when resolving a callchain")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <martin.liska@hey.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Matt Fleming <matt@readmodwrite.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steve Clevenger <scclevenger@os.amperecomputing.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Yujie Liu <yujie.liu@intel.com>
Cc: Zhongqiu Han <quic_zhonhan@quicinc.com>
Cc: Zixian Cai <fzczx123@gmail.com>
Link: https://lore.kernel.org/r/20250529044000.759937-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-31 08:58:30 -03:00
Namhyung Kim
0df14c1f1e perf lock contention: Reject more than 10ms delays for safety
Delaying kernel operations can be dangerous and the kernel may kill
(non-sleepable) BPF programs running for long in the future.

Limit the max delay to 10ms and update the document about it.

  $ sudo ./perf lock con -abl -J 100000us@cgroup_mutex true
  lock delay is too long: 100000us (> 10ms)

   Usage: perf lock contention [<options>]

      -J, --inject-delay <TIME@FUNC>
                            Inject delays to specific locks

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250515181042.555189-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-31 08:45:24 -03:00
Paul Chaignon
ead7f9b8de bpf: Fix L4 csum update on IPv6 in CHECKSUM_COMPLETE
In Cilium, we use bpf_csum_diff + bpf_l4_csum_replace to, among other
things, update the L4 checksum after reverse SNATing IPv6 packets. That
use case is however not currently supported and leads to invalid
skb->csum values in some cases. This patch adds support for IPv6 address
changes in bpf_l4_csum_update via a new flag.

When calling bpf_l4_csum_replace in Cilium, it ends up calling
inet_proto_csum_replace_by_diff:

    1:  void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb,
    2:                                       __wsum diff, bool pseudohdr)
    3:  {
    4:      if (skb->ip_summed != CHECKSUM_PARTIAL) {
    5:          csum_replace_by_diff(sum, diff);
    6:          if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr)
    7:              skb->csum = ~csum_sub(diff, skb->csum);
    8:      } else if (pseudohdr) {
    9:          *sum = ~csum_fold(csum_add(diff, csum_unfold(*sum)));
    10:     }
    11: }

The bug happens when we're in the CHECKSUM_COMPLETE state. We've just
updated one of the IPv6 addresses. The helper now updates the L4 header
checksum on line 5. Next, it updates skb->csum on line 7. It shouldn't.

For an IPv6 packet, the updates of the IPv6 address and of the L4
checksum will cancel each other. The checksums are set such that
computing a checksum over the packet including its checksum will result
in a sum of 0. So the same is true here when we update the L4 checksum
on line 5. We'll update it as to cancel the previous IPv6 address
update. Hence skb->csum should remain untouched in this case.

The same bug doesn't affect IPv4 packets because, in that case, three
fields are updated: the IPv4 address, the IP checksum, and the L4
checksum. The change to the IPv4 address and one of the checksums still
cancel each other in skb->csum, but we're left with one checksum update
and should therefore update skb->csum accordingly. That's exactly what
inet_proto_csum_replace_by_diff does.

This special case for IPv6 L4 checksums is also described atop
inet_proto_csum_replace16, the function we should be using in this case.

This patch introduces a new bpf_l4_csum_replace flag, BPF_F_IPV6,
to indicate that we're updating the L4 checksum of an IPv6 packet. When
the flag is set, inet_proto_csum_replace_by_diff will skip the
skb->csum update.

Fixes: 7d672345ed ("bpf: add generic bpf_csum_diff helper")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/96a6bc3a443e6f0b21ff7b7834000e17fb549e05.1748509484.git.paul.chaignon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-30 19:53:51 -07:00
Linus Torvalds
b78f1293f9 tracing updates for v6.16:
- Have module addresses get updated in the persistent ring buffer
 
   The addresses of the modules from the previous boot are saved in the
   persistent ring buffer. If the same modules are loaded and an address is
   in the old buffer points to an address that was both saved in the
   persistent ring buffer and is loaded in memory, shift the address to point
   to the address that is loaded in memory in the trace event.
 
 - Print function names for irqs off and preempt off callsites
 
   When ignoring the print fmt of a trace event and just printing the fields
   directly, have the fields for preempt off and irqs off events still show
   the function name (via kallsyms) instead of just showing the raw address.
 
 - Clean ups of the histogram code
 
   The histogram functions saved over 800 bytes on the stack to process
   events as they come in. Instead, create per-cpu buffers that can hold this
   information and have a separate location for each context level (thread,
   softirq, IRQ and NMI).
 
   Also add some more comments to the code.
 
 - Add "common_comm" field for histograms
 
   Add "common_comm" that uses the current->comm as a field in an event
   histogram and acts like any of the other fields of the event.
 
 - Show "subops" in the enabled_functions file
 
   When the function graph infrastructure is used, a subsystem has a "subops"
   that it attaches its callback function to. Instead of the
   enabled_functions just showing a function calling the function that calls
   the subops functions, also show the subops functions that will get called
   for that function too.
 
 - Add "copy_trace_marker" option to instances
 
   There are cases where an instance is created for tooling to write into,
   but the old tooling has the top level instance hardcoded into the
   application. New tools want to consume the data from an instance and not
   the top level buffer. By adding a copy_trace_marker option, whenever the
   top instance trace_marker is written into, a copy of it is also written
   into the instance with this option set. This allows new tools to read what
   old tools are writing into the top buffer.
 
   If this option is cleared by the top instance, then what is written into
   the trace_marker is not written into the top instance. This is a way to
   redirect the trace_marker writes into another instance.
 
 - Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp()
 
   If a tracepoint is created by DECLARE_TRACE() instead of TRACE_EVENT(),
   then it will not be exposed via tracefs. Currently there's no way to
   differentiate in the kernel the tracepoint functions between those that
   are exposed via tracefs or not. A calling convention has been made
   manually to append a "_tp" prefix for events created by DECLARE_TRACE().
   Instead of doing this manually, force it so that all DECLARE_TRACE()
   events have this notation.
 
 - Use __string() for task->comm in some sched events
 
   Instead of hardcoding the comm to be TASK_COMM_LEN in some of the
   scheduler events use __string() which makes it dynamic. Note, if these
   events are parsed by user space it they may break, and the event may have
   to be converted back to the hardcoded size.
 
 - Have function graph "depth" be unsigned to the user
 
   Internally to the kernel, the "depth" field of the function graph event is
   signed due to -1 being used for end of boundary. What actually gets
   recorded in the event itself is zero or positive. Reflect this to user
   space by showing "depth" as unsigned int and be consistent across all
   events.
 
 - Allow an arbitrary long CPU string to osnoise_cpus_write()
 
   The filtering of which CPUs to write to can exceed 256 bytes. If a machine
   has 256 CPUs, and the filter is to filter every other CPU, the write would
   take a string larger than 256 bytes. Instead of using a fixed size buffer
   on the stack that is 256 bytes, allocate it to handle what is passed in.
 
 - Stop having ftrace check the per-cpu data "disabled" flag
 
   The "disabled" flag in the data structure passed to most ftrace functions
   is checked to know if tracing has been disabled or not. This flag was
   added back in 2008 before the ring buffer had its own way to disable
   tracing. The "disable" flag is now not always set when needed, and the
   ring buffer flag should be used in all locations where the disabled is
   needed. Since the "disable" flag is redundant and incorrect, stop using it.
   Fix up some locations that use the "disable" flag to use the ring buffer
   info.
 
 - Use a new tracer_tracing_disable/enable() instead of data->disable flag
 
   There's a few cases that set the data->disable flag to stop tracing, but
   this flag is not consistently used. It is also an on/off switch where if a
   function set it and calls another function that sets it, the called
   function may incorrectly enable it.
 
   Use a new trace_tracing_disable() and tracer_tracing_enable() that uses a
   counter and can be nested. These use the ring buffer flags which are
   always checked making the disabling more consistent.
 
 - Save the trace clock in the persistent ring buffer
 
   Save what clock was used for tracing in the persistent ring buffer and set
   it back to that clock after a reboot.
 
 - Remove unused reference to a per CPU data pointer in mmiotrace functions
 
 - Remove unused buffer_page field from trace_array_cpu structure
 
 - Remove more strncpy() instances
 
 - Other minor clean ups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaDhiqRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qkheAQDpyRHoXF1AIoEqyahDax8f3vpZQeCH
 B/mn+YJmU1wuVgEA7AFALov5SHKv4IzoARz68GXtR0jGhP5D8uebUhUqDAQ=
 =WmFG
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Have module addresses get updated in the persistent ring buffer

   The addresses of the modules from the previous boot are saved in the
   persistent ring buffer. If the same modules are loaded and an address
   is in the old buffer points to an address that was both saved in the
   persistent ring buffer and is loaded in memory, shift the address to
   point to the address that is loaded in memory in the trace event.

 - Print function names for irqs off and preempt off callsites

   When ignoring the print fmt of a trace event and just printing the
   fields directly, have the fields for preempt off and irqs off events
   still show the function name (via kallsyms) instead of just showing
   the raw address.

 - Clean ups of the histogram code

   The histogram functions saved over 800 bytes on the stack to process
   events as they come in. Instead, create per-cpu buffers that can hold
   this information and have a separate location for each context level
   (thread, softirq, IRQ and NMI).

   Also add some more comments to the code.

 - Add "common_comm" field for histograms

   Add "common_comm" that uses the current->comm as a field in an event
   histogram and acts like any of the other fields of the event.

 - Show "subops" in the enabled_functions file

   When the function graph infrastructure is used, a subsystem has a
   "subops" that it attaches its callback function to. Instead of the
   enabled_functions just showing a function calling the function that
   calls the subops functions, also show the subops functions that will
   get called for that function too.

 - Add "copy_trace_marker" option to instances

   There are cases where an instance is created for tooling to write
   into, but the old tooling has the top level instance hardcoded into
   the application. New tools want to consume the data from an instance
   and not the top level buffer. By adding a copy_trace_marker option,
   whenever the top instance trace_marker is written into, a copy of it
   is also written into the instance with this option set. This allows
   new tools to read what old tools are writing into the top buffer.

   If this option is cleared by the top instance, then what is written
   into the trace_marker is not written into the top instance. This is a
   way to redirect the trace_marker writes into another instance.

 - Have tracepoints created by DECLARE_TRACE() use trace_<name>_tp()

   If a tracepoint is created by DECLARE_TRACE() instead of
   TRACE_EVENT(), then it will not be exposed via tracefs. Currently
   there's no way to differentiate in the kernel the tracepoint
   functions between those that are exposed via tracefs or not. A
   calling convention has been made manually to append a "_tp" prefix
   for events created by DECLARE_TRACE(). Instead of doing this
   manually, force it so that all DECLARE_TRACE() events have this
   notation.

 - Use __string() for task->comm in some sched events

   Instead of hardcoding the comm to be TASK_COMM_LEN in some of the
   scheduler events use __string() which makes it dynamic. Note, if
   these events are parsed by user space it they may break, and the
   event may have to be converted back to the hardcoded size.

 - Have function graph "depth" be unsigned to the user

   Internally to the kernel, the "depth" field of the function graph
   event is signed due to -1 being used for end of boundary. What
   actually gets recorded in the event itself is zero or positive.
   Reflect this to user space by showing "depth" as unsigned int and be
   consistent across all events.

 - Allow an arbitrary long CPU string to osnoise_cpus_write()

   The filtering of which CPUs to write to can exceed 256 bytes. If a
   machine has 256 CPUs, and the filter is to filter every other CPU,
   the write would take a string larger than 256 bytes. Instead of using
   a fixed size buffer on the stack that is 256 bytes, allocate it to
   handle what is passed in.

 - Stop having ftrace check the per-cpu data "disabled" flag

   The "disabled" flag in the data structure passed to most ftrace
   functions is checked to know if tracing has been disabled or not.
   This flag was added back in 2008 before the ring buffer had its own
   way to disable tracing. The "disable" flag is now not always set when
   needed, and the ring buffer flag should be used in all locations
   where the disabled is needed. Since the "disable" flag is redundant
   and incorrect, stop using it. Fix up some locations that use the
   "disable" flag to use the ring buffer info.

 - Use a new tracer_tracing_disable/enable() instead of data->disable
   flag

   There's a few cases that set the data->disable flag to stop tracing,
   but this flag is not consistently used. It is also an on/off switch
   where if a function set it and calls another function that sets it,
   the called function may incorrectly enable it.

   Use a new trace_tracing_disable() and tracer_tracing_enable() that
   uses a counter and can be nested. These use the ring buffer flags
   which are always checked making the disabling more consistent.

 - Save the trace clock in the persistent ring buffer

   Save what clock was used for tracing in the persistent ring buffer
   and set it back to that clock after a reboot.

 - Remove unused reference to a per CPU data pointer in mmiotrace
   functions

 - Remove unused buffer_page field from trace_array_cpu structure

 - Remove more strncpy() instances

 - Other minor clean ups and fixes

* tag 'trace-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (36 commits)
  tracing: Fix compilation warning on arm32
  tracing: Record trace_clock and recover when reboot
  tracing/sched: Use __string() instead of fixed lengths for task->comm
  tracepoint: Have tracepoints created with DECLARE_TRACE() have _tp suffix
  tracing: Cleanup upper_empty() in pid_list
  tracing: Allow the top level trace_marker to write into another instances
  tracing: Add a helper function to handle the dereference arg in verifier
  tracing: Remove unnecessary "goto out" that simply returns ret is trigger code
  tracing: Fix error handling in event_trigger_parse()
  tracing: Rename event_trigger_alloc() to trigger_data_alloc()
  tracing: Replace deprecated strncpy() with strscpy() for stack_trace_filter_buf
  tracing: Remove unused buffer_page field from trace_array_cpu structure
  tracing: Use atomic_inc_return() for updating "disabled" counter in irqsoff tracer
  tracing: Convert the per CPU "disabled" counter to local from atomic
  tracing: branch: Use trace_tracing_is_on_cpu() instead of "disabled" field
  ring-buffer: Add ring_buffer_record_is_on_cpu()
  tracing: Do not use per CPU array_buffer.data->disabled for cpumask
  ftrace: Do not disabled function graph based on "disabled" field
  tracing: kdb: Use tracer_tracing_on/off() instead of setting per CPU disabled
  tracing: Use tracer_tracing_disable() instead of "disabled" field for ftrace_dump_one()
  ...
2025-05-29 21:04:36 -07:00
Linus Torvalds
472c5f736b tracing tools updates for v6.16:
- Set distinctive value for failed tests
 
   When running "make check" that performs tests on rtla the failure is
   checked by examining the output. Instead have the tool return an error
   status if it exceeds the threadhold.
 
 - Define __NR_sched_setattr for LoongArch
 
   Define __NR_sched_setattr to allow this to build for LoongArch.
 
 - Define _GNU_SOURCE for timerlat_bpf.c
 
   Due to modifications of struct sched_attr in utils.h when _GNU_SOURCE is
   not defined, this can cause errors for timerlat_bpf_init() and breakage in
   BPF sample collection mode.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaDeTzBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qokRAP0XJzos+uvQtkGrqiX5SB/rn1s3/tiD
 nZagARyiV06BAwEA+NNzqFyx/BLUwMnpx/HFTnIMGXbRVWCVAEeL3t77zgk=
 =dx7t
 -----END PGP SIGNATURE-----

Merge tag 'trace-tools-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing tools updates from Steven Rostedt:

 - Set distinctive value for failed tests

   When running "make check" that performs tests on rtla the failure is
   checked by examining the output. Instead have the tool return an
   error status if it exceeds the threadhold.

 - Define __NR_sched_setattr for LoongArch

   Define __NR_sched_setattr to allow this to build for LoongArch.

 - Define _GNU_SOURCE for timerlat_bpf.c

   Due to modifications of struct sched_attr in utils.h when _GNU_SOURCE
   is not defined, this can cause errors for timerlat_bpf_init() and
   breakage in BPF sample collection mode.

* tag 'trace-tools-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rtla: Define _GNU_SOURCE in timerlat_bpf.c
  rtla: Define __NR_sched_setattr for LoongArch
  rtla: Set distinctive exit value for failed tests
2025-05-29 20:59:52 -07:00
Anubhav Shelat
8c56bfe53b perf trace: Set errpid to false for rseq and set_robust_list
The 'rseq' and 'set_robust_list' syscalls don't return a pid, so set
errpid for both to false.

Fixes: 0c1019e346 ("perf trace: Mark the 'rseq' arg in the rseq syscall as coming from user space")
Fixes: 1de5b5dcb8 ("perf trace: Mark the 'head' arg in the set_robust_list syscall as coming from user space")
Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250529143334.1469669-2-ashelat@redhat.com
[ Remove explicit .errpid = false, omitting its initialization zeroes it, as noted by Namhyung ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-29 22:09:37 -03:00
Linus Torvalds
1193e205db platform-drivers-x86 for v6.16-1
Highlights:
 
  - alienware-wmi-wmax:
 
    - Add HWMON support
 
    - Add ABI and admin-guide documentation
 
    - Expose GPIO debug methods through debug FS
 
    - Support manual fan control and "custom" thermal profile
 
  - amd/hsmp:
 
    - Add sysfs files to show HSMP telemetry
 
    - Report power readings and limits via hwmon
 
  - amd/isp4: Add AMD ISP platform config for OV05C10
 
  - asus-wmi:
 
    - Refactor Ally suspend/resume to work better with older FW
 
    - hid-asus: check ROG Ally MCU version and warn about old FW versions
 
  - dasharo-acpi: Add driver for Dasharo devices supporting fans and
                  temperatures monitoring
 
  - dell-ddv:
 
    - Expose the battery health and manufacture date to userspace using
      power supply extensions
 
    - Implement the battery matching algorithm
 
  - dell-pc:
 
   - Improve error propagation
 
   - Use faux device
 
  - int3472:
 
    - Add delays to avoid GPIO regulator spikes
 
    - Add handshake pin support
 
    - Make regulator supply name configurable and allow registering more
      than 1 GPIO regulator
 
    - Map mt9m114 powerdown pin to powerenable
 
  - intel/pmc: Add separate SSRAM Telemetry driver
 
  - intel-uncore-freq: Add attributes to show agent types and die ID
 
  - ISST:
 
    - Support SST-TF revision 2 (allows more cores per bucket)
 
    - Support SST-PP revision 2 (fabric 1 frequencies)
 
    - Remove unnecessary SST MSRs restore (the package retains MSRs
      despite CPU offlining)
 
  - mellanox: Add support for SN2201, SN4280, SN5610, and SN5640
 
  - mellanox: mlxbf-pmc: Support additional PMC blocks
 
  - oxpec:
 
    - Add OneXFly variants
 
    - Add support for charge limit, charge thresholds, and turbo LED
 
    - Distinguish current X1 variants to avoid unwanted matching to new
      variants
 
    - Follow hwmon conventions
 
    - Move from hwmon/oxp-sensors to platform/x86 to match the enlarged
      scope
 
  - power: supply:
 
    - Add inhibit-charge-awake (needed by oxpec)
 
    - Add additional battery health status values ("blown fuse" and "cell
      imbalance") (needed by dell-ddv)
 
  - powerwell-ec: Add driver for Portwell EC supporting GPIO and watchdog
 
  - thinkpad-acpi: Support camera shutter switch hotkey
 
  - tuxedo: Add virtual LampArray for TUXEDO NB04 devices
 
  - tools/power/x86/intel-speed-select:
 
    - Support displaying SST-PP revision 2 fields
 
    - Skip uncore frequency update on newer generations of CPUs
 
  - Miscellaneous cleanups / refactoring / improvements
 
 The following is an automated shortlog grouped by driver:
 
 ABI: testing: sysfs-class-oxp:
  -  add missing documentation
  -  add tt_led attribute documentation
 
 Add AMD ISP platform config for OV05C10:
  - Add AMD ISP platform config for OV05C10
 
 alienware-wmi-wmax:
  -  Add a DebugFS interface
  -  Add HWMON support
  -  Add support for manual fan control
  -  Add support for the "custom" thermal profile
  -  Expose GPIO debug methods
  -  Fix awcc_hwmon_fans_init() label logic
  -  Fix uninitialized bitmap in awcc_hwmon_fans_init()
  -  Improve ID processing
  -  Improve internal AWCC API
  -  Improve platform profile probe
  -  Modify supported_thermal_profiles[]
  -  Rename thermal related symbols
 
 amd/hsmp: acpi:
  -  Add sysfs files to display HSMP telemetry
 
 amd/hsmp:
  -  fix building with CONFIG_HWMON=m
  -  Report power via hwmon sensors
  -  Use a single DRIVER_VERSION for all hsmp modules
 
 arm64: huawei-gaokun-ec:
  -  Remove unneeded semicolon
 
 asus-wmi:
  -  fix build without CONFIG_SUSPEND
  -  Refactor Ally suspend/resume
 
 Avoid -Wflex-array-member-not-at-end warning:
  - Avoid -Wflex-array-member-not-at-end warning
 
 barco-p50:
  -  use new GPIO line value setter callbacks
 
 dell-ddv:
  -  Expose the battery health to userspace
  -  Expose the battery manufacture date to userspace
  -  Implement the battery matching algorithm
 
 dell-pc:
  -  Propagate errors when detecting feature support
  -  Transition to faux device
  -  Use non-atomic bitmap operations
 
 docs: ABI:
  -  Fix "aassociated" to "associated"
 
 Documentation/ABI:
  -  Add new attribute for mlxreg-io sysfs interfaces
 
 Documentation: ABI:
  -  Add sysfs platform and debugfs ABI documentation for alienware-wmi
 
 Documentation: admin-guide: laptops:
  -  Add documentation for alienware-wmi
 
 Documentation: admin-guide: pm:
  -  Add documentation for agent_types
  -  Add documentation for die_id
 
 Documentation: wmi: alienware-wmi:
  -  Add GPIO control documentation
 
 Documentation: wmi:
  -  Improve and update alienware-wmi documentation
 
 Do not enable by default during compile testing:
  - Do not enable by default during compile testing
 
 hid-asus:
  -  check ROG Ally MCU version and warn
 
 hwmon:
  -  (oxp-sensors) Add all OneXFly variants
  -  (oxp-sensors) Distinguish the X1 variants
 
 int0002:
  -  use new GPIO line value setter callbacks
 
 int3472:
  -  Add handshake pin support
  -  Add skl_int3472_register_clock() helper
  -  Avoid GPIO regulator spikes
  -  Debug log when remapping pins
  -  Drop unused gpio field from struct int3472_gpio_regulator
  -  Export int3472_discrete_parse_crs()
  -  For mt9m114 sensors map powerdown to powerenable
  -  Make regulator supply name configurable
  -  Move common.h to public includes, symbols to INTEL_INT3472
  -  Prepare for registering more than 1 GPIO regulator
  -  Remove unused sensor_config struct member
  -  Rework AVDD second sensor quirk handling
  -  Stop setting a supply-name for GPIO regulators
  -  Stop using devm_gpiod_get()
 
 intel/pmc:
  -  Convert index variables to be unsigned
  -  Create Intel PMC SSRAM Telemetry driver
  -  Improve pmc_core_get_lpm_req()
  -  Move error handling to init function
  -  Move PMC Core related functions
  -  Move PMC devid to core.h
  -  Remove unneeded header file inclusion
  -  Remove unneeded io operations
  -  Rename core_ssram to ssram_telemetry
  -  Use devm for mutex_init
 
 intel: power-domains:
  -  Add interface to get Linux die ID
 
 intel-uncore-freq:
  -  Add attributes to show agent types
  -  Add attributes to show die_id
 
 intel/vsec:
  -  Change return type of intel_vsec_register
 
 Introduce dasharo-acpi platform driver:
  - Introduce dasharo-acpi platform driver
 
 ISST:
  -  Do Not Restore SST MSRs on CPU Online Operation
  -  Support SST-PP revision 2
  -  Support SST-TF revision 2
  -  Update minor version
 
 mellanox:
  -  Cosmetic changes to improve code style
  -  Introduce support of Nvidia smart switch
  -  Rename field to improve code readability
 
 mlxbf-pmc:
  -  Support additional PMC blocks
 
 mlx-platform:
  -  Add support for new Nvidia system
 
 mlxreg-dpu:
  -  Add initial support for Nvidia DPU
  -  Fix smatch warnings
 
 nvsw-sn2200:
  -  Add support for new system flavour
  -  Fix .items in nvsw_sn2201_busbar_hotplug
 
 oxpec:
  -  Add a lower bounds check in oxp_psy_ext_set_prop()
  -  Add charge threshold and behaviour to OneXPlayer
  -  Add support for the OneXPlayer G1
  -  Add turbo led support to X1 devices
  -  Adhere to sysfs-class-hwmon and enable pwm on 2
  -  Convert defines to using tabs
  -  Follow reverse xmas convention for tt_toggle
  -  Make turbo val apply a bitmask
  -  Move fan speed read to separate function
  -  Move hwmon/oxp-sensors to platform/x86
  -  Move pwm_enable read to its own function
  -  Move pwm value read/write to separate functions
  -  Rename ec group to tt_toggle
  -  Rename rval to ret in tt_toggle
 
 portwell-ec:
  -  Add GPIO and WDT driver for Portwell EC
 
 power: supply:
  -  add inhibit-charge-awake to charge_behaviour
 
 power: supply: core:
  -  Add additional health status values
 
 silicom:
  -  use new GPIO line value setter callbacks
 
 sony-laptop:
  -  Remove unused sony laptop camera code
 
 thermal/drivers/acerhdf:
  -  Constify struct thermal_zone_device_ops
 
 thinkpad-acpi:
  -  Add support for new hotkey for camera shutter switch
 
 tools/power/x86/intel-speed-select:
  -  Skip uncore frequency update
  -  Support SST PP revision 2 fields
  -  v1.23 release
 
 tuxedo:
  -  Add virtual LampArray for TUXEDO NB04 devices
  -  Prevent invalid Kconfig state
 
 Use strscpy()/scnprintf() with acpi_device_name/class():
  - Use strscpy()/scnprintf() with acpi_device_name/class()
 
 Merges:
  -  Merge branch 'fixes' into for-next
  -  Merge branch 'intel-sst' of https://github.com/spandruvada/linux-kernel into for-next
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSCSUwRdwTNL2MhaBlZrE9hU+XOMQUCaDWJ7wAKCRBZrE9hU+XO
 MT8JAQDWW6qBoXuqpd6Yx1oOyROc6gJMQAsS9sNc7I60mGooEAEAnTLhOHDGkKb5
 av1fz/SmXGl7joeRYkZV9FRzJ/26AAk=
 =ytxa
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform drivers updates from Ilpo Järvinen:
 "The changes are mostly business as usual. Besides pdx86 changes, there
  are a few power supply changes needed for related pdx86 features, move
  of oxpec driver from hwmon (oxp-sensors) to pdx86, and one FW version
  warning to hid-asus.

  Highlights:

   - alienware-wmi-wmax:
       - Add HWMON support
       - Add ABI and admin-guide documentation
       - Expose GPIO debug methods through debug FS
       - Support manual fan control and "custom" thermal profile

   - amd/hsmp:
       - Add sysfs files to show HSMP telemetry
       - Report power readings and limits via hwmon

   - amd/isp4: Add AMD ISP platform config for OV05C10

   - asus-wmi:
       - Refactor Ally suspend/resume to work better with older FW
       - hid-asus: check ROG Ally MCU version and warn about old FW versions

   - dasharo-acpi:
       - Add driver for Dasharo devices supporting fans and temperatures
         monitoring

   - dell-ddv:
       - Expose the battery health and manufacture date to userspace
         using power supply extensions
       - Implement the battery matching algorithm

   - dell-pc:
       - Improve error propagation
       - Use faux device

   - int3472:
       - Add delays to avoid GPIO regulator spikes
       - Add handshake pin support
       - Make regulator supply name configurable and allow registering
         more than 1 GPIO regulator
       - Map mt9m114 powerdown pin to powerenable

   - intel/pmc: Add separate SSRAM Telemetry driver

   - intel-uncore-freq: Add attributes to show agent types and die ID

   - ISST:
       - Support SST-TF revision 2 (allows more cores per bucket)
       - Support SST-PP revision 2 (fabric 1 frequencies)
       - Remove unnecessary SST MSRs restore (the package retains MSRs
         despite CPU offlining)

   - mellanox: Add support for SN2201, SN4280, SN5610, and SN5640

   - mellanox: mlxbf-pmc: Support additional PMC blocks

   - oxpec:
       - Add OneXFly variants
       - Add support for charge limit, charge thresholds, and turbo LED
       - Distinguish current X1 variants to avoid unwanted matching to
         new variants
       - Follow hwmon conventions
       - Move from hwmon/oxp-sensors to platform/x86 to match the
         enlarged scope

   - power supply:
       - Add inhibit-charge-awake (needed by oxpec)
       - Add additional battery health status values ("blown fuse" and
         "cell imbalance") (needed by dell-ddv)

   - powerwell-ec: Add driver for Portwell EC supporting GPIO and watchdog

   - thinkpad-acpi: Support camera shutter switch hotkey

   - tuxedo: Add virtual LampArray for TUXEDO NB04 devices

   - tools/power/x86/intel-speed-select:
       - Support displaying SST-PP revision 2 fields
       - Skip uncore frequency update on newer generations of CPUs

   - Miscellaneous cleanups / refactoring / improvements"

* tag 'platform-drivers-x86-v6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (112 commits)
  thermal/drivers/acerhdf: Constify struct thermal_zone_device_ops
  platform/x86/amd/hsmp: fix building with CONFIG_HWMON=m
  platform/x86: asus-wmi: fix build without CONFIG_SUSPEND
  docs: ABI: Fix "aassociated" to "associated"
  platform/x86: Add AMD ISP platform config for OV05C10
  Documentation: admin-guide: pm: Add documentation for die_id
  platform/x86/intel-uncore-freq: Add attributes to show die_id
  platform/x86/intel: power-domains: Add interface to get Linux die ID
  Documentation: admin-guide: pm: Add documentation for agent_types
  platform/x86/intel-uncore-freq: Add attributes to show agent types
  platform/x86/tuxedo: Prevent invalid Kconfig state
  platform/x86: dell-ddv: Expose the battery health to userspace
  platform/x86: dell-ddv: Expose the battery manufacture date to userspace
  platform/x86: dell-ddv: Implement the battery matching algorithm
  power: supply: core: Add additional health status values
  platform/x86/amd/hsmp: acpi: Add sysfs files to display HSMP telemetry
  platform/x86/amd/hsmp: Report power via hwmon sensors
  platform/x86/amd/hsmp: Use a single DRIVER_VERSION for all hsmp modules
  platform/mellanox: mlxreg-dpu: Fix smatch warnings
  platform: mellanox: nvsw-sn2200: Fix .items in nvsw_sn2201_busbar_hotplug
  ...
2025-05-29 10:19:22 -07:00
Linus Torvalds
43db111107 ARM:
* Add large stage-2 mapping (THP) support for non-protected guests when
   pKVM is enabled, clawing back some performance.
 
 * Enable nested virtualisation support on systems that support it,
   though it is disabled by default.
 
 * Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and
   protected modes.
 
 * Large rework of the way KVM tracks architecture features and links
   them with the effects of control bits. While this has no functional
   impact, it ensures correctness of emulation (the data is automatically
   extracted from the published JSON files), and helps dealing with the
   evolution of the architecture.
 
 * Significant changes to the way pKVM tracks ownership of pages,
   avoiding page table walks by storing the state in the hypervisor's
   vmemmap. This in turn enables the THP support described above.
 
 * New selftest checking the pKVM ownership transition rules
 
 * Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
   even if the host didn't have it.
 
 * Fixes for the address translation emulation, which happened to be
   rather buggy in some specific contexts.
 
 * Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
   from the number of counters exposed to a guest and addressing a
   number of issues in the process.
 
 * Add a new selftest for the SVE host state being corrupted by a
   guest.
 
 * Keep HCR_EL2.xMO set at all times for systems running with the
   kernel at EL2, ensuring that the window for interrupts is slightly
   bigger, and avoiding a pretty bad erratum on the AmpereOne HW.
 
 * Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
   from a pretty bad case of TLB corruption unless accesses to HCR_EL2
   are heavily synchronised.
 
 * Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
   tables in a human-friendly fashion.
 
 * and the usual random cleanups.
 
 LoongArch:
 
 * Don't flush tlb if the host supports hardware page table walks.
 
 * Add KVM selftests support.
 
 RISC-V:
 
 * Add vector registers to get-reg-list selftest
 
 * VCPU reset related improvements
 
 * Remove scounteren initialization from VCPU reset
 
 * Support VCPU reset from userspace using set_mpstate() ioctl
 
 x86:
 
 * Initial support for TDX in KVM.  This finally makes it possible to use the
   TDX module to run confidential guests on Intel processors.  This is quite a
   large series, including support for private page tables (managed by the
   TDX module and mirrored in KVM for efficiency), forwarding some TDVMCALLs
   to userspace, and handling several special VM exits from the TDX module.
 
   This has been in the works for literally years and it's not really possible
   to describe everything here, so I'll defer to the various merge commits
   up to and including commit 7bcf7246c4 ("Merge branch 'kvm-tdx-finish-initial'
   into HEAD").
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmg02hwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNnkwf/db4xeWKSMseCIvBVR+ObDn3LXhwT
 hAgmTkDkP1zq9RfbfJSbUA1DXRwfP+f1sWySLMWECkFEQW9fGIJF9fOQRDSXKmhX
 158U3+FEt+3jxLRCGFd4zyXAqyY3C8JSkPUyJZxCpUbXtB5tdDNac4rZAXKDULwe
 sUi0OW/kFDM2yt369pBGQAGdN+75/oOrYISGOSvMXHxjccNqvveX8MUhpBjYIuuj
 73iBWmsfv3vCtam56Racz3C3v44ie498PmWFtnB0R+CVfWfrnUAaRiGWx+egLiBW
 dBPDiZywMn++prmphEUFgaStDTQy23JBLJ8+RvHkp+o5GaTISKJB3nedZQ==
 =adZU
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "As far as x86 goes this pull request "only" includes TDX host support.

  Quotes are appropriate because (at 6k lines and 100+ commits) it is
  much bigger than the rest, which will come later this week and
  consists mostly of bugfixes and selftests. s390 changes will also come
  in the second batch.

  ARM:

   - Add large stage-2 mapping (THP) support for non-protected guests
     when pKVM is enabled, clawing back some performance.

   - Enable nested virtualisation support on systems that support it,
     though it is disabled by default.

   - Add UBSAN support to the standalone EL2 object used in nVHE/hVHE
     and protected modes.

   - Large rework of the way KVM tracks architecture features and links
     them with the effects of control bits. While this has no functional
     impact, it ensures correctness of emulation (the data is
     automatically extracted from the published JSON files), and helps
     dealing with the evolution of the architecture.

   - Significant changes to the way pKVM tracks ownership of pages,
     avoiding page table walks by storing the state in the hypervisor's
     vmemmap. This in turn enables the THP support described above.

   - New selftest checking the pKVM ownership transition rules

   - Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
     even if the host didn't have it.

   - Fixes for the address translation emulation, which happened to be
     rather buggy in some specific contexts.

   - Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
     from the number of counters exposed to a guest and addressing a
     number of issues in the process.

   - Add a new selftest for the SVE host state being corrupted by a
     guest.

   - Keep HCR_EL2.xMO set at all times for systems running with the
     kernel at EL2, ensuring that the window for interrupts is slightly
     bigger, and avoiding a pretty bad erratum on the AmpereOne HW.

   - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
     from a pretty bad case of TLB corruption unless accesses to HCR_EL2
     are heavily synchronised.

   - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
     tables in a human-friendly fashion.

   - and the usual random cleanups.

  LoongArch:

   - Don't flush tlb if the host supports hardware page table walks.

   - Add KVM selftests support.

  RISC-V:

   - Add vector registers to get-reg-list selftest

   - VCPU reset related improvements

   - Remove scounteren initialization from VCPU reset

   - Support VCPU reset from userspace using set_mpstate() ioctl

  x86:

   - Initial support for TDX in KVM.

     This finally makes it possible to use the TDX module to run
     confidential guests on Intel processors. This is quite a large
     series, including support for private page tables (managed by the
     TDX module and mirrored in KVM for efficiency), forwarding some
     TDVMCALLs to userspace, and handling several special VM exits from
     the TDX module.

     This has been in the works for literally years and it's not really
     possible to describe everything here, so I'll defer to the various
     merge commits up to and including commit 7bcf7246c4 ('Merge
     branch 'kvm-tdx-finish-initial' into HEAD')"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (248 commits)
  x86/tdx: mark tdh_vp_enter() as __flatten
  Documentation: virt/kvm: remove unreferenced footnote
  RISC-V: KVM: lock the correct mp_state during reset
  KVM: arm64: Fix documentation for vgic_its_iter_next()
  KVM: arm64: np-guest CMOs with PMD_SIZE fixmap
  KVM: arm64: Stage-2 huge mappings for np-guests
  KVM: arm64: Add a range to pkvm_mappings
  KVM: arm64: Convert pkvm_mappings to interval tree
  KVM: arm64: Add a range to __pkvm_host_test_clear_young_guest()
  KVM: arm64: Add a range to __pkvm_host_wrprotect_guest()
  KVM: arm64: Add a range to __pkvm_host_unshare_guest()
  KVM: arm64: Add a range to __pkvm_host_share_guest()
  KVM: arm64: Introduce for_each_hyp_page
  KVM: arm64: Handle huge mappings for np-guest CMOs
  KVM: arm64: nv: Release faulted-in VNCR page from mmu_lock critical section
  KVM: arm64: nv: Handle TLBI S1E2 for VNCR invalidation with mmu_lock held
  KVM: arm64: nv: Hold mmu_lock when invalidating VNCR SW-TLB before translating
  RISC-V: KVM: add KVM_CAP_RISCV_MP_STATE_RESET
  RISC-V: KVM: Remove scounteren initialization
  KVM: RISC-V: remove unnecessary SBI reset state
  ...
2025-05-29 08:10:01 -07:00
Linus Torvalds
90b83efa67 bpf-next-6.16
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmg3NqgACgkQ6rmadz2v
 bTpNUQ/8DPeYtn3nskpsP2OwFy6O3hhfCe6gjOAmUVSk000xbG+AcI/h1DnGZWgk
 xlVcEs93ekzUzHd7k1+RJ2c5yDLXieLJAtb66rbFU1enkxs2cWlcWSKE6K/gaoh3
 G1BCARVlKwtrJhrVrsXtYP/eGZxKRSUZFK7xhtCk7lp7sRI3xkTLE+FJBcDkTJ6W
 HwF14i3zO+BkqNGdFwwlASCCqRItSNBBiM3KjW1DbETOTfAKlvCTrcgdUiODqxhF
 PNnULW+xmICABDFlKfDMlUAGNlSHKjiI3+g31LdblA5eyEhIqiCRgBGFYoCnsluk
 qUauRSie61KqC7fxN3qVpC3bXJfD1td7uIvoqSkDLtTv8a5+HAoiohzi1qBzCayl
 LAGkBYewAfDtdDDjNY38JLH2RCdyY6zG9DhqghPHdPlM7zj7L5zZgj34igEwesMM
 mfj9TuFFF99yfX5UUeSxKpDGR1eO4Ew0p7tg8CRs8Fqh6AIQSmboREZrsncVRCTS
 4SDHSI4KcO4LO2pEKzy+X4dewganN7aESnQG34iG0liyvDDwJOgUnDWLRwPLas7k
 3b/zIfBLxOJpA5R+0hhAMtjMA4NgyKJf4yFZwEieuasQjvzwTApi24YhZ/b3HSEB
 2Dp8kHEEbwezv0OFFz/fJ88dNQnrDmtJ+QByN/liA8kj4Yuh2+Q=
 =j3t8
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

 - Fix and improve BTF deduplication of identical BTF types (Alan
   Maguire and Andrii Nakryiko)

 - Support up to 12 arguments in BPF trampoline on arm64 (Xu Kuohai and
   Alexis Lothoré)

 - Support load-acquire and store-release instructions in BPF JIT on
   riscv64 (Andrea Parri)

 - Fix uninitialized values in BPF_{CORE,PROBE}_READ macros (Anton
   Protopopov)

 - Streamline allowed helpers across program types (Feng Yang)

 - Support atomic update for hashtab of BPF maps (Hou Tao)

 - Implement json output for BPF helpers (Ihor Solodrai)

 - Several s390 JIT fixes (Ilya Leoshkevich)

 - Various sockmap fixes (Jiayuan Chen)

 - Support mmap of vmlinux BTF data (Lorenz Bauer)

 - Support BPF rbtree traversal and list peeking (Martin KaFai Lau)

 - Tests for sockmap/sockhash redirection (Michal Luczaj)

 - Introduce kfuncs for memory reads into dynptrs (Mykyta Yatsenko)

 - Add support for dma-buf iterators in BPF (T.J. Mercier)

 - The verifier support for __bpf_trap() (Yonghong Song)

* tag 'bpf-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (135 commits)
  bpf, arm64: Remove unused-but-set function and variable.
  selftests/bpf: Add tests with stack ptr register in conditional jmp
  bpf: Do not include stack ptr register in precision backtracking bookkeeping
  selftests/bpf: enable many-args tests for arm64
  bpf, arm64: Support up to 12 function arguments
  bpf: Check rcu_read_lock_trace_held() in bpf_map_lookup_percpu_elem()
  bpf: Avoid __bpf_prog_ret0_warn when jit fails
  bpftool: Add support for custom BTF path in prog load/loadall
  selftests/bpf: Add unit tests with __bpf_trap() kfunc
  bpf: Warn with __bpf_trap() kfunc maybe due to uninitialized variable
  bpf: Remove special_kfunc_set from verifier
  selftests/bpf: Add test for open coded dmabuf_iter
  selftests/bpf: Add test for dmabuf_iter
  bpf: Add open coded dmabuf iterator
  bpf: Add dmabuf iterator
  dma-buf: Rename debugfs symbols
  bpf: Fix error return value in bpf_copy_from_user_dynptr
  libbpf: Use mmap to parse vmlinux BTF from sysfs
  selftests: bpf: Add a test for mmapable vmlinux BTF
  btf: Allow mmap of vmlinux btf
  ...
2025-05-28 15:52:42 -07:00
Linus Torvalds
1b98f357da Networking changes for 6.16.
Core
 ----
 
  - Implement the Device Memory TCP transmit path, allowing zero-copy
    data transmission on top of TCP from e.g. GPU memory to the wire.
 
  - Move all the IPv6 routing tables management outside the RTNL scope,
    under its own lock and RCU. The route control path is now 3x times
    faster.
 
  - Convert queue related netlink ops to instance lock, reducing
    again the scope of the RTNL lock. This improves the control plane
    scalability.
 
  - Refactor the software crc32c implementation, removing unneeded
    abstraction layers and improving significantly the related
    micro-benchmarks.
 
  - Optimize the GRO engine for UDP-tunneled traffic, for a 10%
    performance improvement in related stream tests.
 
  - Cover more per-CPU storage with local nested BH locking; this is a
    prep work to remove the current per-CPU lock in local_bh_disable()
    on PREMPT_RT.
 
  - Introduce and use nlmsg_payload helper, combining buffer bounds
    verification with accessing payload carried by netlink messages.
 
 Netfilter
 ---------
 
  - Rewrite the procfs conntrack table implementation, improving
    considerably the dump performance. A lot of user-space tools
    still use this interface.
 
  - Implement support for wildcard netdevice in netdev basechain
    and flowtables.
 
  - Integrate conntrack information into nft trace infrastructure.
 
  - Export set count and backend name to userspace, for better
    introspection.
 
 BPF
 ---
 
  - BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
    programs and can be controlled in similar way to traditional qdiscs
    using the "tc qdisc" command.
 
  - Refactor the UDP socket iterator, addressing long standing issues
    WRT duplicate hits or missed sockets.
 
 Protocols
 ---------
 
  - Improve TCP receive buffer auto-tuning and increase the default
    upper bound for the receive buffer; overall this improves the single
    flow maximum thoughput on 200Gbs link by over 60%.
 
  - Add AFS GSSAPI security class to AF_RXRPC; it provides transport
    security for connections to the AFS fileserver and VL server.
 
  - Improve TCP multipath routing, so that the sources address always
    matches the nexthop device.
 
  - Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
    and thus preventing DoS caused by passing around problematic FDs.
 
  - Retire DCCP socket. DCCP only receives updates for bugs, and major
    distros disable it by default. Its removal allows for better
    organisation of TCP fields to reduce the number of cache lines hit
    in the fast path.
 
  - Extend TCP drop-reason support to cover PAWS checks.
 
 Driver API
 ----------
 
  - Reorganize PTP ioctl flag support to require an explicit opt-in for
    the drivers, avoiding the problem of drivers not rejecting new
    unsupported flags.
 
  - Converted several device drivers to timestamping APIs.
 
  - Introduce per-PHY ethtool dump helpers, improving the support for
    dump operations targeting PHYs.
 
 Tests and tooling
 -----------------
 
  - Add support for classic netlink in user space C codegen, so that
    ynl-c can now read, create and modify links, routes addresses and
    qdisc layer configuration.
 
  - Add ynl sub-types for binary attributes, allowing ynl-c to output
    known struct instead of raw binary data, clarifying the classic
    netlink output.
 
  - Extend MPTCP selftests to improve the code-coverage.
 
  - Add tests for XDP tail adjustment in AF_XDP.
 
 New hardware / drivers
 ----------------------
 
  - OpenVPN virtual driver: offload OpenVPN data channels processing
    to the kernel-space, increasing the data transfer throughput WRT
    the user-space implementation.
 
  - Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
 
  - Broadcom asp-v3.0 ethernet driver.
 
  - AMD Renoir ethernet device.
 
  - ReakTek MT9888 2.5G ethernet PHY driver.
 
  - Aeonsemi 10G C45 PHYs driver.
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - nVidia/Mellanox (mlx5):
      - refactor the stearing table handling to reduce significantly
        the amount of memory used
      - add support for complex matches in H/W flow steering
      - improve flow streeing error handling
      - convert to netdev instance locking
    - Intel (100G, ice, igb, ixgbe, idpf):
      - ice: add switchdev support for LLDP traffic over VF
      - ixgbe: add firmware manipulation and regions devlink support
      - igb: introduce support for frame transmission premption
      - igb: adds persistent NAPI configuration
      - idpf: introduce RDMA support
      - idpf: add initial PTP support
    - Meta (fbnic):
      - extend hardware stats coverage
      - add devlink dev flash support
    - Broadcom (bnxt):
      - add support for RX-side device memory TCP
    - Wangxun (txgbe):
      - implement support for udp tunnel offload
      - complete PTP and SRIOV support for AML 25G/10G devices
 
  - Ethernet NICs embedded and virtual:
    - Google (gve):
      - add device memory TCP TX support
    - Amazon (ena):
      - support persistent per-NAPI config
    - Airoha:
      - add H/W support for L2 traffic offload
      - add per flow stats for flow offloading
    - RealTek (rtl8211): add support for WoL magic packet
    - Synopsys (stmmac):
      - dwmac-socfpga 1000BaseX support
      - add Loongson-2K3000 support
      - introduce support for hardware-accelerated VLAN stripping
    - Broadcom (bcmgenet):
      - expose more H/W stats
    - Freescale (enetc, dpaa2-eth):
      - enetc: add MAC filter, VLAN filter RSS and loopback support
      - dpaa2-eth: convert to H/W timestamping APIs
    - vxlan: convert FDB table to rhashtable, for better scalabilty
    - veth: apply qdisc backpressure on full ring to reduce TX drops
 
  - Ethernet switches:
    - Microchip (kzZ88x3): add ETS scheduler support
 
  - Ethernet PHYs:
    - RealTek (rtl8211):
      - add support for WoL magic packet
      - add support for PHY LEDs
 
  - CAN:
    - Adds RZ/G3E CANFD support to the rcar_canfd driver.
    - Preparatory work for CAN-XL support.
    - Add self-tests framework with support for CAN physical interfaces.
 
  - WiFi:
    - mac80211:
      - scan improvements with multi-link operation (MLO)
    - Qualcomm (ath12k):
      - enable AHB support for IPQ5332
      - add monitor interface support to QCN9274
      - add multi-link operation support to WCN7850
      - add 802.11d scan offload support to WCN7850
      - monitor mode for WCN7850, better 6 GHz regulatory
    - Qualcomm (ath11k):
      - restore hibernation support
    - MediaTek (mt76):
      - WiFi-7 improvements
      - implement support for mt7990
    - Intel (iwlwifi):
      - enhanced multi-link single-radio (EMLSR) support on 5 GHz links
      - rework device configuration
    - RealTek (rtw88):
      - improve throughput for RTL8814AU
    - RealTek (rtw89):
      - add multi-link operation support
      - STA/P2P concurrency improvements
      - support different SAR configs by antenna
 
  - Bluetooth:
    - introduce HCI Driver protocol
    - btintel_pcie: do not generate coredump for diagnostic events
    - btusb: add HCI Drv commands for configuring altsetting
    - btusb: add RTL8851BE device 0x0bda:0xb850
    - btusb: add new VID/PID 13d3/3584 for MT7922
    - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
    - btnxpuart: implement host-wakeup feature
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmg3D64SHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkcIsQAK2eEc+BxQer975wzvtMg6gF9eoex4a+
 rZ7jxfDzDtNvTauoQsrpehDZp0FnySaVGCU36lHGB2OvDnhCpPc5hXzKDWQpOuqQ
 SHrGG3/6FTbdTG/HfHUcbNyrUzIf53SADSObiQ3qg4gyEQ3sCpcOKtVtMcU8rvsY
 /HqMnsJWFaROUMjMtCcnUSgjmeY9kBvha3sTXUqgeRugEOCvZD7z4rpqFIcQqHw7
 e2Fi8dwIXEYNxqPp6MRq2qdyUTewCRruE8ZIMAFuhtfYeMElUZMPlqlMENX3AzTQ
 cr0EgwcFOUxRA7oZRxhoBNBsVXavtSpQr4ZDoWplxP4aQ37n5tc1E9Q72axpB/Og
 FbJRl6GvWYnCd8071BczgmfHlKaTAigPvt2Z4r6JjM5I/Bij/IZ3k+On1OTuOAj/
 EqfFkdZ0a5cfKrwUMP+oSGtSAywkMVUtnIKJlZeRbjSj2432sCfe2jVAlS8ELM43
 3LUgXYrAKtA87g171LlsRu5EEpI5QmqPb+i5LpPlEXe2TJEgPisyfecJ3NafF/2+
 j575lm+TFNm9NTNhGGjDPEvw0djI5wSGGMe9J4gC74eWi6s5t6C4cuUf84TKWdwR
 x+9H0IB7rfFncAwXHJuUUtzd+fPHaYzs5dDGbSgMQOXr1cr1wlubCK8mQ1r/Wt/a
 3GjFIOQKW2Q5
 =t/Tz
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "Core:

   - Implement the Device Memory TCP transmit path, allowing zero-copy
     data transmission on top of TCP from e.g. GPU memory to the wire.

   - Move all the IPv6 routing tables management outside the RTNL scope,
     under its own lock and RCU. The route control path is now 3x times
     faster.

   - Convert queue related netlink ops to instance lock, reducing again
     the scope of the RTNL lock. This improves the control plane
     scalability.

   - Refactor the software crc32c implementation, removing unneeded
     abstraction layers and improving significantly the related
     micro-benchmarks.

   - Optimize the GRO engine for UDP-tunneled traffic, for a 10%
     performance improvement in related stream tests.

   - Cover more per-CPU storage with local nested BH locking; this is a
     prep work to remove the current per-CPU lock in local_bh_disable()
     on PREMPT_RT.

   - Introduce and use nlmsg_payload helper, combining buffer bounds
     verification with accessing payload carried by netlink messages.

  Netfilter:

   - Rewrite the procfs conntrack table implementation, improving
     considerably the dump performance. A lot of user-space tools still
     use this interface.

   - Implement support for wildcard netdevice in netdev basechain and
     flowtables.

   - Integrate conntrack information into nft trace infrastructure.

   - Export set count and backend name to userspace, for better
     introspection.

  BPF:

   - BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
     programs and can be controlled in similar way to traditional qdiscs
     using the "tc qdisc" command.

   - Refactor the UDP socket iterator, addressing long standing issues
     WRT duplicate hits or missed sockets.

  Protocols:

   - Improve TCP receive buffer auto-tuning and increase the default
     upper bound for the receive buffer; overall this improves the
     single flow maximum thoughput on 200Gbs link by over 60%.

   - Add AFS GSSAPI security class to AF_RXRPC; it provides transport
     security for connections to the AFS fileserver and VL server.

   - Improve TCP multipath routing, so that the sources address always
     matches the nexthop device.

   - Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
     and thus preventing DoS caused by passing around problematic FDs.

   - Retire DCCP socket. DCCP only receives updates for bugs, and major
     distros disable it by default. Its removal allows for better
     organisation of TCP fields to reduce the number of cache lines hit
     in the fast path.

   - Extend TCP drop-reason support to cover PAWS checks.

  Driver API:

   - Reorganize PTP ioctl flag support to require an explicit opt-in for
     the drivers, avoiding the problem of drivers not rejecting new
     unsupported flags.

   - Converted several device drivers to timestamping APIs.

   - Introduce per-PHY ethtool dump helpers, improving the support for
     dump operations targeting PHYs.

  Tests and tooling:

   - Add support for classic netlink in user space C codegen, so that
     ynl-c can now read, create and modify links, routes addresses and
     qdisc layer configuration.

   - Add ynl sub-types for binary attributes, allowing ynl-c to output
     known struct instead of raw binary data, clarifying the classic
     netlink output.

   - Extend MPTCP selftests to improve the code-coverage.

   - Add tests for XDP tail adjustment in AF_XDP.

  New hardware / drivers:

   - OpenVPN virtual driver: offload OpenVPN data channels processing to
     the kernel-space, increasing the data transfer throughput WRT the
     user-space implementation.

   - Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.

   - Broadcom asp-v3.0 ethernet driver.

   - AMD Renoir ethernet device.

   - ReakTek MT9888 2.5G ethernet PHY driver.

   - Aeonsemi 10G C45 PHYs driver.

  Drivers:

   - Ethernet high-speed NICs:
       - nVidia/Mellanox (mlx5):
           - refactor the steering table handling to significantly
             reduce the amount of memory used
           - add support for complex matches in H/W flow steering
           - improve flow streeing error handling
           - convert to netdev instance locking
       - Intel (100G, ice, igb, ixgbe, idpf):
           - ice: add switchdev support for LLDP traffic over VF
           - ixgbe: add firmware manipulation and regions devlink support
           - igb: introduce support for frame transmission premption
           - igb: adds persistent NAPI configuration
           - idpf: introduce RDMA support
           - idpf: add initial PTP support
       - Meta (fbnic):
           - extend hardware stats coverage
           - add devlink dev flash support
       - Broadcom (bnxt):
           - add support for RX-side device memory TCP
       - Wangxun (txgbe):
           - implement support for udp tunnel offload
           - complete PTP and SRIOV support for AML 25G/10G devices

   - Ethernet NICs embedded and virtual:
       - Google (gve):
           - add device memory TCP TX support
       - Amazon (ena):
           - support persistent per-NAPI config
       - Airoha:
           - add H/W support for L2 traffic offload
           - add per flow stats for flow offloading
       - RealTek (rtl8211): add support for WoL magic packet
       - Synopsys (stmmac):
           - dwmac-socfpga 1000BaseX support
           - add Loongson-2K3000 support
           - introduce support for hardware-accelerated VLAN stripping
       - Broadcom (bcmgenet):
           - expose more H/W stats
       - Freescale (enetc, dpaa2-eth):
           - enetc: add MAC filter, VLAN filter RSS and loopback support
           - dpaa2-eth: convert to H/W timestamping APIs
       - vxlan: convert FDB table to rhashtable, for better scalabilty
       - veth: apply qdisc backpressure on full ring to reduce TX drops

   - Ethernet switches:
       - Microchip (kzZ88x3): add ETS scheduler support

   - Ethernet PHYs:
       - RealTek (rtl8211):
           - add support for WoL magic packet
           - add support for PHY LEDs

   - CAN:
       - Adds RZ/G3E CANFD support to the rcar_canfd driver.
       - Preparatory work for CAN-XL support.
       - Add self-tests framework with support for CAN physical interfaces.

   - WiFi:
       - mac80211:
           - scan improvements with multi-link operation (MLO)
       - Qualcomm (ath12k):
           - enable AHB support for IPQ5332
           - add monitor interface support to QCN9274
           - add multi-link operation support to WCN7850
           - add 802.11d scan offload support to WCN7850
           - monitor mode for WCN7850, better 6 GHz regulatory
       - Qualcomm (ath11k):
           - restore hibernation support
       - MediaTek (mt76):
           - WiFi-7 improvements
           - implement support for mt7990
       - Intel (iwlwifi):
           - enhanced multi-link single-radio (EMLSR) support on 5 GHz links
           - rework device configuration
       - RealTek (rtw88):
           - improve throughput for RTL8814AU
       - RealTek (rtw89):
           - add multi-link operation support
           - STA/P2P concurrency improvements
           - support different SAR configs by antenna

   - Bluetooth:
       - introduce HCI Driver protocol
       - btintel_pcie: do not generate coredump for diagnostic events
       - btusb: add HCI Drv commands for configuring altsetting
       - btusb: add RTL8851BE device 0x0bda:0xb850
       - btusb: add new VID/PID 13d3/3584 for MT7922
       - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
       - btnxpuart: implement host-wakeup feature"

* tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits)
  selftests/bpf: Fix bpf selftest build warning
  selftests: netfilter: Fix skip of wildcard interface test
  net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames
  net: openvswitch: Fix the dead loop of MPLS parse
  calipso: Don't call calipso functions for AF_INET sk.
  selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem
  net_sched: hfsc: Address reentrant enqueue adding class to eltree twice
  octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback
  octeontx2-pf: QOS: Perform cache sync on send queue teardown
  net: mana: Add support for Multi Vports on Bare metal
  net: devmem: ncdevmem: remove unused variable
  net: devmem: ksft: upgrade rx test to send 1K data
  net: devmem: ksft: add 5 tuple FS support
  net: devmem: ksft: add exit_wait to make rx test pass
  net: devmem: ksft: add ipv4 support
  net: devmem: preserve sockc_err
  page_pool: fix ugly page_pool formatting
  net: devmem: move list_add to net_devmem_bind_dmabuf.
  selftests: netfilter: nft_queue.sh: include file transfer duration in log message
  net: phy: mscc: Fix memory leak when using one step timestamping
  ...
2025-05-28 15:24:36 -07:00
Ian Rogers
4d9b5146f0 perf symbol: Move demangling code out of symbol-elf.c
symbol-elf.c is used when building with libelf, symbol-minimal is used
otherwise.

There is no reason the demangling code with no dependencies on libelf is
part of symbol-elf.c so move to symbol.c.

This allows demangling tests to pass with NO_LIBELF=1.

Structurally, while moving the functions rename demangle_sym() to
dso__demangle_sym() which is already a function exposed in symbol.h and
the only purpose of which in symbol-elf.c was to call demangle_sym().

Change the calls to demangle_sym() in symbol-elf.c to calls to
dso__demangle_sym().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250528210858.499898-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 19:02:58 -03:00
Linus Torvalds
47cf96fbe3 arm64 updates for 6.16
ACPI, EFI and PSCI:
  - Decouple Arm's "Software Delegated Exception Interface" (SDEI)
    support from the ACPI GHES code so that it can be used by platforms
    booted with device-tree.
 
  - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
    runtime calls.
 
  - Fix a node refcount imbalance in the PSCI device-tree code.
 
 CPU Features:
  - Ensure register sanitisation is applied to fields in ID_AA64MMFR4.
 
  - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM guests
    can reliably query the underlying CPU types from the VMM.
 
  - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
    to our context-switching, signal handling and ptrace code.
 
 Entry code:
  - Hook up TIF_NEED_RESCHED_LAZY so that CONFIG_PREEMPT_LAZY can be
    selected.
 
 Memory management:
  - Prevent BSS exports from being used by the early PI code.
 
  - Propagate level and stride information to the low-level TLB
    invalidation routines when operating on hugetlb entries.
 
  - Use the page-table contiguous hint for vmap() mappings with
    VM_ALLOW_HUGE_VMAP where possible.
 
  - Optimise vmalloc()/vmap() page-table updates to use "lazy MMU mode"
    and hook this up on arm64 so that the trailing DSB (used to publish
    the updates to the hardware walker) can be deferred until the end of
    the mapping operation.
 
  - Extend mmap() randomisation for 52-bit virtual addresses (on par with
    48-bit addressing) and remove limited support for randomisation of
    the linear map.
 
 Perf and PMUs:
  - Add support for probing the CMN-S3 driver using ACPI.
 
  - Minor driver fixes to the CMN, Arm-NI and amlogic PMU drivers.
 
 Selftests:
  - Fix FPSIMD and SME tests to align with the freshly re-enabled SME
    support.
 
  - Fix default setting of the OUTPUT variable so that tests are
    installed in the right location.
 
 vDSO:
  - Replace raw counter access from inline assembly code with a call to
    the the __arch_counter_get_cntvct() helper function.
 
 Miscellaneous:
  - Add some missing header inclusions to the CCA headers.
 
  - Rework rendering of /proc/cpuinfo to follow the x86-approach and
    avoid repeated buffer expansion (the user-visible format remains
    identical).
 
  - Remove redundant selection of CONFIG_CRC32
 
  - Extend early error message when failing to map the device-tree blob.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmg1uTgQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNFv2CAC9S5OW0btOAo7V/LFBpLhJM3hdIV6Sn6N1
 d/K5znuqPBG6VPfBrshaZltEl/C3U8KG4H8xrlX5cSo7CRuf3DgVBw3kiZ6ERZj6
 1gnKR54juA1oWhcroPl0s76ETWj3N4gO036u2qOhWNAYflDunh1+bCIGJkG4H/yP
 wqtWn974YUbad/zQJSbG3IMO1yvxZ/PsNpVF8HjyQ0/ZPWsYTscrhNQ0hWro17sR
 CTcUaGxH4GrXW24EGNgkLB9aq67X2rtGGtaIlp5oFl8FuLklc7TYbPwJp8cPCTNm
 0Sp0mpuR9M675pYIKoCI9m5twc46znRIKmbXi5LvPd77418y3jTf
 =03N4
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The headline feature is the re-enablement of support for Arm's
  Scalable Matrix Extension (SME) thanks to a bumper crop of fixes
  from Mark Rutland.

  If matrices aren't your thing, then Ryan's page-table optimisation
  work is much more interesting.

  Summary:

  ACPI, EFI and PSCI:

   - Decouple Arm's "Software Delegated Exception Interface" (SDEI)
     support from the ACPI GHES code so that it can be used by platforms
     booted with device-tree

   - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
     runtime calls

   - Fix a node refcount imbalance in the PSCI device-tree code

  CPU Features:

   - Ensure register sanitisation is applied to fields in ID_AA64MMFR4

   - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM
     guests can reliably query the underlying CPU types from the VMM

   - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
     to our context-switching, signal handling and ptrace code

  Entry code:

   - Hook up TIF_NEED_RESCHED_LAZY so that CONFIG_PREEMPT_LAZY can be
     selected

  Memory management:

   - Prevent BSS exports from being used by the early PI code

   - Propagate level and stride information to the low-level TLB
     invalidation routines when operating on hugetlb entries

   - Use the page-table contiguous hint for vmap() mappings with
     VM_ALLOW_HUGE_VMAP where possible

   - Optimise vmalloc()/vmap() page-table updates to use "lazy MMU mode"
     and hook this up on arm64 so that the trailing DSB (used to publish
     the updates to the hardware walker) can be deferred until the end
     of the mapping operation

   - Extend mmap() randomisation for 52-bit virtual addresses (on par
     with 48-bit addressing) and remove limited support for
     randomisation of the linear map

  Perf and PMUs:

   - Add support for probing the CMN-S3 driver using ACPI

   - Minor driver fixes to the CMN, Arm-NI and amlogic PMU drivers

  Selftests:

   - Fix FPSIMD and SME tests to align with the freshly re-enabled SME
     support

   - Fix default setting of the OUTPUT variable so that tests are
     installed in the right location

  vDSO:

   - Replace raw counter access from inline assembly code with a call to
     the the __arch_counter_get_cntvct() helper function

  Miscellaneous:

   - Add some missing header inclusions to the CCA headers

   - Rework rendering of /proc/cpuinfo to follow the x86-approach and
     avoid repeated buffer expansion (the user-visible format remains
     identical)

   - Remove redundant selection of CONFIG_CRC32

   - Extend early error message when failing to map the device-tree
     blob"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (83 commits)
  arm64: cputype: Add cputype definition for HIP12
  arm64: el2_setup.h: Make __init_el2_fgt labels consistent, again
  perf/arm-cmn: Add CMN S3 ACPI binding
  arm64/boot: Disallow BSS exports to startup code
  arm64/boot: Move global CPU override variables out of BSS
  arm64/boot: Move init_pgdir[] and init_idmap_pgdir[] into __pi_ namespace
  perf/arm-cmn: Initialise cmn->cpu earlier
  kselftest/arm64: Set default OUTPUT path when undefined
  arm64: Update comment regarding values in __boot_cpu_mode
  arm64: mm: Drop redundant check in pmd_trans_huge()
  arm64/mm: Re-organise setting up FEAT_S1PIE registers PIRE0_EL1 and PIR_EL1
  arm64/mm: Permit lazy_mmu_mode to be nested
  arm64/mm: Disable barrier batching in interrupt contexts
  arm64/cpuinfo: only show one cpu's info in c_show()
  arm64/mm: Batch barriers when updating kernel mappings
  mm/vmalloc: Enter lazy mmu mode while manipulating vmalloc ptes
  arm64/mm: Support huge pte-mapped pages in vmap
  mm/vmalloc: Gracefully unmap huge ptes
  mm/vmalloc: Warn on improper use of vunmap_range()
  arm64/mm: Hoist barriers out of set_ptes_anysz() loop
  ...
2025-05-28 14:55:35 -07:00
Linus Torvalds
2c26b68cd5 NFSD 6.16 Release Notes
The marquee feature for this release is that the limit on the
 maximum rsize and wsize has been raised to 4MB. The default remains
 at 1MB, but risk-seeking administrators now have the ability to try
 larger I/O sizes with NFS clients that support them. Eventually the
 default setting will be increased when we have confidence that this
 change will not have negative impact.
 
 With v6.16, NFSD now has its own debugfs file system where we can
 add experimental features and make them available outside of our
 development community without impacting production deployments. The
 first experimental setting added is one that makes all NFS READ
 operations use vfs_iter_read() instead of the NFSD splice actor. The
 plan is to eventually retire the splice actor, as that will enable a
 number of new capabilities such as the use of struct bio_vec from the
 top to the bottom of the NFSD stack.
 
 Jeff Layton contributed a number of observability improvements. The
 use of dprintk() in a number of high-traffic code paths has been
 replaced with static trace points.
 
 This release sees the continuation of efforts to harden the NFSv4.2
 COPY operation. Soon, the restriction on async COPY operations can
 be lifted.
 
 Many thanks to the contributors, reviewers, testers, and bug
 reporters who participated during the v6.16 development cycle.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmg1yGUACgkQM2qzM29m
 f5frMA//TJTbSWiM7qBX1GhVMNr1lxQcjU4BPKo0qZfEtwV06F2BB9mWgDU+BIQh
 AcGfMZUmNWAnhOTOYvwqyW6dnX+1yt8sBsCZ/1ctY30A4JH4AgG5sdZS7BUrlEEr
 bGDMUCaPnvQ3maeDjMlefe7Xv/rUhj9TVhXmDkt4vf/jCde2JODTB/z8n7WeAxYJ
 eOvmr/n5z6VI5Q67M7b5/xqofBEaEoq9P5UEgn61ThfeR0bMlrklm/avDCbbNIH8
 6n7Z3tjzllK1CAjEmwHalq4LRbMX5FHWzNkyJw+wtviXS18J5vCAvRe+JDoykusu
 L2bgXT8bBUqy46eO4WKEOJtEqVQhIsRFx/8ku1iTLrpDWlwrR4mHVyObEDkkdlMX
 EyBQ4svg2OxCXSyy5O8oggzU0TWVJStIjbIEHbJYusWLU7HxxFveBwqwzYHXLtip
 WKm6N2ANqQi1du+Pc6xmgXo9svA5Vk+DQjljm1Y5up9dhi2K9cvCIHjwFsZ+E0VL
 XqXJ2YgIQb3oXK7FttzLOiDrpX1OX82sTIbgdcPcfT7lP+ej7uiHMBPmdPwgaZIU
 EbIp0ThoTkh8/VRMDcWIt+B6SEhmb5vY3Zgz9Lcf2J0PM1fuYJ67L7xGTviFX7Ci
 DpohiCgceb6PHYeIuarayF86tPJGF8Vb7XvQZej2Ybv8QdxLFg8=
 =FbeG
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "The marquee feature for this release is that the limit on the maximum
  rsize and wsize has been raised to 4MB. The default remains at 1MB,
  but risk-seeking administrators now have the ability to try larger I/O
  sizes with NFS clients that support them. Eventually the default
  setting will be increased when we have confidence that this change
  will not have negative impact.

  With v6.16, NFSD now has its own debugfs file system where we can add
  experimental features and make them available outside of our
  development community without impacting production deployments. The
  first experimental setting added is one that makes all NFS READ
  operations use vfs_iter_read() instead of the NFSD splice actor. The
  plan is to eventually retire the splice actor, as that will enable a
  number of new capabilities such as the use of struct bio_vec from the
  top to the bottom of the NFSD stack.

  Jeff Layton contributed a number of observability improvements. The
  use of dprintk() in a number of high-traffic code paths has been
  replaced with static trace points.

  This release sees the continuation of efforts to harden the NFSv4.2
  COPY operation. Soon, the restriction on async COPY operations can be
  lifted.

  Many thanks to the contributors, reviewers, testers, and bug reporters
  who participated during the v6.16 development cycle"

* tag 'nfsd-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (60 commits)
  xdrgen: Fix code generated for counted arrays
  SUNRPC: Bump the maximum payload size for the server
  NFSD: Add a "default" block size
  NFSD: Remove NFSSVC_MAXBLKSIZE_V2 macro
  NFSD: Remove NFSD_BUFSIZE
  sunrpc: Remove the RPCSVC_MAXPAGES macro
  svcrdma: Adjust the number of entries in svc_rdma_send_ctxt::sc_pages
  svcrdma: Adjust the number of entries in svc_rdma_recv_ctxt::rc_pages
  sunrpc: Adjust size of socket's receive page array dynamically
  SUNRPC: Remove svc_rqst :: rq_vec
  SUNRPC: Remove svc_fill_write_vector()
  NFSD: Use rqstp->rq_bvec in nfsd_iter_write()
  SUNRPC: Export xdr_buf_to_bvec()
  NFSD: De-duplicate the svc_fill_write_vector() call sites
  NFSD: Use rqstp->rq_bvec in nfsd_iter_read()
  sunrpc: Replace the rq_bvec array with dynamically-allocated memory
  sunrpc: Replace the rq_pages array with dynamically-allocated memory
  sunrpc: Remove backchannel check in svc_init_buffer()
  sunrpc: Add a helper to derive maxpages from sv_max_mesg
  svcrdma: Reduce the number of rdma_rw contexts per-QP
  ...
2025-05-28 12:21:12 -07:00
Anubhav Shelat
c7a48ea9b9 perf trace: Always print return value for syscalls returning a pid
The syscalls that were consistently observed were set_robust_list and
rseq. This is because perf cannot find their child process.

This change ensures that the return value is always printed.

Before:
     0.256 ( 0.001 ms): set_robust_list(head: 0x7f09c77dba20, len: 24)                        =
     0.259 ( 0.001 ms): rseq(rseq: 0x7f09c77dc0e0, rseq_len: 32, sig: 1392848979)             =
After:
     0.270 ( 0.002 ms): set_robust_list(head: 0x7f0bb14a6a20, len: 24)                        = 0
     0.273 ( 0.002 ms): rseq(rseq: 0x7f0bb14a70e0, rseq_len: 32, sig: 1392848979)             = 0

Committer notes:

As discussed in the thread in the Link: tag below, these two don't
return a pid, but for syscalls returning one, we need to print the
result and if we manage to find the children in 'perf trace' data
structures, then print its name as well.

Fixes: 11c8e39f51 ("perf trace: Infrastructure to show COMM strings for syscalls returning PIDs")
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250403160411.159238-2-ashelat@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 15:39:29 -03:00
Leo Yan
e8718f9866 perf script: Print PERF_AUX_FLAG_COLLISION flag
Print out the collision flag for AUX trace data. This is helpful for
inspecting sample collisions.

After:

  0x217b60@/data_nvme1n1/niayan01/upstream/perf.data [0x40]: event: 11
  .
  . ... raw event: size 64 bytes
  .  0000:  0b 00 00 00 00 00 40 00 d2 ef 3f 00 00 00 00 00  ......@...?.....
  .  0010:  ff 0f 00 00 00 00 00 00 08 00 00 00 00 00 00 00  ................
  .  0020:  1c 01 00 00 1c 01 00 00 10 bf 38 d6 11 01 00 00  ..........8.....
  .  0030:  03 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00  ................

  3 1176120114960 0x217b60 [0x40]: PERF_RECORD_AUX offset: 0x3fefd2 size: 0xfff flags: 0x8 [C]

The added character '[C]' indicates the collision.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250528153519.188644-1-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 15:08:25 -03:00
Namhyung Kim
0dad79cf81 perf mem: Show absolute percent in mem_stat output
Currently the output sums up to 100% for each entry.  But it can be
confusing when it's displayed with 'overhead'.

Before:

  $ perf mem report -F overhead,sample,cache,comm
  ...
  #                         -------------- Cache --------------
  # Overhead       Samples       L1     L2     L3 L1-buf  Other  Command
  # ........  ............  ...................................  ...............
  #
      25.38%           517    34.6%   0.0%  15.8%  23.3%  26.2%  swapper
       9.03%           239    35.4%   0.8%   9.1%  22.1%  32.6%  chrome
       8.61%           233    45.3%   1.2%   8.9%  22.7%  21.9%  Chrome_ChildIOT
       7.81%           189    33.6%   0.4%   5.5%  35.9%  24.6%  Isolated Web Co
       3.73%           103    40.4%   0.3%   2.7%  39.4%  17.2%  gnome-shell

Let's convert it to use absolute percent value so that it can add up to
the overhead for that entry.

After:
  #                         -------------- Cache --------------
  # Overhead       Samples       L1     L2     L3 L1-buf  Other  Command
  # ........  ............  ...................................  ...............
  #
      25.38%           517     8.8%   0.0%   4.0%   5.9%   6.7%  swapper
       9.03%           239     3.2%   0.1%   0.8%   2.0%   2.9%  chrome
       8.61%           233     3.9%   0.1%   0.8%   2.0%   1.9%  Chrome_ChildIOT
       7.81%           189     2.6%   0.0%   0.4%   2.8%   1.9%  Isolated Web Co
       3.73%           103     1.5%   0.0%   0.1%   1.5%   0.6%  gnome-shell

This aligns well with the existing 'mem' sort key.

  $ perf mem report -s comm,mem -H
  ...
  #
  #    Overhead       Samples  Command / Memory access
  # .........................  ..........................................
  #
      25.38%           517     swapper
          8.78%           150     L1 hit
          6.66%            72     RAM hit
          5.92%           137     LFB/MAB hit
          4.02%           157     L3 hit
          0.00%             1     L3 miss
       9.03%           239     chrome
          3.19%           117     L1 hit
          2.94%            35     RAM hit
          1.99%            48     LFB/MAB hit
          0.82%            32     L3 hit
          0.08%             5     L2 hit
          0.00%             2     L3 miss

We can add an option or a config to change the setting later.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250523222157.1259998-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 14:42:20 -03:00
Namhyung Kim
7a6710d015 perf mem: Display sort order only if it's available
IOW it's not used when -F option is used alone.  Let's make it
conditional to skip printing incorrect information.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250523222157.1259998-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 14:41:42 -03:00
Ravi Bangoria
00a23c000e perf mem: Describe overhead calculation in brief
Unlike perf-report which uses sample period for overhead calculation,
perf-mem overhead is calculated using sample weight. Describe perf-mem
overhead calculation method in it's man page.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250523222157.1259998-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 14:41:10 -03:00
Dapeng Mi
a4a859eb67 perf record: Fix incorrect --user-regs comments
The comment of "--user-regs" option is not correct, fix it.

"on interrupt," -> "in user space,"

Fixes: 84c4174227 ("perf record: Support direct --user-regs arguments")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250403060810.196028-1-dapeng1.mi@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 14:10:56 -03:00
Arnaldo Carvalho de Melo
24bcc31fc7 Revert "perf thread: Ensure comm_lock held for comm_list"
This reverts commit 8f454c9581.

'perf top' is freezing on exit sometimes, bisected to this one, revert.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Fei Lang <langfei@huawei.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/r/aDcyvvOKZkRYbjul@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 12:59:17 -03:00
Linus Torvalds
96d40793ab seccomp updates for v6.16-rc1
- selftest fixes for arm32 (Neill Kapron, Terry Tritton)
 
 - documentation typo fix (Sumanth Gavini)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaDUlIgAKCRA2KwveOeQk
 uxbDAQCIUTc4Dy5SDihfbUaaxmScMZJmz+3Rvnz2665djkl82QD/aKEpRdxe8gCm
 mTRF6pcfeurw5W6ms+QxpayGf3hnmgU=
 =XJBX
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:

 - selftest fixes for arm32 (Neill Kapron, Terry Tritton)

 - documentation typo fix (Sumanth Gavini)

* tag 'seccomp-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  selftests: seccomp: Fix "performace" to "performance"
  selftests/seccomp: fix negative_ENOSYS tracer tests on arm32
  selftests/seccomp: fix syscall_restart test for arm compat
2025-05-28 07:45:39 -07:00
Ian Rogers
6dd7a0fde9 perf test trace_summary: Skip --bpf-summary tests if no libbpf
If perf is built without libbpf (e.g. NO_LIBBPF=1) then the
--bpf-summary perf trace tests will fail.

Skip the tests as this is expected behavior.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Howard Chu <howardchu95@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250528032637.198960-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 10:12:47 -03:00
Ian Rogers
8755f940a0 perf test intel-pt: Skip jitdump test if no libelf
jitdump support is only present if building with libelf.

Skip the intel-pt jitdump test if perf isn't compiled with libelf
support.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250528032637.198960-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 10:12:47 -03:00
Ian Rogers
040a008d0e perf intel-tpebs: Avoid race when evlist is being deleted
Reading through the evsel->evlist may seg fault if a sample arrives
when the evlist is being deleted.

Detect this case and ignore samples arriving when the evlist is being
deleted.

Fixes: bcfab08db7 ("perf intel-tpebs: Filter non-workload samples")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250528032637.198960-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 10:12:47 -03:00
Ian Rogers
07f2b1287c perf test demangle-java: Don't segv if demangling fails
The buffer returned by dso__demangle_sym() may be NULL, don't segv in
strcmp if this happens.

Currently this happens for NO_LIBELF=1 builds.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250528032637.198960-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 10:12:47 -03:00
Ian Rogers
fef8f648bb perf symbol: Fix use-after-free in filename__read_build_id
The same buf is used for the program headers and reading notes. As the
notes memory may be reallocated then this corrupts the memory pointed
to by the phdr. Using the same buffer is in any case a logic
error. Rather than deal with the duplicated code, introduce an elf32
boolean and a union for either the elf32 or elf64 headers that are in
use. Let the program headers have their own memory and grow the buffer
for notes as necessary.

Before `perf list -j` compiled with asan would crash with:
```
==4176189==ERROR: AddressSanitizer: heap-use-after-free on address 0x5160000070b8 at pc 0x555d3b15075b bp 0x7ffebb5a8090 sp 0x7ffebb5a8088
READ of size 8 at 0x5160000070b8 thread T0
    #0 0x555d3b15075a in filename__read_build_id tools/perf/util/symbol-minimal.c:212:25
    #1 0x555d3ae43aff in filename__sprintf_build_id tools/perf/util/build-id.c:110:8
...

0x5160000070b8 is located 312 bytes inside of 560-byte region [0x516000006f80,0x5160000071b0)
freed by thread T0 here:
    #0 0x555d3ab21840 in realloc (perf+0x264840) (BuildId: 12dff2f6629f738e5012abdf0e90055518e70b5e)
    #1 0x555d3b1506e7 in filename__read_build_id tools/perf/util/symbol-minimal.c:206:11
...

previously allocated by thread T0 here:
    #0 0x555d3ab21423 in malloc (perf+0x264423) (BuildId: 12dff2f6629f738e5012abdf0e90055518e70b5e)
    #1 0x555d3b1503a2 in filename__read_build_id tools/perf/util/symbol-minimal.c:182:9
...
```

Note: this bug is long standing and not introduced by the other asan
fix in commit fa9c4977fb ("perf symbol-minimal: Fix double free in
filename__read_build_id").

Fixes: b691f64360 ("perf symbols: Implement poor man's ELF parser")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250528032637.198960-2-irogers@google.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Trevor Gross <tmgross@umich.edu>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 10:12:47 -03:00
Ian Rogers
2a2a7f5e7d perf pmu: Avoid segv for missing name/alias_name in wildcarding
The pmu name or alias_name fields may be NULL and should be skipped if
so. This is done in all loops of perf_pmu___name_match except the
final wildcard loop which was an oversight.

Fixes: 63e287131c ("perf pmu: Rename name matching for no suffix or wildcard variants")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250527215035.187992-1-irogers@google.com
[ Fixup the Fixes: tag to the right commit ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 10:12:47 -03:00
Ian Rogers
4c04654455 perf machine: Factor creating a "live" machine out of dwarf-unwind
Factor out for use in places other than the dwarf unwinding tests for
libunwind.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250313052952.871958-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28 09:24:59 -03:00
Paolo Abeni
f6bd8faeb1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in late fixes to prepare for the 6.16 net-next PR.

No conflicts nor adjacent changes.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28 10:11:15 +02:00
Saket Kumar Bhaskar
acea6b132d selftests/bpf: Fix bpf selftest build warning
On linux-next, build for bpf selftest displays a warning:

Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h'
differs from latest version at 'include/uapi/linux/if_xdp.h'.

Commit 8066e388be ("net: add UAPI to the header guard in various network headers")
changed the header guard from _LINUX_IF_XDP_H to _UAPI_LINUX_IF_XDP_H
in include/uapi/linux/if_xdp.h.

To resolve the warning, update tools/include/uapi/linux/if_xdp.h
to align with the changes in include/uapi/linux/if_xdp.h

Fixes: 8066e388be ("net: add UAPI to the header guard in various network headers")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/all/c2bc466d-dff2-4d0d-a797-9af7f676c065@linux.ibm.com/
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20250527054138.1086006-1-skb99@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28 10:00:14 +02:00
Phil Sutter
6da5f1b4b4 selftests: netfilter: Fix skip of wildcard interface test
The script is supposed to skip wildcard interface testing if unsupported
by the host's nft tool. The failing check caused script abort due to
'set -e' though. Fix this by running the potentially failing nft command
inside the if-conditional pipe.

Fixes: 73db1b5dab ("selftests: netfilter: Torture nftables netdev hooks")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20250527094117.18589-1-phil@nwl.cc
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28 09:48:41 +02:00
Pedro Tammela
2945ff733d selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem
Reproduce the UAF scenario where netem is a child of HFSC and HFSC
is configured to use the eltree. In such case, this TDC test would
cause the HFSC class to be added to the eltree twice resulting
in a UAF.

Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://patch.msgid.link/20250522181448.1439717-3-pctammela@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28 08:55:35 +02:00
Linus Torvalds
feacb1774b sched_ext: Changes for v6.16
- More in-kernel idle CPU selection improvements. Expand topology awareness
   coverage add scx_bpf_select_cpu_and() to allow more flexibility. The idle
   CPU selection kfuncs can now be called from unlocked contexts too.
 
 - A bunch of reorganization changes to lay the foundation for multiple
   hierarchical scheduler support. This isn't ready yet and the included
   changes don't make meaningful behavior differences. One notable change is
   replacing some static_key tests with dynamic tests as the test results may
   differ depending on the scheduler instance. This isn't expected to cause
   meaningful performance difference.
 
 - Other minor and doc updates.
 
 - There were multiple patches in for-6.15-fixes which conflicted with
   changes in for-6.16. for-6.15-fixes were pulled three times into for-6.16
   to resolve the conflicts.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaDYZMw4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGfbcAQDRloVb/d5RfC6VYlue9EV1jHuoJefTYHvR3jmO
 ju70EQEAjLBXw58XAePQ9La/570JELgsC5FzJp3tLTilGx2JyQA=
 =7cDG
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext updates from Tejun Heo:

 - More in-kernel idle CPU selection improvements. Expand topology
   awareness coverage add scx_bpf_select_cpu_and() to allow more
   flexibility. The idle CPU selection kfuncs can now be called from
   unlocked contexts too.

 - A bunch of reorganization changes to lay the foundation for multiple
   hierarchical scheduler support. This isn't ready yet and the included
   changes don't make meaningful behavior differences. One notable
   change is replacing some static_key tests with dynamic tests as the
   test results may differ depending on the scheduler instance. This
   isn't expected to cause meaningful performance difference.

 - Other minor and doc updates.

 - There were multiple patches in for-6.15-fixes which conflicted with
   changes in for-6.16. for-6.15-fixes were pulled three times into
   for-6.16 to resolve the conflicts.

* tag 'sched_ext-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (49 commits)
  sched_ext: Call ops.update_idle() after updating builtin idle bits
  sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler"
  selftests/sched_ext: Update test enq_select_cpu_fails
  sched_ext: idle: Consolidate default idle CPU selection kfuncs
  selftests/sched_ext: Add test for scx_bpf_select_cpu_and() via test_run
  sched_ext: idle: Allow scx_bpf_select_cpu_and() from unlocked context
  sched_ext: idle: Validate locking correctness in scx_bpf_select_cpu_and()
  sched_ext: Make scx_kf_allowed_if_unlocked() available outside ext.c
  sched_ext, docs: add label
  sched_ext: Explain the temporary situation around scx_root dereferences
  sched_ext: Add @sch to SCX_CALL_OP*()
  sched_ext: Cleanup [__]scx_exit/error*()
  sched_ext: Add @sch to SCX_CALL_OP*()
  sched_ext: Clean up scx_root usages
  Documentation: scheduler: Changed lowercase acronyms to uppercase
  sched_ext: Avoid NULL scx_root deref in __scx_exit()
  sched_ext: Add RCU protection to scx_root in DSQ iterator
  sched_ext: Clean up SCX_EXIT_NONE handling in scx_disable_workfn()
  sched_ext: Move disable machinery into scx_sched
  sched_ext: Move event_stats_cpu into scx_sched
  ...
2025-05-27 21:12:50 -07:00
Linus Torvalds
3b66e6b3c0 cgroup: Changes for v6.16
- cgroup rstat shared the tracking tree across all controlers with the
   rationale being that a cgroup which is using one resource is likely to be
   using other resources at the same time (ie. if something is allocating
   memory, it's probably consuming CPU cycles). However, this turned out to
   not scale very well especially with memcg using rstat for internal
   operations which made memcg stat read and flush patterns substantially
   different from other controllers. JP Kobryn split the rstat tree per
   controller.
 
 - cgroup BPF support was hooking into cgroup init/exit paths directly.
   Convert them to use a notifier chain instead so that other usages can be
   added easily. The two of the patches which implement this are mislabeled
   as belonging to sched_ext instead of cgroup. Sorry.
 
 - Relatively minor cpuset updates.
 
 - Documentation updates.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaDYUmA4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGRhbAP90v8QwUkWEKGQSam8JY3by7PvrW6pV5ot+BGuM
 4xu3BAEAjsJ9FdiwYLwKYqG7y59xhhBFOo6GpcP52kPp3znl+QQ=
 =6MIT
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - cgroup rstat shared the tracking tree across all controllers with the
   rationale being that a cgroup which is using one resource is likely
   to be using other resources at the same time (ie. if something is
   allocating memory, it's probably consuming CPU cycles).

   However, this turned out to not scale very well especially with memcg
   using rstat for internal operations which made memcg stat read and
   flush patterns substantially different from other controllers. JP
   Kobryn split the rstat tree per controller.

 - cgroup BPF support was hooking into cgroup init/exit paths directly.

   Convert them to use a notifier chain instead so that other usages can
   be added easily. The two of the patches which implement this are
   mislabeled as belonging to sched_ext instead of cgroup. Sorry.

 - Relatively minor cpuset updates

 - Documentation updates

* tag 'cgroup-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (23 commits)
  sched_ext: Convert cgroup BPF support to use cgroup_lifetime_notifier
  sched_ext: Introduce cgroup_lifetime_notifier
  cgroup: Minor reorganization of cgroup_create()
  cgroup, docs: cpu controller's interaction with various scheduling policies
  cgroup, docs: convert space indentation to tab indentation
  cgroup: avoid per-cpu allocation of size zero rstat cpu locks
  cgroup, docs: be specific about bandwidth control of rt processes
  cgroup: document the rstat per-cpu initialization
  cgroup: helper for checking rstat participation of css
  cgroup: use subsystem-specific rstat locks to avoid contention
  cgroup: use separate rstat trees for each subsystem
  cgroup: compare css to cgroup::self in helper for distingushing css
  cgroup: warn on rstat usage by early init subsystems
  cgroup/cpuset: drop useless cpumask_empty() in compute_effective_exclusive_cpumask()
  cgroup/rstat: Improve cgroup_rstat_push_children() documentation
  cgroup: fix goto ordering in cgroup_init()
  cgroup: fix pointer check in css_rstat_init()
  cgroup/cpuset: Add warnings to catch inconsistency in exclusive CPUs
  cgroup/cpuset: Fix obsolete comment in cpuset_css_offline()
  cgroup/cpuset: Always use cpu_active_mask
  ...
2025-05-27 20:59:53 -07:00
Linus Torvalds
f1975e4765 Summary
* Move kern_table members out of kernel/sysctl.c
 
   Moved a subset (tracing, panic, signal, stack_tracer and sparc) out of the
   kern_table array. The goal is for kern_table to only have sysctl elements. All
   this increases modularity by placing the ctl_tables closer to where they are
   used while reducing the chances of merge conflicts in kernel/sysctl.c.
 
 * Fixed sysctl unit test panic by relocating it to selftests
 
 * Testing
 
   These have been in linux-next from rc2, so they have had more than a month
   worth of testing.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmgwLsAACgkQupfNUreW
 QU9ghwv/VKZW+IXEvSjc8OiwntWkL7e5ddHY6O2Vf44MzhBefLTXmfx2HfkEA0Xw
 RaOQ28Hf/zQL83RqHHnXqI7JdGWQJUm8bCPwk4H3DCaF8qOfPVvblVYmfNL2auSY
 oyRRpRzZuY5EtKcrNjiHFHL2WIC8KvPVwS748oHY1eZY7kn1fcs8DDnNO4iuWop+
 uJeDxu87wkRCFXF3DIM+MAHRvxSa8GHtZvb9EjAl/EHMbAyVSz3uTb7FdQDdnE09
 s7P30EC03RHtgi3sd2Ku04dJsHLz7VErvpToxSH2KFlcdpJuWuCSCTT8XaD8kII8
 kYYCxNpmPOf4LzEy/J2vVZB0PSHrHvuQCH7iGy+8wOPk9GHTOMkKMMXVmeGnAsef
 AiosPYroxXp/nBFcuNs6/1LKpsdpFr2F6u6oMgbzLaW1Xe/oc+6oynuOgeVj9LuM
 FrSxSwaVvpdwHYHujYPQAAWIgKRzITiEXnCgtSyohFquKb+7E8ZspwjOqYH2xWMQ
 WwABNRqY
 =45X2
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl updates from Joel Granados:

 - Move kern_table members out of kernel/sysctl.c

   Moved a subset (tracing, panic, signal, stack_tracer and sparc) out
   of the kern_table array. The goal is for kern_table to only have
   sysctl elements. All this increases modularity by placing the
   ctl_tables closer to where they are used while reducing the chances
   of merge conflicts in kernel/sysctl.c.

 - Fixed sysctl unit test panic by relocating it to selftests

* tag 'sysctl-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: Close test ctl_headers with a for loop
  sysctl: call sysctl tests with a for loop
  sysctl: Add 0012 to test the u8 range check
  sysctl: move u8 register test to lib/test_sysctl.c
  sparc: mv sparc sysctls into their own file under arch/sparc/kernel
  stack_tracer: move sysctl registration to kernel/trace/trace_stack.c
  tracing: Move trace sysctls into trace.c
  signal: Move signal ctl tables into signal.c
  panic: Move panic ctl tables into panic.c
2025-05-27 20:43:35 -07:00
Mina Almasry
affffcbb87 net: devmem: ncdevmem: remove unused variable
This variable is unused and can be removed.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250523230524.1107879-9-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-27 19:19:36 -07:00
Mina Almasry
baa18bc535 net: devmem: ksft: upgrade rx test to send 1K data
The current test just sends "hello\nworld" and verifies that is the
string received on the RX side. That is fine, but improve the test a bit
by sending 1K data. The test should be improved further to send more
data, but for now this should be a welcome improvement.

The test will send a repeating pattern of 0x01, 0x02, ... 0x06. The
ncdevmem `-v 7` flag will verify this pattern. ncdevmem will provide
useful debugging info when the test fails, such as the frags received
and verified fine, and which frag exactly failed, what was the expected
byte pattern, and what is the actual byte pattern received. All this
debug information will be useful when the test fails.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250523230524.1107879-8-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-27 19:19:36 -07:00
Mina Almasry
243d47a5e1 net: devmem: ksft: add 5 tuple FS support
ncdevmem supports drivers that are limited to either 3-tuple or 5-tuple
FS support, but the ksft is currently 3-tuple only. Support drivers that
have 5-tuple FS supported by adding a ksft arg.

Signed-off-by: Mina Almasry <almasrymina@google.com>

fix 5-tuple

fix 5-tuple
Acked-by: Stanislav Fomichev <sdf@fomichev.me>

Link: https://patch.msgid.link/20250523230524.1107879-7-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-27 19:19:36 -07:00
Mina Almasry
57605ae8e1 net: devmem: ksft: add exit_wait to make rx test pass
This exit_wait seems necessary to make the rx side test pass for me.
I think this is just missed from the original test add patch. Add it now.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250523230524.1107879-6-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-27 19:19:35 -07:00
Mina Almasry
12d31142e6 net: devmem: ksft: add ipv4 support
ncdevmem supports both ipv4 and ipv6, but the ksft is currently
ipv6-only. Propagate the ipv4 support to the ksft, so that folks that
are limited to these networks can also test.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250523230524.1107879-5-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-27 19:19:35 -07:00
Florian Westphal
429d410bf9 selftests: netfilter: nft_queue.sh: include file transfer duration in log message
Paolo Abeni says:
 Recently the nipa CI infra went through some tuning, and the mentioned
 self-test now often fails.

The failing test is the sctp+nfqueue one, where the file transfer takes
too long and hits the timeout (1 minute).

Because SCTP nfqueue tests had timeout related issues before (esp. on debug
kernels) print the file transfer duration in the PASS/FAIL message.
This would aallow us to see if there is/was an unexpected slowdown
(CI keeps logs around) or 'creeping slowdown' where things got slower
over time until 'fail point' was reached.

Output of altered lines looks like this:
  PASS: tcp and nfqueue in forward chan (duration: 2s)
  PASS: tcp via loopback (duration: 2s)
  PASS: sctp and nfqueue in forward chain (duration: 42s)
  PASS: sctp and nfqueue in output chain with GSO (duration: 21s)

Reported-by: Paolo Abeni <pabeni@redhat.com
Closes: https://lore.kernel.org/netdev/584524ef-9fd7-4326-9f1b-693ca62c5692@redhat.com/
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20250523121700.20011-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-27 19:13:21 -07:00
Hangbin Liu
d9d836bfa5 selftests: net: move wait_local_port_listen to lib.sh
The function wait_local_port_listen() is the only function defined in
net_helper.sh. Since some tests source both lib.sh and net_helper.sh,
we can simplify the setup by moving wait_local_port_listen() to lib.sh.

With this change, net_helper.sh becomes redundant and can be removed.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250526014600.9128-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-27 17:31:42 -07:00
Donald Hunter
09d7ff0694 tools: ynl: parse extack for sub-messages
Extend the Python YNL extack decoding to handle sub-messages in the same
way that YNL C does. This involves retaining the input values so that
they are available during extack decoding.

./tools/net/ynl/pyynl/cli.py --family rt-link --do newlink --create \
    --json '{
        "linkinfo": {"kind": "netkit", "data": {"policy": 10} }
    }'
Netlink error: Invalid argument
nl_len = 92 (76) nl_flags = 0x300 nl_type = 2
	error: -22
	extack: {'msg': 'Provided default xmit policy not supported', 'bad-attr': '.linkinfo.data(netkit).policy'}

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250523103031.80236-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-27 17:31:42 -07:00
Linus Torvalds
c89756bcf4 Power management updates for 6.16-rc1
- Fix potential division-by-zero error in em_compute_costs() (Yaxiong
    Tian).
 
  - Fix typos in energy model documentation and example driver code (Moon
    Hee Lee, Atul Kumar Pant).
 
  - Rearrange the energy model management code and add a new function for
    adjusting a CPU energy model after adjusting the capacity of the
    given CPU to it (Rafael Wysocki).
 
  - Refactor cpufreq_online(), add and use cpufreq policy locking guards,
    use __free() in policy reference counting, and clean up core cpufreq
    code on top of that (Rafael Wysocki).
 
  - Fix boost handling on CPU suspend/resume and sysfs updates (Viresh
    Kumar).
 
  - Fix des_perf clamping with max_perf in amd_pstate_update() (Dhananjay
    Ugwekar).
 
  - Add offline, online and suspend callbacks to the amd-pstate driver,
    rename and use the existing amd_pstate_epp callbacks in it (Dhananjay
    Ugwekar).
 
  - Add support for the "Requested CPU Min frequency" BIOS option to the
    amd-pstate driver (Dhananjay Ugwekar).
 
  - Reset amd-pstate driver mode after running selftests (Swapnil
    Sapkal).
 
  - Avoid shadowing ret in amd_pstate_ut_check_driver() (Nathan
    Chancellor).
 
  - Add helper for governor checks to the schedutil cpufreq governor and
    move cpufreq-specific EAS checks to cpufreq (Rafael Wysocki).
 
  - Populate the cpu_capacity sysfs entries from the intel_pstate driver
    after registering asym capacity support (Ricardo Neri).
 
  - Add support for enabling Energy-aware scheduling (EAS) to the
    intel_pstate driver when operating in the passive mode on a hybrid
    platform (Rafael Wysocki).
 
  - Drop redundant cpus_read_lock() from store_local_boost() in the
    cpufreq core (Seyediman Seyedarab).
 
  - Replace sscanf() with kstrtouint() in the cpufreq code and use a
    symbol instead of a raw number in it (Bowen Yu).
 
  - Add support for autonomous CPU performance state selection to the
    CPPC cpufreq driver (Lifeng Zheng).
 
  - OPP: Add dev_pm_opp_set_level() (Praveen Talari).
 
  - Introduce scope-based cleanup headers and mutex locking guards in OPP
    core (Viresh Kumar).
 
  - Switch OPP to use kmemdup_array() (Zhang Enpei).
 
  - Optimize bucket assignment when next_timer_ns equals KTIME_MAX in the
    menu cpuidle governor (Zhongqiu Han).
 
  - Convert the cpuidle PSCI driver to a faux device one (Sudeep Holla).
 
  - Add C1 demotion on/off sysfs knob to the intel_idle driver (Artem
    Bityutskiy).
 
  - Fix typos in two comments in the teo cpuidle governor (Atul Kumar
    Pant).
 
  - Fix denying of auto suspend in pm_suspend_timer_fn() (Charan Teja
    Kalla).
 
  - Move debug runtime PM attributes to runtime_attrs[] (Rafael Wysocki).
 
  - Add new devm_ functions for enabling runtime PM and runtime PM
    reference counting (Bence Csókás).
 
  - Remove size arguments from strscpy() calls in the hibernation core
    code (Thorsten Blum).
 
  - Adjust the handling of devices with asynchronous suspend enabled
    during system suspend and resume to start resuming them immediately
    after resuming their parents and to start suspending such a device
    immediately after suspending its first child (Rafael Wysocki).
 
  - Adjust messages printed during tasks freezing to avoid using
    pr_cont() (Andrew Sayers, Paul Menzel).
 
  - Clean up unnecessary usage of !! in pm_print_times_init() (Zihuan
    Zhang).
 
  - Add missing wakeup source attribute relax_count to sysfs and
    remove the space character at the end ofi the string produced by
    pm_show_wakelocks() (Zijun Hu).
 
  - Add configurable pm_test delay for hibernation (Zihuan Zhang).
 
  - Disable asynchronous suspend in ucsi_ccg_probe() to prevent the
    cypd4226 device on Tegra boards from suspending prematurely (Jon
    Hunter).
 
  - Unbreak printing PM debug messages during hibernation and clean up
    some related code (Rafael Wysocki).
 
  - Add a systemd service to run cpupower and change cpupower binding's
    Makefile to use -lcpupower (John B. Wyatt IV, Francesco Poli).
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmg0xS0SHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO1AwwH/Rvgza5YBPb9JZqWJT/ZiBw7HcEWHhP1
 fNfcVU1gXPZiF0yoPfjfJua6BcLj6lyQ3d/+zWqqAcWfmRSD6HPe8yYz8qALUAqj
 RWhDa04aGj6B9bQuOjejatznYlQlkwCRT7zec+75D+dAHVMqR/Vt2LFAetCadgHe
 MQibAQmVFXu3RFkBjReTAdGzVoTXkwoZDrzdfA2aFAfMJNtJpOW4atUZvnucuctv
 VK3ZratrctCIw7yXEoB1nWSmlY7R5JlslplBfndjmmOnky3YxNr7C6paqwtbTWoF
 MiX48qkmLOGeO6gS8s/lVCDQ4oZ+UNFQvXRsM5NGjycBikhHX/dp/w4=
 =dIqJ
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "Once again, the changes are dominated by cpufreq updates, but this
  time the majority of them are cpufreq core changes, mostly related to
  the introduction of policy locking guards and __free() usage, and
  fixes related to boost handling.

  Still, there is also a significant update of the intel_pstate driver
  making it register an energy model when running on a hybrid platform
  which is used for enabling energy-aware scheduling (EAS) if the driver
  operates in the passive mode (and schedutil is used as the cpufreq
  governor for all CPUs which is the passive mode default).

  There are some amd-pstate driver updates too, for a good measure,
  including the "Requested CPU Min frequency" BIOS option support and
  new online/offline callbacks.

  In the cpuidle space, the most significant change is the addition of a
  C1 demotion on/off sysfs knob to intel_idle which should help some
  users to configure their systems more precisely. There is also the
  conversion of the PSCI cpuidle driver to a faux device one and there
  are two small updates of cpuidle governors.

  Device power management is also modified quite a bit, especially the
  handling of devices with asynchronous suspend and resume enabled
  during system transitions. They are now going to be handled more
  asynchronously during suspend transitions and somewhat less
  aggressively during resume transitions.

  Apart from the above, the operating performance points (OPP) library
  is now going to use mutex locking guards and scope-based cleanup
  helpers and there is the usual bunch of assorted fixes and code
  cleanups.

  Specifics:

   - Fix potential division-by-zero error in em_compute_costs() (Yaxiong
     Tian)

   - Fix typos in energy model documentation and example driver code
     (Moon Hee Lee, Atul Kumar Pant)

   - Rearrange the energy model management code and add a new function
     for adjusting a CPU energy model after adjusting the capacity of
     the given CPU to it (Rafael Wysocki)

   - Refactor cpufreq_online(), add and use cpufreq policy locking
     guards, use __free() in policy reference counting, and clean up
     core cpufreq code on top of that (Rafael Wysocki)

   - Fix boost handling on CPU suspend/resume and sysfs updates (Viresh
     Kumar)

   - Fix des_perf clamping with max_perf in amd_pstate_update()
     (Dhananjay Ugwekar)

   - Add offline, online and suspend callbacks to the amd-pstate driver,
     rename and use the existing amd_pstate_epp callbacks in it
     (Dhananjay Ugwekar)

   - Add support for the "Requested CPU Min frequency" BIOS option to
     the amd-pstate driver (Dhananjay Ugwekar)

   - Reset amd-pstate driver mode after running selftests (Swapnil
     Sapkal)

   - Avoid shadowing ret in amd_pstate_ut_check_driver() (Nathan
     Chancellor)

   - Add helper for governor checks to the schedutil cpufreq governor
     and move cpufreq-specific EAS checks to cpufreq (Rafael Wysocki)

   - Populate the cpu_capacity sysfs entries from the intel_pstate
     driver after registering asym capacity support (Ricardo Neri)

   - Add support for enabling Energy-aware scheduling (EAS) to the
     intel_pstate driver when operating in the passive mode on a hybrid
     platform (Rafael Wysocki)

   - Drop redundant cpus_read_lock() from store_local_boost() in the
     cpufreq core (Seyediman Seyedarab)

   - Replace sscanf() with kstrtouint() in the cpufreq code and use a
     symbol instead of a raw number in it (Bowen Yu)

   - Add support for autonomous CPU performance state selection to the
     CPPC cpufreq driver (Lifeng Zheng)

   - OPP: Add dev_pm_opp_set_level() (Praveen Talari)

   - Introduce scope-based cleanup headers and mutex locking guards in
     OPP core (Viresh Kumar)

   - Switch OPP to use kmemdup_array() (Zhang Enpei)

   - Optimize bucket assignment when next_timer_ns equals KTIME_MAX in
     the menu cpuidle governor (Zhongqiu Han)

   - Convert the cpuidle PSCI driver to a faux device one (Sudeep Holla)

   - Add C1 demotion on/off sysfs knob to the intel_idle driver (Artem
     Bityutskiy)

   - Fix typos in two comments in the teo cpuidle governor (Atul Kumar
     Pant)

   - Fix denying of auto suspend in pm_suspend_timer_fn() (Charan Teja
     Kalla)

   - Move debug runtime PM attributes to runtime_attrs[] (Rafael
     Wysocki)

   - Add new devm_ functions for enabling runtime PM and runtime PM
     reference counting (Bence Csókás)

   - Remove size arguments from strscpy() calls in the hibernation core
     code (Thorsten Blum)

   - Adjust the handling of devices with asynchronous suspend enabled
     during system suspend and resume to start resuming them immediately
     after resuming their parents and to start suspending such a device
     immediately after suspending its first child (Rafael Wysocki)

   - Adjust messages printed during tasks freezing to avoid using
     pr_cont() (Andrew Sayers, Paul Menzel)

   - Clean up unnecessary usage of !! in pm_print_times_init() (Zihuan
     Zhang)

   - Add missing wakeup source attribute relax_count to sysfs and remove
     the space character at the end ofi the string produced by
     pm_show_wakelocks() (Zijun Hu)

   - Add configurable pm_test delay for hibernation (Zihuan Zhang)

   - Disable asynchronous suspend in ucsi_ccg_probe() to prevent the
     cypd4226 device on Tegra boards from suspending prematurely (Jon
     Hunter)

   - Unbreak printing PM debug messages during hibernation and clean up
     some related code (Rafael Wysocki)

   - Add a systemd service to run cpupower and change cpupower binding's
     Makefile to use -lcpupower (John B. Wyatt IV, Francesco Poli)"

* tag 'pm-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits)
  cpufreq: CPPC: Add support for autonomous selection
  cpufreq: Update sscanf() to kstrtouint()
  cpufreq: Replace magic number
  OPP: switch to use kmemdup_array()
  PM: freezer: Rewrite restarting tasks log to remove stray *done.*
  PM: runtime: fix denying of auto suspend in pm_suspend_timer_fn()
  cpufreq: drop redundant cpus_read_lock() from store_local_boost()
  cpupower: do not install files to /etc/default/
  cpupower: do not call systemctl at install time
  cpupower: do not write DESTDIR to cpupower.service
  PM: sleep: Introduce pm_sleep_transition_in_progress()
  cpufreq/amd-pstate: Avoid shadowing ret in amd_pstate_ut_check_driver()
  cpufreq: intel_pstate: Document hybrid processor support
  cpufreq: intel_pstate: EAS: Increase cost for CPUs using L3 cache
  cpufreq: intel_pstate: EAS support for hybrid platforms
  PM: EM: Introduce em_adjust_cpu_capacity()
  PM: EM: Move CPU capacity check to em_adjust_new_capacity()
  PM: EM: Documentation: Fix typos in example driver code
  cpufreq: Drop policy locking from cpufreq_policy_is_good_for_eas()
  PM: sleep: Introduce pm_suspend_in_progress()
  ...
2025-05-27 16:48:47 -07:00
Linus Torvalds
3702a515ed ACPI updates for 6.16-rc1
- Fix two ACPICA SLAB cache leaks (Seunghun Han).
 
  - Add EINJv2 get error type action and define Error Injection Actions
    in hex values to avoid inconsistencies between the specification and
    the code (Zaid Alali).
 
  - Fix typo in comments for SRAT structures (Adam Lackorzynski).
 
  - Prevent possible loss of data in ACPICA because of u32 to u8
    conversions (Saket Dumbre).
 
  - Fix reading FFixedHW operation regions in ACPICA (Daniil Tatianin).
 
  - Add support for printing AML arguments when the ACPICA debug level is
    ACPI_LV_TRACE_POINT (Mario Limonciello).
 
  - Drop a stale comment about the file content from actbl2.h (Sudeep
    Holla).
 
  - Apply pack(1) to union aml_resource (Tamir Duberstein).
 
  - Fix overflow check in the ACPICA version of vsnprintf() (gldrk).
 
  - Interpret SIDP structures in DMAR added revision 3.4 of the VT-d
    specification (Alexey Neyman).
 
  - Add typedef and other definitions related to MRRM to ACPICA (Tony
    Luck).
 
  - Add definitions for RIMT to ACPICA (Sunil V L).
 
  - Fix spelling mistake "Incremement" -> "Increment" in the ACPICA
    utilities code (Colin Ian King).
 
  - Add typedef and other definitions for ERDT to ACPICA (Tony Luck).
 
  - Introduce ACPI_NONSTRING and use it (Kees Cook, Ahmed Salem).
 
  - Rename structure and field names of the RAS2 table in actbl2.h (Shiju
    Jose).
 
  - Fix up whitespace in acpica/utcache.c (Zhe Qiao).
 
  - Avoid sequence overread in a call to strncmp() in ap_get_table_length()
    and replace strncpy() with memcpy() in ACPICA in some places (Ahmed
    Salem).
 
  - Update copyright year in all ACPICA files (Saket Dumbre).
 
  - Add __nonstring annotations for unterminated strings in the static
    ACPI tables parsing code (Kees Cook).
 
  - Add support for parsing the MRRM ACPI table and sysfs files to
    describe memory regions listed in it (Tony Luck, Anil Keshavamurthy).
 
  - Remove an (explicitly) unused header file include from the VIOT ACPI
    table parser file (Andy Shevchenko).
 
  - Improve logging around acpi_initialize_tables() (Bartosz Szczepanek).
 
  - Clean up the initialization of CPU data structures in the ACPI
    processor driver (Zhang Rui).
 
  - Remove an obsolete comment regarding the C-states handling in the
    ACPI processor driver (Giovanni Gherdovich).
 
  - Simplify PCC shared memory region handling (Sudeep Holla).
 
  - Rework and extend functions for reading CPPC register values and for
    updating CPPC registers (Lifeng Zheng).
 
  - Add three functions related to autonomous CPU performance state
    selection to the CPPC library (Lifeng Zheng).
 
  - Turn the acpi_pci_root_remap_iospace() fwnode_handle parameter into a
    const pointer (Pei Xiao).
 
  - Round battery capacity percengate in the ACPI battery driver to the
    closest integer to avoid user confusion (shitao).
 
  - Make the ACPI battery driver report the current as a negative number
    to the power supply framework when the battery is discharging as
    documented (Peter Marheine).
 
  - Add TUXEDO InfinityBook Pro AMD Gen9 to the acpi_ec_no_wakeup[] list
    to prevent spurious wakeups from suspend-to-idle (Werner Sembach).
 
  - Convert the APEI EINJ driver to a faux device one (Sudeep Holla, Jon
    Hunter).
 
  - Remove redundant calls to einj_get_available_error_type() from the
    APEI EINJ driver (Zaid Alali).
 
  - Fix a typo for MECHREVO in irq1_edge_low_force_override[] (Mingcong
    Bai).
 
  - Add an LPS0 check() callback to the AMD pinctrl driver and fix up
    config symbol dependencies in it (Mario Limonciello, Rafael Wysocki).
 
  - Avoid initializing the ACPI platform profile driver on non-ACPI
    platforms (Alexandre Ghiti).
 
  - Document that references to ACPI data (non-device) nodes should use
    string-only references in hierarchical data node packages (Sakari
    Ailus).
 
  - Fail the ACPI bus registration if acpi_kobj registration fails (Armin
    Wolf).
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmg0rlUSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO16GUH/jtsD33u0tSopncGA+MYQO/rd3MA1sW0
 2W9NACvdPDtoOpDiKLa3ptw6FaicJMrPvghZ6LMZ1Wjt3JKqjGZLtj5zzZcCaCxt
 34UOwr1V4FUDNQ7mfXRMRdwDnGj5UCaBLV80fekwr8gYhfEWM5F9MpeRuFR4QOAf
 HGvmtI8NCAaYn2utHSvP/XkH2+C257k5lbdt1PMnQg7E7iQBoz3Nn39EuQ2MrkFN
 1YEZzBR2wZkfQRTf1ZeK6SFKixNKCtDAJj7YcnKhwafj5YKvynr8EnuMDNc9B4Xh
 dFGj9P/Jkxyol7sJqC+CjdyhDCY24cSBbk/+naCsInD0taSdK/FJW+w=
 =/Ek0
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "The most significant part of these changes is an ACPICA update
  covering two upstream ACPICA releases, 20241212 and 20250404, that
  have not been included into the kernel code base yet.

  Among other things, it adds definitions needed to address GCC 15's
  -Wunterminated-string-initialization warnings, adds support for three
  new tables (MRRM, ERDT, RIMT), extends support for two tables (RAS2,
  DMAR), and fixes some issues.

  On top of the above, there is a new parser for the MRRM table, more
  changes related to GCC 15's -Wunterminated-string-initialization
  warnings, a CPPC library update including functions related to
  autonomous CPU performance state selection, a couple of new quirks,
  some assorted fixes and some code cleanups.

  Specifics:

   - Fix two ACPICA SLAB cache leaks (Seunghun Han)

   - Add EINJv2 get error type action and define Error Injection Actions
     in hex values to avoid inconsistencies between the specification
     and the code (Zaid Alali)

   - Fix typo in comments for SRAT structures (Adam Lackorzynski)

   - Prevent possible loss of data in ACPICA because of u32 to u8
     conversions (Saket Dumbre)

   - Fix reading FFixedHW operation regions in ACPICA (Daniil Tatianin)

   - Add support for printing AML arguments when the ACPICA debug level
     is ACPI_LV_TRACE_POINT (Mario Limonciello)

   - Drop a stale comment about the file content from actbl2.h (Sudeep
     Holla)

   - Apply pack(1) to union aml_resource (Tamir Duberstein)

   - Fix overflow check in the ACPICA version of vsnprintf() (gldrk)

   - Interpret SIDP structures in DMAR added revision 3.4 of the VT-d
     specification (Alexey Neyman)

   - Add typedef and other definitions related to MRRM to ACPICA (Tony
     Luck)

   - Add definitions for RIMT to ACPICA (Sunil V L)

   - Fix spelling mistake "Incremement" -> "Increment" in the ACPICA
     utilities code (Colin Ian King)

   - Add typedef and other definitions for ERDT to ACPICA (Tony Luck)

   - Introduce ACPI_NONSTRING and use it (Kees Cook, Ahmed Salem)

   - Rename structure and field names of the RAS2 table in actbl2.h
     (Shiju Jose)

   - Fix up whitespace in acpica/utcache.c (Zhe Qiao)

   - Avoid sequence overread in a call to strncmp() in
     ap_get_table_length() and replace strncpy() with memcpy() in ACPICA
     in some places (Ahmed Salem)

   - Update copyright year in all ACPICA files (Saket Dumbre)

   - Add __nonstring annotations for unterminated strings in the static
     ACPI tables parsing code (Kees Cook)

   - Add support for parsing the MRRM ACPI table and sysfs files to
     describe memory regions listed in it (Tony Luck, Anil
     Keshavamurthy)

   - Remove an (explicitly) unused header file include from the VIOT
     ACPI table parser file (Andy Shevchenko)

   - Improve logging around acpi_initialize_tables() (Bartosz
     Szczepanek)

   - Clean up the initialization of CPU data structures in the ACPI
     processor driver (Zhang Rui)

   - Remove an obsolete comment regarding the C-states handling in the
     ACPI processor driver (Giovanni Gherdovich)

   - Simplify PCC shared memory region handling (Sudeep Holla)

   - Rework and extend functions for reading CPPC register values and
     for updating CPPC registers (Lifeng Zheng)

   - Add three functions related to autonomous CPU performance state
     selection to the CPPC library (Lifeng Zheng)

   - Turn the acpi_pci_root_remap_iospace() fwnode_handle parameter into
     a const pointer (Pei Xiao)

   - Round battery capacity percengate in the ACPI battery driver to the
     closest integer to avoid user confusion (shitao)

   - Make the ACPI battery driver report the current as a negative
     number to the power supply framework when the battery is
     discharging as documented (Peter Marheine)

   - Add TUXEDO InfinityBook Pro AMD Gen9 to the acpi_ec_no_wakeup[]
     list to prevent spurious wakeups from suspend-to-idle (Werner
     Sembach)

   - Convert the APEI EINJ driver to a faux device one (Sudeep Holla,
     Jon Hunter)

   - Remove redundant calls to einj_get_available_error_type() from the
     APEI EINJ driver (Zaid Alali)

   - Fix a typo for MECHREVO in irq1_edge_low_force_override[] (Mingcong
     Bai)

   - Add an LPS0 check() callback to the AMD pinctrl driver and fix up
     config symbol dependencies in it (Mario Limonciello, Rafael
     Wysocki)

   - Avoid initializing the ACPI platform profile driver on non-ACPI
     platforms (Alexandre Ghiti)

   - Document that references to ACPI data (non-device) nodes should use
     string-only references in hierarchical data node packages (Sakari
     Ailus)

   - Fail the ACPI bus registration if acpi_kobj registration fails
     (Armin Wolf)"

* tag 'acpi-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
  ACPI: MRRM: Fix default max memory region
  ACPI: bus: Bail out if acpi_kobj registration fails
  ACPI: platform_profile: Avoid initializing on non-ACPI platforms
  pinctrl: amd: Fix hibernation support with CONFIG_SUSPEND unset
  ACPI: tables: Improve logging around acpi_initialize_tables()
  ACPI: VIOT: Remove (explicitly) unused header
  ACPI: Add documentation for exposing MRRM data
  ACPI: MRRM: Add /sys files to describe memory ranges
  ACPI: MRRM: Minimal parse of ACPI MRRM table
  ACPICA: Update copyright year
  ACPICA: Logfile: Changes for version 20250404
  ACPICA: Replace strncpy() with memcpy()
  ACPICA: Apply ACPI_NONSTRING in more places
  ACPICA: Avoid sequence overread in call to strncmp()
  ACPICA: Adjust the position of code lines
  ACPICA: actbl2.h: ACPI 6.5: RAS2: Rename structure and field names of the RAS2 table
  ACPICA: Apply ACPI_NONSTRING
  ACPICA: Introduce ACPI_NONSTRING
  ACPICA: actbl2.h: ERDT: Add typedef and other definitions
  ACPICA: infrastructure: Add new DMT_BUF types and shorten a long name
  ...
2025-05-27 16:32:30 -07:00
Linus Torvalds
aacc73ceeb gpio updates for v6.16-rc1
GPIO core:
 - use more lock guards where applicable
 - refactor GPIO ACPI code and shrink it in the process by 8%
 - move GPIO ACPI quirks into a separate file
 - remove unneeded #ifdef
 - convert GPIO devres helpers to using devm_add_action() where applicable
   which shrinks and simplifies the code
 - refactor GPIO descriptor validation in GPIO consumer interfaces
 - don't allow setting values on input lines in the GPIO core which will
   take off the burden from GPIO drivers of checking this down the line
 - provide gpiod_is_equal() as a way of safely comparing two GPIO
   descriptors (the only current user is in regulator core)
 
 New drivers:
 - add the GPIO module for the max77759 multifunction device
 - add the GPIO driver for the VeriSilicon BLZP1600 GPIO controller
 - add the GPIO driver for the Spacemit K1 SoC
 
 Driver improvements:
 - convert more drivers to using the new GPIO line value setter callbacks
 - convert more drivers to making the irq_chip immutable as is recommended
   by the interrupt subsystem
 - extend build testing coverage by enabling more modules to be built with
   COMPILE_TEST=y
 - extend the gpio-aggregator module with a configfs interface that makes
   the setup easier for user-space than the existing driver-level sysfs
   attributes and also adds more advanced configuration features (such as
   referring to aggregated lines by their original names or modifying
   their names as exposed by the aggregated chip)
 - add a missing mutex_destroy() in gpio-imx-scu
 - add an OF polarity quirk for s5m8767
 - allow building gpio-vf610 as a loadable module
 - make gpio-mxc not hardcode its GPIO base number with GPIO SYSFS
   interface disabled (another small step towards getting rid of the global
   GPIO numberspace)
 - add support for level-triggered interrupts to gpio-pca953x
 - don't double-check the ngpios property in gpio-ds4520 as GPIO core
   already does it
 - don't double-check the number of GPIOs in gpio-imx-scu as GPIO core
   already does it
 - remove unused callbacks from gpio-max3191x
 
 DT bindings:
 - add device-tree bindings for max77759, spacemit,k1 and blzp1600 (new
   drivers added this cycle)
 - document more properties for gpio-vf610 and gpio-tegra186
 - document a new pca95xx variant
 - fix style of examples in several GPIO DT-binding documents
 
 Misc:
 - TODO list updates
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmg0NtQACgkQEacuoBRx
 13Iolg/+P8fe1hTek+UgdKm/EAQ1Mn3oijNE1Ix15VD8Iqacu+URyB2SJMFcg27n
 S/tsuwogQeQmdgXPfYDJkQmiZEyln/ytWf5W2lNwYhGfGujVa8h1FueB7Wb8Zs7G
 PNMnobyAIGivodJfvikDEyczMuxhkOH04ZOT7UpTSPI47BSGsujX/1vgmRQLid1Z
 3wFDJ0yDhVcuxit/VC+LzFpHIV0MiRzGpvHzYid5jjEaGSiRMpHixf27VJGc0gG1
 IJLkhNkwZ3InisWVGvqdRg/FUNErRYKYQSARb4AjCU+/y1H0SWdB0R6sZDTZpP+e
 YqAc8FW31Lw1L7PWBLRTaVS3KT868tdXDCsArNzfBbb3u/WikO2GY/AXuzveZatp
 pHwyPA0JS9QvxaTXU9yjCpGqdNfjbrmU5OkZxTTe+Nyz84fUfiURiE8g4Rl6riy4
 fNzaywRBmVZlEECWSWGzyNw9ZEYDRPZ1ZHmOA+8FWE+/XKJIsVf8w3x2QIC5b/HO
 hYKH4mar8oiEYJFZqoko3iQURJq+AD9wILCNpws5bSsi//VyyNT0mZV/q5hj7+Xx
 pqeEGDInvycN5fDWWJlkN1lj5dDyHZi4uus05mYI9Ec+eX3XNWRUHXUskbpzdgCs
 XepjP9kFQmMSL7y4z2d7tLd7gFup/uGny7o/KyMsIPDw7qVL5rY=
 =PQqp
 -----END PGP SIGNATURE-----

Merge tag 'gpio-updates-for-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio updates from Bartosz Golaszewski:
 "We have three new drivers, some refactoring in the GPIO core, lots of
  various changes across many drivers, new configfs interface for the
  virtual gpio-aggregator module and DT-bindings updates.

  The treewide conversion of GPIO drivers to using the new value setter
  callbacks is ongoing with another round of GPIO drivers updated. You
  will also see these commits coming in from other subsystems as with
  the relevant changes merged into mainline last cycle, I've started
  converting GPIO providers located elsewhere than drivers/gpio/.

  GPIO core:
   - use more lock guards where applicable
   - refactor GPIO ACPI code and shrink it in the process by 8%
   - move GPIO ACPI quirks into a separate file
   - remove unneeded #ifdef
   - convert GPIO devres helpers to using devm_add_action() where
     applicable which shrinks and simplifies the code
   - refactor GPIO descriptor validation in GPIO consumer interfaces
   - don't allow setting values on input lines in the GPIO core which
     will take off the burden from GPIO drivers of checking this down
     the line
   - provide gpiod_is_equal() as a way of safely comparing two GPIO
     descriptors (the only current user is in regulator core)

  New drivers:
   - add the GPIO module for the max77759 multifunction device
   - add the GPIO driver for the VeriSilicon BLZP1600 GPIO controller
   - add the GPIO driver for the Spacemit K1 SoC

  Driver improvements:
   - convert more drivers to using the new GPIO line value setter
     callbacks
   - convert more drivers to making the irq_chip immutable as is
     recommended by the interrupt subsystem
   - extend build testing coverage by enabling more modules to be built
     with COMPILE_TEST=y
   - extend the gpio-aggregator module with a configfs interface that
     makes the setup easier for user-space than the existing
     driver-level sysfs attributes and also adds more advanced
     configuration features (such as referring to aggregated lines by
     their original names or modifying their names as exposed by the
     aggregated chip)
   - add a missing mutex_destroy() in gpio-imx-scu
   - add an OF polarity quirk for s5m8767
   - allow building gpio-vf610 as a loadable module
   - make gpio-mxc not hardcode its GPIO base number with GPIO SYSFS
     interface disabled (another small step towards getting rid of the
     global GPIO numberspace)
   - add support for level-triggered interrupts to gpio-pca953x
   - don't double-check the ngpios property in gpio-ds4520 as GPIO core
     already does it
   - don't double-check the number of GPIOs in gpio-imx-scu as GPIO core
     already does it
   - remove unused callbacks from gpio-max3191x

  DT bindings:
   - add device-tree bindings for max77759, spacemit,k1 and blzp1600
     (new drivers added this cycle)
   - document more properties for gpio-vf610 and gpio-tegra186
   - document a new pca95xx variant
   - fix style of examples in several GPIO DT-binding documents

  Misc:
   - TODO list updates"

* tag 'gpio-updates-for-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (123 commits)
  gpio: timberdale: select GPIOLIB_IRQCHIP
  gpio: lpc18xx: select GPIOLIB_IRQCHIP
  gpio: grgpio: select GPIOLIB_IRQCHIP
  gpio: bcm-kona: select GPIOLIB_IRQCHIP
  dt-bindings: gpio: vf610: add ngpios and gpio-reserved-ranges
  gpio: davinci: select GPIOLIB_IRQCHIP
  gpiolib-acpi: Update file references in the Documentation and MAINTAINERS
  gpiolib: acpi: Move quirks to a separate file
  gpiolib: acpi: Add acpi_gpio_need_run_edge_events_on_boot() getter
  gpiolib: acpi: Handle deferred list via new API
  gpiolib: acpi: Make sure we fill struct acpi_gpio_info
  gpiolib: acpi: Switch to use enum in acpi_gpio_in_ignore_list()
  gpiolib: acpi: Use temporary variable for struct acpi_gpio_info
  gpiolib: remove unneeded #ifdef
  gpio: mpc8xxx: select GPIOLIB_IRQCHIP
  gpio: pxa: select GPIOLIB_IRQCHIP
  gpio: pxa: Make irq_chip immutable
  gpio: timberdale: Make irq_chip immutable
  gpio: xgene-sb: Make irq_chip immutable
  gpio: davinci: Make irq_chip immutable
  ...
2025-05-27 15:22:01 -07:00
Yonghong Song
5ffb537e41 selftests/bpf: Add tests with stack ptr register in conditional jmp
Add two tests:
  - one test has 'rX <op> r10' where rX is not r10, and
  - another test has 'rX <op> rY' where rX and rY are not r10
    but there is an early insn 'rX = r10'.

Without previous verifier change, both tests will fail.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250524041340.4046304-1-yonghong.song@linux.dev
2025-05-27 14:09:12 -07:00
Namhyung Kim
0e71bcdcf1 perf test: Add AMD IBS sw filter test
The kernel v6.14 added 'swfilt' to support privilege filtering in
software so that IBS can be used by regular users.  Add a test case in
x86 to verify the behavior.

  $ sudo perf test -vv 'IBS software filter'
  113: AMD IBS software filtering:
  --- start ---
  test child forked, pid 178826
  check availability of IBS swfilt
  run perf record with modifier and swfilt
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 0.000 MB /dev/null ]
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 0.000 MB /dev/null ]
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 0.000 MB /dev/null ]
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 0.000 MB /dev/null ]
  check number of samples with swfilt
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 0.037 MB - ]
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 0.041 MB - ]
  ---- end(0) ----
  113: AMD IBS software filtering                                      : Ok

Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # On a 9950x3d
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250524002754.1266681-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-27 18:07:02 -03:00
Yicong Yang
fa9b3578ed perf mem: Count L2 HITM for c2c statistic
L2 HITM is not counted in c2c statistic decoding. Count it for lcl_hitm
like how we handle L2 Peer snoop.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: CaiJingtao <caijingtao@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Junhao He <hejunhao3@huawei.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Yushan Wang <wangyushan12@huawei.com>
Cc: Zeng Tao <prime.zeng@hisilicon.com>
Cc: xueshan2@huawei.com
Link: https://lore.kernel.org/r/20250425033845.57671-4-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-27 18:05:28 -03:00
Yicong Yang
846b62b343 perf arm-spe: Add support for SPE Data Source packet on HiSilicon HIP12
Add data source encoding for HiSilicon HIP12 and coresponding mapping
to the perf's memory data source. This will help to synthesize the data
and support upper layer tools like perf-mem and perf-c2c.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: CaiJingtao <caijingtao@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Junhao He <hejunhao3@huawei.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Yushan Wang <wangyushan12@huawei.com>
Cc: Zeng Tao <prime.zeng@hisilicon.com>
Cc: xueshan2@huawei.com
Link: https://lore.kernel.org/r/20250425033845.57671-3-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-27 17:57:58 -03:00
Linus Torvalds
015a99fa76 nolibc changes for v6.16
Highlights:
 
 * New supported architectures: m68k, SPARC (32 and 64 bit)
 * Compatibility with kselftest_harness.h
 * A more robust mechanism to include all of nolibc from each header
 * Split existing features into new headers to simplify adoption
 * Compatibility with UBSAN and it is used in the testsuite
 * Many small new features focussing on usage in kselftests
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTg4lxklFHAidmUs57B+h1jyw5bOAUCaDTMAgAKCRDB+h1jyw5b
 ONycAQD35X+ca33v05Zup3oEne2zZtKqeRlStcFTp2AY03MpngEA46IsX5yBf1Hj
 GgsmC/RFGicF9RjaymgdAB5WOg0CeAc=
 =tjwV
 -----END PGP SIGNATURE-----

Merge tag 'nolibc-20250526-for-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc

Pull nolibc updates from Thomas Weißschuh:

 - New supported architectures: m68k, SPARC (32 and 64 bit)

 - Compatibility with kselftest_harness.h

 - A more robust mechanism to include all of nolibc from each header

 - Split existing features into new headers to simplify adoption

 - Compatibility with UBSAN and it is used in the testsuite

 - Many small new features focussing on usage in kselftests

* tag 'nolibc-20250526-for-6.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc: (83 commits)
  selftests: harness: Stop using setjmp()/longjmp()
  selftests: harness: Add "variant" and "self" to test metadata
  selftests: harness: Add teardown callback to test metadata
  selftests: harness: Move teardown conditional into test metadata
  selftests: harness: Don't set setup_completed for fixtureless tests
  selftests: harness: Implement test timeouts through pidfd
  selftests: harness: Remove dependency on libatomic
  selftests: harness: Remove inline qualifier for wrappers
  selftests: harness: Mark functions without prototypes static
  selftests: harness: Ignore unused variant argument warning
  selftests: harness: Use C89 comment style
  selftests: harness: Add kselftest harness selftest
  selftests/nolibc: drop include guards around standard headers
  tools/nolibc: move NULL and offsetof() to sys/stddef.h
  tools/nolibc: move uname() and friends to sys/utsname.h
  tools/nolibc: move makedev() and friends to sys/sysmacros.h
  tools/nolibc: move getrlimit() and friends to sys/resource.h
  tools/nolibc: move reboot() to sys/reboot.h
  tools/nolibc: move prctl() to sys/prctl.h
  tools/nolibc: move mount() to sys/mount.h
  ...
2025-05-27 11:27:09 -07:00
Linus Torvalds
3e443d1673 A moderately busy cycle for documentation this time around:
- The most significant change is the replacement of the old kernel-doc
   script (a monstrous collection of Perl regexes that predates the Git era)
   with a Python reimplementation.  That, too, is a horrifying collection of
   regexes, but in a much cleaner and more maintainable structure that
   integrates far better with the Sphinx build system.
 
   This change has been in linux-next for the full 6.15 cycle; the small
   number of problems that turned up have been addressed, seemingly to
   everybody's satisfaction.  The Perl kernel-doc script remains in tree (as
   scripts/kernel-doc.pl) and can be used with a command-line option if need
   be.  Unless some reason to keep it around materializes, it will probably
   go away in 6.17.
 
   Credit goes to Mauro Carvalho Chehab for doing all this work.
 
 - Some RTLA documentation updates
 
 - A handful of Chinese translations
 
 - The usual collection of typo fixes, general updates, etc.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmg0j/IPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Yu7sH/1w2LtO8XB/KTRNmuz3tV6KzGtDvQVwqgxB2
 X8bbeJlBtYenvuak66RjCfucOh7Y8Ni3UN0G2BGa67KBAxmZEYc6u+IF4SrJUg5g
 DuS6+ZXgqV4TrjWMRof5LtPS8KbNJLGnqgxSVdEPSBV0jJ13r3gb3/e7X06iNAKR
 X4Nq+h5aa1tCwZTkPOSHHQn4qm3Tb1LQreDSn8gnBn6e8nVJIakNlwaVYkClhI9B
 byvItInv32LPAXPDkcEWITvLNUTiMobTyfBYHOD6i3nImQ+j4ZiMMmOUjiB+0jDO
 UQDvoUa46ipXkLBsBOrYEkM/iKXBawMwTa3CcudxR4scvVgATJs=
 =BQ9X
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.16' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "A moderately busy cycle for documentation this time around:

   - The most significant change is the replacement of the old
     kernel-doc script (a monstrous collection of Perl regexes that
     predates the Git era) with a Python reimplementation. That, too, is
     a horrifying collection of regexes, but in a much cleaner and more
     maintainable structure that integrates far better with the Sphinx
     build system.

     This change has been in linux-next for the full 6.15 cycle; the
     small number of problems that turned up have been addressed,
     seemingly to everybody's satisfaction. The Perl kernel-doc script
     remains in tree (as scripts/kernel-doc.pl) and can be used with a
     command-line option if need be. Unless some reason to keep it
     around materializes, it will probably go away in 6.17.

     Credit goes to Mauro Carvalho Chehab for doing all this work.

   - Some RTLA documentation updates

   - A handful of Chinese translations

   - The usual collection of typo fixes, general updates, etc"

* tag 'docs-6.16' of git://git.lwn.net/linux: (85 commits)
  Docs: doc-guide: update sphinx.rst Sphinx version number
  docs: doc-guide: clarify latest theme usage
  Documentation/scheduler: Fix typo in sched-stats domain field description
  scripts: kernel-doc: prevent a KeyError when checking output
  docs: kerneldoc.py: simplify exception handling logic
  MAINTAINERS: update linux-doc entry to cover new Python scripts
  docs: align with scripts/syscall.tbl migration
  Documentation: NTB: Fix typo
  Documentation: ioctl-number: Update table intro
  docs: conf.py: drop backward support for old Sphinx versions
  Docs: driver-api/basics: add kobject_event interfaces
  Docs: relay: editing cleanups
  docs: fix "incase" typo in coresight/panic.rst
  Fix spelling error for 'parallel'
  docs: admin-guide: fix typos in reporting-issues.rst
  docs: dmaengine: add explanation for DMA_ASYNC_TX capability
  Documentation: leds: improve readibility of multicolor doc
  docs: fix typo in firmware-related section
  docs: Makefile: Inherit PYTHONPYCACHEPREFIX setting as env variable
  Documentation: ioctl-number: Update outdated submission info
  ...
2025-05-27 11:22:19 -07:00
Linus Torvalds
95bf3760eb lkmm: Update documentation
Changes
 -------
 
 * Cross-references, typos, broken URLs (Akira Yokosawa)
 * Clarify SRCU explanation (Uladzislau Rezki)
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmgzV8QTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jMZkD/9aspJ+8yAQPVSGPSZNN6Z0524A/n4A
 FIFKbD1Rm0QGH4QxfK6qhh5L6IouMAJL1GT0Y7Wg5tq/WOUU+ALGuPEmoD9424RL
 BzDO8wSKlswLTzPQ1rkcDdGgbqMxP9CMU7AAxbWsH8qHG95VVwODpHjoidNTODFo
 jfCmsB4vNb+LkEP7Cw2BseRX3FHOoRoWmfKF20aHyXEkCTS6SaLHZ+JydLuqdVzG
 XflasW/5kCmfFjPVyT0ferrvc7adGQ4yGdwufIapM/LVurGmttnYL4w9eJSVAs1n
 tZ/6AEN18Lrnb5x78ZEZjm/CKHO03GZgJTSw8rs6voWRXbN1E89Rd4fyhZIgACPj
 tQkF34hNmNMQKyyab+tpCmLjrJTiWaYMp5TpQWhr2agvLZqJZf6i9oDsXTq/TArT
 uKgrrzjAVlUfX6fcJHASxyCjFoh4KYuAJVTHszzCvG5crp7Fg0Wz7q5D+oM7UDwZ
 jPdMgFMJXJTvFFxBzQTcoeDLI74cIpCzmj0dCZONUoliEISuSciz+Iyv3HzFtF9M
 vRp6PZBr9fT3YWqs6qwYkqT351+12uCigMei5kK/GkQkCDDvUk+dzv/BG9YUIeBb
 Xq3UDLeJ+ACtYJAn/MCQw01TXEe7uc54xGL+hgoxC642myM1Kvu1wd7le1kPt+cK
 djMqbbrrPAdpFg==
 =DRxd
 -----END PGP SIGNATURE-----

Merge tag 'lkmm.2025.05.25a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull lkmm updates from Paul McKenney:
 "Update LKMM documentation:

   - Cross-references, typos, broken URLs (Akira Yokosawa)

   - Clarify SRCU explanation (Uladzislau Rezki)"

* tag 'lkmm.2025.05.25a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  tools/memory-model/Documentation: Fix SRCU section in explanation.txt
  tools/memory-model: docs/references: Remove broken link to imgtec.com
  tools/memory-model: docs/ordering: Fix trivial typos
  tools/memory-model: docs/simple.txt: Fix trivial typos
  tools/memory-model: docs/README: Update introduction of locking.txt
2025-05-27 11:17:45 -07:00
Alexis Lothoré (eBPF Foundation)
149ead9d7e selftests/bpf: enable many-args tests for arm64
Now that support for up to 12 args is enabled for tracing programs on
ARM64, enable the existing tests for this feature on this architecture.

Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
Link: https://lore.kernel.org/r/20250527-many_args_arm64-v3-2-3faf7bb8e4a2@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-27 10:50:31 -07:00
Jiayuan Chen
1ae7a84ed8 bpftool: Add support for custom BTF path in prog load/loadall
This patch exposes the btf_custom_path feature to bpftool, allowing users
to specify a custom BTF file when loading BPF programs using prog load or
prog loadall commands.

The argument 'btf_custom_path' in libbpf is used for those kernels that
don't have CONFIG_DEBUG_INFO_BTF enabled but still want to perform CO-RE
relocations.

Suggested-by: Quentin Monnet <qmo@kernel.org>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://lore.kernel.org/r/20250516144708.298652-1-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-27 10:31:20 -07:00
Yonghong Song
92de53d247 selftests/bpf: Add unit tests with __bpf_trap() kfunc
Add some inline-asm tests and C tests where __bpf_trap() or
__builtin_trap() is used in the code. The __builtin_trap()
test is guarded with llvm21 ([1]) since otherwise the compilation
failure will happen.

  [1] https://github.com/llvm/llvm-project/pull/131731

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250523205331.1291734-1-yonghong.song@linux.dev
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-27 10:30:07 -07:00
T.J. Mercier
7594dcb71f selftests/bpf: Add test for open coded dmabuf_iter
Use the same test buffers as the traditional iterator and a new BPF map
to verify the test buffers can be found with the open coded dmabuf
iterator.

Signed-off-by: T.J. Mercier <tjmercier@google.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250522230429.941193-6-tjmercier@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-27 09:51:26 -07:00
T.J. Mercier
ae5d2c59ec selftests/bpf: Add test for dmabuf_iter
This test creates a udmabuf, and a dmabuf from the system dmabuf heap,
and uses a BPF program that prints dmabuf metadata with the new
dmabuf_iter to verify they can be found.

Signed-off-by: T.J. Mercier <tjmercier@google.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250522230429.941193-5-tjmercier@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-27 09:51:26 -07:00
Paolo Bonzini
4e02d4f973 KVM SVM changes for 6.16:
- Wait for target vCPU to acknowledge KVM_REQ_UPDATE_PROTECTED_GUEST_STATE to
    fix a race between AP destroy and VMRUN.
 
  - Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for the VM.
 
  - Add support for ALLOWED_SEV_FEATURES.
 
  - Add #VMGEXIT to the set of handlers special cased for CONFIG_RETPOLINE=y.
 
  - Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing features
    that utilize those bits.
 
  - Don't account temporary allocations in sev_send_update_data().
 
  - Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock Threshold.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmgwmwAACgkQOlYIJqCj
 N/1pHw//edW/x838POMeeCN8j39NBKErW9yZoQLhMbzogttRvfoba+xYY9zXyRFx
 8AXB8+2iLtb7pXUohc0eYN0mNqgD0SnoMLqGfn7nrkJafJSUAJHAoZn1Mdom1M1y
 jHvBPbHCMMsgdLV8wpDRqCNWTH+d5W0kcN5WjKwOswVLj1rybVfK7bSLMhvkk1e5
 RrOR4Ewf95/Ag2b36L4SvS1yG9fTClmKeGArMXhEXjy2INVSpBYyZMjVtjHiNzU9
 TjtB2RSM45O+Zl0T2fZdVW8LFhA6kVeL1v+Oo433CjOQE0LQff3Vl14GCANIlPJU
 PiWN/RIKdWkuxStIP3vw02eHzONCcg2GnNHzEyKQ1xW8lmrwzVRdXZzVsc2Dmowb
 7qGykBQ+wzoE0sMeZPA0k/QOSqg2vGxUQHjR7720loLV9m9Tu/mJnS9e179GJKgI
 e1ArSLwKmHpjwKZqU44IQVTZaxSC4Sg2kI670i21ChPgx8+oVkA6I0LFQXymx7uS
 2lbH+ovTlJSlP9fbaJhMwAU2wpSHAyXif/HPjdw2LTH3NdgXzfEnZfTlAWiP65LQ
 hnz5HvmUalW3x9kmzRmeDIAkDnAXhyt3ZQMvbNzqlO5AfS+Tqh4Ed5EFP3IrQAzK
 HQ+Gi0ip+B84t9Tbi6rfQwzTZEbSSOfYksC7TXqRGhNo/DvHumE=
 =k6rK
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-svm-6.16' of https://github.com/kvm-x86/linux into HEAD

KVM SVM changes for 6.16:

 - Wait for target vCPU to acknowledge KVM_REQ_UPDATE_PROTECTED_GUEST_STATE to
   fix a race between AP destroy and VMRUN.

 - Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for the VM.

 - Add support for ALLOWED_SEV_FEATURES.

 - Add #VMGEXIT to the set of handlers special cased for CONFIG_RETPOLINE=y.

 - Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing features
   that utilize those bits.

 - Don't account temporary allocations in sev_send_update_data().

 - Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock Threshold.
2025-05-27 12:15:49 -04:00
Paolo Bonzini
3e0797f6dd KVM selftests changes for 6.16:
- Add support for SNP to the various SEV selftests.
 
  - Add a selftest to verify fastops instructions via forced emulation.
 
  - Add MGLRU support to the access tracking perf test.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmgwmd0ACgkQOlYIJqCj
 N/3WCBAAsQ5zS8e+B1+7xopSz41eCou8L7KBDZZSe4B9TAuT+hMslXBEculyOJqh
 tIlBFlvrQA/hC2tNYla58jIeA6/f08Jq4/sV1URMNvORFKMcIvgnKpxmMJfKujve
 L2iHvJigJs4hoBCXYHCZHTkd5VAtB6j++7y9rqZS+RznM6z6/NI9SalX7pHr7Sri
 DQeaMc71UYJfllvLyLmI+MbQccdLfQ1v4dmkt6pz29K5s0pX9PQYp54+Hu1Z73Te
 aFdrG+CuDchra1jxLFoell5P9bD6nq9SsNBfdf+6VjYk/1MMHP4yX/dAFEtEqMbm
 RJNX95bewY4mms3fj6e9j8jVXDLBiXR2an8yJI8k6CP6VPsIXQn+RG2pUQMcOUj0
 zcWikbfXvfn+ReIoaeReWPyZ7tPMW33mhnHDPy/saWHdZ9sycI4w2DstKgc2pe9E
 e6jI9H5JiH49CoMnue38kwnACNUIIvolJDpWeU6K0vQz4p5k6eUNTMSTEEVZbwiV
 Y8MVqMIf+Cu+y6UY1co5OhH387kFuLgYMC/LIFz/4nOrlopRCAzMvYcFEqo9gIOO
 0+Ls/lkPc/hU5D2f3/20UjAGKVY/GfTwKJDRFptzaYMfmiMWW0pl2zlHagYp1huM
 7k8p0vVh5rFOLUJxiftXC8+jBVyJKXLGgwPxBdLVapFMs9DU9gI=
 =v9bp
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-selftests-6.16' of https://github.com/kvm-x86/linux into HEAD

KVM selftests changes for 6.16:

 - Add support for SNP to the various SEV selftests.

 - Add a selftest to verify fastops instructions via forced emulation.

 - Add MGLRU support to the access tracking perf test.
2025-05-27 12:15:26 -04:00
Paolo Bonzini
ebd38b26ec KVM x86 misc changes for 6.16:
- Unify virtualization of IBRS on nested VM-Exit, and cross-vCPU IBPB, between
    SVM and VMX.
 
  - Advertise support to userspace for WRMSRNS and PREFETCHI.
 
  - Rescan I/O APIC routes after handling EOI that needed to be intercepted due
    to the old/previous routing, but not the new/current routing.
 
  - Add a module param to control and enumerate support for device posted
    interrupts.
 
  - Misc cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmgwmHsACgkQOlYIJqCj
 N/2fYhAAiwKkqQpOWLcGjjezDTnpMqDDbHCffroq0Ttmqfg/cuul1oyZax+9fxBO
 203HUi5VKcG7uAGSpLcMFkPUs9hKnaln2lsDaQD+AnGucdj+JKF5p3INCsSYCo9N
 LVRjRZWtZocxJwHSHX9gU8om0pJ5fBCBG2+7+7XhWRaIqCpJe5k944JotiiOkgZ4
 5sXeITkN2kouFVMI8eD4wQGNXxRxs837SYUlwCnoD3VuuBesOZuEhz/CEL9l8vNY
 keXBLPg7bSW53clKfquNKwXDQRephnZaYoexDebUd+OlZphGhTIPh+C75xPQLWSi
 aYg6W9XDu3TChf4LPxHnJLwLg/rjeKNQARcxrnb3XLpPAtx3i2cKU8pDPhnd4qn0
 +YV5H0dato8bbe+oClGv+oIolM01qfI9SJVoaEhTPu3Rdw9cCQSVFn5t32vG3Vab
 FVxX+seV3+XTmVveD4cjiiMbqtNADwZ/PmHNAi9QCl46DgHR++MLfRtjYuGo1koL
 QOmCg2fWOFYtQT6XPJqZxp1SHYxuawrB4qcO9FNyxTuMChoslYoLAr3mBUj0DvwL
 fdXNof74Ccj8OK9o3uCPOXS92pZz92rz/edy/XmYiCmO+VEwBJFR6IdginZMPX11
 UAc4mAC+KkDTvOcKPPIWEXArOsfQMFKefi+bPOeWUx9/nqpcVws=
 =Vs9P
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-misc-6.16' of https://github.com/kvm-x86/linux into HEAD

KVM x86 misc changes for 6.16:

 - Unify virtualization of IBRS on nested VM-Exit, and cross-vCPU IBPB, between
   SVM and VMX.

 - Advertise support to userspace for WRMSRNS and PREFETCHI.

 - Rescan I/O APIC routes after handling EOI that needed to be intercepted due
   to the old/previous routing, but not the new/current routing.

 - Add a module param to control and enumerate support for device posted
   interrupts.

 - Misc cleanups.
2025-05-27 12:14:36 -04:00
Michal Luczaj
393d070135 vsock/test: Add test for an unexpectedly lingering close()
There was an issue with SO_LINGER: instead of blocking until all queued
messages for the socket have been successfully sent (or the linger timeout
has been reached), close() would block until packets were handled by the
peer.

Add a test to alert on close() lingering when it should not.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250522-vsock-linger-v6-5-2ad00b0e447e@rbox.co
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-27 11:05:22 +02:00
Michal Luczaj
8b07b7e5c2 vsock/test: Introduce enable_so_linger() helper
Add a helper function that sets SO_LINGER. Adapt the caller.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250522-vsock-linger-v6-4-2ad00b0e447e@rbox.co
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-27 11:05:21 +02:00
Michal Luczaj
e78e0596c7 vsock/test: Introduce vsock_wait_sent() helper
Distill the virtio_vsock_sock::bytes_unsent checking loop (ioctl SIOCOUTQ)
and move it to utils. Tweak the comment.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250522-vsock-linger-v6-3-2ad00b0e447e@rbox.co
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-27 11:05:21 +02:00
Jason A. Donenfeld
ca8bf8f383 wireguard: selftests: specify -std=gnu17 for bash
GCC 15 defaults to C23, which bash can't compile under, so specify gnu17
explicitly.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20250521212707.1767879-6-Jason@zx2c4.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-27 09:06:19 +02:00
Jordan Rife
ba3d7b93db wireguard: allowedips: add WGALLOWEDIP_F_REMOVE_ME flag
The current netlink API for WireGuard does not directly support removal
of allowed ips from a peer. A user can remove an allowed ip from a peer
in one of two ways:

1. By using the WGPEER_F_REPLACE_ALLOWEDIPS flag and providing a new
   list of allowed ips which omits the allowed ip that is to be removed.
2. By reassigning an allowed ip to a "dummy" peer then removing that
   peer with WGPEER_F_REMOVE_ME.

With the first approach, the driver completely rebuilds the allowed ip
list for a peer. If my current configuration is such that a peer has
allowed ips 192.168.0.2 and 192.168.0.3 and I want to remove 192.168.0.2
the actual transition looks like this.

[192.168.0.2, 192.168.0.3] <-- Initial state
[]                         <-- Step 1: Allowed ips removed for peer
[192.168.0.3]              <-- Step 2: Allowed ips added back for peer

This is true even if the allowed ip list is small and the update does
not need to be batched into multiple WG_CMD_SET_DEVICE requests, as the
removal and subsequent addition of ips is non-atomic within a single
request. Consequently, wg_allowedips_lookup_dst and
wg_allowedips_lookup_src may return NULL while reconfiguring a peer even
for packets bound for ips a user did not intend to remove leading to
unintended interruptions in connectivity. This presents in userspace as
failed calls to sendto and sendmsg for UDP sockets. In my case, I ran
netperf while repeatedly reconfiguring the allowed ips for a peer with
wg.

/usr/local/bin/netperf -H 10.102.73.72 -l 10m -t UDP_STREAM -- -R 1 -m 1024
send_data: data send error: No route to host (errno 113)
netperf: send_omni: send_data failed: No route to host

While this may not be of particular concern for environments where peers
and allowed ips are mostly static, systems like Cilium manage peers and
allowed ips in a dynamic environment where peers (i.e. Kubernetes nodes)
and allowed ips (i.e. pods running on those nodes) can frequently
change making WGPEER_F_REPLACE_ALLOWEDIPS problematic.

The second approach avoids any possible connectivity interruptions
but is hacky and less direct, requiring the creation of a temporary
peer just to dispose of an allowed ip.

Introduce a new flag called WGALLOWEDIP_F_REMOVE_ME which in the same
way that WGPEER_F_REMOVE_ME allows a user to remove a single peer from
a WireGuard device's configuration allows a user to remove an ip from a
peer's set of allowed ips. This enables incremental updates to a
device's configuration without any connectivity blips or messy
workarounds.

A corresponding patch for wg extends the existing `wg set` interface to
leverage this feature.

$ wg set wg0 peer <PUBKEY> allowed-ips +192.168.88.0/24,-192.168.0.1/32

When '+' or '-' is prepended to any ip in the list, wg clears
WGPEER_F_REPLACE_ALLOWEDIPS and sets the WGALLOWEDIP_F_REMOVE_ME flag on
any ip prefixed with '-'.

Signed-off-by: Jordan Rife <jordan@jrife.io>
[Jason: minor style nits, fixes to selftest, bump of wireguard-tools version]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20250521212707.1767879-5-Jason@zx2c4.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-27 09:06:19 +02:00
WangYuli
e74e9ee2c8 wireguard: selftests: cleanup CONFIG_UBSAN_SANITIZE_ALL
Commit 918327e9b7 ("ubsan: Remove CONFIG_UBSAN_SANITIZE_ALL")
removed the CONFIG_UBSAN_SANITIZE_ALL configuration option.
Eliminate invalid configurations to improve code readability.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20250521212707.1767879-2-Jason@zx2c4.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-27 09:06:19 +02:00
Linus Torvalds
785cdec46e Core x86 updates for v6.16:
Boot code changes:
 
  - A large series of changes to reorganize the x86 boot code into a better isolated
    and easier to maintain base of PIC early startup code in arch/x86/boot/startup/,
    by Ard Biesheuvel.
 
    Motivation & background:
 
 	| Since commit
 	|
 	|    c88d71508e ("x86/boot/64: Rewrite startup_64() in C")
 	|
 	| dated Jun 6 2017, we have been using C code on the boot path in a way
 	| that is not supported by the toolchain, i.e., to execute non-PIC C
 	| code from a mapping of memory that is different from the one provided
 	| to the linker. It should have been obvious at the time that this was a
 	| bad idea, given the need to sprinkle fixup_pointer() calls left and
 	| right to manipulate global variables (including non-pointer variables)
 	| without crashing.
 	|
 	| This C startup code has been expanding, and in particular, the SEV-SNP
 	| startup code has been expanding over the past couple of years, and
 	| grown many of these warts, where the C code needs to use special
 	| annotations or helpers to access global objects.
 
    This tree includes the first phase of this work-in-progress x86 boot code
    reorganization.
 
 Scalability enhancements and micro-optimizations:
 
  - Improve code-patching scalability (Eric Dumazet)
  - Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR (Andrew Cooper)
 
 CPU features enumeration updates:
 
  - Thorough reorganization and cleanup of CPUID parsing APIs (Ahmed S. Darwish)
  - Fix, refactor and clean up the cacheinfo code (Ahmed S. Darwish, Thomas Gleixner)
  - Update CPUID bitfields to x86-cpuid-db v2.3 (Ahmed S. Darwish)
 
 Memory management changes:
 
  - Allow temporary MMs when IRQs are on (Andy Lutomirski)
  - Opt-in to IRQs-off activate_mm() (Andy Lutomirski)
  - Simplify choose_new_asid() and generate better code (Borislav Petkov)
  - Simplify 32-bit PAE page table handling (Dave Hansen)
  - Always use dynamic memory layout (Kirill A. Shutemov)
  - Make SPARSEMEM_VMEMMAP the only memory model (Kirill A. Shutemov)
  - Make 5-level paging support unconditional (Kirill A. Shutemov)
  - Stop prefetching current->mm->mmap_lock on page faults (Mateusz Guzik)
  - Predict valid_user_address() returning true (Mateusz Guzik)
  - Consolidate initmem_init() (Mike Rapoport)
 
 FPU support and vector computing:
 
  - Enable Intel APX support (Chang S. Bae)
  - Reorgnize and clean up the xstate code (Chang S. Bae)
  - Make task_struct::thread constant size (Ingo Molnar)
  - Restore fpu_thread_struct_whitelist() to fix CONFIG_HARDENED_USERCOPY=y
    (Kees Cook)
  - Simplify the switch_fpu_prepare() + switch_fpu_finish() logic (Oleg Nesterov)
  - Always preserve non-user xfeatures/flags in __state_perm (Sean Christopherson)
 
 Microcode loader changes:
 
  - Help users notice when running old Intel microcode (Dave Hansen)
  - AMD: Do not return error when microcode update is not necessary (Annie Li)
  - AMD: Clean the cache if update did not load microcode (Boris Ostrovsky)
 
 Code patching (alternatives) changes:
 
  - Simplify, reorganize and clean up the x86 text-patching code (Ingo Molnar)
  - Make smp_text_poke_batch_process() subsume smp_text_poke_batch_finish()
    (Nikolay Borisov)
  - Refactor the {,un}use_temporary_mm() code (Peter Zijlstra)
 
 Debugging support:
 
  - Add early IDT and GDT loading to debug relocate_kernel() bugs (David Woodhouse)
  - Print the reason for the last reset on modern AMD CPUs (Yazen Ghannam)
  - Add AMD Zen debugging document (Mario Limonciello)
  - Fix opcode map (!REX2) superscript tags (Masami Hiramatsu)
  - Stop decoding i64 instructions in x86-64 mode at opcode (Masami Hiramatsu)
 
 CPU bugs and bug mitigations:
 
  - Remove X86_BUG_MMIO_UNKNOWN (Borislav Petkov)
  - Fix SRSO reporting on Zen1/2 with SMT disabled (Borislav Petkov)
  - Restructure and harmonize the various CPU bug mitigation methods
    (David Kaplan)
  - Fix spectre_v2 mitigation default on Intel (Pawan Gupta)
 
 MSR API:
 
  - Large MSR code and API cleanup (Xin Li)
  - In-kernel MSR API type cleanups and renames (Ingo Molnar)
 
 PKEYS:
 
  - Simplify PKRU update in signal frame (Chang S. Bae)
 
 NMI handling code:
 
  - Clean up, refactor and simplify the NMI handling code (Sohil Mehta)
  - Improve NMI duration console printouts (Sohil Mehta)
 
 Paravirt guests interface:
 
  - Restrict PARAVIRT_XXL to 64-bit only (Kirill A. Shutemov)
 
 SEV support:
 
  - Share the sev_secrets_pa value again (Tom Lendacky)
 
 x86 platform changes:
 
  - Introduce the <asm/amd/> header namespace (Ingo Molnar)
  - i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to <asm/amd/fch.h>
    (Mario Limonciello)
 
 Fixes and cleanups:
 
  - x86 assembly code cleanups and fixes (Uros Bizjak)
 
  - Misc fixes and cleanups (Andi Kleen, Andy Lutomirski, Andy Shevchenko,
    Ard Biesheuvel, Bagas Sanjaya, Baoquan He, Borislav Petkov, Chang S. Bae,
    Chao Gao, Dan Williams, Dave Hansen, David Kaplan, David Woodhouse,
    Eric Biggers, Ingo Molnar, Josh Poimboeuf, Juergen Gross, Malaya Kumar Rout,
    Mario Limonciello, Nathan Chancellor, Oleg Nesterov, Pawan Gupta,
    Peter Zijlstra, Shivank Garg, Sohil Mehta, Thomas Gleixner, Uros Bizjak,
    Xin Li)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmgy9WARHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jJSw/+OW2zvAx602doujBIE17vFLU7R10Xwj5H
 lVgomkWCoTNscUZPhdT/iI+/kQF1fG8PtN9oZKUsTAUswldKJsqu7KevobviesiW
 qI+FqH/fhHaIk7GVh9VP65Dgrdki8zsgd7BFxD8pLRBlbZTxTxXNNkuNJrs6LxJh
 SxWp/FVtKo6Wd57qlUcsdo0tilAfcuhlEweFUarX55X2ouhdeHjcGNpxj9dHKOh8
 M7R5yMYFrpfdpSms+WaCnKKahWHaIQtQTsPAyKwoVdtfl1kK+7NgaCF55Gbo3ogp
 r59JwC/CGruDa5QnnDizCwFIwpZw9M52Q1NhP/eLEZbDGB4Yya3b5NW+Ya+6rPvO
 ZZC3e1uUmlxW3lrYflUHurnwrVb2GjkQZOdf0gfnly/7LljIicIS2dk4qIQF9NBd
 sQPpW5hjmIz9CsfeL8QaJW38pQyMsQWznFuz4YVuHcLHvleb3hR+n4fNfV5Lx9bw
 oirVETSIT5hy/msAgShPqTqFUEiVCgp16ow20YstxxzFu/FQ+VG987tkeUyFkPMe
 q1v5yF1hty+TkM4naKendIZ/MJnsrv0AxaegFz9YQrKGL1UPiOajQbSyKbzbto7+
 ozmtN0W80E8n4oQq008j8htpgIhDV91UjF5m33qB82uSqKihHPPTsVcbeg5nZwh2
 ti5g/a1jk94=
 =JgQo
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core x86 updates from Ingo Molnar:
 "Boot code changes:

   - A large series of changes to reorganize the x86 boot code into a
     better isolated and easier to maintain base of PIC early startup
     code in arch/x86/boot/startup/, by Ard Biesheuvel.

     Motivation & background:

  	| Since commit
  	|
  	|    c88d71508e ("x86/boot/64: Rewrite startup_64() in C")
  	|
  	| dated Jun 6 2017, we have been using C code on the boot path in a way
  	| that is not supported by the toolchain, i.e., to execute non-PIC C
  	| code from a mapping of memory that is different from the one provided
  	| to the linker. It should have been obvious at the time that this was a
  	| bad idea, given the need to sprinkle fixup_pointer() calls left and
  	| right to manipulate global variables (including non-pointer variables)
  	| without crashing.
  	|
  	| This C startup code has been expanding, and in particular, the SEV-SNP
  	| startup code has been expanding over the past couple of years, and
  	| grown many of these warts, where the C code needs to use special
  	| annotations or helpers to access global objects.

     This tree includes the first phase of this work-in-progress x86
     boot code reorganization.

  Scalability enhancements and micro-optimizations:

   - Improve code-patching scalability (Eric Dumazet)

   - Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR (Andrew Cooper)

  CPU features enumeration updates:

   - Thorough reorganization and cleanup of CPUID parsing APIs (Ahmed S.
     Darwish)

   - Fix, refactor and clean up the cacheinfo code (Ahmed S. Darwish,
     Thomas Gleixner)

   - Update CPUID bitfields to x86-cpuid-db v2.3 (Ahmed S. Darwish)

  Memory management changes:

   - Allow temporary MMs when IRQs are on (Andy Lutomirski)

   - Opt-in to IRQs-off activate_mm() (Andy Lutomirski)

   - Simplify choose_new_asid() and generate better code (Borislav
     Petkov)

   - Simplify 32-bit PAE page table handling (Dave Hansen)

   - Always use dynamic memory layout (Kirill A. Shutemov)

   - Make SPARSEMEM_VMEMMAP the only memory model (Kirill A. Shutemov)

   - Make 5-level paging support unconditional (Kirill A. Shutemov)

   - Stop prefetching current->mm->mmap_lock on page faults (Mateusz
     Guzik)

   - Predict valid_user_address() returning true (Mateusz Guzik)

   - Consolidate initmem_init() (Mike Rapoport)

  FPU support and vector computing:

   - Enable Intel APX support (Chang S. Bae)

   - Reorgnize and clean up the xstate code (Chang S. Bae)

   - Make task_struct::thread constant size (Ingo Molnar)

   - Restore fpu_thread_struct_whitelist() to fix
     CONFIG_HARDENED_USERCOPY=y (Kees Cook)

   - Simplify the switch_fpu_prepare() + switch_fpu_finish() logic (Oleg
     Nesterov)

   - Always preserve non-user xfeatures/flags in __state_perm (Sean
     Christopherson)

  Microcode loader changes:

   - Help users notice when running old Intel microcode (Dave Hansen)

   - AMD: Do not return error when microcode update is not necessary
     (Annie Li)

   - AMD: Clean the cache if update did not load microcode (Boris
     Ostrovsky)

  Code patching (alternatives) changes:

   - Simplify, reorganize and clean up the x86 text-patching code (Ingo
     Molnar)

   - Make smp_text_poke_batch_process() subsume
     smp_text_poke_batch_finish() (Nikolay Borisov)

   - Refactor the {,un}use_temporary_mm() code (Peter Zijlstra)

  Debugging support:

   - Add early IDT and GDT loading to debug relocate_kernel() bugs
     (David Woodhouse)

   - Print the reason for the last reset on modern AMD CPUs (Yazen
     Ghannam)

   - Add AMD Zen debugging document (Mario Limonciello)

   - Fix opcode map (!REX2) superscript tags (Masami Hiramatsu)

   - Stop decoding i64 instructions in x86-64 mode at opcode (Masami
     Hiramatsu)

  CPU bugs and bug mitigations:

   - Remove X86_BUG_MMIO_UNKNOWN (Borislav Petkov)

   - Fix SRSO reporting on Zen1/2 with SMT disabled (Borislav Petkov)

   - Restructure and harmonize the various CPU bug mitigation methods
     (David Kaplan)

   - Fix spectre_v2 mitigation default on Intel (Pawan Gupta)

  MSR API:

   - Large MSR code and API cleanup (Xin Li)

   - In-kernel MSR API type cleanups and renames (Ingo Molnar)

  PKEYS:

   - Simplify PKRU update in signal frame (Chang S. Bae)

  NMI handling code:

   - Clean up, refactor and simplify the NMI handling code (Sohil Mehta)

   - Improve NMI duration console printouts (Sohil Mehta)

  Paravirt guests interface:

   - Restrict PARAVIRT_XXL to 64-bit only (Kirill A. Shutemov)

  SEV support:

   - Share the sev_secrets_pa value again (Tom Lendacky)

  x86 platform changes:

   - Introduce the <asm/amd/> header namespace (Ingo Molnar)

   - i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to
     <asm/amd/fch.h> (Mario Limonciello)

  Fixes and cleanups:

   - x86 assembly code cleanups and fixes (Uros Bizjak)

   - Misc fixes and cleanups (Andi Kleen, Andy Lutomirski, Andy
     Shevchenko, Ard Biesheuvel, Bagas Sanjaya, Baoquan He, Borislav
     Petkov, Chang S. Bae, Chao Gao, Dan Williams, Dave Hansen, David
     Kaplan, David Woodhouse, Eric Biggers, Ingo Molnar, Josh Poimboeuf,
     Juergen Gross, Malaya Kumar Rout, Mario Limonciello, Nathan
     Chancellor, Oleg Nesterov, Pawan Gupta, Peter Zijlstra, Shivank
     Garg, Sohil Mehta, Thomas Gleixner, Uros Bizjak, Xin Li)"

* tag 'x86-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (331 commits)
  x86/bugs: Fix spectre_v2 mitigation default on Intel
  x86/bugs: Restructure ITS mitigation
  x86/xen/msr: Fix uninitialized variable 'err'
  x86/msr: Remove a superfluous inclusion of <asm/asm.h>
  x86/paravirt: Restrict PARAVIRT_XXL to 64-bit only
  x86/mm/64: Make 5-level paging support unconditional
  x86/mm/64: Make SPARSEMEM_VMEMMAP the only memory model
  x86/mm/64: Always use dynamic memory layout
  x86/bugs: Fix indentation due to ITS merge
  x86/cpuid: Rename hypervisor_cpuid_base()/for_each_possible_hypervisor_cpuid_base() to cpuid_base_hypervisor()/for_each_possible_cpuid_base_hypervisor()
  x86/cpu/intel: Rename CPUID(0x2) descriptors iterator parameter
  x86/cacheinfo: Rename CPUID(0x2) descriptors iterator parameter
  x86/cpuid: Rename cpuid_get_leaf_0x2_regs() to cpuid_leaf_0x2()
  x86/cpuid: Rename have_cpuid_p() to cpuid_feature()
  x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID header
  x86/cpuid: Move CPUID(0x2) APIs into <cpuid/api.h>
  x86/msr: Add rdmsrl_on_cpu() compatibility wrapper
  x86/mm: Fix kernel-doc descriptions of various pgtable methods
  x86/asm-offsets: Export certain 'struct cpuinfo_x86' fields for 64-bit asm use too
  x86/boot: Defer initialization of VM space related global variables
  ...
2025-05-26 16:04:17 -07:00
Linus Torvalds
ddddf9d64f Performance events updates for v6.16:
Core & generic-arch updates:
 
  - Add support for dynamic constraints and propagate it to
    the Intel driver (Kan Liang)
 
  - Fix & enhance driver-specific throttling support (Kan Liang)
 
  - Record sample last_period before updating on the
    x86 and PowerPC platforms (Mark Barnett)
 
  - Make perf_pmu_unregister() usable (Peter Zijlstra)
 
  - Unify perf_event_free_task() / perf_event_exit_task_context()
    (Peter Zijlstra)
 
  - Simplify perf_event_release_kernel() and perf_event_free_task()
    (Peter Zijlstra)
 
  - Allocate non-contiguous AUX pages by default (Yabin Cui)
 
 Uprobes updates:
 
  - Add support to emulate NOP instructions (Jiri Olsa)
 
  - selftests/bpf: Add 5-byte NOP uprobe trigger benchmark (Jiri Olsa)
 
 x86 Intel PMU enhancements:
 
  - Support Intel Auto Counter Reload [ACR] (Kan Liang)
 
  - Add PMU support for Clearwater Forest (Dapeng Mi)
 
  - Arch-PEBS preparatory changes: (Dapeng Mi)
 
    - Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
    - Decouple BTS initialization from PEBS initialization
    - Introduce pairs of PEBS static calls
 
 x86 AMD PMU enhancements:
 
  - Use hrtimer for handling overflows in the AMD uncore driver
    (Sandipan Das)
 
  - Prevent UMC counters from saturating (Sandipan Das)
 
 Fixes and cleanups:
 
  - Fix put_ctx() ordering (Frederic Weisbecker)
 
  - Fix irq work dereferencing garbage (Frederic Weisbecker)
 
  - Misc fixes and cleanups (Changbin Du, Frederic Weisbecker,
    Ian Rogers, Ingo Molnar, Kan Liang, Peter Zijlstra, Qing Wang,
    Sandipan Das, Thorsten Blum)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmgy4zoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j6QRAAvQ4GBPrdJLb8oXkLjCmWSp9PfM1h2IW0
 reUrcV0BPRAwz4T60QEU2KyiEjvKxNghR6bNw4i3slAZ8EFwP9eWE/0ZYOo5+W/N
 wv8vsopv/oZd2L2G5TgxDJf+tLPkqnTvp651LmGAbquPFONN1lsya9UHVPnt2qtv
 fvFhjW6D828VoevRcUCsdoEUNlFDkUYQ2c3M1y5H2AI6ILDVxLsp5uYtuVUP+2lQ
 7UI/elqRIIblTGT7G9LvTGiXZMm8T58fe1OOLekT6NdweJ3XEt1kMdFo/SCRYfzU
 eDVVVLSextZfzBXNPtAEAlM3aSgd8+4m5sACiD1EeOUNjo5J9Sj1OOCa+bZGF/Rl
 XNv5Kcp6Kh1T4N5lio8DE/NabmHDqDMbUGfud+VTS8uLLku4kuOWNMxJTD1nQ2Zz
 BMfJhP89G9Vk07F9fOGuG1N6mKhIKNOgXh0S92tB7XDHcdJegueu2xh4ZszBL1QK
 JVXa4DbnDj+y0LvnV+A5Z6VILr5RiCAipDb9ascByPja6BbN10Nf9Aj4nWwRTwbO
 ut5OK/fDKmSjEHn1+a42d4iRxdIXIWhXCyxEhH+hJXEFx9htbQ3oAbXAEedeJTlT
 g9QYGAjL96QEd0CqviorV8KyU59nVkEPoLVCumXBZ0WWhNwU6GdAmsW1hLfxQdLN
 sp+XHhfxf8M=
 =tPRs
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:
 "Core & generic-arch updates:

   - Add support for dynamic constraints and propagate it to the Intel
     driver (Kan Liang)

   - Fix & enhance driver-specific throttling support (Kan Liang)

   - Record sample last_period before updating on the x86 and PowerPC
     platforms (Mark Barnett)

   - Make perf_pmu_unregister() usable (Peter Zijlstra)

   - Unify perf_event_free_task() / perf_event_exit_task_context()
     (Peter Zijlstra)

   - Simplify perf_event_release_kernel() and perf_event_free_task()
     (Peter Zijlstra)

   - Allocate non-contiguous AUX pages by default (Yabin Cui)

  Uprobes updates:

   - Add support to emulate NOP instructions (Jiri Olsa)

   - selftests/bpf: Add 5-byte NOP uprobe trigger benchmark (Jiri Olsa)

  x86 Intel PMU enhancements:

   - Support Intel Auto Counter Reload [ACR] (Kan Liang)

   - Add PMU support for Clearwater Forest (Dapeng Mi)

   - Arch-PEBS preparatory changes: (Dapeng Mi)
       - Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
       - Decouple BTS initialization from PEBS initialization
       - Introduce pairs of PEBS static calls

  x86 AMD PMU enhancements:

   - Use hrtimer for handling overflows in the AMD uncore driver
     (Sandipan Das)

   - Prevent UMC counters from saturating (Sandipan Das)

  Fixes and cleanups:

   - Fix put_ctx() ordering (Frederic Weisbecker)

   - Fix irq work dereferencing garbage (Frederic Weisbecker)

   - Misc fixes and cleanups (Changbin Du, Frederic Weisbecker, Ian
     Rogers, Ingo Molnar, Kan Liang, Peter Zijlstra, Qing Wang, Sandipan
     Das, Thorsten Blum)"

* tag 'perf-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  perf/headers: Clean up <linux/perf_event.h> a bit
  perf/uapi: Clean up <uapi/linux/perf_event.h> a bit
  perf/uapi: Fix PERF_RECORD_SAMPLE comments in <uapi/linux/perf_event.h>
  mips/perf: Remove driver-specific throttle support
  xtensa/perf: Remove driver-specific throttle support
  sparc/perf: Remove driver-specific throttle support
  loongarch/perf: Remove driver-specific throttle support
  csky/perf: Remove driver-specific throttle support
  arc/perf: Remove driver-specific throttle support
  alpha/perf: Remove driver-specific throttle support
  perf/apple_m1: Remove driver-specific throttle support
  perf/arm: Remove driver-specific throttle support
  s390/perf: Remove driver-specific throttle support
  powerpc/perf: Remove driver-specific throttle support
  perf/x86/zhaoxin: Remove driver-specific throttle support
  perf/x86/amd: Remove driver-specific throttle support
  perf/x86/intel: Remove driver-specific throttle support
  perf: Only dump the throttle log for the leader
  perf: Fix the throttle logic for a group
  perf/core: Add the is_event_in_freq_mode() helper to simplify the code
  ...
2025-05-26 15:40:23 -07:00
Linus Torvalds
3ba121c9f3 objtool changes for v6.15:
- Speed up SHT_GROUP reindexing (Josh Poimboeuf)
 
  - Fix up st_info in COMDAT group section (Rong Xu)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmgy3XURHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iIow/7BgUA+vah2719bPl8ET2qkpE51yolc56Z
 8prJT1z5s9eFRakFpVQoj036Iy4O+damzTJxiEfDCXn9bMqCRAZtUNC5xsA0yn4N
 pROlioyPMFaRLjULdA5pO8HLBVh5vOG712q0Pp1+E5JPiy1qHmWq15MgxBR5QY/N
 8AZ9jFnE4xr7kLPnfC1AkCP9CsCcGFJThkkFEiVatBTrn5uDZO0IDdn6rRfoof2T
 BY24eMEXiwBvvtMJ/iMBMm1ZM1cfWZgDal1M8vsrfVsL4b8DERgghP3Pic1NboPB
 6r1syLUGtigP3K6SCtClfEG3sBpxip9rdI5wcivnRUjOwfZXPF4F8VDivyK8XAdz
 Y1E7o6/ZZp49Iku2L6sA8oyDMZ6PaKDH1JOByZ+Oz0/3v5bpXUjLnDSFqLSXldIg
 hU0U1shTc8h3pi1ewEjgsTpKsCfRoeGoihhgguWY8Z5eHQ52T8SdSgHuSI+KJ27m
 TKeqcMdjJmQwUvY6pSUuYlI7Co2PjlGUhDPW93OsGa7gi5qc78yPfYFkn9cvRd/0
 7SR5KFkNd/6KMq+dhZCbwulB7GxFh/8VctTXwzH8ghNa1dGfOjXRDYUOdpZ3ZiEA
 csEB2XKgXvKSTId09SCwS0VE+sq1swVjFGFbkhlD4tSosq+AuF+CwxWD6V/MjuE/
 Dpp1JUBbpJ4=
 =eXgV
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:

 - Speed up SHT_GROUP reindexing (Josh Poimboeuf)

 - Fix up st_info in COMDAT group section (Rong Xu)

* tag 'objtool-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Speed up SHT_GROUP reindexing
  objtool: Fix up st_info in COMDAT group section
2025-05-26 15:13:53 -07:00
Linus Torvalds
b3570b00dc Locking changes for v6.16:
Futexes:
 
    - Add support for task local hash maps (Sebastian Andrzej Siewior,
      Peter Zijlstra)
 
    - Implement the FUTEX2_NUMA ABI, which feature extends the futex
      interface to be NUMA-aware. On NUMA-aware futexes a second u32
      word containing the NUMA node is added to after the u32 futex value
      word. (Peter Zijlstra)
 
    - Implement the FUTEX2_MPOL ABI, which feature extends the futex
      interface to be mempolicy-aware as well, to further refine futex
      node mappings and lookups. (Peter Zijlstra)
 
   Locking primitives:
 
    - Misc cleanups (Andy Shevchenko, Borislav Petkov, Colin Ian King,
                     Ingo Molnar, Nam Cao, Peter Zijlstra)
 
   Lockdep:
 
    - Prevent abuse of lockdep subclasses (Waiman Long)
    - Add number of dynamic keys to /proc/lockdep_stats (Waiman Long)
 
 Plus misc cleanups and fixes.
 
 Note that the tree includes the following dependent out-of-subsystem
 changes as well:
 
  - rcuref: Provide rcuref_is_dead()
  - mm: Add vmalloc_huge_node()
  - mm: Add the mmap_read_lock guard to <linux/mmap_lock.h>
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmgy3E8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1isNw/9FS6+ZReiV3NLHvhwIw8+6U2vV733wLY+
 mFzDk2CRwv2d6xg+QUrhLNI93i2fZnwNvK1f6LcRZMa1pNmwCcEghKgm0G+fRgbv
 skiGrlkUCoEqsDUxRW++/aTBcMo0vqG3NOObnUOrddG2W9tfrR8jq/EwlzB99dO7
 q8qaBNl9W1vLT3gh9/RPP5uKt0NKIf8ObvsyhWCGaywg81h2lC4AHf0Xlj3ZD95T
 TO5jhUhl/muhYtaqxeYPK0gDtCrgFz8NwZdjKx1nyP7Gbko6+L50AvOVXog0SIAU
 nncftvutGJg2ki7dbSYPDoHQrHO0JsF1vUfVZRjaKFebWpFo2yYdNMbITOeXVhSC
 QSpbH2qvyn21nT/YSj9dottHWBoNYBEgrcSf6DO4g0d8A0Jh7egXjQdA852RpeQ0
 LWGYx4rfiKhnjiXlKKQHrURZkcxxa40o+ls3RfFl2/kWA+7aUybvw6nAeDEkV0oL
 s2U0vZxsY37EPWDm40rTe9r4YpPqcB65i9YIesPzhtbcHJVmN0gts0o5l+x53GhR
 CeftFiiUi2nm6JaT+1wGvBDT3hQ8+NZ8GkPSeA6pEJWE3i4KquZlcBZLOSLZ3k/B
 df58zQi99Yun33is5f1kqDNspqvJOg/1nxUK68PgNSdCMKeuZkJYrcmh/rKNnXSC
 f7M1XHoWFb0=
 =La/x
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
 "Futexes:

   - Add support for task local hash maps (Sebastian Andrzej Siewior,
     Peter Zijlstra)

   - Implement the FUTEX2_NUMA ABI, which feature extends the futex
     interface to be NUMA-aware. On NUMA-aware futexes a second u32 word
     containing the NUMA node is added to after the u32 futex value word
     (Peter Zijlstra)

   - Implement the FUTEX2_MPOL ABI, which feature extends the futex
     interface to be mempolicy-aware as well, to further refine futex
     node mappings and lookups (Peter Zijlstra)

  Locking primitives:

   - Misc cleanups (Andy Shevchenko, Borislav Petkov, Colin Ian King,
     Ingo Molnar, Nam Cao, Peter Zijlstra)

  Lockdep:

   - Prevent abuse of lockdep subclasses (Waiman Long)

   - Add number of dynamic keys to /proc/lockdep_stats (Waiman Long)

  Plus misc cleanups and fixes"

* tag 'locking-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
  selftests/futex: Fix spelling mistake "unitiliazed" -> "uninitialized"
  futex: Correct the kernedoc return value for futex_wait_setup().
  tools headers: Synchronize prctl.h ABI header
  futex: Use RCU_INIT_POINTER() in futex_mm_init().
  selftests/futex: Use TAP output in futex_numa_mpol
  selftests/futex: Use TAP output in futex_priv_hash
  futex: Fix kernel-doc comments
  futex: Relax the rcu_assign_pointer() assignment of mm->futex_phash in futex_mm_init()
  futex: Fix outdated comment in struct restart_block
  locking/lockdep: Add number of dynamic keys to /proc/lockdep_stats
  locking/lockdep: Prevent abuse of lockdep subclass
  locking/lockdep: Move hlock_equal() to the respective #ifdeffery
  futex,selftests: Add another FUTEX2_NUMA selftest
  selftests/futex: Add futex_numa_mpol
  selftests/futex: Add futex_priv_hash
  selftests/futex: Build without headers nonsense
  tools/perf: Allow to select the number of hash buckets
  tools headers: Synchronize prctl.h ABI header
  futex: Implement FUTEX2_MPOL
  futex: Implement FUTEX2_NUMA
  ...
2025-05-26 14:42:07 -07:00
Linus Torvalds
ba45037098 linux_kselftest-kunit-6.16-rc1
- Enables qemu_config for riscv32, sparc 64-bit, PowerPC 32-bit BE and
   64-bit LE.
 - Enables CONFIG_SPARC32 to clearly differentiate between sparc 32-bit
   and 64-bit configurations.
 - Enables CONFIG_CPU_BIG_ENDIAN to clearly differentiate between powerpc
   LE and BE configurations.
 - Add feature to list available architectures to kunit tool.
 - Fixes to bugs and changes to documentation.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmgwylUACgkQCwJExA0N
 Qxz1/w/+NVxyC80omepHOxoi8Y+j5vWSlQ3S4n+HThyeb4srJlvJuidZN+zrRIEU
 QLKo4edUDa25I+JkXnnw/8gLG1000PESM/VjC6NIzhCrEHc/cX/WS9OlwbR5JkpW
 wOlxju3vWYrfy5i+LSYiH+v822WlSZsznH9AM9w+ws4FWoRNN+7LiEitj2LElrWI
 XHpEYDsBhePu2cFQgAdHaQ5YcMmG64KqeX5Xj+cGjtURciijQDxVuX5k03n0qkbn
 YrH4U0it8ZqxJTImpExnqlP2G8uZS72hE91vK0hwW9oZ+/k9vVeBQWJ13JvNKveb
 +zHGUimJ0n6d5HB5ZFaUAazkbZHKq7mWPp86lBYNiRXsseDwwqDSva5tfUxzXGQP
 MNDK2wrHusH0beRdzdlPjynJABxiOnczmBOAj/Y6t+ZgC2D8BHvyvj+Pecdz3Tr0
 YPLgpY72LqxTKDCvOf91IeT8EFaAjyGfVcN6SvUJ/zjPc7UcopM2DEQvlSZcU/Ac
 EAzgWmFfkwowSWwr8Q4GOXuKzkMj65QhouJrSCoBjuHQuUxa2MZxAbv5snPmSLs0
 635MdPU9uPAEvZKJTbVlXKBMCM3FrS8Buy+dHFEuUiUy0i3PndV5aJUQpIxwM/Du
 +DjQkREy6wsIHXSTu4lm+zoZ0yVKQdEZJgsYjY+JfRQ25cLWgiQ=
 =32OH
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - Enable qemu_config for riscv32, sparc 64-bit, PowerPC 32-bit BE and
   64-bit LE

 - Enable CONFIG_SPARC32 to clearly differentiate between sparc 32-bit
   and 64-bit configurations

 - Enable CONFIG_CPU_BIG_ENDIAN to clearly differentiate between powerpc
   LE and BE configurations

 - Add feature to list available architectures to kunit tool

 - Fixes to bugs and changes to documentation

* tag 'linux_kselftest-kunit-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Fix wrong parameter to kunit_deactivate_static_stub()
  kunit: tool: add test counts to JSON output
  Documentation: kunit: improve example on testing static functions
  kunit: executor: Remove const from kunit_filter_suites() allocation type
  kunit: qemu_configs: Disable faulting tests on 32-bit SPARC
  kunit: qemu_configs: Add 64-bit SPARC configuration
  kunit: qemu_configs: sparc: Explicitly enable CONFIG_SPARC32=y
  kunit: qemu_configs: Add PowerPC 32-bit BE and 64-bit LE
  kunit: qemu_configs: powerpc: Explicitly enable CONFIG_CPU_BIG_ENDIAN=y
  kunit: tool: Implement listing of available architectures
  kunit: qemu_configs: Add riscv32 config
  kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests
2025-05-26 14:29:44 -07:00
Linus Torvalds
2d2435e1c8 linux_kselftest-next-6.16-rc1
-- Fixes
    - cpufreq test to not double suspend in rtcwake case.
    - compile error in pid_namespace test.
    - run_kselftest.sh to use readlink if realpath is not available.
    - cpufreq basic read and update testcases.
    - ftrace to add poll to a gen_file so test can find it at run-time.
    - spelling errors in perf_events test.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmgw1J8ACgkQCwJExA0N
 QxyHxg//Um5AMX2oPocPfKTctrvxgwdu9dsk1Cm1DfXwtaH6BZ2R34+fqlkNpBBO
 nRKhF17ArTslRNjTlnDZ/nh+7Pg7AU/ikyahT607div9DEQM4bXhuIBjd4KDKANK
 KlHrlbfP5TArGx81+OzuiRvLhVBO8JN5O5G9EtY1/NU5p6EhMYAt03n89/OQLe1T
 Lr3NIHsLOeQSk+jNW4mxAXRSJ0siQBrYls4SeeNKd+B/7gRHBVD7IrGXc6ruHMoB
 8DJDXgSjYrRXy3Do6jYhFvIGHap0pELYN3FpluEU5uW+K+CJuLWQWI1O6/pjIpc9
 Zk/S2JI8yg+6PMzLMvy1YtcUIM1oSV8wx9uWqR5thyFzJdeMcP5k+yXP6ErbHWTB
 ncSlqzEe88pI9ywxmiaqiK0Jf+IBArW1agcCDkORpJ2KCz5GeZU7wGPYH/Ohn8Y4
 7Ooqxb4jH5ZBofvSAM+mEjmPypS7VhQRa8pU0Ied+O79c+wyiK2su5u0pGpwZeW/
 4X2Q393OhT9g8aI2ZFsQYOYLTQuPnrbLQhyg7XqcmkZb5vAwVK8GNKaGNjb+qhz4
 sF1xA5ySRObj+5MHcR+JFtYKoPoyJVzkOblwvugLuAE8EGTlRIxXF5hJv/f/U/RD
 OE6u3dJX8O0Qa9ZJKX3b5Rt+qEG228/VxUQ3qILv5ikWnqkVL3w=
 =9kTs
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-next-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest updates from Shuah Khan:
 "Fixes:

   - cpufreq test to not double suspend in rtcwake case

   - compile error in pid_namespace test

   - run_kselftest.sh to use readlink if realpath is not available

   - cpufreq basic read and update testcases

   - ftrace to add poll to a gen_file so test can find it at run-time

   - spelling errors in perf_events test"

* tag 'linux_kselftest-next-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/run_kselftest.sh: Use readlink if realpath is not available
  selftests/timens: timerfd: Use correct clockid type in tclock_gettime()
  selftests/timens: Make run_tests() functions static
  selftests/timens: Print TAP headers
  selftests: pid_namespace: add missing sys/mount.h include in pid_max.c
  kselftest: cpufreq: Get rid of double suspend in rtcwake case
  selftests/cpufreq: Fix cpufreq basic read and update testcases
  selftests/ftrace: Convert poll to a gen_file
  selftests/perf_events: Fix spelling mistake "sycnhronize" -> "synchronize"
2025-05-26 14:25:23 -07:00
Linus Torvalds
07046958f6 RCU pull request for v6.16
Summary of changes:
 - Removed swake_up_one_online() workaround
 - Reverted an incorrect rcuog wake-up fix from offline softirq
 - Rust RCU Guard methods marked as inline
 - Updated MAINTAINERS with Joel’s and Zqiang's new email address
 - Replaced magic constant in rcu_seq_done_exact() with named constant
 - Added warning mechanism to validate rcu_seq_done_exact()
 - Switched SRCU polling API to use rcu_seq_done_exact()
 - Commented on redundant delta check in rcu_seq_done_exact()
 - Made ->gpwrap tests in rcutorture more frequent
 - Fixed reuse of ARM64 images in rcutorture
 - rcutorture improved to check Kconfig and reader conflict handling
 - Extracted logic from rcu_torture_one_read() for clarity
 - Updated LWN RCU API documentation links
 - Enabled --do-rt in torture.sh for CONFIG_PREEMPT_RT
 - Added tests for SRCU up/down reader primitives
 - Added comments and delays checks in rcutorture
 - Deprecated srcu_read_lock_lite() and srcu_read_unlock_lite() via checkpatch
 - Added --do-normal and --do-no-normal to torture.sh
 - Added RCU Rust binding tests to torture.sh
 - Reduced CPU overcommit and removed MAXSMP/CPUMASK_OFFSTACK in TREE01
 - Replaced kmalloc() with kcalloc() in rcuscale
 - Refined listRCU example code for stale data elimination
 - Fixed hardirq count bug for x86 in cpu_stall_cputime
 - Added safety checks in rcu/nocb for offloaded rdp access
 - Other miscellaneous changes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEcoCIrlGe4gjE06JJqA4nf2o45hAFAmgoF5oACgkQqA4nf2o4
 5hDvVw//TNsJ/g0HTMu02uXMmtFIrgvpTnH7OEGJ+2p/KErrmWYsBJQw41ueLAQL
 Drtq3q9888UFF5LLA43HC88DFmT9uV8V8TmmURH+pZWdmJY1Ekn8UBSBhDPGGpC5
 sGIO2jJKjHN8G7fyJKoPtL9jxKSulHF/XQTIL2pP23jopAIwosoCHVAwGvnGVvBC
 smXfMSu+bd3IifNFroodsqjVXgnNQwWUNboOkz0KfkiiosgZsWWW8DaM3NGjdp+C
 tUHLs1zfC6sgJUjdpokTE3TcNudlMgVlB2Quj5jhh1YvsvedgIJXl4wpR6JVutyN
 F9awKt1AZkyZ+cTp+JpohaWaN9aKfNNG7jZ+rxQ0VcuRh35wmBJtiWNjEtJ38R82
 kTC1RI7MEus+6OZRt92jv5TNSa9t3wHbi5fBjNRiQ8PYq5cibZy7Lyrj2JOK7Zqs
 pgmdUnhQH2Uhf52b+clG5hWO55gEtACY8pin6kNewClcRtz04Jew7gkiYDGka4F4
 EXbuDHSWi25eSb3FzT2BqR72OZcJ0kv747OTp+2yTv2TaBA5p+OD8hvL/WbWC2Ok
 DK1YQ4RgEerTSZ4PbgPtWkNnlf6xjdWBaYNwmo+G/DgfjPoTOy1Jp73Z4b1AqSB5
 IPEQy1d/799QgGTYkbrvRtvWHg8yfOMz3ByZoHg31rafr0AsrXM=
 =6mun
 -----END PGP SIGNATURE-----

Merge tag 'next.2025.05.17a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU updates from Joel Fernandes:
 - Removed swake_up_one_online() workaround
 - Reverted an incorrect rcuog wake-up fix from offline softirq
 - Rust RCU Guard methods marked as inline
 - Updated MAINTAINERS with Joel’s and Zqiang's new email address
 - Replaced magic constant in rcu_seq_done_exact() with named constant
 - Added warning mechanism to validate rcu_seq_done_exact()
 - Switched SRCU polling API to use rcu_seq_done_exact()
 - Commented on redundant delta check in rcu_seq_done_exact()
 - Made ->gpwrap tests in rcutorture more frequent
 - Fixed reuse of ARM64 images in rcutorture
 - rcutorture improved to check Kconfig and reader conflict handling
 - Extracted logic from rcu_torture_one_read() for clarity
 - Updated LWN RCU API documentation links
 - Enabled --do-rt in torture.sh for CONFIG_PREEMPT_RT
 - Added tests for SRCU up/down reader primitives
 - Added comments and delays checks in rcutorture
 - Deprecated srcu_read_lock_lite() and srcu_read_unlock_lite() via checkpatch
 - Added --do-normal and --do-no-normal to torture.sh
 - Added RCU Rust binding tests to torture.sh
 - Reduced CPU overcommit and removed MAXSMP/CPUMASK_OFFSTACK in TREE01
 - Replaced kmalloc() with kcalloc() in rcuscale
 - Refined listRCU example code for stale data elimination
 - Fixed hardirq count bug for x86 in cpu_stall_cputime
 - Added safety checks in rcu/nocb for offloaded rdp access
 - Other miscellaneous changes

* tag 'next.2025.05.17a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (27 commits)
  rcutorture: Fix issue with re-using old images on ARM64
  rcutorture: Remove MAXSMP and CPUMASK_OFFSTACK from TREE01
  rcutorture: Reduce TREE01 CPU overcommit
  torture: Check for "Call trace:" as well as "Call Trace:"
  rcutorture: Perform more frequent testing of ->gpwrap
  torture: Add testing of RCU's Rust bindings to torture.sh
  torture: Add --do-{,no-}normal to torture.sh
  checkpatch: Deprecate srcu_read_lock_lite() and srcu_read_unlock_lite()
  rcutorture: Comment invocations of tick_dep_set_task()
  rcu/nocb: Add Safe checks for access offloaded rdp
  rcuscale: using kcalloc() to relpace kmalloc()
  doc/RCU/listRCU: refine example code for eliminating stale data
  doc: Update LWN RCU API links in whatisRCU.rst
  Revert "rcu/nocb: Fix rcuog wake-up from offline softirq"
  rust: sync: rcu: Mark Guard methods as inline
  rcu/cpu_stall_cputime: fix the hardirq count for x86 architecture
  rcu: Remove swake_up_one_online() bandaid
  MAINTAINERS: Update Zqiang's email address
  rcutorture: Make torture.sh --do-rt use CONFIG_PREEMPT_RT
  srcu: Use rcu_seq_done_exact() for polling API
  ...
2025-05-26 14:20:50 -07:00
Linus Torvalds
14418ddcc2 This update includes the following changes:
API:
 
 - Fix memcpy_sglist to handle partially overlapping SG lists.
 - Use memcpy_sglist to replace null skcipher.
 - Rename CRYPTO_TESTS to CRYPTO_BENCHMARK.
 - Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS.
 - Hide CRYPTO_MANAGER.
 - Add delayed freeing of driver crypto_alg structures.
 
 Compression:
 
 - Allocate large buffers on first use instead of initialisation in scomp.
 - Drop destination linearisation buffer in scomp.
 - Move scomp stream allocation into acomp.
 - Add acomp scatter-gather walker.
 - Remove request chaining.
 - Add optional async request allocation.
 
 Hashing:
 
 - Remove request chaining.
 - Add optional async request allocation.
 - Move partial block handling into API.
 - Add ahash support to hmac.
 - Fix shash documentation to disallow usage in hard IRQs.
 
 Algorithms:
 
 - Remove unnecessary SIMD fallback code on x86 and arm/arm64.
 - Drop avx10_256 xts(aes)/ctr(aes) on x86.
 - Improve avx-512 optimisations for xts(aes).
 - Move chacha arch implementations into lib/crypto.
 - Move poly1305 into lib/crypto and drop unused Crypto API algorithm.
 - Disable powerpc/poly1305 as it has no SIMD fallback.
 - Move sha256 arch implementations into lib/crypto.
 - Convert deflate to acomp.
 - Set block size correctly in cbcmac.
 
 Drivers:
 
 - Do not use sg_dma_len before mapping in sun8i-ss.
 - Fix warm-reboot failure by making shutdown do more work in qat.
 - Add locking in zynqmp-sha.
 - Remove cavium/zip.
 - Add support for PCI device 0x17D8 to ccp.
 - Add qat_6xxx support in qat.
 - Add support for RK3576 in rockchip-rng.
 - Add support for i.MX8QM in caam.
 
 Others:
 
 - Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up.
 - Add new SEV/SNP platform shutdown API in ccp.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmgz47AACgkQxycdCkmx
 i6fvKRAAr4Xa903L0r1Q1P1alQqoFFCqimUWeH72m68LiWynHWi0lUo0z/+tKweg
 mnPStz7/Ha9HRHJjdNCMPnlJqXQDkuH3bIOuBJCwduDuhHo9VGOd46XGzmGMv3gb
 HKuZhI0lk7pznK3CSyD/2nHmbDCHD+7feTZSBMoN9mm875+aSoM6fdxgak8uPFcq
 KbB1L+hObTn2kAPSqRrNOR8/xG2N7hdH8eax7Li+LAtqYNVT5HvWVECsB/CKRPfB
 sgAv3UTzcIFapSSHUHaONppSeoqPAIAeV7SdQhJvlT+EUUR/h/B6+D9OUQQqbphQ
 LBalgTnqMKl0ymDEQFQ6QyYCat9ZfNmDft2WcXEsxc8PxImkgJI1W3B8O51sOjbG
 78D8JqVQ96dleo4FsBhM2wfG0b41JM6zU4raC4vS7a3qsUS+Q1MpehvcS1iORicy
 SpGdE8e7DLlxKhzWyW1xJnbrtMZDC7Sa2hUnxrvP0/xOvRhChKscRVtWcf0a5q7X
 8JmuvwVSOJuSbQ3MeFbQvpo5lR9+0WsNjM6e9miiH6Y7vZUKmWcq2yDp377qVzeh
 7NK6+OwGIQZZExrmtPw2BXwssT9Eg+ks6Y7g2Ne7yzvrjVNfEPY7Cws/5w7p8mRS
 qhrcpbJNFlWgD7YYkmGZFTQ8DCN25ipP8lklO/hbcfchqLE/o1o=
 =O8L5
 -----END PGP SIGNATURE-----

Merge tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Fix memcpy_sglist to handle partially overlapping SG lists
   - Use memcpy_sglist to replace null skcipher
   - Rename CRYPTO_TESTS to CRYPTO_BENCHMARK
   - Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS
   - Hide CRYPTO_MANAGER
   - Add delayed freeing of driver crypto_alg structures

  Compression:
   - Allocate large buffers on first use instead of initialisation in scomp
   - Drop destination linearisation buffer in scomp
   - Move scomp stream allocation into acomp
   - Add acomp scatter-gather walker
   - Remove request chaining
   - Add optional async request allocation

  Hashing:
   - Remove request chaining
   - Add optional async request allocation
   - Move partial block handling into API
   - Add ahash support to hmac
   - Fix shash documentation to disallow usage in hard IRQs

  Algorithms:
   - Remove unnecessary SIMD fallback code on x86 and arm/arm64
   - Drop avx10_256 xts(aes)/ctr(aes) on x86
   - Improve avx-512 optimisations for xts(aes)
   - Move chacha arch implementations into lib/crypto
   - Move poly1305 into lib/crypto and drop unused Crypto API algorithm
   - Disable powerpc/poly1305 as it has no SIMD fallback
   - Move sha256 arch implementations into lib/crypto
   - Convert deflate to acomp
   - Set block size correctly in cbcmac

  Drivers:
   - Do not use sg_dma_len before mapping in sun8i-ss
   - Fix warm-reboot failure by making shutdown do more work in qat
   - Add locking in zynqmp-sha
   - Remove cavium/zip
   - Add support for PCI device 0x17D8 to ccp
   - Add qat_6xxx support in qat
   - Add support for RK3576 in rockchip-rng
   - Add support for i.MX8QM in caam

  Others:
   - Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up
   - Add new SEV/SNP platform shutdown API in ccp"

* tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (382 commits)
  x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  crypto: qat - add missing header inclusion
  crypto: api - Redo lookup on EEXIST
  Revert "crypto: testmgr - Add hash export format testing"
  crypto: marvell/cesa - Do not chain submitted requests
  crypto: powerpc/poly1305 - add depends on BROKEN for now
  Revert "crypto: powerpc/poly1305 - Add SIMD fallback"
  crypto: ccp - Add missing tee info reg for teev2
  crypto: ccp - Add missing bootloader info reg for pspv5
  crypto: sun8i-ce - move fallback ahash_request to the end of the struct
  crypto: octeontx2 - Use dynamic allocated memory region for lmtst
  crypto: octeontx2 - Initialize cptlfs device info once
  crypto: xts - Only add ecb if it is not already there
  crypto: lrw - Only add ecb if it is not already there
  crypto: testmgr - Add hash export format testing
  crypto: testmgr - Use ahash for generic tfm
  crypto: hmac - Add ahash support
  crypto: testmgr - Ignore EEXIST on shash allocation
  crypto: algapi - Add driver template support to crypto_inst_setname
  crypto: shash - Set reqsize in shash_alg
  ...
2025-05-26 13:47:28 -07:00
Paolo Bonzini
1f7c9d52b1 KVM/riscv changes for 6.16
- Add vector registers to get-reg-list selftest
 - VCPU reset related improvements
 - Remove scounteren initialization from VCPU reset
 - Support VCPU reset from userspace using set_mpstate() ioctl
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmgx8nkACgkQrUjsVaLH
 LAdmYw//aoxlXA8OKFMWFzEC4yZHfq502X1p95Z9h1i8IYKoRcqNxX8O5MJt0y8Z
 dyhgqKLT5q5bpUf/yxJjuJdZQJtCTC0o1bx89sSiM83M2J+AZymcIS5yFVIcxetv
 JRsaWh8GDc/XYUFBnjYCj5zmA5IObbd/QCVnkATjVRcRS56LnRv98P9YPxN7zhHF
 OhDRWELtUowk6FMgiHo75R9vYhOO1ywzzjLFnK5ZHLSYeQ7bhRX/jY5n1BShsWjp
 k+zVRTpJobw9AU76EPQaEKuqiynQz35ZPkxAiuWYac6SDKwmztvGJ1fCnSNAkJLk
 0kN/eAxv61F5nCRJxOkxK0Uy3U94zA6zX53VdnLoRN4rYpA8CYrE4mNAddN31IBC
 NHlBS59w2EsUxIRL1FiCUrEKKgeSWqJY1NuqsmB4ogeo4MKm8n1OhiYSU0l7NnQ6
 3h77ccnHN95cah2C9XDX3GeZ+on6z1t6FjZ4Enki1w3CCfGdKMsfphwUckTNzSvw
 hTeXhYHcP4VKsYkCcitdLR/VFwO4a3HlnjAtHdJLh0qfJ5SergifZ5/eQXhHu7f1
 1uyfk/6nKculr/8yzUVkOR7kerMjBuBx8jir89ceKD8qeA+4MUI1rdwEvzbzaEY9
 9aDwEVbQ1qMjpeONfKUvbyLJ7uN5Lae1/X4Kafmo2TWNAvtLPLo=
 =+S8x
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.16-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.16

- Add vector registers to get-reg-list selftest
- VCPU reset related improvements
- Remove scounteren initialization from VCPU reset
- Support VCPU reset from userspace using set_mpstate() ioctl
2025-05-26 16:27:00 -04:00
Paolo Bonzini
4d526b02df KVM/arm64 updates for 6.16
* New features:
 
   - Add large stage-2 mapping support for non-protected pKVM guests,
     clawing back some performance.
 
   - Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and
     protected modes.
 
   - Enable nested virtualisation support on systems that support it
     (yes, it has been a long time coming), though it is disabled by
     default.
 
 * Improvements, fixes and cleanups:
 
   - Large rework of the way KVM tracks architecture features and links
     them with the effects of control bits. This ensures correctness of
     emulation (the data is automatically extracted from the published
     JSON files), and helps dealing with the evolution of the
     architecture.
 
   - Significant changes to the way pKVM tracks ownership of pages,
     avoiding page table walks by storing the state in the hypervisor's
     vmemmap. This in turn enables the THP support described above.
 
   - New selftest checking the pKVM ownership transition rules
 
   - Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
     even if the host didn't have it.
 
   - Fixes for the address translation emulation, which happened to be
     rather buggy in some specific contexts.
 
   - Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
     from the number of counters exposed to a guest and addressing a
     number of issues in the process.
 
   - Add a new selftest for the SVE host state being corrupted by a
     guest.
 
   - Keep HCR_EL2.xMO set at all times for systems running with the
     kernel at EL2, ensuring that the window for interrupts is slightly
     bigger, and avoiding a pretty bad erratum on the AmpereOne HW.
 
   - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
     from a pretty bad case of TLB corruption unless accesses to HCR_EL2
     are heavily synchronised.
 
   - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
     tables in a human-friendly fashion.
 
   - and the usual random cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmgwU7UACgkQI9DQutE9
 ekN93g//fNnejxf01dBFIbuylzYEyHZSEH0iTGLeM+ES9zvntCzciTYVzb27oqNG
 RDLShlQYp3w4rAe6ORzyePyHptOmKXCxfj/VXUFp3A7H9QYOxt1nacD3WxI9fCOo
 LzaSLquvgwFBaeTdDE0KdeTUKQHluId+w1Azh0lnHGeUP+lOHNZ8FqoP1/la0q04
 GvVL+l3wz/IhPP8r1YA0Q1bzJ5SLfSpjIw/0F5H/xgI4lyYdHzgFL8sKuSyFeCyM
 2STQi+ZnTCsAs4bkXkw2Pp9CFYrfQgZi+sf7Om+noAKhbJo3vb7/RHpgjv+QCjJy
 Kx4g9CbxHfaM03cH6uSLBoFzsACR1iAuUz8BCSRvvVNH4RVT6H+34nzjLZXLncrP
 gm1uYs9aMTLr91caeAx0aYIMWGYa1uqV0rum3WxyIHezN9Q/NuQoZyfprUufr8oX
 wCYE+ot4VT3DwG0UFZKKwj0BiCbYcbph9nBLVyZJsg8OKxpvspkCtPriFp1kb6BP
 dTTGSXd9JJqwSgP9qJLxijcv6Nfgp2gT42TWwh/dJRZXhnTCvr9IyclFIhoIIq3G
 Q2BkFCXOoEoNQhBA1tiWzJ9nDHf52P72Z2K1gPyyMZwF49HGa2BZBCJGkqX06wSs
 Riolf1/cjFhDno1ThiHKsHT0sG1D4oc9k/1NLq5dyNAEGcgATIA=
 =Jju3
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.16

* New features:

  - Add large stage-2 mapping support for non-protected pKVM guests,
    clawing back some performance.

  - Add UBSAN support to the standalone EL2 object used in nVHE/hVHE and
    protected modes.

  - Enable nested virtualisation support on systems that support it
    (yes, it has been a long time coming), though it is disabled by
    default.

* Improvements, fixes and cleanups:

  - Large rework of the way KVM tracks architecture features and links
    them with the effects of control bits. This ensures correctness of
    emulation (the data is automatically extracted from the published
    JSON files), and helps dealing with the evolution of the
    architecture.

  - Significant changes to the way pKVM tracks ownership of pages,
    avoiding page table walks by storing the state in the hypervisor's
    vmemmap. This in turn enables the THP support described above.

  - New selftest checking the pKVM ownership transition rules

  - Fixes for FEAT_MTE_ASYNC being accidentally advertised to guests
    even if the host didn't have it.

  - Fixes for the address translation emulation, which happened to be
    rather buggy in some specific contexts.

  - Fixes for the PMU emulation in NV contexts, decoupling PMCR_EL0.N
    from the number of counters exposed to a guest and addressing a
    number of issues in the process.

  - Add a new selftest for the SVE host state being corrupted by a
    guest.

  - Keep HCR_EL2.xMO set at all times for systems running with the
    kernel at EL2, ensuring that the window for interrupts is slightly
    bigger, and avoiding a pretty bad erratum on the AmpereOne HW.

  - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers
    from a pretty bad case of TLB corruption unless accesses to HCR_EL2
    are heavily synchronised.

  - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS
    tables in a human-friendly fashion.

  - and the usual random cleanups.
2025-05-26 16:19:46 -04:00
Rafael J. Wysocki
3e0c509fbd Merge branch 'pm-tools'
Merge a cpupower utility update for 6.16-rc1 that adds a systemd service
to run cpupower and changes binding's Makefile to use -lcpupower (John B.
Wyatt IV, Francesco Poli).

* pm-tools:
  cpupower: do not install files to /etc/default/
  cpupower: do not call systemctl at install time
  cpupower: do not write DESTDIR to cpupower.service
  cpupower: change binding's makefile to use -lcpupower
  cpupower: add a systemd service to run cpupower
2025-05-26 21:24:27 +02:00
Rafael J. Wysocki
76524ffd10 Merge branches 'pm-runtime' and 'pm-sleep'
Merge updates related to system sleep handling and runtime PM for 6.16-rc1:

 - Fix denying of auto suspend in pm_suspend_timer_fn() (Charan Teja
   Kalla).

 - Move debug runtime PM attributes to runtime_attrs[] (Rafael Wysocki).

 - Add new devm_ functions for enabling runtime PM and runtime PM
   reference counting (Bence Csókás).

 - Remove size arguments from strscpy() calls in the hibernation core
   code (Thorsten Blum).

 - Adjust the handling of devices with asynchronous suspend enabled
   during system suspend and resume to start resuming them immediately
   after resuming their parents and to start suspending such a device
   immediately after suspending its first child (Rafael Wysocki).

 - Adjust messages printed during tasks freezing to avoid using
   pr_cont() (Andrew Sayers, Paul Menzel).

 - Clean up unnecessary usage of !! in pm_print_times_init() (Zihuan
   Zhang).

 - Add missing wakeup source attribute relax_count to sysfs and
   remove the space character at the end ofi the string produced by
   pm_show_wakelocks() (Zijun Hu).

 - Add configurable pm_test delay for hibernation (Zihuan Zhang).

 - Disable asynchronous suspend in ucsi_ccg_probe() to prevent the
   cypd4226 device on Tegra boards from suspending prematurely (Jon
   Hunter).

 - Unbreak printing PM debug messages during hibernation and clean up
   some related code (Rafael Wysocki).

* pm-runtime:
  PM: runtime: fix denying of auto suspend in pm_suspend_timer_fn()
  PM: sysfs: Move debug runtime PM attributes to runtime_attrs[]
  PM: runtime: Add new devm functions

* pm-sleep:
  PM: freezer: Rewrite restarting tasks log to remove stray *done.*
  PM: sleep: Introduce pm_sleep_transition_in_progress()
  PM: sleep: Introduce pm_suspend_in_progress()
  PM: sleep: Print PM debug messages during hibernation
  ucsi_ccg: Disable async suspend in ucsi_ccg_probe()
  PM: hibernate: add configurable delay for pm_test
  PM: wakeup: Delete space in the end of string shown by pm_show_wakelocks()
  PM: wakeup: Add missing wakeup source attribute relax_count
  PM: sleep: Remove unnecessary !!
  PM: sleep: Use two lines for "Restarting..." / "done" messages
  PM: sleep: Make suspend of devices more asynchronous
  PM: sleep: Suspend async parents after suspending children
  PM: sleep: Resume children after resuming the parent
  PM: hibernate: Remove size arguments when calling strscpy()
2025-05-26 21:21:58 +02:00
Linus Torvalds
6f59de9bc0 for-6.16/block-20250523
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmgwnGYQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpq9aD/4iqOts77xhWWLrOJWkkhOcV5rREeyppq8X
 MKYul9S4cc4Uin9Xou9a+nab31QBQEk3nsN3kX9o3yAXvkh6yUm36HD8qYNW/46q
 IUkwRQQJ0COyTnexMZQNTbZPQDIYcenXmQxOcrEJ5jC1Jcz0sOKHsgekL+ab3kCy
 fLnuz2ozvjGDMala/NmE8fN5qSlj4qQABHgbamwlwfo4aWu07cwfqn5G/FCYJgDO
 xUvsnTVclom2g4G+7eSSvGQI1QyAxl5QpviPnj/TEgfFBFnhbCSoBTEY6ecqhlfW
 6u59MF/Uw8E+weiuGY4L87kDtBhjQs3UMSLxCuwH7MxXb25ff7qB4AIkcFD0kKFH
 3V5NtwqlU7aQT0xOjGxaHhfPwjLD+FVss4ARmuHS09/Kn8egOW9yROPyetnuH84R
 Oz0Ctnt1IPLFjvGeg3+rt9fjjS9jWOXLITb9Q6nX9gnCt7orCwIYke8YCpmnJyhn
 i+fV4CWYIQBBRKxIT0E/GhJxZOmL0JKpomnbpP2dH8npemnsTCuvtfdrK9gfhH2X
 chBVqCPY8MNU5zKfzdEiavPqcm9392lMzOoOXW2pSC1eAKqnAQ86ZT3r7rLntqE8
 75LxHcvaQIsnpyG+YuJVHvoiJ83TbqZNpyHwNaQTYhDmdYpp2d/wTtTQywX4DuXb
 Y6NDJw5+kQ==
 =1PNK
 -----END PGP SIGNATURE-----

Merge tag 'for-6.16/block-20250523' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - ublk updates:
      - Add support for updating the size of a ublk instance
      - Zero-copy improvements
      - Auto-registering of buffers for zero-copy
      - Series simplifying and improving GET_DATA and request lookup
      - Series adding quiesce support
      - Lots of selftests additions
      - Various cleanups

 - NVMe updates via Christoph:
      - add per-node DMA pools and use them for PRP/SGL allocations
        (Caleb Sander Mateos, Keith Busch)
      - nvme-fcloop refcounting fixes (Daniel Wagner)
      - support delayed removal of the multipath node and optionally
        support the multipath node for private namespaces (Nilay Shroff)
      - support shared CQs in the PCI endpoint target code (Wilfred
        Mallawa)
      - support admin-queue only authentication (Hannes Reinecke)
      - use the crc32c library instead of the crypto API (Eric Biggers)
      - misc cleanups (Christoph Hellwig, Marcelo Moreira, Hannes
        Reinecke, Leon Romanovsky, Gustavo A. R. Silva)

 - MD updates via Yu:
      - Fix that normal IO can be starved by sync IO, found by mkfs on
        newly created large raid5, with some clean up patches for bdev
        inflight counters

 - Clean up brd, getting rid of atomic kmaps and bvec poking

 - Add loop driver specifically for zoned IO testing

 - Eliminate blk-rq-qos calls with a static key, if not enabled

 - Improve hctx locking for when a plug has IO for multiple queues
   pending

 - Remove block layer bouncing support, which in turn means we can
   remove the per-node bounce stat as well

 - Improve blk-throttle support

 - Improve delay support for blk-throttle

 - Improve brd discard support

 - Unify IO scheduler switching. This should also fix a bunch of lockdep
   warnings we've been seeing, after enabling lockdep support for queue
   freezing/unfreezeing

 - Add support for block write streams via FDP (flexible data placement)
   on NVMe

 - Add a bunch of block helpers, facilitating the removal of a bunch of
   duplicated boilerplate code

 - Remove obsolete BLK_MQ pci and virtio Kconfig options

 - Add atomic/untorn write support to blktrace

 - Various little cleanups and fixes

* tag 'for-6.16/block-20250523' of git://git.kernel.dk/linux: (186 commits)
  selftests: ublk: add test for UBLK_F_QUIESCE
  ublk: add feature UBLK_F_QUIESCE
  selftests: ublk: add test case for UBLK_U_CMD_UPDATE_SIZE
  traceevent/block: Add REQ_ATOMIC flag to block trace events
  ublk: run auto buf unregisgering in same io_ring_ctx with registering
  io_uring: add helper io_uring_cmd_ctx_handle()
  ublk: remove io argument from ublk_auto_buf_reg_fallback()
  ublk: handle ublk_set_auto_buf_reg() failure correctly in ublk_fetch()
  selftests: ublk: add test for covering UBLK_AUTO_BUF_REG_FALLBACK
  selftests: ublk: support UBLK_F_AUTO_BUF_REG
  ublk: support UBLK_AUTO_BUF_REG_FALLBACK
  ublk: register buffer to local io_uring with provided buf index via UBLK_F_AUTO_BUF_REG
  ublk: prepare for supporting to register request buffer automatically
  ublk: convert to refcount_t
  selftests: ublk: make IO & device removal test more stressful
  nvme: rename nvme_mpath_shutdown_disk to nvme_mpath_remove_disk
  nvme: introduce multipath_always_on module param
  nvme-multipath: introduce delayed removal of the multipath head node
  nvme-pci: derive and better document max segments limits
  nvme-pci: use struct_size for allocation struct nvme_dev
  ...
2025-05-26 11:39:36 -07:00
Linus Torvalds
3e406741b1 vfs-6.16-rc1.selftests
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaDBPUAAKCRCRxhvAZXjc
 ooziAP9vZifOayDPF/do8fG8BQZ3RpOmAeNqkLdGliX6kMOEIAEApfa00Y+EVeaJ
 gyUY+Ui7k5xb5pzlDbHwrZxX4Q6wEwY=
 =MUXL
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.16-rc1.selftests' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs selftests updates from Christian Brauner:
 "This contains various cleanups, fixes, and extensions for out
  filesystem selftests"

* tag 'vfs-6.16-rc1.selftests' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests/fs/mount-notify: add a test variant running inside userns
  selftests/filesystems: create setup_userns() helper
  selftests/filesystems: create get_unique_mnt_id() helper
  selftests/fs/mount-notify: build with tools include dir
  selftests/mount_settattr: remove duplicate syscall definitions
  selftests/pidfd: move syscall definitions into wrappers.h
  selftests/fs/statmount: build with tools include dir
  selftests/filesystems: move wrapper.h out of overlayfs subdir
  selftests/mount_settattr: ensure that ext4 filesystem can be created
  selftests/mount_settattr: add missing STATX_MNT_ID_UNIQUE define
  selftests/mount_settattr: don't define sys_open_tree() twice
2025-05-26 11:32:28 -07:00
Linus Torvalds
c5bfc48d54 vfs-6.16-rc1.coredump
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaDBPTwAKCRCRxhvAZXjc
 oliqAQCVdrBn7D2+dB04hjefFq6W6LhyLGrtCCliflicN5SyxAD+PHHiB9nFKe6J
 xQkaNArCJjPd2QEx73aGjHzi3UQq6Qs=
 =Pk9c
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.16-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull coredump updates from Christian Brauner:
 "This adds support for sending coredumps over an AF_UNIX socket. It
  also makes (implicit) use of the new SO_PEERPIDFD ability to hand out
  pidfds for reaped peer tasks

  The new coredump socket will allow userspace to not have to rely on
  usermode helpers for processing coredumps and provides a saf way to
  handle them instead of relying on super privileged coredumping helpers

  This will also be significantly more lightweight since the kernel
  doens't have to do a fork()+exec() for each crashing process to spawn
  a usermodehelper. Instead the kernel just connects to the AF_UNIX
  socket and userspace can process it concurrently however it sees fit.
  Support for userspace is incoming starting with systemd-coredump

  There's more work coming in that direction next cycle. The rest below
  goes into some details and background

  Coredumping currently supports two modes:

   (1) Dumping directly into a file somewhere on the filesystem.

   (2) Dumping into a pipe connected to a usermode helper process
       spawned as a child of the system_unbound_wq or kthreadd

  For simplicity I'm mostly ignoring (1). There's probably still some
  users of (1) out there but processing coredumps in this way can be
  considered adventurous especially in the face of set*id binaries

  The most common option should be (2) by now. It works by allowing
  userspace to put a string into /proc/sys/kernel/core_pattern like:

          |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h

  The "|" at the beginning indicates to the kernel that a pipe must be
  used. The path following the pipe indicator is a path to a binary that
  will be spawned as a usermode helper process. Any additional
  parameters pass information about the task that is generating the
  coredump to the binary that processes the coredump

  In the example the core_pattern shown causes the kernel to spawn
  systemd-coredump as a usermode helper. There's various conceptual
  consequences of this (non-exhaustive list):

   - systemd-coredump is spawned with file descriptor number 0 (stdin)
     connected to the read-end of the pipe. All other file descriptors
     are closed. That specifically includes 1 (stdout) and 2 (stderr).

     This has already caused bugs because userspace assumed that this
     cannot happen (Whether or not this is a sane assumption is
     irrelevant)

   - systemd-coredump will be spawned as a child of system_unbound_wq.
     So it is not a child of any userspace process and specifically not
     a child of PID 1. It cannot be waited upon and is in a weird hybrid
     upcall which are difficult for userspace to control correctly

   - systemd-coredump is spawned with full kernel privileges. This
     necessitates all kinds of weird privilege dropping excercises in
     userspace to make this safe

   - A new usermode helper has to be spawned for each crashing process

  This adds a new mode:

   (3) Dumping into an AF_UNIX socket

  Userspace can set /proc/sys/kernel/core_pattern to:

          @/path/to/coredump.socket

  The "@" at the beginning indicates to the kernel that an AF_UNIX
  coredump socket will be used to process coredumps

  The coredump socket must be located in the initial mount namespace.
  When a task coredumps it opens a client socket in the initial network
  namespace and connects to the coredump socket:

   - The coredump server uses SO_PEERPIDFD to get a stable handle on the
     connected crashing task. The retrieved pidfd will provide a stable
     reference even if the crashing task gets SIGKILLed while generating
     the coredump. That is a huge attack vector right now

   - By setting core_pipe_limit non-zero userspace can guarantee that
     the crashing task cannot be reaped behind it's back and thus
     process all necessary information in /proc/<pid>. The SO_PEERPIDFD
     can be used to detect whether /proc/<pid> still refers to the same
     process

     The core_pipe_limit isn't used to rate-limit connections to the
     socket. This can simply be done via AF_UNIX socket directly

   - The pidfd for the crashing task will contain information how the
     task coredumps. The PIDFD_GET_INFO ioctl gained a new flag
     PIDFD_INFO_COREDUMP which can be used to retreive the coredump
     information

     If the coredump gets a new coredump client connection the kernel
     guarantees that PIDFD_INFO_COREDUMP information is available.

     Currently the following information is provided in the new
     @coredump_mask extension to struct pidfd_info:

      * PIDFD_COREDUMPED is raised if the task did actually coredump

      * PIDFD_COREDUMP_SKIP is raised if the task skipped coredumping
        (e.g., undumpable)

      * PIDFD_COREDUMP_USER is raised if this is a regular coredump and
        doesn't need special care by the coredump server

      * PIDFD_COREDUMP_ROOT is raised if the generated coredump should
        be treated as sensitive and the coredump server should restrict
        access to the generated coredump to sufficiently privileged
        users"

* tag 'vfs-6.16-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  mips, net: ensure that SOCK_COREDUMP is defined
  selftests/coredump: add tests for AF_UNIX coredumps
  selftests/pidfd: add PIDFD_INFO_COREDUMP infrastructure
  coredump: validate socket name as it is written
  coredump: show supported coredump modes
  pidfs, coredump: add PIDFD_INFO_COREDUMP
  coredump: add coredump socket
  coredump: reflow dump helpers a little
  coredump: massage do_coredump()
  coredump: massage format_corename()
2025-05-26 11:17:01 -07:00
Linus Torvalds
7d7a103d29 vfs-6.16-rc1.pidfs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaDBPTwAKCRCRxhvAZXjc
 ov4zAP4yfqKBAz6eMt9CzDgHCdVQJ9Nuur1EiRdot3maPzHTcQEA2hVkJrvVo1Y/
 jCVAf7gmGX1Uu6nCUF6Vjy35g8i20gA=
 =nzYS
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.16-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull pidfs updates from Christian Brauner:
 "Features:

   - Allow handing out pidfds for reaped tasks for AF_UNIX SO_PEERPIDFD
     socket option

     SO_PEERPIDFD is a socket option that allows to retrieve a pidfd for
     the process that called connect() or listen(). This is heavily used
     to safely authenticate clients in userspace avoiding security bugs
     due to pid recycling races (dbus, polkit, systemd, etc.)

     SO_PEERPIDFD currently doesn't support handing out pidfds if the
     sk->sk_peer_pid thread-group leader has already been reaped. In
     this case it currently returns EINVAL. Userspace still wants to get
     a pidfd for a reaped process to have a stable handle it can pass
     on. This is especially useful now that it is possible to retrieve
     exit information through a pidfd via the PIDFD_GET_INFO ioctl()'s
     PIDFD_INFO_EXIT flag

     Another summary has been provided by David Rheinsberg:

      > A pidfd can outlive the task it refers to, and thus user-space
      > must already be prepared that the task underlying a pidfd is
      > gone at the time they get their hands on the pidfd. For
      > instance, resolving the pidfd to a PID via the fdinfo must be
      > prepared to read `-1`.
      >
      > Despite user-space knowing that a pidfd might be stale, several
      > kernel APIs currently add another layer that checks for this. In
      > particular, SO_PEERPIDFD returns `EINVAL` if the peer-task was
      > already reaped, but returns a stale pidfd if the task is reaped
      > immediately after the respective alive-check.
      >
      > This has the unfortunate effect that user-space now has two ways
      > to check for the exact same scenario: A syscall might return
      > EINVAL/ESRCH/... *or* the pidfd might be stale, even though
      > there is no particular reason to distinguish both cases. This
      > also propagates through user-space APIs, which pass on pidfds.
      > They must be prepared to pass on `-1` *or* the pidfd, because
      > there is no guaranteed way to get a stale pidfd from the kernel.
      >
      > Userspace must already deal with a pidfd referring to a reaped
      > task as the task may exit and get reaped at any time will there
      > are still many pidfds referring to it

     In order to allow handing out reaped pidfd SO_PEERPIDFD needs to
     ensure that PIDFD_INFO_EXIT information is available whenever a
     pidfd for a reaped task is created by PIDFD_INFO_EXIT. The uapi
     promises that reaped pidfds are only handed out if it is guaranteed
     that the caller sees the exit information:

     TEST_F(pidfd_info, success_reaped)
     {
             struct pidfd_info info = {
                     .mask = PIDFD_INFO_CGROUPID | PIDFD_INFO_EXIT,
             };

             /*
              * Process has already been reaped and PIDFD_INFO_EXIT been set.
              * Verify that we can retrieve the exit status of the process.
              */
             ASSERT_EQ(ioctl(self->child_pidfd4, PIDFD_GET_INFO, &info), 0);
             ASSERT_FALSE(!!(info.mask & PIDFD_INFO_CREDS));
             ASSERT_TRUE(!!(info.mask & PIDFD_INFO_EXIT));
             ASSERT_TRUE(WIFEXITED(info.exit_code));
             ASSERT_EQ(WEXITSTATUS(info.exit_code), 0);
     }

     To hand out pidfds for reaped processes we thus allocate a pidfs
     entry for the relevant sk->sk_peer_pid at the time the
     sk->sk_peer_pid is stashed and drop it when the socket is
     destroyed. This guarantees that exit information will always be
     recorded for the sk->sk_peer_pid task and we can hand out pidfds
     for reaped processes

   - Hand a pidfd to the coredump usermode helper process

     Give userspace a way to instruct the kernel to install a pidfd for
     the crashing process into the process started as a usermode helper.
     There's still tricky race-windows that cannot be easily or
     sometimes not closed at all by userspace. There's various ways like
     looking at the start time of a process to make sure that the
     usermode helper process is started after the crashing process but
     it's all very very brittle and fraught with peril

     The crashed-but-not-reaped process can be killed by userspace
     before coredump processing programs like systemd-coredump have had
     time to manually open a PIDFD from the PID the kernel provides
     them, which means they can be tricked into reading from an
     arbitrary process, and they run with full privileges as they are
     usermode helper processes

     Even if that specific race-window wouldn't exist it's still the
     safest and cleanest way to let the kernel provide the pidfd
     directly instead of requiring userspace to do it manually. In
     parallel with this commit we already have systemd adding support
     for this in [1]

     When the usermode helper process is forked we install a pidfd file
     descriptor three into the usermode helper's file descriptor table
     so it's available to the exec'd program

     Since usermode helpers are either children of the system_unbound_wq
     workqueue or kthreadd we know that the file descriptor table is
     empty and can thus always use three as the file descriptor number

     Note, that we'll install a pidfd for the thread-group leader even
     if a subthread is calling do_coredump(). We know that task linkage
     hasn't been removed yet and even if this @current isn't the actual
     thread-group leader we know that the thread-group leader cannot be
     reaped until
     @current has exited

   - Allow telling when a task has not been found from finding the wrong
     task when creating a pidfd

     We currently report EINVAL whenever a struct pid has no tasked
     attached anymore thereby conflating two concepts:

      (1) The task has already been reaped

      (2) The caller requested a pidfd for a thread-group leader but the
          pid actually references a struct pid that isn't used as a
          thread-group leader

     This is causing issues for non-threaded workloads as in where they
     expect ESRCH to be reported, not EINVAL

     So allow userspace to reliably distinguish between (1) and (2)

   - Make it possible to detect when a pidfs entry would outlive the
     struct pid it pinned

   - Add a range of new selftests

  Cleanups:

   - Remove unneeded NULL check from pidfd_prepare() for passed struct
     pid

   - Avoid pointless reference count bump during release_task()

  Fixes:

   - Various fixes to the pidfd and coredump selftests

   - Fix error handling for replace_fd() when spawning coredump usermode
     helper"

* tag 'vfs-6.16-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  pidfs: detect refcount bugs
  coredump: hand a pidfd to the usermode coredump helper
  coredump: fix error handling for replace_fd()
  pidfs: move O_RDWR into pidfs_alloc_file()
  selftests: coredump: Raise timeout to 2 minutes
  selftests: coredump: Fix test failure for slow machines
  selftests: coredump: Properly initialize pointer
  net, pidfs: enable handing out pidfds for reaped sk->sk_peer_pid
  pidfs: get rid of __pidfd_prepare()
  net, pidfs: prepare for handing out pidfds for reaped sk->sk_peer_pid
  pidfs: register pid in pidfs
  net, pidfd: report EINVAL for ESRCH
  release_task: kill the no longer needed get/put_pid(thread_pid)
  pidfs: ensure consistent ENOENT/ESRCH reporting
  exit: move wake_up_all() pidfd waiters into __unhash_process()
  selftest/pidfd: add test for thread-group leader pidfd open for thread
  pidfd: improve uapi when task isn't found
  pidfd: remove unneeded NULL check from pidfd_prepare()
  selftests/pidfd: adapt to recent changes
2025-05-26 10:30:02 -07:00
Paolo Abeni
f5b60d6a57 netfilter pull request 25-05-23
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmgwd00ACgkQ1w0aZmrP
 KyEfwA//RXQ3i8PCa7lKHxDRhVzG3rEvgXRmiXeNd+JjzsCnybBb7+wRf3dtBGWT
 +1s44Utx1JqosWxCVBulqYC5bqSC66789l5X2jhYJmUZxRrbcsqPngwnIrjb/XeK
 ZJM62wiRhkBQED7yZLGy+y4VHQiG8CEMt16AOQHk863aruWv1tT7up90CTtzA545
 4GF/grU3FC0PsoTLwzWyvqsWK+9uk3Y4Tifp5hU3w6uRD9EjX5tHCZlXXSqOF5gu
 KT26OYsePYXhJVZIwDf2oVLGi0EVTPB9IFxZSNgLqyXqu2ILAb9OwRNVTNfTP7Pg
 1RWJWmgqvRNs9OM2ecifYgQf/AfvCL0Cja1BJOjmvtICuGegrYH7G5YYQsMl9CoE
 7jBoTzpToSASat5+dwoz81Bvzh447dYxRE2VmbxmRTTWToQYS1KGBPc9e3u/n5Rr
 ruh8tRZ3/R0Fy+YLDkrJst3grh5RLITbuyu4ElJMArPU50mLTVYxKd6nA3BqwB5G
 1GmLfCzvQH3e6PKz6CNke1AytVDy/wLTXtcbLnze2Muaj4AqhtOe5Q8ypnOO0Vyk
 PsJ6U3rm2asd3GE9+AIx8gZBv8yCu1w9CiwLK8ybT2NETb2dEnqPgWeDyT7rpcaD
 sQOPsBE1q/TEp9gofbYCHBm5E2mX9UP7Q6EHCTekrI97xLq8Q2M=
 =fBhd
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-25-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following batch contains Netfilter updates for net-next,
specifically 26 patches: 5 patches adding/updating selftests,
4 fixes, 3 PREEMPT_RT fixes, and 14 patches to enhance nf_tables):

1) Improve selftest coverage for pipapo 4 bit group format, from
   Florian Westphal.

2) Fix incorrect dependencies when compiling a kernel without
   legacy ip{6}tables support, also from Florian.

3) Two patches to fix nft_fib vrf issues, including selftest updates
   to improve coverage, also from Florian Westphal.

4) Fix incorrect nesting in nft_tunnel's GENEVE support, from
   Fernando F. Mancera.

5) Three patches to fix PREEMPT_RT issues with nf_dup infrastructure
   and nft_inner to match in inner headers, from Sebastian Andrzej Siewior.

6) Integrate conntrack information into nft trace infrastructure,
   from Florian Westphal.

7) A series of 13 patches to allow to specify wildcard netdevice in
   netdev basechain and flowtables, eg.

   table netdev filter {
       chain ingress {
           type filter hook ingress devices = { eth0, eth1, vlan* } priority 0; policy accept;
       }
   }

   This also allows for runtime hook registration on NETDEV_{UN}REGISTER
   event, from Phil Sutter.

netfilter pull request 25-05-23

* tag 'nf-next-25-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: (26 commits)
  selftests: netfilter: Torture nftables netdev hooks
  netfilter: nf_tables: Add notifications for hook changes
  netfilter: nf_tables: Support wildcard netdev hook specs
  netfilter: nf_tables: Sort labels in nft_netdev_hook_alloc()
  netfilter: nf_tables: Handle NETDEV_CHANGENAME events
  netfilter: nf_tables: Wrap netdev notifiers
  netfilter: nf_tables: Respect NETDEV_REGISTER events
  netfilter: nf_tables: Prepare for handling NETDEV_REGISTER events
  netfilter: nf_tables: Have a list of nf_hook_ops in nft_hook
  netfilter: nf_tables: Pass nf_hook_ops to nft_unregister_flowtable_hook()
  netfilter: nf_tables: Introduce nft_register_flowtable_ops()
  netfilter: nf_tables: Introduce nft_hook_find_ops{,_rcu}()
  netfilter: nf_tables: Introduce functions freeing nft_hook objects
  netfilter: nf_tables: add packets conntrack state to debug trace info
  netfilter: conntrack: make nf_conntrack_id callable without a module dependency
  netfilter: nf_dup_netdev: Move the recursion counter struct netdev_xmit
  netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx
  netfilter: nf_dup{4, 6}: Move duplication check to task_struct
  netfilter: nft_tunnel: fix geneve_opt dump
  selftests: netfilter: nft_fib.sh: add type and oif tests with and without VRFs
  ...
====================

Link: https://patch.msgid.link/20250523132712.458507-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-26 18:53:41 +02:00
Rafael J. Wysocki
57356d98c0 Merge branch 'acpica'
Merge ACPICA updates, including two upstream releases 20241212 and
20250404, for 6.16-rc1:

 - Fix two ACPICA SLAB cache leaks (Seunghun Han).

 - Add EINJv2 get error type action and define Error Injection Actions
   in hex values to avoid inconsistencies between the specification and
   the code (Zaid Alali).

 - Fix typo in comments for SRAT structures (Adam Lackorzynski).

 - Prevent possible loss of data in ACPICA because of u32 to u8
   conversions (Saket Dumbre).

 - Fix reading FFixedHW operation regions in ACPICA (Daniil Tatianin).

 - Add support for printing AML arguments when the ACPICA debug level is
   ACPI_LV_TRACE_POINT (Mario Limonciello).

 - Drop a stale comment about the file content from actbl2.h (Sudeep
   Holla).

 - Apply pack(1) to union aml_resource (Tamir Duberstein).

 - Fix overflow check in the ACPICA version of vsnprintf() (gldrk).

 - Interpret SIDP structures in DMAR added revision 3.4 of the VT-d
   specification (Alexey Neyman).

 - Add typedef and other definitions related to MRRM to ACPICA (Tony
   Luck).

 - Add definitions for RIMT to ACPICA (Sunil V L).

 - Fix spelling mistake "Incremement" -> "Increment" in the ACPICA
   utilities code (Colin Ian King).

 - Add typedef and other definitions for ERDT to ACPICA (Tony Luck).

 - Introduce ACPI_NONSTRING and use it (Kees Cook, Ahmed Salem).

 - Rename structure and field names of the RAS2 table in actbl2.h (Shiju
   Jose).

 - Fix up whitespace in acpica/utcache.c (Zhe Qiao).

 - Avoid sequence overread in a call to strncmp() in ap_get_table_length()
   and replace strncpy() with memcpy() in ACPICA in some places (Ahmed
   Salem).

 - Update copyright year in all ACPICA files (Saket Dumbre).

* acpica: (30 commits)
  ACPICA: Update copyright year
  ACPICA: Logfile: Changes for version 20250404
  ACPICA: Replace strncpy() with memcpy()
  ACPICA: Apply ACPI_NONSTRING in more places
  ACPICA: Avoid sequence overread in call to strncmp()
  ACPICA: Adjust the position of code lines
  ACPICA: actbl2.h: ACPI 6.5: RAS2: Rename structure and field names of the RAS2 table
  ACPICA: Apply ACPI_NONSTRING
  ACPICA: Introduce ACPI_NONSTRING
  ACPICA: actbl2.h: ERDT: Add typedef and other definitions
  ACPICA: infrastructure: Add new DMT_BUF types and shorten a long name
  ACPICA: Utilities: Fix spelling mistake "Incremement" -> "Increment"
  ACPICA: MRRM: Some cleanups
  ACPICA: actbl2: Add definitions for RIMT
  ACPICA: actbl2.h: MRRM: Add typedef and other definitions
  ACPICA: infrastructure: Add new header and ACPI_DMT_BUF26 types
  ACPICA: Interpret SIDP structures in DMAR
  ACPICA: utilities: Fix overflow check in vsnprintf()
  ACPICA: Apply pack(1) to union aml_resource
  ACPICA: Drop stale comment about the header file content
  ...
2025-05-26 18:35:01 +02:00
Paolo Abeni
34d26315db linux-can-next-for-6.16-20250522
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEn/sM2K9nqF/8FWzzDHRl3/mQkZwFAmgu4Z8THG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAMdGXf+ZCRnBV7B/9i36vJXJuRMvLhP6vQNeEtixa5YkWe
 AZ/ALtAlYVzQXIKYJs7st+bxzQLmN7BfvcGdtXmtgIsNttNL1Kl1asvKFuRN3hqp
 CNjH83vqmoJMbKjcnPmxi/t3IfprfTU99g34gz5ayJN15rYptQAZRIFoX63Di6jC
 XvJbhM2ztJqHA5o5kMseCJ8kRq+RCAunI5Z1hltaAUWmypdc1RPWAJQed4x1ssSM
 8ctyWGy32ctyIoZ+B8tXzE3FoTay5UditN0lfdOe9pE+j6ZeYZRKdquKh4gLvK9c
 mFjxQm/TqbWWIWuts30iy2dp2PgNEyFrcJUZWu8/y5rhZSQQeqzQww8v
 =ILU2
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-6.16-20250522' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2025-05-22

this is a pull request of 22 patches for net-next/main.

The series by Biju Das contains 19 patches and adds RZ/G3E CANFD
support to the rcar_canfd driver.

The patch by Vincent Mailhol adds a struct data_bittiming_params to
group FD parameters as a preparation patch for CAN-XL support.

Felix Maurer's patch imports tst-filter from can-tests into the kernel
self tests and Vincent Mailhol adds support for physical CAN
interfaces.

linux-can-next-for-6.16-20250522

* tag 'linux-can-next-for-6.16-20250522' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (22 commits)
  selftests: can: test_raw_filter.sh: add support of physical interfaces
  selftests: can: Import tst-filter from can-tests
  can: dev: add struct data_bittiming_params to group FD parameters
  can: rcar_canfd: Add RZ/G3E support
  can: rcar_canfd: Enhance multi_channel_irqs handling
  can: rcar_canfd: Add external_clk variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add sh variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add struct rcanfd_regs variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add shared_can_regs variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add ch_interface_mode variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add {nom,data}_bittiming variables to struct rcar_canfd_hw_info
  can: rcar_canfd: Add max_cftml variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add max_aflpn variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add rnc_field_width variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Update RCANFD_GAFLCFG macro
  can: rcar_canfd: Add rcar_canfd_setrnc()
  can: rcar_canfd: Drop the mask operation in RCANFD_GAFLCFG_SETRNC macro
  can: rcar_canfd: Update RCANFD_GERFL_ERR macro
  can: rcar_canfd: Drop RCANFD_GAFLCFG_GETRNC macro
  can: rcar_canfd: Use of_get_available_child_by_name()
  ...
====================

Link: https://patch.msgid.link/20250522084128.501049-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-26 18:11:24 +02:00
Linus Torvalds
181d8e399f vfs-6.16-rc1.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaDBPTwAKCRCRxhvAZXjc
 om0+AQDMxKLweJXplqQQ7jxuvW2dEa60YpE2EalEKWGg9YA3KgEA3nI4kyKMKn7Y
 PRFXgIcKvhs62oJLKsq8SGQUqExqvAE=
 =atEw
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.16-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "This contains the usual selections of misc updates for this cycle.

  Features:

   - Use folios for symlinks in the page cache

     FUSE already uses folios for its symlinks. Mirror that conversion
     in the generic code and the NFS code. That lets us get rid of a few
     folio->page->folio conversions in this path, and some of the few
     remaining users of read_cache_page() / read_mapping_page()

   - Try and make a few filesystem operations killable on the VFS
     inode->i_mutex level

   - Add sysctl vfs_cache_pressure_denom for bulk file operations

     Some workloads need to preserve more dentries than we currently
     allow through out sysctl interface

     A HDFS servers with 12 HDDs per server, on a HDFS datanode startup
     involves scanning all files and caching their metadata (including
     dentries and inodes) in memory. Each HDD contains approximately 2
     million files, resulting in a total of ~20 million cached dentries
     after initialization

     To minimize dentry reclamation, they set vfs_cache_pressure to 1.
     Despite this configuration, memory pressure conditions can still
     trigger reclamation of up to 50% of cached dentries, reducing the
     cache from 20 million to approximately 10 million entries. During
     the subsequent cache rebuild period, any HDFS datanode restart
     operation incurs substantial latency penalties until full cache
     recovery completes

     To maintain service stability, more dentries need to be preserved
     during memory reclamation. The current minimum reclaim ratio (1/100
     of total dentries) remains too aggressive for such workload. This
     patch introduces vfs_cache_pressure_denom for more granular cache
     pressure control

     The configuration [vfs_cache_pressure=1,
     vfs_cache_pressure_denom=10000] effectively maintains the full 20
     million dentry cache under memory pressure, preventing datanode
     restart performance degradation

   - Avoid some jumps in inode_permission() using likely()/unlikely()

   - Avid a memory access which is most likely a cache miss when
     descending into devcgroup_inode_permission()

   - Add fastpath predicts for stat() and fdput()

   - Anonymous inodes currently don't come with a proper mode causing
     issues in the kernel when we want to add useful VFS debug assert.
     Fix that by giving them a proper mode and masking it off when we
     report it to userspace which relies on them not having any mode

   - Anonymous inodes currently allow to change inode attributes because
     the VFS falls back to simple_setattr() if i_op->setattr isn't
     implemented. This means the ownership and mode for every single
     user of anon_inode_inode can be changed. Block that as it's either
     useless or actively harmful. If specific ownership is needed the
     respective subsystem should allocate anonymous inodes from their
     own private superblock

   - Raise SB_I_NODEV and SB_I_NOEXEC on the anonymous inode superblock

   - Add proper tests for anonymous inode behavior

   - Make it easy to detect proper anonymous inodes and to ensure that
     we can detect them in codepaths such as readahead()

  Cleanups:

   - Port pidfs to the new anon_inode_{g,s}etattr() helpers

   - Try to remove the uselib() system call

   - Add unlikely branch hint return path for poll

   - Add unlikely branch hint on return path for core_sys_select

   - Don't allow signals to interrupt getdents copying for fuse

   - Provide a size hint to dir_context for during readdir()

   - Use writeback_iter directly in mpage_writepages

   - Update compression and mtime descriptions in initramfs
     documentation

   - Update main netfs API document

   - Remove useless plus one in super_cache_scan()

   - Remove unnecessary NULL-check guards during setns()

   - Add separate separate {get,put}_cgroup_ns no-op cases

  Fixes:

   - Fix typo in root= kernel parameter description

   - Use KERN_INFO for infof()|info_plog()|infofc()

   - Correct comments of fs_validate_description()

   - Mark an unlikely if condition with unlikely() in
     vfs_parse_monolithic_sep()

   - Delete macro fsparam_u32hex()

   - Remove unused and problematic validate_constant_table()

   - Fix potential unsigned integer underflow in fs_name()

   - Make file-nr output the total allocated file handles"

* tag 'vfs-6.16-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (43 commits)
  fs: Pass a folio to page_put_link()
  nfs: Use a folio in nfs_get_link()
  fs: Convert __page_get_link() to use a folio
  fs/read_write: make default_llseek() killable
  fs/open: make do_truncate() killable
  fs/open: make chmod_common() and chown_common() killable
  include/linux/fs.h: add inode_lock_killable()
  readdir: supply dir_context.count as readdir buffer size hint
  vfs: Add sysctl vfs_cache_pressure_denom for bulk file operations
  fuse: don't allow signals to interrupt getdents copying
  Documentation: fix typo in root= kernel parameter description
  include/cgroup: separate {get,put}_cgroup_ns no-op case
  kernel/nsproxy: remove unnecessary guards
  fs: use writeback_iter directly in mpage_writepages
  fs: remove useless plus one in super_cache_scan()
  fs: add S_ANON_INODE
  fs: remove uselib() system call
  device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()
  fs/fs_parse: Remove unused and problematic validate_constant_table()
  fs: touch up predicts in inode_permission()
  ...
2025-05-26 09:02:39 -07:00
Stanislav Fomichev
8ceeef23a3 selftests: ncdevmem: add tx test with multiple IOVs
Use prime 3 for length to make offset slowly drift away.

Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Acked-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-26 10:00:48 +01:00
Stanislav Fomichev
61f24c6885 selftests: ncdevmem: make chunking optional
Add new -z argument to specify max IOV size. By default, use
single large IOV.

Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-26 10:00:48 +01:00
Ingo Molnar
94ec70880f Merge branch 'locking/futex' into locking/core, to pick up pending futex changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-25 10:10:08 +02:00
Leo Yan
628e124404 perf tests switch-tracking: Fix timestamp comparison
The test might fail on the Arm64 platform with the error:

  # perf test -vvv "Track with sched_switch"
  Missing sched_switch events
  #

The issue is caused by incorrect handling of timestamp comparisons. The
comparison result, a signed 64-bit value, was being directly cast to an
int, leading to incorrect sorting for sched events.

The case does not fail everytime, usually I can trigger the failure
after run 20 ~ 30 times:

  # while true; do perf test "Track with sched_switch"; done
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : FAILED!
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : FAILED!
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok

I used cross compiler to build Perf tool on my host machine and tested on
Debian / Juno board.  Generally, I think this issue is not very specific
to GCC versions.  As both internal CI and my local env can reproduce the
issue.

My Host Build compiler:

  # aarch64-linux-gnu-gcc --version
  aarch64-linux-gnu-gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0

Juno Board:

  # lsb_release -a
  No LSB modules are available.
  Distributor ID: Debian
  Description:    Debian GNU/Linux 12 (bookworm)
  Release:        12
  Codename:       bookworm

Fix this by explicitly returning 0, 1, or -1 based on whether the result
is zero, positive, or negative.

Fixes: d44bc55829 ("perf tests: Add a test for tracking with sched_switch")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250331172759.115604-1-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-24 11:40:47 -03:00
Dave Jiang
9f153b7fb5 Merge branch 'for-6.16/cxl-features-ras' into cxl-for-next
Add CXL RAS Features support. Features include "patrol scrub control",
"error check scrub", "perform maintenance", and "memory sparing". This
support connects the RAS Featurs to EDAC.
2025-05-23 13:26:24 -07:00
Shiju Jose
0c6e6f1357 cxl/edac: Add CXL memory device patrol scrub control feature
CXL spec 3.2 section 8.2.10.9.11.1 describes the device patrol scrub
control feature. The device patrol scrub proactively locates and makes
corrections to errors in regular cycle.

Allow specifying the number of hours within which the patrol scrub must be
completed, subject to minimum and maximum limits reported by the device.
Also allow disabling scrub allowing trade-off error rates against
performance.

Add support for patrol scrub control on CXL memory devices.
Register with the EDAC device driver, which retrieves the scrub attribute
descriptors from EDAC scrub and exposes the sysfs scrub control attributes
to userspace. For example, scrub control for the CXL memory device
"cxl_mem0" is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/.

Additionally, add support for region-based CXL memory patrol scrub control.
CXL memory regions may be interleaved across one or more CXL memory
devices. For example, region-based scrub control for "cxl_region1" is
exposed in /sys/bus/edac/devices/cxl_region1/scrubX/.

[dj: A few formatting fixes from Jonathan]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-4-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:24:09 -07:00
Lorenz Bauer
3c0421c93c libbpf: Use mmap to parse vmlinux BTF from sysfs
Teach libbpf to use mmap when parsing vmlinux BTF from /sys. We don't
apply this to fall-back paths on the regular file system because there
is no way to ensure that modifications underlying the MAP_PRIVATE
mapping are not visible to the process.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250520-vmlinux-mmap-v5-3-e8c941acc414@isovalent.com
2025-05-23 10:06:28 -07:00
Lorenz Bauer
828226b69f selftests: bpf: Add a test for mmapable vmlinux BTF
Add a basic test for the ability to mmap /sys/kernel/btf/vmlinux.
Ensure that the data is valid BTF and that it is padded with zero.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20250520-vmlinux-mmap-v5-2-e8c941acc414@isovalent.com
2025-05-23 10:06:28 -07:00
Shradha Gupta
a9c0b33ef2 tools: hv: Enable debug logs for hv_kvp_daemon
Allow the KVP daemon to log the KVP updates triggered in the VM
with a new debug flag(-d).
When the daemon is started with this flag, it logs updates and debug
information in syslog with loglevel LOG_DEBUG. This information comes
in handy for debugging issues where the key-value pairs for certain
pools show mismatch/incorrect values.
The distro-vendors can further consume these changes and modify the
respective service files to redirect the logs to specific files as
needed.

Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Naman Jain <namjain@linux.microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com>
2025-05-23 16:30:55 +00:00
Ming Lei
533c87e2ed selftests: ublk: add test for UBLK_F_QUIESCE
Add test generic_11 for covering new control command of
UBLK_U_CMD_QUIESCE_DEV.

Add 'quiesce -n dev_id' sub-command on ublk utility for transitioning
device state to quiesce states, then verify the feature via generic_10
by doing quiesce and recovery.

Cc: Yoav Cohen <yoav@nvidia.com>
Link: https://lore.kernel.org/linux-block/DM4PR12MB632807AB7CDCE77D1E5AB7D0A9B92@DM4PR12MB6328.namprd12.prod.outlook.com/
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250522163523.406289-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-23 09:42:12 -06:00
Ming Lei
f40b1f2670 selftests: ublk: add test case for UBLK_U_CMD_UPDATE_SIZE
Add test generic_10 for covering new control command of UBLK_U_CMD_UPDATE_SIZE.

Add 'update_size -s|--size size_in_bytes' sub-command on ublk utility for
supporting this feature, then verify the feature via generic_10.

Cc: Omri Mann <omri@nvidia.com>
Cc: Jared Holzman <jholzman@nvidia.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250522163523.406289-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-23 09:42:12 -06:00
Phil Sutter
73db1b5dab selftests: netfilter: Torture nftables netdev hooks
Add a ruleset which binds to various interface names via netdev-family
chains and flowtables and massage the notifiers by frequently renaming
interfaces to match these names. While doing so:
- Keep an 'nft monitor' running in background to receive the notifications
- Loop over 'nft list ruleset' to exercise ruleset dump codepath
- Have iperf running so the involved chains/flowtables see traffic

If supported, also test interface wildcard support separately by
creating a flowtable with 'wild*' interface spec and quickly add/remove
matching dummy interfaces.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-23 13:57:14 +02:00
Florian Westphal
996d62ece0 selftests: netfilter: nft_fib.sh: add type and oif tests with and without VRFs
Replace the existing VRF test with a more comprehensive one.

It tests following combinations:
 - fib type (returns address type, e.g. unicast)
 - fib oif (route output interface index
 - both with and without 'iif' keyword (changes result, e.g.
  'fib daddr type local' will be true when the destination address
  is configured on the local machine, but
  'fib daddr . iif type local' will only be true when the destination
  address is configured on the incoming interface.

Add all types of addresses to test with for both ipv4 and ipv6:
- local address on the incoming interface
- local address on another interface
- local address on another interface thats part of a vrf
- address on another host

The ruleset stores obtained results from 'fib' in nftables sets and
then queries the sets to check that it has the expected results.

Perform one pass while packets are coming in on interface NOT part of
a VRF and then again when it was added and make sure fib returns the
expected routes and address types for the various addresses in the
setup.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-23 13:57:12 +02:00
Marc Zyngier
1b85d923ba Merge branch kvm-arm64/misc-6.16 into kvmarm-master/next
* kvm-arm64/misc-6.16:
  : .
  : Misc changes and improvements for 6.16:
  :
  : - Add a new selftest for the SVE host state being corrupted by a guest
  :
  : - Keep HCR_EL2.xMO set at all times for systems running with the kernel at EL2,
  :   ensuring that the window for interrupts is slightly bigger, and avoiding
  :   a pretty bad erratum on the AmpereOne HW
  :
  : - Replace a couple of open-coded on/off strings with str_on_off()
  :
  : - Get rid of the pKVM memblock sorting, which now appears to be superflous
  :
  : - Drop superflous clearing of ICH_LR_EOI in the LR when nesting
  :
  : - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers from
  :   a pretty bad case of TLB corruption unless accesses to HCR_EL2 are
  :   heavily synchronised
  :
  : - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS tables
  :   in a human-friendly fashion
  : .
  KVM: arm64: Fix documentation for vgic_its_iter_next()
  KVM: arm64: vgic-its: Add debugfs interface to expose ITS tables
  arm64: errata: Work around AmpereOne's erratum AC04_CPU_23
  KVM: arm64: nv: Remove clearing of ICH_LR<n>.EOI if ICH_LR<n>.HW == 1
  KVM: arm64: Drop sort_memblock_regions()
  KVM: arm64: selftests: Add test for SVE host corruption
  KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode
  KVM: arm64: Replace ternary flags with str_on_off() helper

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:59:43 +01:00
Marc Zyngier
fef3acf5ae Merge branch kvm-arm64/fgt-masks into kvmarm-master/next
* kvm-arm64/fgt-masks: (43 commits)
  : .
  : Large rework of the way KVM deals with trap bits in conjunction with
  : the CPU feature registers. It now draws a direct link between which
  : the feature set, the system registers that need to UNDEF to match
  : the configuration and bits that need to behave as RES0 or RES1 in
  : the trap registers that are visible to the guest.
  :
  : Best of all, these definitions are mostly automatically generated
  : from the JSON description published by ARM under a permissive
  : license.
  : .
  KVM: arm64: Handle TSB CSYNC traps
  KVM: arm64: Add FGT descriptors for FEAT_FGT2
  KVM: arm64: Allow sysreg ranges for FGT descriptors
  KVM: arm64: Add context-switch for FEAT_FGT2 registers
  KVM: arm64: Add trap routing for FEAT_FGT2 registers
  KVM: arm64: Add sanitisation for FEAT_FGT2 registers
  KVM: arm64: Add FEAT_FGT2 registers to the VNCR page
  KVM: arm64: Use HCR_EL2 feature map to drive fixed-value bits
  KVM: arm64: Use HCRX_EL2 feature map to drive fixed-value bits
  KVM: arm64: Allow kvm_has_feat() to take variable arguments
  KVM: arm64: Use FGT feature maps to drive RES0 bits
  KVM: arm64: Validate FGT register descriptions against RES0 masks
  KVM: arm64: Switch to table-driven FGU configuration
  KVM: arm64: Handle PSB CSYNC traps
  KVM: arm64: Use KVM-specific HCRX_EL2 RES0 mask
  KVM: arm64: Remove hand-crafted masks for FGT registers
  KVM: arm64: Use computed FGT masks to setup FGT registers
  KVM: arm64: Propagate FGT masks to the nVHE hypervisor
  KVM: arm64: Unconditionally configure fine-grain traps
  KVM: arm64: Use computed masks as sanitisers for FGT registers
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:58:15 +01:00
Marc Zyngier
6eb0ed9629 Merge branch kvm-arm64/mte-frac into kvmarm-master/next
* kvm-arm64/mte-frac:
  : .
  : Prevent FEAT_MTE_ASYNC from being accidently exposed to a guest,
  : courtesy of Ben Horgan. From the cover letter:
  :
  : "The ID_AA64PFR1_EL1.MTE_frac field is currently hidden from KVM.
  : However, when ID_AA64PFR1_EL1.MTE==2, ID_AA64PFR1_EL1.MTE_frac==0
  : indicates that MTE_ASYNC is supported. On a host with
  : ID_AA64PFR1_EL1.MTE==2 but without MTE_ASYNC support a guest with the
  : MTE capability enabled will incorrectly see MTE_ASYNC advertised as
  : supported. This series fixes that."
  : .
  KVM: selftests: Confirm exposing MTE_frac does not break migration
  KVM: arm64: Make MTE_frac masking conditional on MTE capability
  arm64/sysreg: Expose MTE_frac so that it is visible to KVM

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:57:44 +01:00
Kuniyuki Iwashima
431e2b874e selftest: af_unix: Test SO_PASSRIGHTS.
scm_rights.c has various patterns of tests to exercise GC.

Let's add cases where SO_PASSRIGHTS is disabled.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-23 10:24:19 +01:00
Kuniyuki Iwashima
77cbe1a6d8 af_unix: Introduce SO_PASSRIGHTS.
As long as recvmsg() or recvmmsg() is used with cmsg, it is not
possible to avoid receiving file descriptors via SCM_RIGHTS.

This behaviour has occasionally been flagged as problematic, as
it can be (ab)used to trigger DoS during close(), for example, by
passing a FUSE-controlled fd or a hung NFS fd.

For instance, as noted on the uAPI Group page [0], an untrusted peer
could send a file descriptor pointing to a hung NFS mount and then
close it.  Once the receiver calls recvmsg() with msg_control, the
descriptor is automatically installed, and then the responsibility
for the final close() now falls on the receiver, which may result
in blocking the process for a long time.

Regarding this, systemd calls cmsg_close_all() [1] after each
recvmsg() to close() unwanted file descriptors sent via SCM_RIGHTS.

However, this cannot work around the issue at all, because the final
fput() may still occur on the receiver's side once sendmsg() with
SCM_RIGHTS succeeds.  Also, even filtering by LSM at recvmsg() does
not work for the same reason.

Thus, we need a better way to refuse SCM_RIGHTS at sendmsg().

Let's introduce SO_PASSRIGHTS to disable SCM_RIGHTS.

Note that this option is enabled by default for backward
compatibility.

Link: https://uapi-group.org/kernel-features/#disabling-reception-of-scm_rights-for-af_unix-sockets #[0]
Link: https://github.com/systemd/systemd/blob/v257.5/src/basic/fd-util.c#L612-L628 #[1]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-23 10:24:18 +01:00
Kuniyuki Iwashima
ae4f2f59e1 tcp: Restrict SO_TXREHASH to TCP socket.
sk->sk_txrehash is only used for TCP.

Let's restrict SO_TXREHASH to TCP to reflect this.

Later, we will make sk_txrehash a part of the union for other
protocol families.

Note that we need to modify BPF selftest not to get/set
SO_TEREHASH for non-TCP sockets.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-23 10:24:18 +01:00
Ian Rogers
0ffca606e9 perf pmu intel: Adjust cpumaks for sub-NUMA clusters on graniterapids
On graniterapids the cache home agent (CHA) and memory controller
(IMC) PMUs all have their cpumask set to per-socket information. In
order for per NUMA node aggregation to work correctly the PMUs cpumask
needs to be set to CPUs for the relevant sub-NUMA grouping.

For example, on a 2 socket graniterapids machine with sub NUMA
clustering of 3, for uncore_cha and uncore_imc PMUs the cpumask is
"0,120" leading to aggregation only on NUMA nodes 0 and 3:
```
$ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1

 Performance counter stats for 'system wide':

N0        1    277,835,681,344      UNC_CHA_CLOCKTICKS
N0        1     19,242,894,228      UNC_M_CLOCKTICKS
N3        1    277,803,448,124      UNC_CHA_CLOCKTICKS
N3        1     19,240,741,498      UNC_M_CLOCKTICKS

       1.002113847 seconds time elapsed
```

By updating the PMUs cpumasks to "0,120", "40,160" and "80,200" then
the correctly 6 NUMA node aggregations are achieved:
```
$ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1

 Performance counter stats for 'system wide':

N0        1     92,748,667,796      UNC_CHA_CLOCKTICKS
N0        0      6,424,021,142      UNC_M_CLOCKTICKS
N1        0     92,753,504,424      UNC_CHA_CLOCKTICKS
N1        1      6,424,308,338      UNC_M_CLOCKTICKS
N2        0     92,751,170,084      UNC_CHA_CLOCKTICKS
N2        0      6,424,227,402      UNC_M_CLOCKTICKS
N3        1     92,745,944,144      UNC_CHA_CLOCKTICKS
N3        0      6,423,752,086      UNC_M_CLOCKTICKS
N4        0     92,725,793,788      UNC_CHA_CLOCKTICKS
N4        1      6,422,393,266      UNC_M_CLOCKTICKS
N5        0     92,717,504,388      UNC_CHA_CLOCKTICKS
N5        0      6,421,842,618      UNC_M_CLOCKTICKS

       1.003406645 seconds time elapsed
```

In general, having the perf tool adjust cpumasks isn't desirable as
ideally the PMU driver would be advertising the correct cpumask.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250515181417.491401-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 23:15:48 -03:00
Arnaldo Carvalho de Melo
9e893dab82 perf tests trace_summary.sh: Run in exclusive mode
And it is being successfull only when running alone, probably because
there are some tests that add the vfs_getname probe that gets used by
'perf trace' and alter how it does syscall arg pathname resolution.

This should be removed or made a fallback to the preferred BPF mode of
getting syscall parameters, but till then, run this in exclusive mode.

For reference, here are some of the tests that run close to this one:

  127: perf record offcpu profiling tests                              : Ok
  128: perf all PMU test                                               : Ok
  129: perf stat --bpf-counters test                                   : Ok
  130: Check Arm CoreSight trace data recording and synthesized samples: Skip
  131: Check Arm CoreSight disassembly script completes without errors : Skip
  132: Check Arm SPE trace data recording and synthesized samples      : Skip
  133: Test data symbol                                                : Ok
  134: Miscellaneous Intel PT testing                                  : Skip
  135: test Intel TPEBS counting mode                                  : Skip
  136: perf script task-analyzer tests                                 : Ok
  137: Check open filename arg using perf trace + vfs_getname          : Ok
  138: perf trace summary                                              : Ok

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/aC-hHTgArwlF_zu9@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Namhyung Kim
dd8633bd09 perf test: Add cgroup summary test case for 'perf trace'
$ sudo ./perf test -vv 112
  112: perf trace summary:
  --- start ---
  test child forked, pid 1018940
  testing: perf trace -s -- true
  testing: perf trace -S -- true
  testing: perf trace -s --summary-mode=thread -- true
  testing: perf trace -S --summary-mode=total -- true
  testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true
  testing: perf trace -as --summary-mode=total --no-bpf-summary -- true
  testing: perf trace -as --summary-mode=thread --bpf-summary -- true
  testing: perf trace -as --summary-mode=total --bpf-summary -- true
  testing: perf trace -aS --summary-mode=total --bpf-summary -- true
  testing: perf trace -as --summary-mode=cgroup --bpf-summary -- true
  testing: perf trace -aS --summary-mode=cgroup --bpf-summary -- true
  ---- end(0) ----
  112: perf trace summary                                              : Ok

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250522142551.1062417-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Gautam Menghani
59df607bf8 perf python: Add counting.py as example for counting perf events
Add counting.py - a python version of counting.c to demonstrate
measuring and reading of counts for given perf events.

Committer testing:

Build perf and make the generated python binding somewhere you can point
to to avoid using the one in the distro python3-perf (fedora, may be
different in other distros):

  $ make -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin

Copy /tmp/build/perf-tools-next/python/perf.cpython-313-x86_64-linux-gnu.so to
somewhere outside this toolbox container and then use it with root:

  # export PYTHONPATH=/root/python/
  # ls -la /root/python/
  total 10640
  drwxr-xr-x. 1 root root       72 May 21 11:40 .
  dr-xr-x---. 1 root root      574 May 21 11:40 ..
  -rwxr-xr-x. 1 acme acme 10894360 May 21 11:40 perf.cpython-313-x86_64-linux-gnu.so
  # tools/perf/python/counting.py | head -5
  For evsel(software/cpu-clock/) val: 2930946 enable: 2932479 run: 2932479
  For evsel(software/cpu-clock/) val: 2924975 enable: 2926267 run: 2926267
  For evsel(software/cpu-clock/) val: 2921017 enable: 2922430 run: 2922430
  For evsel(software/cpu-clock/) val: 2914966 enable: 2916549 run: 2916549
  For evsel(software/cpu-clock/) val: 2910027 enable: 2911589 run: 2911589
  #

Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
[ make the API take a CPU and thread then compute from these the appropriate indices. ]
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/linux-perf-users/CAP-5=fWb-=hCYmpg7U5N9C94EucQGTOS7YwR2-fo4ptOexzxyg@mail.gmail.com/
Link: https://lore.kernel.org/r/20250519195148.1708988-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Gautam Menghani
aa68483740 perf python: Add evlist close support
Add support for the evlist close function.

Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250519195148.1708988-7-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Gautam Menghani
739621f657 perf python: Add evsel read method
Add the evsel read method to enable python to read counter data for the
given evsel.

Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/linux-perf-users/20250512055748.479786-1-gautam@linux.ibm.com/
Link: https://lore.kernel.org/r/20250519195148.1708988-6-irogers@google.com
[ make the API take a CPU and thread then compute from these the appropriate indices. ]
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Gautam Menghani
3b4991dcb4 perf python: Add support for 'struct perf_counts_values' to return counter data
Add support for the perf_counts_values struct to enable the python
bindings to read and return the counter data.

Committer notes:

Use T_ULONG instead of Py_T_ULONG, as all the other PyMemberDef arrays,
fixing the build with older python3 versions.

Use { .name = NULL, } to finish the new PyMemberDef
pyrf_counts_values_members array, again as the other arrays to please
some clang versions, ditto for PyGetSetDef.

Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250519195148.1708988-5-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:23:52 -03:00
Ryan Chung
19e0713bbe selftests/eventfd: correct test name and improve messages
- Rename test from eventfd_chek_flag_cloexec_and_nonblock to
eventfd_check_flag_cloexec_and_nonblock.

- Make the RDWR‐flag comment declarative:
  “The kernel automatically adds the O_RDWR flag.”
- Update semaphore‐flag failure message to:
  “eventfd semaphore flag check failed: …”

Link: https://lkml.kernel.org/r/20250513074411.6965-1-seokwoo.chung130@gmail.com
Signed-off-by: Ryan Chung <seokwoo.chung130@gmail.com>
Reviewed-by: Wen Yang <wen.yang@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22 14:55:38 -07:00
SeongJae Park
03f83209e8 selftests/damon/_damon_sysfs: read tried regions directories in order
Kdamond.update_schemes_tried_regions() reads and stores tried regions
information out of address order.  It makes debugging a test failure
difficult.  Change the behavior to do the reading and writing in the
address order.

Link: https://lkml.kernel.org/r/20250513002715.40126-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22 14:55:38 -07:00
Mark Brown
5fc4b770fc selftests/mm: deduplicate second mmap() of 5*PAGE_SIZE at base
The map_fixed_noreplace test does two blocks of test starting from a
mapping of 5 pages at the base address, logging a test result for each
initial mapping.  These are logged with the same test name, causing test
automation software to see two reports for the same test in a single run. 
Tweak the log message for the second one to deduplicate.

Link: https://lkml.kernel.org/r/20250518-selftests-mm-map-fixed-noreplace-dup-v1-1-1a11a62c5e9f@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22 14:55:37 -07:00
David Hildenbrand
2616b37032 selftests/mm: add simple VM_PFNMAP tests based on mmap'ing /dev/mem
Let's test some basic functionality using /dev/mem.  These tests will
implicitly cover some PAT (Page Attribute Handling) handling on x86.

These tests will only run when /dev/mem access to the first two pages in
physical address space is possible and allowed; otherwise, the tests are
skipped.

On current x86-64 with PAT inside a VM, all tests pass:

	TAP version 13
	1..6
	# Starting 6 tests from 1 test cases.
	#  RUN           pfnmap.madvise_disallowed ...
	#            OK  pfnmap.madvise_disallowed
	ok 1 pfnmap.madvise_disallowed
	#  RUN           pfnmap.munmap_split ...
	#            OK  pfnmap.munmap_split
	ok 2 pfnmap.munmap_split
	#  RUN           pfnmap.mremap_fixed ...
	#            OK  pfnmap.mremap_fixed
	ok 3 pfnmap.mremap_fixed
	#  RUN           pfnmap.mremap_shrink ...
	#            OK  pfnmap.mremap_shrink
	ok 4 pfnmap.mremap_shrink
	#  RUN           pfnmap.mremap_expand ...
	#            OK  pfnmap.mremap_expand
	ok 5 pfnmap.mremap_expand
	#  RUN           pfnmap.fork ...
	#            OK  pfnmap.fork
	ok 6 pfnmap.fork
	# PASSED: 6 / 6 tests passed.
	# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0

However, we are able to trigger:

[   27.888251] x86/PAT: pfnmap:1790 freeing invalid memtype [mem 0x00000000-0x00000fff]

There are probably more things worth testing in the future, such as
MAP_PRIVATE handling.  But this set of tests is sufficient to cover most
of the things we will rework regarding PAT handling.

Link: https://lkml.kernel.org/r/20250509153033.952746-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Dev Jain <dev.jain@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22 14:55:36 -07:00
Michal Luczaj
c04eeeb2af selftests/bpf: sockmap_listen cleanup: Drop af_inet SOCK_DGRAM redir tests
Remove tests covered by sockmap_redir.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-8-a1ea723f7e7e@rbox.co
2025-05-22 14:26:59 -07:00
Michal Luczaj
f3de1cf621 selftests/bpf: sockmap_listen cleanup: Drop af_unix redir tests
Remove tests covered by sockmap_redir.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-7-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
9266e49d60 selftests/bpf: sockmap_listen cleanup: Drop af_vsock redir tests
Remove tests covered by sockmap_redir.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-6-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
f0709263a0 selftests/bpf: Add selftest for sockmap/hashmap redirection
Test redirection logic. All supported and unsupported redirect combinations
are tested for success and failure respectively.

BPF_MAP_TYPE_SOCKMAP
BPF_MAP_TYPE_SOCKHASH
	x
sk_msg-to-egress
sk_msg-to-ingress
sk_skb-to-egress
sk_skb-to-ingress
	x
AF_INET, SOCK_STREAM
AF_INET6, SOCK_STREAM
AF_INET, SOCK_DGRAM
AF_INET6, SOCK_DGRAM
AF_UNIX, SOCK_STREAM
AF_UNIX, SOCK_DGRAM
AF_VSOCK, SOCK_STREAM
AF_VSOCK, SOCK_SEQPACKET

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-5-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
f266905bb3 selftests/bpf: Introduce verdict programs for sockmap_redir
Instead of piggybacking on test_sockmap_listen, introduce
test_sockmap_redir especially for sockmap redirection tests.

Suggested-by: Jiayuan Chen <mrpre@163.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-4-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
b57482b0fe selftests/bpf: Add u32()/u64() to sockmap_helpers
Add integer wrappers for convenient sockmap usage.

While there, fix misaligned trailing slashes.

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-3-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
d87857946d selftests/bpf: Add socket_kind_to_str() to socket_helpers
Add function that returns string representation of socket's domain/type.

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-2-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
fb1131d5e1 selftests/bpf: Support af_unix SOCK_DGRAM socket pair creation
Handle af_unix in init_addr_loopback(). For pair creation, bind() the peer
socket to make SOCK_DGRAM connect() happy.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-1-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Jakub Kicinski
33e1b1b399 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.15-rc8).

Conflicts:
  80f2ab46c2 ("irdma: free iwdev->rf after removing MSI-X")
  4bcc063939 ("ice, irdma: fix an off by one in error handling code")
  c24a65b6a2 ("iidc/ice/irdma: Update IDC to support multiple consumers")
https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au

No extra adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22 09:42:41 -07:00
Mykyta Yatsenko
5ead949920 selftests/bpf: Add SKIP_LLVM makefile variable
Introduce SKIP_LLVM makefile variable that allows to avoid using llvm
dependencies when building BPF selftests. This is different from
existing feature-llvm, as the latter is a result of automatic detection
and should not be set by user explicitly.
Avoiding llvm dependencies could be useful for environments that do not
have them, given that as of now llvm dependencies are required only by
jit_disasm_helpers.c.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250522013813.125428-1-mykyta.yatsenko5@gmail.com
2025-05-22 09:31:03 -07:00
Florian Westphal
98287045c9 selftests: netfilter: move fib vrf test to nft_fib.sh
It was located in conntrack_vrf.sh because that already had the VRF bits.
Lets not add to this and move it to nft_fib.sh where this belongs.

No functional changes for the subtest intended.
The subtest is limited, it only covered 'fib oif'
(route output interface query) when the incoming interface is part
of a VRF.

Next we can extend it to cover 'fib type' for VRFs and also check fib
results when there is an unrelated VRF in same netns.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-22 17:16:02 +02:00
Florian Westphal
839340f7c7 selftests: netfilter: nft_fib.sh: add 'type' mode tests
fib can either lookup the interface id/name of the output interface that
would be used for the given address, or it can check for the type of the
address according to the fib, e.g. local, unicast, multicast and so on.

This can be used to e.g. make a locally configured address only reachable
through its interface.

Example: given eth0:10.1.1.1 and eth1:10.1.2.1 then 'fib daddr type' for
10.1.1.1 arriving on eth1 will be 'local', but 'fib daddr . iif type' is
expected to return 'unicast', whereas 'fib daddr' and 'fib daddr . iif'
are expected to indicate 'local' if such a packet arrives on eth0.

So far nft_fib.sh only covered oif/oifname, not type.

Repeat tests both with default and a policy (ip rule) based setup.

Also try to run all remaining tests even if a subtest has failed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-22 17:16:02 +02:00
Florian Westphal
d31c1cafc4 selftests: netfilter: nft_concat_range.sh: add coverage for 4bit group representation
Pipapo supports a more compact '4 bit group' format that is chosen when
the memory needed for the default exceeds a threshold (2mb).

Add coverage for those code paths, the existing tests use small sets that
are handled by the default representation.

This comes with a test script run-time increase, but I think its ok:

 normal: 2m35s -> 3m9s
 debug:  3m24s -> 5m29s (with KSFT_MACHINE_SLOW=yes).

Cc: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-22 17:16:01 +02:00
Greg Kroah-Hartman
0ca7cb7089 IIO: New device support, features and cleanup for 6.16 - take 2
Note - last minute rebase was to drop a typo patch that I'd accidentally
 picked up (in the microblaze arch Kconfig)
 Take 2 is due to that rebase messing up some fixes tags that were
 referring to patches after that point.
 
 There is a known merge conflict due to changes in neighbouring lines.
 
 Stephen's resolution in linux-next is:
 https://lore.kernel.org/linux-next/20250506155728.65605bae@canb.auug.org.au/
 
 Added 3 named IIO reviewers to MAINTAINERS. This is a reflection of those
 who have been doing much of this work for some time. Lars-Peter is
 removed from the entry having moved on to other topics.  Thanks
 Nuno, David and Andy for stepping up and Lars-Peter for all your
 hard work in the past!
 
 Includes the usual mix of new device support, features and general
 cleanup.
 
 This time we also have some tree wide changes.
 
 - Rip out the iio_device_claim_direct_scoped() as it proved hard to work
   with.  This series includes quite a few related cleanups such as use
   of guard or factoring code out to allow direct returns.
 - Switch from iio_device_claim/release_direct_mode() to new
   iio_device_claim/release_direct() which is structured so that sparse
   can warn on failed releases. There were a few false positives but
   those were mostly in code that benefited from being cleaned up as part
   of this process.
 - Introduce iio_push_to_buffers_with_ts() to replace the _timestamp()
   version over time. This version takes the size of the supplied buffer
   which the core checks is at least as big as expected by calculation
   from channel descriptions of those channels enabled. Use this in
   an initial set of drivers.
 - Add macros for IIO_DECLARE_BUFFER_WITH_TS() and
   IIO_DECLARE_DMA_BUFFER_WITH_TS() to avoid lots of fiddly code to ensure
   correctly aligned buffers for timestamps being added onto the end of
   channel data.
 
 New device support
 ------------------
 
 adi,ad3530r
 - New driver for AD3530, AD3530R, AD3531 and AD3531R DACs with
   programmable gain controls. R variants have internal references.
 adi,ad7476
 - Add support (dt compatible only) for the Rohm BU79100G ADC which is
   fully compatible with the ti,ads7866.
 adi,ad7606
 - Support ad7606c-16 and ad7606c-18 devices. Includes switch to dynamic
   channel information allocation.
 adi,ad7380
 - Add support for the AD7389-4
 dfrobot,sen0322
 - New driver for this oxygen sensor.
 mediatek,mt2701-auxadc
 - Add binding for MT6893 which is fully compatible with already supported
   MT8173.
 meson-saradc
 - Support the GXLX SoCs.  Mostly this is a workaround for some unrelated
   clock control bits found in the ADC register map.
 nuvoton,nct7201
 - New driver for NCT7201 and NCT7202 I2C ADCs.
 rohm,bd79124
 - New driver for this 12-bit, 8-channel SAR ADC.
 - Switch to new set_rv etc gpio callbacks that were added in 6.15.
 rohm,bd79703
 - Add support for BD79700, BD79701 and BD79702 DACs that have subsets of
   functionality of the already supported bd79703.  Included making this
   driver suitable for support device variants.
 st,stm32-lptimer
 - Add support for stm32pm25 to this trigger.
 
 Features
 --------
 
 Beyond IIO
 - Property iterator for named children.
 core
 - Enable writes for 64 bit integers used for standard IIO ABI elements.
   Previously these could be read only.
 - Helper library that should avoid code duplication for simpler ADC
   bindings that have a child node per channel.
 - Enforce that IIO_DMA_MINALIGN is always at least 8 (almost always true
   and simplifies code on all significant architectures)
 core/backend
 - Add support to control source of data - useful when the HDL includes
   things like generated ramps for testing purposes. Enable this for
   adi-axi-dac
 adi,ad3552-hs
 - Add debugfs related callbacks to allow debug access to register contents.
 adi,ad4000
 - Support SPI offload with appropriate FPGA firmware along with improving
   documentation.
 adi,ad7293
 - Add support for external reference voltage.
 adi,ad7606
 - Support SPI offload.
 adi,ad7768-1
 - Support reset GPIO.
 adi,admv8818
 - Support filter frequencies beyond 2^32.
 adi,adxl345
 - Add single and double tap events.
 hid-sensor-prox
 - Support 16-bit report sizes as seen on some Intel platforms.
 invensense,icm42600
 - Enable use of named interrupts to avoid problems with some wiring choices.
   Get the interrupt by name, but fallback to previous assumption on the first
   being INT1 if no names are supplied.
 microchip,mcp3911
 - Add reset gpio support.
 rohm,bh7150
 - Add reset gpio support.
 st,stm32
 - Add support to control oversampling.
 ti,adc128s052
 - Add support for ROHM BD79104 which is early compatible with the TI
   parts already supported by this driver. Includes some general driver
   cleanup and a separate dt binding.
 - Simplify reference voltage handling by assuming it is fixed after enabling
   the supply.
 winsen,mhz19b
 - New driver for this C02 sensor.
 
 Cleanup and minor fixes
 -----------------------
 
 dt-bindings
 - Correct indentation and style for DTS examples.
 - Use unevalutateProperties for SPI devices instead of additionalProperties
   to allow generic SPI properties from spi-peripheral-props.yaml
 ABI Docs
 - Add missing docs for sampling_frequency when it applies only to events.
 Treewide
 - Various minor tweaks, comment fixes and similar.
 - Sort TI ADCs in Kconfig that had gotten out of order.
 - Switch various drives that provide GPIO chip functionality to the new
   callbacks with return values.
 - Standardize on { } formatting for all array sentinels.
 - Make use of aligned_s64 in a few places to replace either wrong types
   or manually defined equivalents.
 - Drop places where spi bits_per_word is set to 8 because that is the
   default anyway.
 
 adi,ad_sigma_delta library
 - Avoid a potential use of uninitialized data if reg_size has a value
   that is not supported (no drivers hit this but it is reasonable hardening)
 adi,ad4030
 - Add error checking for scan types and no longer store it in state.
 - Rework code to reduce duplication.
 - Move setting the mode from buffer preenable() to update_scan_mode(),
   better matching expected semantics of the two different callbacks.
 - Improve data marshalling comments.
 adi,ad4695
 - Use u16 for buffer elements as oversampling is not yet supported except
   with SPI offload (which doesn't use this path).
 adi,ad5592r
 - Clean up destruction of mutexes.
 - Use lock guards to simplify code (later patch fixes a missed unlock)
 adi,ad5933
 - Correct some incorrect settling times.
 adi,ad7091
 - Deduplicate handling of writable vs volatile registers as they are the
   inverse of each other for this device.
 adi,ad7124
 - Fix 3db Filter frequency.
 - Remove ability to directly write the filter frequency (which was broken)
 - Register naming improvements.
 adi,ad7606
 - Add a missing return value check.
 - Fill in max sampling rates for all chips.
 - Use devm_mutex_init()
 - Fix up some kernel-doc formatting issues.
 - Remove some camel case that snuck in.
 - Drop setting address field in channels as easily established from other
   fields.
 - Drop unnecessary parameter to ad76060_scale_setup_cb_t.
 adi,ad7768-1
 - Convert to regmap.
 - Factor out buffer allocation.
 - Tidy up headers.
 adi,ad7944
 - Stop setting bits_per_word in SPI xfers with no data.
 adi,ad9832
 - Add of_device_id table rather than just relying on fallbacks.
 - Use FIELD_PREP() to set values of fields.
 adi,admv1013
 - Cleanup a pointless ternary.
 adi,admv8818
 - Fix up LPF Band 5 frequency which was slightly wrong.
 - Fix an integer overflow.
 - Fix range calculation
 adi,adt7316
 - Replace irqd_get_trigger_type(irq_get_irq_data()) with simpler
   irq_get_trigger_type()
 adi,adxl345
 - Use regmap cache instead of various state variables that were there to
   reduce bus accesses.
 - Make regmap return value checking consistent across all call sites.
 adi,axi-dac
 - Add a check on number of channels (0 to 15 valid)
 allwinner,sun20i
 - Use new adc-helpers to replace local parsing code for channel nodes.
 bosch,bmp290
 - Move to local variables for sensor data marshalling removing the need
   for a messy definition that has to work for all supported parts.
   Follow up fix adds a missing initialization.
 dynaimage,al3010 and dynaimage,al3320a
 - Various minor cleanup to bring these drivers inline with reviewed feedback
   given on a new driver.
 - Fix an error path in which power down is not called when it should be.
 - Switch to regmap.
 google,cros_ec
 - Fix up a flexible array in middle of structure warning.
 - Flush fifo when changing the timeout to avoid potential long wait
   for samples.
 hid-sensor-rotation
 - Remove an __aligned(16) marking that doesn't seem to be justified.
 kionix,kxcjk-1013
 - Deduplicate code for setting up interrupts.
 microchip,mcp3911
 - Fix handling of conversion results register which differs across supported
   devices.
 idt,zopt2201
 - Avoid duplicating register lists as all volatile registers are the
   inverse of writeable registers on this device.
 renesas,rzg2l
 - Use new adc-helpers to replace local parsing code for channel nodes.
 ti,ads1298
 - Fix a missing Kconfig dependency.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmgt1EgRHGppYzIzQGtl
 cm5lbC5vcmcACgkQVIU0mcT0Foii3g//bZJT2/rHGIkU/MKUsYQCZGh5ux7X59Kk
 DT8R7i8RYg2dshbtkkHUn/S6yC3OX84q3Legzy8TtHkPvGMK1NSwWCrn2cqkCZdG
 4kksfq3P41FmRrpSfm4p+WSAD9nYzA/LToSLdwGIOa44lKxlCElBgsIDz/HxwXLB
 N08H5jniST+cmSWAkkxCj15jcKXGtuIUS3XSCffwlzQZNPHp7guk4u+5EiUaPwvy
 X75iZ6SthUo9GfE6QyhE4rdOceix6S5wsQrKVpGKB7ZuQbXYQxlvNVt/l5p6MGdY
 teoqB7CyUADD1yHZp9L85+JSqjbodxm5a/Y9b0wQKHV33dxphEVZeli9t+VZYg6K
 fSM5xq0Cmt0msXLGRgB1uvZcEfj+vlkn7SN9nyyfTN2nXoac1E/MYCEiCLMhOj6D
 jxwc1aQDwaT+f6U6mRyRab2+TPea7AOv81x0MyRclOArqY4NCOPIG+XvC2nouYXk
 9MSyK4FIysbINjenFCZdpdibp3HKWGqc8rTv3NWrx7CejAYQoEkYW5WA9qyzy/I8
 8gDbcQskcYcehrI8ztc1ratfKl5k0QJPCgcLvuTpxK3PkQCgX+dGnL1EXW7D4BRx
 7Gl6O7gW7JR6kVx7LidJAxUSykoCdtPyiDxfEU8ypERiype2E6EJN/Bn11o1P8zr
 VZ/PdGti5gg=
 =vLKV
 -----END PGP SIGNATURE-----

Merge tag 'iio-for-6.16a-take2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next

Jonathan writes:

IIO: New device support, features and cleanup for 6.16 - take 2

Note - last minute rebase was to drop a typo patch that I'd accidentally
picked up (in the microblaze arch Kconfig)
Take 2 is due to that rebase messing up some fixes tags that were
referring to patches after that point.

There is a known merge conflict due to changes in neighbouring lines.

Stephen's resolution in linux-next is:
https://lore.kernel.org/linux-next/20250506155728.65605bae@canb.auug.org.au/

Added 3 named IIO reviewers to MAINTAINERS. This is a reflection of those
who have been doing much of this work for some time. Lars-Peter is
removed from the entry having moved on to other topics.  Thanks
Nuno, David and Andy for stepping up and Lars-Peter for all your
hard work in the past!

Includes the usual mix of new device support, features and general
cleanup.

This time we also have some tree wide changes.

- Rip out the iio_device_claim_direct_scoped() as it proved hard to work
  with.  This series includes quite a few related cleanups such as use
  of guard or factoring code out to allow direct returns.
- Switch from iio_device_claim/release_direct_mode() to new
  iio_device_claim/release_direct() which is structured so that sparse
  can warn on failed releases. There were a few false positives but
  those were mostly in code that benefited from being cleaned up as part
  of this process.
- Introduce iio_push_to_buffers_with_ts() to replace the _timestamp()
  version over time. This version takes the size of the supplied buffer
  which the core checks is at least as big as expected by calculation
  from channel descriptions of those channels enabled. Use this in
  an initial set of drivers.
- Add macros for IIO_DECLARE_BUFFER_WITH_TS() and
  IIO_DECLARE_DMA_BUFFER_WITH_TS() to avoid lots of fiddly code to ensure
  correctly aligned buffers for timestamps being added onto the end of
  channel data.

New device support
------------------

adi,ad3530r
- New driver for AD3530, AD3530R, AD3531 and AD3531R DACs with
  programmable gain controls. R variants have internal references.
adi,ad7476
- Add support (dt compatible only) for the Rohm BU79100G ADC which is
  fully compatible with the ti,ads7866.
adi,ad7606
- Support ad7606c-16 and ad7606c-18 devices. Includes switch to dynamic
  channel information allocation.
adi,ad7380
- Add support for the AD7389-4
dfrobot,sen0322
- New driver for this oxygen sensor.
mediatek,mt2701-auxadc
- Add binding for MT6893 which is fully compatible with already supported
  MT8173.
meson-saradc
- Support the GXLX SoCs.  Mostly this is a workaround for some unrelated
  clock control bits found in the ADC register map.
nuvoton,nct7201
- New driver for NCT7201 and NCT7202 I2C ADCs.
rohm,bd79124
- New driver for this 12-bit, 8-channel SAR ADC.
- Switch to new set_rv etc gpio callbacks that were added in 6.15.
rohm,bd79703
- Add support for BD79700, BD79701 and BD79702 DACs that have subsets of
  functionality of the already supported bd79703.  Included making this
  driver suitable for support device variants.
st,stm32-lptimer
- Add support for stm32pm25 to this trigger.

Features
--------

Beyond IIO
- Property iterator for named children.
core
- Enable writes for 64 bit integers used for standard IIO ABI elements.
  Previously these could be read only.
- Helper library that should avoid code duplication for simpler ADC
  bindings that have a child node per channel.
- Enforce that IIO_DMA_MINALIGN is always at least 8 (almost always true
  and simplifies code on all significant architectures)
core/backend
- Add support to control source of data - useful when the HDL includes
  things like generated ramps for testing purposes. Enable this for
  adi-axi-dac
adi,ad3552-hs
- Add debugfs related callbacks to allow debug access to register contents.
adi,ad4000
- Support SPI offload with appropriate FPGA firmware along with improving
  documentation.
adi,ad7293
- Add support for external reference voltage.
adi,ad7606
- Support SPI offload.
adi,ad7768-1
- Support reset GPIO.
adi,admv8818
- Support filter frequencies beyond 2^32.
adi,adxl345
- Add single and double tap events.
hid-sensor-prox
- Support 16-bit report sizes as seen on some Intel platforms.
invensense,icm42600
- Enable use of named interrupts to avoid problems with some wiring choices.
  Get the interrupt by name, but fallback to previous assumption on the first
  being INT1 if no names are supplied.
microchip,mcp3911
- Add reset gpio support.
rohm,bh7150
- Add reset gpio support.
st,stm32
- Add support to control oversampling.
ti,adc128s052
- Add support for ROHM BD79104 which is early compatible with the TI
  parts already supported by this driver. Includes some general driver
  cleanup and a separate dt binding.
- Simplify reference voltage handling by assuming it is fixed after enabling
  the supply.
winsen,mhz19b
- New driver for this C02 sensor.

Cleanup and minor fixes
-----------------------

dt-bindings
- Correct indentation and style for DTS examples.
- Use unevalutateProperties for SPI devices instead of additionalProperties
  to allow generic SPI properties from spi-peripheral-props.yaml
ABI Docs
- Add missing docs for sampling_frequency when it applies only to events.
Treewide
- Various minor tweaks, comment fixes and similar.
- Sort TI ADCs in Kconfig that had gotten out of order.
- Switch various drives that provide GPIO chip functionality to the new
  callbacks with return values.
- Standardize on { } formatting for all array sentinels.
- Make use of aligned_s64 in a few places to replace either wrong types
  or manually defined equivalents.
- Drop places where spi bits_per_word is set to 8 because that is the
  default anyway.

adi,ad_sigma_delta library
- Avoid a potential use of uninitialized data if reg_size has a value
  that is not supported (no drivers hit this but it is reasonable hardening)
adi,ad4030
- Add error checking for scan types and no longer store it in state.
- Rework code to reduce duplication.
- Move setting the mode from buffer preenable() to update_scan_mode(),
  better matching expected semantics of the two different callbacks.
- Improve data marshalling comments.
adi,ad4695
- Use u16 for buffer elements as oversampling is not yet supported except
  with SPI offload (which doesn't use this path).
adi,ad5592r
- Clean up destruction of mutexes.
- Use lock guards to simplify code (later patch fixes a missed unlock)
adi,ad5933
- Correct some incorrect settling times.
adi,ad7091
- Deduplicate handling of writable vs volatile registers as they are the
  inverse of each other for this device.
adi,ad7124
- Fix 3db Filter frequency.
- Remove ability to directly write the filter frequency (which was broken)
- Register naming improvements.
adi,ad7606
- Add a missing return value check.
- Fill in max sampling rates for all chips.
- Use devm_mutex_init()
- Fix up some kernel-doc formatting issues.
- Remove some camel case that snuck in.
- Drop setting address field in channels as easily established from other
  fields.
- Drop unnecessary parameter to ad76060_scale_setup_cb_t.
adi,ad7768-1
- Convert to regmap.
- Factor out buffer allocation.
- Tidy up headers.
adi,ad7944
- Stop setting bits_per_word in SPI xfers with no data.
adi,ad9832
- Add of_device_id table rather than just relying on fallbacks.
- Use FIELD_PREP() to set values of fields.
adi,admv1013
- Cleanup a pointless ternary.
adi,admv8818
- Fix up LPF Band 5 frequency which was slightly wrong.
- Fix an integer overflow.
- Fix range calculation
adi,adt7316
- Replace irqd_get_trigger_type(irq_get_irq_data()) with simpler
  irq_get_trigger_type()
adi,adxl345
- Use regmap cache instead of various state variables that were there to
  reduce bus accesses.
- Make regmap return value checking consistent across all call sites.
adi,axi-dac
- Add a check on number of channels (0 to 15 valid)
allwinner,sun20i
- Use new adc-helpers to replace local parsing code for channel nodes.
bosch,bmp290
- Move to local variables for sensor data marshalling removing the need
  for a messy definition that has to work for all supported parts.
  Follow up fix adds a missing initialization.
dynaimage,al3010 and dynaimage,al3320a
- Various minor cleanup to bring these drivers inline with reviewed feedback
  given on a new driver.
- Fix an error path in which power down is not called when it should be.
- Switch to regmap.
google,cros_ec
- Fix up a flexible array in middle of structure warning.
- Flush fifo when changing the timeout to avoid potential long wait
  for samples.
hid-sensor-rotation
- Remove an __aligned(16) marking that doesn't seem to be justified.
kionix,kxcjk-1013
- Deduplicate code for setting up interrupts.
microchip,mcp3911
- Fix handling of conversion results register which differs across supported
  devices.
idt,zopt2201
- Avoid duplicating register lists as all volatile registers are the
  inverse of writeable registers on this device.
renesas,rzg2l
- Use new adc-helpers to replace local parsing code for channel nodes.
ti,ads1298
- Fix a missing Kconfig dependency.

* tag 'iio-for-6.16a-take2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (260 commits)
  dt-bindings: iio: adc: Add ROHM BD79100G
  iio: adc: add support for Nuvoton NCT7201
  dt-bindings: iio: adc: add NCT7201 ADCs
  iio: chemical: Add driver for SEN0322
  dt-bindings: trivial-devices: Document SEN0322
  iio: adc: ad7768-1: reorganize driver headers
  iio: bmp280: zero-init buffer
  iio: ssp_sensors: optimalize -> optimize
  HID: sensor-hub: Fix typo and improve documentation
  iio: admv1013: replace redundant ternary operator with just len
  iio: chemical: mhz19b: Fix error code in probe()
  iio: adc: at91-sama5d2: use IIO_DECLARE_BUFFER_WITH_TS
  iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS
  iio: adc: ad7380: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: adc: ad4695: rename AD4695_MAX_VIN_CHANNELS
  iio: adc: ad4695: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: introduce IIO_DECLARE_BUFFER_WITH_TS macros
  iio: make IIO_DMA_MINALIGN minimum of 8 bytes
  iio: pressure: zpa2326_spi: remove bits_per_word = 8
  iio: pressure: ms5611_spi: remove bits_per_word = 8
  ...
2025-05-22 15:54:52 +02:00
Miguel Ojeda
cbeaa41dfe objtool/rust: relax slice condition to cover more noreturn Rust functions
Developers are indeed hitting other of the `noreturn` slice symbols in
Nova [1], thus relax the last check in the list so that we catch all of
them, i.e.

    *_4core5slice5index22slice_index_order_fail
    *_4core5slice5index24slice_end_index_len_fail
    *_4core5slice5index26slice_start_index_len_fail
    *_4core5slice5index29slice_end_index_overflow_fail
    *_4core5slice5index31slice_start_index_overflow_fail

These all exist since at least Rust 1.78.0, thus backport it too.

See commit 56d680dd23 ("objtool/rust: list `noreturn` Rust functions")
for more details.

Cc: stable@vger.kernel.org # Needed in 6.12.y and later.
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Timur Tabi <ttabi@nvidia.com>
Cc: Kane York <kanepyork@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Reported-by: Joel Fernandes <joelagnelf@nvidia.com>
Fixes: 56d680dd23 ("objtool/rust: list `noreturn` Rust functions")
Closes: https://lore.kernel.org/rust-for-linux/20250513180757.GA1295002@joelnvbox/ [1]
Tested-by: Joel Fernandes <joelagnelf@nvidia.com>
Link: https://lore.kernel.org/r/20250520185555.825242-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-05-22 12:00:58 +02:00
Cong Wang
c3572acffb selftests/tc-testing: Add an HFSC qlen accounting test
This test reproduces a scenario where HFSC queue length and backlog accounting
can become inconsistent when a peek operation triggers a dequeue and possible
drop before the parent qdisc updates its counters. The test sets up a DRR root
qdisc with an HFSC class, netem, and blackhole children, and uses Scapy to
inject a packet. It helps to verify that HFSC correctly tracks qlen and backlog
even when packets are dropped during peek-induced dequeue.

Cc: Mingi Cho <mincho@theori.io>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250518222038.58538-3-xiyou.wangcong@gmail.com
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 11:16:51 +02:00
Ingo Molnar
44889ff67c perf/uapi: Clean up <uapi/linux/perf_event.h> a bit
When applying a recent commit to the <uapi/linux/perf_event.h>
header I noticed that we have accumulated quite a bit of
historic noise in this header, so do a bit of spring cleaning:

 - Define bitfields in a vertically aligned fashion, like
   perf_event_mmap_page::capabilities already does. This
   makes it easier to see the distribution and sizing of
   bits within a word, at a glance. The following is much
   more readable:

			__u64	cap_bit0		: 1,
				cap_bit0_is_deprecated	: 1,
				cap_user_rdpmc		: 1,
				cap_user_time		: 1,
				cap_user_time_zero	: 1,
				cap_user_time_short	: 1,
				cap_____res		: 58;

   Than:

			__u64	cap_bit0:1,
				cap_bit0_is_deprecated:1,
				cap_user_rdpmc:1,
				cap_user_time:1,
				cap_user_time_zero:1,
				cap_user_time_short:1,
				cap_____res:58;

   So convert all bitfield definitions from the latter style to the
   former style.

 - Fix typos and grammar

 - Fix capitalization

 - Remove whitespace noise

 - Harmonize the definitions of various generations and groups of
   PERF_MEM_ ABI values.

 - Vertically align all definitions and assignments to the same
   column (48), as the first definition (enum perf_type_id),
   throughout the entire header.

 - And in general make the code and comments to be more in sync
   with each other and to be more readable overall.

No change in functionality.

Copy the changes over to tools/include/uapi/linux/perf_event.h.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250521221529.2547099-1-irogers@google.com
2025-05-22 11:03:41 +02:00
Ian Rogers
f4b18ff2c1 perf/uapi: Fix PERF_RECORD_SAMPLE comments in <uapi/linux/perf_event.h>
AAUX data for PERF_SAMPLE_AUX appears last. PERF_SAMPLE_CGROUP is
missing from the comment.

This makes the <uapi/linux/perf_event.h> comment match that in the
perf_event_open man page.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-perf-users@vger.kernel.org
Link: https://lore.kernel.org/r/20250521221529.2547099-1-irogers@google.com
2025-05-22 10:01:34 +02:00
Jakub Kicinski
4e4dc6db2b tools: ynl: add a sample for TC
Add a very simple TC dump sample with decoding of fq_codel attrs:

  # ./tools/net/ynl/samples/tc
        dummy0: fq_codel  limit: 10240p target: 5ms new_flow_cnt: 0

proving that selector passing (for stats) works.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250520161916.413298-13-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 12:38:23 -07:00
Jakub Kicinski
e06c9d2515 tools: ynl: enable codegen for TC
We are ready to support most of TC. Enable C code gen.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250520161916.413298-11-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 12:38:23 -07:00
Jakub Kicinski
4e9806a8f4 tools: ynl-gen: support weird sub-message formats
TC uses all possible sub-message formats:
 - nested attrs
 - fixed headers + nested attrs
 - fixed headers
 - empty

Nested attrs are already supported for rt-link. Add support
for remaining 3. The empty and fixed headers ones are fairly
trivial, we can fake a Binary or Flags type instead of a Nest.

For fixed headers + nest we need to teach nest parsing and
nest put to handle fixed headers.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250520161916.413298-10-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 12:38:23 -07:00
Jakub Kicinski
092b34b937 tools: ynl-gen: support local attrs in _multi_parse
The _multi_parse() helper calls the _attr_get() method of each attr,
but it only respects what code the helper wants to emit, not what
local variables it needs. Local variables will soon be needed,
support them.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250520161916.413298-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 12:38:22 -07:00
Jakub Kicinski
a66a170b68 tools: ynl-gen: move fixed header info from RenderInfo to Struct
RenderInfo describes a request-response exchange. Struct describes
a parsed attribute set. For ease of parsing sub-messages with
fixed headers move fixed header info from RenderInfo to Struct.
No functional changes.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250520161916.413298-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 12:38:22 -07:00
Jakub Kicinski
cb39645d9a tools: ynl-gen: support passing selector to a nest
In rtnetlink all submessages had the selector at the same level
of nesting as the submessage. We could refer to the relevant
attribute from the current struct. In TC, stats are one level
of nesting deeper than "kind". Teach the code-gen about structs
which need to be passed a selector by the caller for parsing.

Because structs are "topologically sorted" one pass of propagating
the selectors down is enough.

For generating netlink message we depend on the presence bits
so no selector passing needed there.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250520161916.413298-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 12:38:22 -07:00
Jakub Kicinski
8b8762eeec tools: ynl-gen: add makefile deps for neigh
Kory is reporting build issues after recent additions to YNL
if the system headers are old.

Link: https://lore.kernel.org/20250519164949.597d6e92@kmaincent-XPS-13-7390
Reported-by: Kory Maincent <kory.maincent@bootlin.com>
Fixes: 0939a418b3 ("tools: ynl: submsg: reverse parse / error reporting")
Tested-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20250520161916.413298-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 12:38:20 -07:00
Ian Rogers
0589aff473 perf python: Add evsel cpus and threads functions
Allow access to cpus and thread_map structs associated with an evsel.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250519195148.1708988-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21 15:07:13 -03:00
Ian Rogers
3ee2255c4f libperf threadmap: Add perf_thread_map__idx()
Allow computation of thread map index from a PID.

Note, with a 'struct perf_cpu_map' the sorted nature allows for a binary
search to compute the index which isn't currently possible with a
'struct perf_thread_map' as they aren't guaranteed sorted.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250519195148.1708988-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21 15:07:13 -03:00
Ian Rogers
eead8a0114 libperf threadmap: Don't segv for index 0 for the NULL 'struct perf_thread_map' pointer
perf_thread_map__nr() returns length 1 if the perf_thread_map is NULL,
meaning index 0 is valid.

When perf_thread_map__pid() of index 0 is read then return the expected
"any" -1 value.

Assert this is only done for index 0.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250519195148.1708988-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21 15:07:13 -03:00
Ravi Bangoria
21fb366b2f perf test amd: Skip amd-ibs-period test on kernel < v6.15
Bunch of IBS kernel fixes went in v6.15-rc1 [1].

The amd-ibs-period test will fail without those kernel patches.

Skip the test on system running kernel older than v6.15 to distinguish
genuine new failures vs known failure due to old kernel.

Since all the related IBS fixes went in -rc1 itself, the ">= 6.15" check
will work for any custom compiled v6.15-* kernel as well.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Closes: https://lore.kernel.org/r/aCfuGXUnNIbnYo_r@x1
Link: https://lore.kernel.org/r/20250115054438.1021-1-ravi.bangoria@amd.com [1]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21 15:07:13 -03:00
Ian Rogers
8f454c9581 perf thread: Ensure comm_lock held for comm_list
Add thread safety annotations for comm_list and add locking for two
instances where the list is accessed without the lock held (in
contradiction to ____thread__set_comm()).

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Fei Lang <langfei@huawei.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/r/20250519224645.1810891-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21 15:07:13 -03:00
Ian Rogers
6fe064491b perf rwsem: Add clang's -Wthread-safety annotations
Add annotations used by clang's -Wthread-safety.

Fix dsos compilation errors caused by a lock of annotations.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Fei Lang <langfei@huawei.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/r/20250519224645.1810891-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21 15:07:13 -03:00
Ian Rogers
ab2c742d75 perf dso: Minor refactor to allow clang's Wthread-safety analysis
The pattern:

```
if (x) {
   lock(...)
}
block1;
if (x) {
   unlock(...)
}
```

defeats clang's -Wthread-safety analysis where it complains of locks
held on one path and not another.

Add helper functions for "block1" then restructure as:

```
if (x) {
   lock(...);
   block1();
   unlock(...);
} else {
   block1();
}
```

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Fei Lang <langfei@huawei.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/r/20250519224645.1810891-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21 15:07:13 -03:00
Andrea Righi
36adf6fe6c selftests/sched_ext: Update test enq_select_cpu_fails
With commit 08699d20467b6 ("sched_ext: idle: Consolidate default idle
CPU selection kfuncs") allowing scx_bpf_select_cpu_dfl() to be invoked
from multiple contexts, update the test to validate that the kfunc
behaves correctly when used from ops.enqueue() and via BPF test_run.

Additionally, rename the test to enq_select_cpu, dropping "fails" from
the name, as the logic has now been inverted.

Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-21 07:35:58 -10:00
Vincent Mailhol
3e20585abf selftests: can: test_raw_filter.sh: add support of physical interfaces
Allow the user to specify a physical interface through the $CANIF
environment variable. Add a $BITRATE environment variable set with a
default value of 500000.

If $CANIF is omitted or if it starts with vcan (e.g. vcan1), the test
will use the virtual can interface type. Otherwise, it will assume
that the provided interface is a physical can interface.

For example:

  CANIF=can1 BITRATE=1000000 ./test_raw_filter.sh

will run set the can1 interface with a bitrate of one million and run
the tests on it.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-21 18:05:11 +02:00
Felix Maurer
77442ffa83 selftests: can: Import tst-filter from can-tests
Tests for the can subsystem have been in the can-tests repository[1] so
far. Start moving the tests to kernel selftests by importing the current
tst-filter test. The test is now named test_raw_filter and is substantially
updated to be more aligned with the kernel selftests, follow the coding
style, and simplify the validation of received CAN frames. We also include
documentation of the test design. The test verifies that the single filters
on raw CAN sockets work as expected.

We intend to import more tests from can-tests and add additional test cases
in the future. The goal of moving the CAN selftests into the tree is to
align the tests more closely with the kernel, improve testing of CAN in
general, and to simplify running the tests automatically in the various
kernel CI systems.

[1]: https://github.com/linux-can/can-tests

Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/87d289f333cba7bbcc9d69173ea1c320e4b5c3b8.1747833283.git.fmaurer@redhat.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-05-21 18:03:56 +02:00
Thomas Weißschuh
869c788909 selftests: harness: Stop using setjmp()/longjmp()
Usage of longjmp() was added to ensure that teardown is always run in
commit 63e6b2a423 ("selftests/harness: Run TEARDOWN for ASSERT failures")
However instead of calling longjmp() to the teardown handler it is easier to
just call the teardown handler directly from __bail().
Any potential duplicate teardown invocations are harmless as the actual
handler will only ever be executed once since
commit fff37bd32c ("selftests/harness: Fix fixture teardown").

Additionally this removes a incompatibility with nolibc,
which does not support setjmp()/longjmp().

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-12-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:37 +02:00
Thomas Weißschuh
f46ddc2cba selftests: harness: Add "variant" and "self" to test metadata
To get rid of setjmp()/longjmp(), the variant and self need to be usable
from __bail().

Make them available from the test metadata.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-11-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:36 +02:00
Thomas Weißschuh
5f036a2a8e selftests: harness: Add teardown callback to test metadata
To get rid of setjmp()/longjmp(), the teardown logic needs to be usable
from __bail(). Introduce a new callback for it.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-10-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:35 +02:00
Thomas Weißschuh
906dbc17d6 selftests: harness: Move teardown conditional into test metadata
To get rid of setjmp()/longjmp(), the teardown logic needs to be usable
from __bail(). To access the atomic teardown conditional from there,
move it into the test metadata.
This also allows the removal of "setup_completed".

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-9-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:35 +02:00
Thomas Weißschuh
fb25e99bce selftests: harness: Don't set setup_completed for fixtureless tests
This field is unused and has no meaning for tests without fixtures.
Don't set it for them.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-8-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:34 +02:00
Thomas Weißschuh
73a3cde976 selftests: harness: Implement test timeouts through pidfd
Make the kselftest harness compatible with nolibc which does not implement
signals by replacing the signal logic with pidfds.
The code also becomes simpler.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-7-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:33 +02:00
Thomas Weißschuh
67ee52611b selftests: harness: Remove dependency on libatomic
__sync_bool_compare_and_swap() is deprecated and requires libatomic on
GCC. Compiler toolchains don't necessarily have libatomic available, so
avoid this requirement by using atomics that don't need libatomic.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-6-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:33 +02:00
Thomas Weißschuh
5cccec7239 selftests: harness: Remove inline qualifier for wrappers
The pointers to the wrappers are stored in function pointers,
preventing them from actually being inlined.
Remove the inline qualifier, aligning these wrappers with the other
functions defined through macros.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-5-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:32 +02:00
Thomas Weißschuh
c2bcc8e957 selftests: harness: Mark functions without prototypes static
With -Wmissing-prototypes the compiler will warn about non-static
functions which don't have a prototype defined.
As they are not used from a different compilation unit they don't need to
be defined globally.

Avoid the issue by marking the functions static.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-4-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:31 +02:00
Thomas Weißschuh
6c409e0d87 selftests: harness: Ignore unused variant argument warning
For tests without fixtures the variant argument is unused.
This is intentional, prevent to compiler from complaining.

Example warning:

    harness-selftest.c: In function 'wrapper_standalone_pass':
    ../kselftest_harness.h:181:52: error: unused parameter 'variant' [-Werror=unused-parameter]
      181 |                 struct __fixture_variant_metadata *variant) \
          |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
    ../kselftest_harness.h:156:25: note: in expansion of macro '__TEST_IMPL'
      156 | #define TEST(test_name) __TEST_IMPL(test_name, -1)
          |                         ^~~~~~~~~~~
    harness-selftest.c:15:1: note: in expansion of macro 'TEST'
       15 | TEST(standalone_pass) {
          | ^~~~

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-3-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:31 +02:00
Thomas Weißschuh
575eca2c8c selftests: harness: Use C89 comment style
All comments in this file use C89 comment style.
Except for this one. Change it to get one step closer to C89
compatibility.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-2-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:30 +02:00
Thomas Weißschuh
df82ffc5a3 selftests: harness: Add kselftest harness selftest
Add a selftest for the kselftest harness itself so any changes can be
validated.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250505-nolibc-kselftest-harness-v4-1-ee4dd5257135@linutronix.de
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2025-05-21 15:32:27 +02:00
Thomas Weißschuh
2011097c17 selftests/nolibc: drop include guards around standard headers
Nolibc now provides all the headers required by nolibc-test.c.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20250515-nolibc-sys-v1-9-74f82eea3b59@weissschuh.net
Acked-by: Willy Tarreau <w@1wt.eu>
2025-05-21 15:32:27 +02:00
Thomas Weißschuh
2217abe09c tools/nolibc: move NULL and offsetof() to sys/stddef.h
This is the location regular userspace expects these definitions.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20250515-nolibc-sys-v1-8-74f82eea3b59@weissschuh.net
2025-05-21 15:32:25 +02:00
Thomas Weißschuh
0f971358dc tools/nolibc: move uname() and friends to sys/utsname.h
This is the location regular userspace expects these definitions.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/r/20250515-nolibc-sys-v1-7-74f82eea3b59@weissschuh.net
2025-05-21 15:32:24 +02:00