Especially once we hit one of the assertions in
sanity_check_pinned_pages(), observing follow-up assertions failing in
other code can give good clues about what went wrong, so use
VM_WARN_ON_ONCE instead.
While at it, let's just convert all VM_BUG_ON to VM_WARN_ON_ONCE as well.
Add one comment for the pfn_valid() check.
We have to introduce VM_WARN_ON_ONCE_VMA() to make that fly.
Drop the BUG_ON after mmap_read_lock_killable(), if that ever returns
something > 0 we're in bigger trouble. Convert the other BUG_ON's into
VM_WARN_ON_ONCE as well, they are in a similar domain "should never
happen", but more reasonable to check for during early testing.
[david@redhat.com: use the _FOLIO variant where possible, per Lorenzo]
Link: https://lkml.kernel.org/r/844bd929-a551-48e3-a12e-285cd65ba580@redhat.com
Link: https://lkml.kernel.org/r/20250604140544.688711-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Knowing how much memory is how cold can be useful for understanding
coldness and utilization efficiency of memory. The raw form of DAMON's
monitoring results has the information. Convert the raw results into the
per-byte idle time distributions and expose it as percentiles metric to
users, as a read-only DAMON_STAT parameter.
In detail, the metrics are calculated as follows. First, DAMON's
per-region access frequency and age information is converted into per-byte
idle time. If access frequency of a region is higher than zero, every
byte of the region has zero idle time. If the access frequency of a
region is zero, every byte of the region has idle time as the age of the
region. Then the logic sorts the per-byte idle times and provides the
value at 0/100, 1/100, ..., 99/100 and 100/100 location of the sorted
array.
The metric can be easily aggregated and compared on large scale production
systems. For example, if an average of 75-th percentile idle time of
machines that collected on similar time is two minutes, it means the
system's 25 percent memory is not accessed at all for two minutes or more
on average. If a workload considers two minutes as unit work time, we can
conclude its working set size is only 75 percent of the memory. If the
system utilizes proactive reclamation and it supports coldness-based
thresholds like DAMON_RECLAIM, the idle time percentiles can be used to
find a more safe or aggressive coldness threshold for aimed memory saving.
Link: https://lkml.kernel.org/r/20250604183127.13968-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The raw form of DAMON's monitoring results captures many details of the
information. However, not every bit of the information is always required
for understanding practical access patterns. Especially on real world
production systems of high scale time and size, the raw form is difficult
to be aggregated and compared.
Convert the raw monitoring results into a single number metric, namely
estimated memory bandwidth and expose it to users as a read-only
DAMON_STAT parameter. The metric represents access intensiveness
(hotness) of the system. It can easily be aggregated and compared for
high level understanding of the access pattern on large systems.
Link: https://lkml.kernel.org/r/20250604183127.13968-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm/damon: introduce DAMON_STAT for simple and practical
access monitoring", v2.
DAMON-based access monitoring is not simple due to required DAMON control
and results visualizations. Introduce a static kernel module for making
it simple. The module can be enabled without manual setup and provides
access pattern metrics that easy to fetch and understand the practical
access pattern information, namely estimated memory bandwidth and memory
idle time percentiles.
Background and Problems
=======================
DAMON can be used for monitoring data access patterns of the system and
workloads. Specifically, users can start DAMON to monitor access events
on specific address space with fine controls including address ranges to
monitor and time intervals between samplings and aggregations. The
resulting access information snapshot contains access frequency
(nr_accesses) and how long the frequency was kept (age) for each byte.
The monitoring usage is not simple and practical enough for production
usage. Users should first start DAMON with a number of parameters, and
wait until DAMON's monitoring results capture a reasonable amount of the
time data (age). In production, such manual start and wait is impractical
to capture useful information from a high number of machines in a timely
manner.
The monitoring result is also too detailed to be used on production
environments. The raw results are hard to be aggregated and/or compared
for production environments having a large scale of time, space and
machines fleet.
Users have to implement and use their own automation of DAMON control and
results processing. It is repetitive and challenging since there is no
good reference or guideline for such automation.
Solution: DAMON_STAT
====================
Implement such automation in kernel space as a static kernel module,
namely DAMON_STAT. It can be enabled at build, boot, or run time via its
build configuration or module parameter. It monitors the entire physical
address space with monitoring intervals that auto-tuned for a reasonable
amount of access observations and minimum overhead. It converts the raw
monitoring results into simpler metrics that can easily be aggregated and
compared, namely estimated memory bandwidth and idle time percentiles.
Understanding of the metrics and the user interface of DAMON_STAT is
essential. Refer to the commit messages of the second and the third
patches of this patch series for more details about the metrics. For the
user interface, the standard module parameters system is used. Refer to
the fourth patch of this patch series for details of the user interface.
Discussions
===========
The module aims to be useful on production environments constructed with a
large number of machines that run a long time. The auto-tuned monitoring
intervals ensure a reasonable quality of the outputs. The auto-tuning
also ensures its overhead be reasonable and low enough to be enabled
always on the production. The simplified monitoring results metrics can
be useful for showing both coldness (idle time percentiles) and hotness
(memory bandwidth) of the system's access pattern. We expect the
information can be useful for assessing system memory utilization and
inspiring optimizations or investigations on both kernel and user space
memory management logics for large scale fleets.
We hence expect the module is good enough to be just used in most
environments. For special cases that require a custom access monitoring
automation, users will still benefit by using DAMON_STAT as a reference or
a guideline for their specialized automation.
This patch (of 4):
To use DAMON for monitoring access patterns of the system, users should
manually start DAMON via DAMON sysfs ABI with a number of parameters for
specifying the monitoring target address space, address ranges, and
monitoring intervals. After that, users should also wait until desired
amount of time data is captured into DAMON's monitoring results. It is
bothersome and take a long time to be practical for access monitoring on
large fleet level production environments.
For access-aware system operations use cases like proactive cold memory
reclamation, similar problems existed. We we solved those by introducing
dedicated static kernel modules such as DAMON_RECLAIM.
Implement such static kernel module for access monitoring, namely
DAMON_STAT. It monitors the entire physical address space with auto-tuned
monitoring intervals. The auto-tuning is set to capture 4 % of observable
access events in each snapshot while keeping the sampling intervals 5
milliseconds in minimum and 10 seconds in maximum. From a few production
environments, we confirmed this setup provides high quality monitoring
results with minimum overheads. The module therefore receives only one
user input, whether to enable or disable it. It can be set on build or
boot time via build configuration or kernel boot command line. It can
also be overridden at runtime.
Note that this commit only implements the DAMON control part of the
module. Users could get the monitoring results via damon:damon_aggregated
tracepoint, but that's of course not the recommended way. Following
commits will implement convenient and optimized ways for serving the
monitoring results to users.
[sj@kernel.org: use IS_ENABLED() for enabled initial value]
Link: https://lkml.kernel.org/r/20250604205619.18929-1-sj@kernel.org
[sj@kernel.org: reset enabled when DAMON start failed]
Link: https://lkml.kernel.org/r/20250706184750.36588-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250604183127.13968-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250604183127.13968-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If a user wishes to enable KSM mergeability for an entire process and all
fork/exec'd processes that come after it, they use the prctl()
PR_SET_MEMORY_MERGE operation.
This defaults all newly mapped VMAs to have the VM_MERGEABLE VMA flag set
(in order to indicate they are KSM mergeable), as well as setting this
flag for all existing VMAs and propagating this across fork/exec.
However it also breaks VMA merging for new VMAs, both in the process and
all forked (and fork/exec'd) child processes.
This is because when a new mapping is proposed, the flags specified will
never have VM_MERGEABLE set. However all adjacent VMAs will already have
VM_MERGEABLE set, rendering VMAs unmergeable by default.
To work around this, we try to set the VM_MERGEABLE flag prior to
attempting a merge. In the case of brk() this can always be done.
However on mmap() things are more complicated - while KSM is not supported
for MAP_SHARED file-backed mappings, it is supported for MAP_PRIVATE
file-backed mappings.
These mappings may have deprecated .mmap() callbacks specified which
could, in theory, adjust flags and thus KSM eligibility.
So we check to determine whether this is possible. If not, we set
VM_MERGEABLE prior to the merge attempt on mmap(), otherwise we retain the
previous behaviour.
This fixes VMA merging for all new anonymous mappings, which covers the
majority of real-world cases, so we should see a significant improvement
in VMA mergeability.
For MAP_PRIVATE file-backed mappings, those which implement the
.mmap_prepare() hook and shmem are both known to be safe, so we allow
these, disallowing all other cases.
Also add stubs for newly introduced function invocations to VMA userland
testing.
[lorenzo.stoakes@oracle.com: correctly invoke late KSM check after mmap hook]
Link: https://lkml.kernel.org/r/5861f8f6-cf5a-4d82-a062-139fb3f9cddb@lucifer.local
Link: https://lkml.kernel.org/r/3ba660af716d87a18ca5b4e635f2101edeb56340.1748537921.git.lorenzo.stoakes@oracle.com
Fixes: d7597f59d1 ("mm: add new api to enable ksm per process") # please no backport!
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Xu Xin <xu.xin16@zte.com.cn>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Stefan Roesch <shr@devkernel.io>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "mm: ksm: prevent KSM from breaking merging of new VMAs", v3.
When KSM-by-default is established using prctl(PR_SET_MEMORY_MERGE), this
defaults all newly mapped VMAs to having VM_MERGEABLE set, and thus makes
them available to KSM for samepage merging. It also sets VM_MERGEABLE in
all existing VMAs.
However this causes an issue upon mapping of new VMAs - the initial flags
will never have VM_MERGEABLE set when attempting a merge with adjacent
VMAs (this is set later in the mmap() logic), and adjacent VMAs will
ALWAYS have VM_MERGEABLE set.
This renders all newly mapped VMAs unmergeable.
To avoid this, this series performs the check for PR_SET_MEMORY_MERGE far
earlier in the mmap() logic, prior to the merge being attempted.
However we run into complexity with the depreciated .mmap() callback - if
a driver hooks this, it might change flags which adjust KSM merge
eligibility.
We have to worry about this because, while KSM is only applicable to
private mappings, this includes both anonymous and MAP_PRIVATE-mapped
file-backed mappings.
This isn't a problem for brk(), where the VMA must be anonymous. However
in mmap() we must be conservative - if the VMA is anonymous then we can
always proceed, however if not, we permit only shmem mappings (whose .mmap
hook does not affect KSM eligibility) and drivers which implement
.mmap_prepare() (invoked prior to the KSM eligibility check).
If we can't be sure of the driver changing things, then we maintain the
same behaviour of performing the KSM check later in the mmap() logic (and
thus losing new VMA mergeability).
A great many use-cases for this logic will use anonymous mappings any
rate, so this change should already cover the majority of actual KSM
use-cases.
This patch (of 4):
In subsequent commits we are going to determine KSM eligibility prior to a
VMA being constructed, at which point we will of course not yet have
access to a VMA pointer.
It is trivial to boil down the check logic to be parameterised on
mm_struct, file and VMA flags, so do so.
As a part of this change, additionally expose and use file_is_dax() to
determine whether a file is being mapped under a DAX inode.
Link: https://lkml.kernel.org/r/cover.1748537921.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/36ad13eb50cdbd8aac6dcfba22c65d5031667295.1748537921.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Xu Xin <xu.xin16@zte.com.cn>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Stefan Roesch <shr@devkernel.io>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Introduces a new drgn script, `show_page_info.py`, which allows users
to analyze the state of a page given a process ID (PID) and a virtual
address (VADDR). This can help kernel developers or debuggers easily
inspect page-related information in a live kernel or vmcore.
The script extracts information such as the page flags, mapping, and
other metadata relevant to diagnosing memory issues.
Output example:
sudo ./show_page_info.py 1 0x7fc988181000
PID: 1 Comm: systemd mm: 0xffff8d22c4089700
RAW: 0017ffffc000416c fffff939062ff708 fffff939062ffe08 ffff8d23062a12a8
RAW: 0000000000000000 ffff8d2323438f60 0000002500000007 ffff8d23203ff500
Page Address: 0xfffff93905664e00
Page Flags: PG_referenced|PG_uptodate|PG_lru|PG_head|PG_active|
PG_private|PG_reported|PG_has_hwpoisoned
Page Size: 4096
Page PFN: 0x159938
Page Physical: 0x159938000
Page Virtual: 0xffff8d2319938000
Page Refcount: 37
Page Mapcount: 7
Page Index: 0x0
Page Memcg Data: 0xffff8d23203ff500
Memcg Name: init.scope
Memcg Path: /sys/fs/cgroup/memory/init.scope
Page Mapping: 0xffff8d23062a12a8
Page Anon/File: File
Page VMA: 0xffff8d22e06e0e40
VMA Start: 0x7fc988181000
VMA End: 0x7fc988185000
This page is part of a compound page.
This page is the head page of a compound page.
Head Page: 0xfffff93905664e00
Compound Order: 2
Number of Pages: 4
Link: https://lkml.kernel.org/r/20250530055855.687067-1-ye.liu@linux.dev
Signed-off-by: Ye Liu <liuye@kylinos.cn>
Tested-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Omar Sandoval <osandov@osandov.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The scan implementation for MGLRU was missing proportional reclaim
pressure for memcg, which contradicts the description in
Documentation/admin-guide/cgroup-v2.rst (memory.{low,min} section).
This issue can be observed in kselftest cgroup:test_memcontrol
(specifically test_memcg_min and test_memcg_low). The following table
shows the actual values observed in my local test env (on xfs) and the
error "e", which is the symmetric absolute percentage error from the ideal
values of 29M for c[0] and 21M for c[1].
test_memcg_min
| MGLRU enabled | MGLRU enabled | MGLRU disabled
| Without patch | With patch |
-----|-----------------|-----------------|---------------
c[0] | 25964544 (e=8%) | 28770304 (e=3%) | 27820032 (e=4%)
c[1] | 26214400 (e=9%) | 23998464 (e=4%) | 24776704 (e=6%)
test_memcg_low
| MGLRU enabled | MGLRU enabled | MGLRU disabled
| Without patch | With patch |
-----|-----------------|-----------------|---------------
c[0] | 26214400 (e=7%) | 27930624 (e=4%) | 27688960 (e=5%)
c[1] | 26214400 (e=9%) | 24764416 (e=6%) | 24920064 (e=6%)
Factor out the proportioning logic to a new function and have MGLRU reuse
it. While at it, update the eviction behavior via debugfs 'lru_gen'
interface ('-' command with an explicit 'nr_to_reclaim' parameter) to
ensure eviction is limited to the specified number.
Link: https://lkml.kernel.org/r/20250530162353.541882-1-den@valinux.co.jp
Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
Reviewed-by: Yuanchu Xie <yuanchu@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull /proc/sys dcache lookup fix from Al Viro:
"Fix for the breakage spotted by Neil in the interplay between
/proc/sys ->d_compare() weirdness and parallel lookups"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix proc_sys_compare() handling of in-lookup dentries
Pull scheduler fixes from Borislav Petkov:
- Fix the calculation of the deadline server task's runtime as this
mishap was preventing realtime tasks from running
- Avoid a race condition during migrate-swapping two tasks
- Fix the string reported for the "none" dynamic preemption option
* tag 'sched_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/deadline: Fix dl_server runtime calculation formula
sched/core: Fix migrate_swap() vs. hotplug
sched: Fix preemption string of preempt_dynamic_none
Pull objtool fix from Borislav Petkov:
- Fix the compilation of an x86 kernel on a big engian machine due to a
missed endianness conversion
* tag 'objtool_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Add missing endian conversion to read_annotate()
Pull perf fixes from Borislav Petkov:
- Revert uprobes to using CAP_SYS_ADMIN again as currently they can
destructively modify kernel code from an unprivileged process
- Move a warning to where it belongs
* tag 'perf_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Revert to requiring CAP_SYS_ADMIN for uprobes
perf/core: Fix the WARN_ON_ONCE is out of lock protected region
Pull x86 fix from Borislav Petkov:
- Make sure AMD SEV guests using secure TSC, include a TSC_FACTOR which
prevents their TSCs from going skewed from the hypervisor's
* tag 'x86_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Use TSC_FACTOR for Secure TSC frequency calculation
Pull locking fixes from Borislav Petkov:
- Disable FUTEX_PRIVATE_HASH for this cycle due to a performance
regression
- Add a selftests compilation product to the corresponding .gitignore
file
* tag 'locking_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/futex: Add futex_numa to .gitignore
futex: Temporary disable FUTEX_PRIVATE_HASH
Pull EDAC fix from Borislav Petkov:
- Initialize sysfs attributes properly to avoid lockdep complaining
about an uninitialized lock class
* tag 'edac_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC: Initialize EDAC features sysfs attributes
Pull RAS fixes from Borislav Petkov:
- Do not remove the MCE sysfs hierarchy if thresholding sysfs nodes
init fails due to new/unknown banks present, which in itself is not
fatal anyway; add default names for new banks
- Make sure MCE polling settings are honored after CMCI storms
- Make sure MCE threshold limit is reset after the thresholding
interrupt has been serviced
- Clean up properly and disable CMCI banks on shutdown so that a
second/kexec-ed kernel can rediscover those banks again
* tag 'ras_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Make sure CMCI banks are cleared during shutdown on Intel
x86/mce/amd: Fix threshold limit reset
x86/mce/amd: Add default names for MCA banks and blocks
x86/mce: Ensure user polling settings are honored when restarting timer
x86/mce: Don't remove sysfs if thresholding sysfs init fails
Pull irq fix from Borislav Petkov:
- Have irq-msi-lib select CONFIG_GENERIC_MSI_IRQ explicitly as it uses
its facilities
* tag 'irq_urgent_for_v6.16_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-msi-lib: Select CONFIG_GENERIC_MSI_IRQ
Pull HID fixes from Jiri Kosina:
- Memory corruption fixes in hid-appletb-kbd driver (Qasim Ijaz)
- New device ID in hid-elecom driver (Leonard Dizon)
- Fixed several HID debugfs contants (Vicki Pfau)
* tag 'hid-for-linus-2025070502' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: appletb-kbd: fix slab use-after-free bug in appletb_kbd_probe
HID: Fix debug name for BTN_GEAR_DOWN, BTN_GEAR_UP, BTN_WHEEL
HID: elecom: add support for ELECOM HUGE 019B variant
HID: appletb-kbd: fix memory corruption of input_handler_list
Pull smb client fixes from Steve French:
- Two reconnect fixes including one for a reboot/reconnect race
- Fix for incorrect file type that can be returned by SMB3.1.1 POSIX
extensions
- tcon initialization fix
- Fix for resolving Windows symlinks with absolute paths
* tag 'v6.16-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix native SMB symlink traversal
smb: client: fix race condition in negotiate timeout by using more precise timing
cifs: all initializations for tcon should happen in tcon_info_alloc
smb: client: fix warning when reconnecting channel
smb: client: fix readdir returning wrong type with POSIX extensions
Pull i2c fixes from Wolfram Sang:
- designware: initialise msg_write_idx during transfer
- microchip: check return value from core xfer call
- realtek: add 'reg' property constraint to the device tree
* tag 'i2c-for-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
dt-bindings: i2c: realtek,rtl9301: Fix missing 'reg' constraint
i2c: microchip-core: re-fix fake detections w/ i2cdetect
i2c/designware: Fix an initialization issue
Pull power management fixes from Rafael Wysocki:
"These address system suspend failures under memory pressure in some
configurations, fix up RAPL handling on platforms where PL1 cannot be
disabled, and fix a documentation typo:
- Prevent the Intel RAPL power capping driver from allowing PL1 to be
exceeded by mistake on systems when PL1 cannot be disabled (Zhang
Rui)
- Fix a typo in the ABI documentation (Sumanth Gavini)
- Allow swap to be used a bit longer during system suspend and
hibernation to avoid suspend failures under memory pressure (Mario
Limonciello)"
* tag 'pm-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: docs: Replace "diasble" with "disable"
powercap: intel_rapl: Do not change CLAMPING bit if ENABLE bit cannot be changed
PM: Restrict swap use to later in the suspend sequence
Pull ACPI fix from Rafael Wysocki:
"Revert a problematic ACPI battery driver change merged recently"
* tag 'acpi-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI: battery: negate current when discharging"
Merge fixes related to system sleep for 6.16-rc5:
- Fix typo in the ABI documentation (Sumanth Gavini).
- Allow swap to be used a bit longer during system suspend and
hibernation to avoid suspend failures under memory pressure (Mario
Limonciello).
* pm-sleep:
PM: sleep: docs: Replace "diasble" with "disable"
PM: Restrict swap use to later in the suspend sequence
Pull SoC fixes from Arnd Bergmann:
"A couple of fixes for firmware drivers have come up, addressing kernel
side bugs in op-tee and ff-a code, as well as compatibility issues
with exynos-acpm and ff-a protocols.
The only devicetree fixes are for the Apple platform, addressing
issues with conformance to the bindings for the wlan, spi and mipi
nodes"
* tag 'soc-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: dts: apple: Move touchbar mipi {address,size}-cells from dtsi to dts
arm64: dts: apple: Drop {address,size}-cells from SPI NOR
arm64: dts: apple: t8103: Fix PCIe BCM4377 nodename
optee: ffa: fix sleep in atomic context
firmware: exynos-acpm: fix timeouts on xfers handling
arm64: defconfig: update renamed PHY_SNPS_EUSB2
firmware: arm_ffa: Fix the missing entry in struct ffa_indirect_msg_hdr
firmware: arm_ffa: Replace mutex with rwlock to avoid sleep in atomic context
firmware: arm_ffa: Move memory allocation outside the mutex locking
firmware: arm_ffa: Fix memory leak by freeing notifier callback node
Pull RISC-V fixes from Palmer Dabbelt:
- kCFI is restricted to clang-17 or newer, as earlier versions have
known bugs
- sbi_hsm_hart_start is now staticly allocated, to avoid tripping up
the SBI HSM page mapping on sparse systems.
* tag 'riscv-for-linus-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: cpu_ops_sbi: Use static array for boot_data
riscv: Require clang-17 or newer for kCFI
Pull regulator fixes from Mark Brown:
"A few driver fixes (the GPIO one being potentially nasty, though it
has been there for a while without anyone reporting it), and one core
fix for the rarely used combination of coupled regulators and
unbinding"
* tag 'regulator-fix-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: gpio: Fix the out-of-bounds access to drvdata::gpiods
regulator: mp886x: Fix ID table driver_data
regulator: sy8824x: Fix ID table driver_data
regulator: tps65219: Fix devm_kmalloc size allocation
regulator: core: fix NULL dereference on unbind due to stale coupling data
Pull spi fixes from Mark Brown:
"As well as a few driver specific fixes we've got a core change here
which raises the hard coded limit on the number of devices we can
support on one SPI bus since some FPGA based systems are running into
the existing limit. This is not a good solution but it's one suitable
for this point in the release cycle, we should dynamically size the
relevant data structures which I hope will happen in the next couple
of merge windows.
We also pull in a MTD fix for the Qualcomm SNAND driver, the two fixes
cover the same issue and merging them together minimises bisection
issues"
* tag 'spi-fix-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: cadence-quadspi: fix cleanup of rx_chan on failure paths
spi: spi-fsl-dspi: Clear completion counter before initiating transfer
spi: Raise limit on number of chip selects to 24
mtd: nand: qpic_common: prevent out of bounds access of BAM arrays
spi: spi-qpic-snand: reallocate BAM transactions
Pull x86 platform drivers fixes from Ilpo Järvinen:
"Mostly a few lines fixed here and there except amd/isp4 which improves
swnodes relationships but that is a new driver not in any stable
kernels yet. The think-lmi driver changes also look relatively large
but there are just many fixes to it.
The i2c/piix4 change is a effectively a revert of the commit
7e173eb82a ("i2c: piix4: Make CONFIG_I2C_PIIX4 dependent on
CONFIG_X86") but that required moving the header out from arch/x86
under include/linux/platform_data/
Summary:
- amd/isp4: Improve swnode graph (new driver exception)
- asus-nb-wmi: Use duo keyboard quirk for Zenbook Duo UX8406CA
- dell-lis3lv02d: Add Latitude 5500 accelerometer address
- dell-wmi-sysman: Fix WMI data block retrieval and class dev unreg
- hp-bioscfg: Fix class device unregistration
- i2c: piix4: Re-enable on non-x86 + move FCH header under platform_data/
- intel/hid: Wildcat Lake support
- mellanox:
- mlxbf-pmc: Fix duplicate event ID
- mlxbf-tmfifo: Fix vring_desc.len assignment
- mlxreg-lc: Fix bit-not-set logic check
- nvsw-sn2201: Fix bus number in error message & spelling errors
- portwell-ec: Move watchdog device under correct platform hierarchy
- think-lmi: Error handling fixes (sysfs, kset, kobject, class dev unreg)
- thinkpad_acpi: Handle HKEY 0x1402 event (2025 Thinkpads)
- wmi: Fix WMI event enablement"
* tag 'platform-drivers-x86-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (22 commits)
platform/x86: think-lmi: Fix sysfs group cleanup
platform/x86: think-lmi: Fix kobject cleanup
platform/x86: think-lmi: Create ksets consecutively
platform/mellanox: mlxreg-lc: Fix logic error in power state check
i2c: Re-enable piix4 driver on non-x86
Move FCH header to a location accessible by all archs
platform/x86/intel/hid: Add Wildcat Lake support
platform/x86: dell-wmi-sysman: Fix class device unregistration
platform/x86: think-lmi: Fix class device unregistration
platform/x86: hp-bioscfg: Fix class device unregistration
platform/x86: Update swnode graph for amd isp4
platform/x86: dell-wmi-sysman: Fix WMI data block retrieval in sysfs callbacks
platform/x86: wmi: Update documentation of WCxx/WExx ACPI methods
platform/x86: wmi: Fix WMI event enablement
platform/mellanox: nvsw-sn2201: Fix bus number in adapter error message
platform/mellanox: Fix spelling and comment clarity in Mellanox drivers
platform/mellanox: mlxbf-pmc: Fix duplicate event ID for CACHE_DATA1
platform/x86: thinkpad_acpi: handle HKEY 0x1402 event
platform/x86: asus-nb-wmi: add DMI quirk for ASUS Zenbook Duo UX8406CA
platform/x86: dell-lis3lv02d: Add Latitude 5500
...
Pull USB fixes from Greg KH:
"Here are some USB driver fixes for 6.16-rc5. I originally wanted this
to get into -rc4, but there were some regressions that had to be
handled first. Now all looks good. Included in here are the following
fixes:
- cdns3 driver fixes
- xhci driver fixes
- typec driver fixes
- USB hub fixes (this is what took the longest to get right)
- new USB driver quirks added
- chipidea driver fixes
All of these have been in linux-next for a while and now we have no
more reported problems with them"
* tag 'usb-6.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (21 commits)
usb: hub: Fix flushing of delayed work used for post resume purposes
xhci: dbc: Flush queued requests before stopping dbc
xhci: dbctty: disable ECHO flag by default
xhci: Disable stream for xHC controller with XHCI_BROKEN_STREAMS
usb: xhci: quirk for data loss in ISOC transfers
usb: dwc3: gadget: Fix TRB reclaim logic for short transfers and ZLPs
usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm
usb: typec: displayport: Fix potential deadlock
usb: typec: altmodes/displayport: do not index invalid pin_assignments
usb: cdnsp: Fix issue with CV Bad Descriptor test
usb: typec: tcpm: apply vbus before data bringup in tcpm_src_attach
Revert "usb: xhci: Implement xhci_handshake_check_state() helper"
usb: xhci: Skip xhci_reset in xhci_resume if xhci is being removed
usb: gadget: u_serial: Fix race condition in TTY wakeup
Revert "usb: gadget: u_serial: Add null pointer check in gs_start_io"
usb: chipidea: udc: disconnect/reconnect from host when do suspend/resume
usb: acpi: fix device link removal
usb: hub: fix detection of high tier USB3 devices behind suspended hubs
Logitech C-270 even more broken
usb: dwc3: Abort suspend on soft disconnect failure
...
Pull input updates from Dmitry Torokhov:
- support for Acer NGR 200 Controller added to xpad driver
- xpad driver will no longer log errors about URBs at sudden disconnect
- a fix for potential NULL dereference in cs40l50-vibra driver
- several drivers have been switched to using scnprintf() to suppress
warnings about potential output truncation
* tag 'input-for-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: cs40l50-vibra - fix potential NULL dereference in cs40l50_upload_owt()
Input: alps - use scnprintf() to suppress truncation warning
Input: iqs7222 - explicitly define number of external channels
Input: xpad - support Acer NGR 200 Controller
Input: xpad - return errors from xpad_try_sending_next_out_packet() up
Input: xpad - adjust error handling for disconnect
Input: apple_z2 - drop default ARCH_APPLE in Kconfig
Input: Fully open-code compatible for grepping
dt-bindings: HID: i2c-hid: elan: Introduce Elan eKTH8D18
Input: psmouse - switch to use scnprintf() to suppress truncation warning
Input: lifebook - switch to use scnprintf() to suppress truncation warning
Input: alps - switch to use scnprintf() to suppress truncation warning
Input: atkbd - switch to use scnprintf() to suppress truncation warning
Input: fsia6b - suppress buffer truncation warning for phys
Input: iqs626a - replace snprintf() with scnprintf()