Merge cpupower utility updates, including a fix and improvements of the
existing functionality, for 7.0-rc4.
* pm-tools:
cpupower: Add intel_pstate turbo boost support for Intel platforms
cpupower: Add support for setting EPP via systemd service
cpupower: fix swapped power/energy unit labels
Pull AppArmor fixes from John Johansen:
- fix race between freeing data and fs accessing it
- fix race on unreferenced rawdata dereference
- fix differential encoding verification
- fix unconfined unprivileged local user can do privileged policy management
- Fix double free of ns_name in aa_replace_profiles()
- fix missing bounds check on DEFAULT table in verify_dfa()
- fix side-effect bug in match_char() macro usage
- fix: limit the number of levels of policy namespaces
- replace recursive profile removal with iterative approach
- fix memory leak in verify_header
- validate DFA start states are in bounds in unpack_pdb
* tag 'apparmor-pr-mainline-2026-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: fix race between freeing data and fs accessing it
apparmor: fix race on rawdata dereference
apparmor: fix differential encoding verification
apparmor: fix unprivileged local user can do privileged policy management
apparmor: Fix double free of ns_name in aa_replace_profiles()
apparmor: fix missing bounds check on DEFAULT table in verify_dfa()
apparmor: fix side-effect bug in match_char() macro usage
apparmor: fix: limit the number of levels of policy namespaces
apparmor: replace recursive profile removal with iterative approach
apparmor: fix memory leak in verify_header
apparmor: validate DFA start states are in bounds in unpack_pdb
Merge an ACPI OS services layer (OSL) fix that addresses sparse warnings
in acpi_os_initialize() (Ben Dooks)
* acpi-osl:
ACPI: OSL: fix __iomem type on return from acpi_os_map_generic_address()
Add "do no harm" testing of EFER, CR0, CR4, and CR8 for SEV+ guests to
verify that the guest can read and write the registers, without hitting
e.g. a #VC on SEV-ES guests due to KVM incorrectly trying to intercept a
register.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20260310211841.2552361-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There's a gap between when the buffer was grabbed and when it
potentially gets recycled, where if the list is empty, someone could've
upgraded it to a ring provided type. This can happen if the request
is forced via io-wq. The legacy recycling is missing checking if the
buffer_list still exists, and if it's of the correct type. Add those
checks.
Cc: stable@vger.kernel.org
Fixes: c7fb19428d ("io_uring: add support for ring mapped supplied buffers")
Reported-by: Keenan Dong <keenanat2000@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Starting with the upcoming Rust 1.96.0 (to be released 2026-05-28),
`rustc` introduces the new lint `unused_features` [1], which warns [2]:
warning: feature `used_with_arg` is declared but not used
--> <crate attribute>:1:93
|
1 | #![feature(asm_const,asm_goto,arbitrary_self_types,lint_reasons,offset_of_nested,raw_ref_op,used_with_arg)]
| ^^^^^^^^^^^^^
|
= note: `#[warn(unused_features)]` (part of `#[warn(unused)]`) on by default
The original goal of using `-Zcrate-attr` automatically was that there
is a consistent set of features enabled and managed globally for all
Rust kernel code (modulo exceptions like the `rust/` crated).
While we could require crates to enable features manually (even if we
still keep the `-Zallow-features=` list, i.e. removing the `-Zcrate-attr`
list), it is not really worth making all developers worry about it just
for a new lint.
The features are expected to eventually become stable anyway (most already
did), and thus having to remove features in every file that may use them
is not worth it either.
Thus just allow the new lint globally.
The lint actually existed for a long time, which is why `rustc` does
not complain about an unknown lint in the stable versions we support,
but it was "disabled" years ago [3], and now it was made to work again.
For extra context, the new implementation of the lint has already been
improved to avoid linting about features that became stable thanks to
Benno's report and the ensuing discussion [4] [5], but while that helps,
it is still the case that we may have features enabled that are not used
for one reason or another in a particular crate.
Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust/pull/152164 [1]
Link: https://github.com/Rust-for-Linux/pin-init/pull/114 [2]
Link: https://github.com/rust-lang/rust/issues/44232 [3]
Link: https://github.com/rust-lang/rust/issues/153523 [4]
Link: https://github.com/rust-lang/rust/pull/153610 [5]
Reviewed-by: Benno Lossin <lossin@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260312111014.74198-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
ASoC: Fixes for v7.0
Quite a large pull request, but nothing too concerning here - everything
is fairly small. We've got a couple of smaller core fixes for races on
card teardown from Matteo Cotifava, a fix for handling dodgy DMI
information generated by u-boot, some driver specific fixes and some new
device IDs for Tegra.
We use a unit struct `__InitOk` in the closure generated by the
initializer macros as the return value. We shadow it by creating a
struct with the same name again inside of the closure, preventing early
returns of `Ok` in the initializer (before all fields have been
initialized).
In the face of Type Alias Impl Trait (TAIT) and the next trait solver,
this solution no longer works [1]. The shadowed struct can be named
through type inference. In addition, there is an RFC proposing to add
the feature of path inference to Rust, which would similarly allow [2].
Thus remove the shadowed token and replace it with an `unsafe` to create
token.
The reason we initially used the shadowing solution was because an
alternative solution used a builder pattern. Gary writes [3]:
In the early builder-pattern based InitOk, having a single InitOk
type for token is unsound because one can launder an InitOk token
used for one place to another initializer. I used a branded lifetime
solution, and then you figured out that using a shadowed type would
work better because nobody could construct it at all.
The laundering issue does not apply to the approach we ended up with
today.
With this change, the example by Tim Chirananthavat in [1] no longer
compiles and results in this error:
error: cannot construct `pin_init::__internal::InitOk` with struct literal syntax due to private fields
--> src/main.rs:26:17
|
26 | InferredType {}
| ^^^^^^^^^^^^
|
= note: private field `0` that was not provided
help: you might have meant to use the `new` associated function
|
26 - InferredType {}
26 + InferredType::new()
|
Applying the suggestion of using the `::new()` function, results in
another expected error:
error[E0133]: call to unsafe function `pin_init::__internal::InitOk::new` is unsafe and requires unsafe block
--> src/main.rs:26:17
|
26 | InferredType::new()
| ^^^^^^^^^^^^^^^^^^^ call to unsafe function
|
= note: consult the function's documentation for information on how to avoid undefined behavior
Reported-by: Tim Chirananthavat <theemathas@gmail.com>
Link: https://github.com/rust-lang/rust/issues/153535 [1]
Link: https://github.com/rust-lang/rfcs/pull/3444#issuecomment-4016145373 [2]
Link: https://github.com/rust-lang/rust/issues/153535#issuecomment-4017620804 [3]
Fixes: fc6c6baa1f ("rust: init: add initialization macros")
Cc: stable@vger.kernel.org
Signed-off-by: Benno Lossin <lossin@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260311105056.1425041-1-lossin@kernel.org
[ Added period as mentioned. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
The new PowerPC VMX fast path (__copy_tofrom_user_power7_vmx) is not
exercised by existing copyloops selftests. This patch updates
the selftest to exercise the VMX variant, ensuring the VMX copy path
is validated.
Changes include:
- COPY_LOOP=test___copy_tofrom_user_power7_vmx with -D VMX_TEST is used
in existing selftest build targets.
- Inclusion of ../utils.c to provide get_auxv_entry() for hardware
feature detection.
- At runtime, the test skips execution if Altivec is not available.
- Copy sizes above VMX_COPY_THRESHOLD are used to ensure the VMX
path is taken.
This enables validation of the VMX fast path without affecting systems
that do not support Altivec.
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260304122201.153049-2-sayalip@linux.ibm.com
On powerpc with PREEMPT_FULL or PREEMPT_LAZY and function tracing enabled,
KUAP warnings can be triggered from the VMX usercopy path under memory
stress workloads.
KUAP requires that no subfunctions are called once userspace access has
been enabled. The existing VMX copy implementation violates this
requirement by invoking enter_vmx_usercopy() from the assembly path after
userspace access has already been enabled. If preemption occurs
in this window, the AMR state may not be preserved correctly,
leading to unexpected userspace access state and resulting in
KUAP warnings.
Fix this by restructuring the VMX usercopy flow so that VMX selection
and VMX state management are centralized in raw_copy_tofrom_user(),
which is invoked by the raw_copy_{to,from,in}_user() wrappers.
The new flow is:
- raw_copy_{to,from,in}_user() calls raw_copy_tofrom_user()
- raw_copy_tofrom_user() decides whether to use the VMX path
based on size and CPU capability
- Call enter_vmx_usercopy() before enabling userspace access
- Enable userspace access as per the copy direction
and perform the VMX copy
- Disable userspace access as per the copy direction
- Call exit_vmx_usercopy()
- Fall back to the base copy routine if the VMX copy faults
With this change, the VMX assembly routines no longer perform VMX state
management or call helper functions; they only implement the
copy operations.
The previous feature-section based VMX selection inside
__copy_tofrom_user_power7() is removed, and a dedicated
__copy_tofrom_user_power7_vmx() entry point is introduced.
This ensures correct KUAP ordering, avoids subfunction calls
while KUAP is unlocked, and eliminates the warnings while preserving
the VMX fast path.
Fixes: de78a9c42a ("powerpc: Add a framework for Kernel Userspace Access Protection")
Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Closes: https://lore.kernel.org/all/20260109064917.777587-2-sshegde@linux.ibm.com/
Suggested-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260304122201.153049-1-sayalip@linux.ibm.com
It may happen that mm is already released, which leads to kernel panic.
This adds the NULL check for current->mm, similarly to
commit 20afc60f89 ("x86, perf: Check that current->mm is alive before getting user callchain").
I was getting this panic when running a profiling BPF program
(profile.py from bcc-tools):
[26215.051935] Kernel attempted to read user page (588) - exploit attempt? (uid: 0)
[26215.051950] BUG: Kernel NULL pointer dereference on read at 0x00000588
[26215.051952] Faulting instruction address: 0xc00000000020fac0
[26215.051957] Oops: Kernel access of bad area, sig: 11 [#1]
[...]
[26215.052049] Call Trace:
[26215.052050] [c000000061da6d30] [c00000000020fc10] perf_callchain_user_64+0x2d0/0x490 (unreliable)
[26215.052054] [c000000061da6dc0] [c00000000020f92c] perf_callchain_user+0x1c/0x30
[26215.052057] [c000000061da6de0] [c0000000005ab2a0] get_perf_callchain+0x100/0x360
[26215.052063] [c000000061da6e70] [c000000000573bc8] bpf_get_stackid+0x88/0xf0
[26215.052067] [c000000061da6ea0] [c008000000042258] bpf_prog_16d4ab9ab662f669_do_perf_event+0xf8/0x274
[...]
In addition, move storing the top-level stack entry to generic
perf_callchain_user to make sure the top-evel entry is always captured,
even if current->mm is NULL.
Fixes: 20002ded4d ("perf_counter: powerpc: Add callchain support")
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Tested-by: Qiao Zhao <qzhao@redhat.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
[Maddy: fixed message to avoid checkpatch format style error]
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260309144045.169427-1-vmalik@redhat.com
commit 4267739cab ("arch, mm: consolidate initialization of SPARSE memory model"),
changed the initialization order of "pageblock_order" from...
start_kernel()
- setup_arch()
- initmem_init()
- sparse_init()
- set_pageblock_order(); // this sets the pageblock_order
- xxx_cma_reserve();
to...
start_kernel()
- setup_arch()
- xxx_cma_reserve();
- mm_core_init_early()
- free_area_init()
- sparse_init()
- set_pageblock_order() // this sets the pageblock_order.
So this means, pageblock_order is not initialized before these cma
reservation function calls, hence we are seeing CMA failures like...
[ 0.000000] kvm_cma_reserve: reserving 3276 MiB for global area
[ 0.000000] cma: pageblock_order not yet initialized. Called during early boot?
[ 0.000000] cma: Failed to reserve 3276 MiB
....
[ 0.000000][ T0] cma: pageblock_order not yet initialized. Called during early boot?
[ 0.000000][ T0] cma: Failed to reserve 1024 MiB
This patch moves these CMA reservations to arch_mm_preinit() which
happens in mm_core_init() (which happens after pageblock_order is
initialized), but before the memblock moves the free memory to buddy.
Fixes: 4267739cab ("arch, mm: consolidate initialization of SPARSE memory model")
Suggested-by: Mike Rapoport <rppt@kernel.org>
Reported-and-tested-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Closes: https://lore.kernel.org/linuxppc-dev/4c338a29-d190-44f3-8874-6cfa0a031f0b@linux.ibm.com/
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Tested-by: Dan Horák <dan@danny.cz>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/6e532cf0db5be99afbe20eed699163d5e86cd71f.1772303986.git.ritesh.list@gmail.com
Fixes for v7.0:
Core:
- Adjusted msm_iommu_pagetable_prealloc_allocate() allocation type
DPU:
- Fixed blue screens on Hamoa laptops by reverting the LM reservation
- Fixed the size of the LM block on several platforms
- Dropped usage of %pK (again)
- Fixed smatch warning on SSPP v13+ code
- Fixed INTF_6 interrupts on Lemans
DSI:
- Fixed DSI PHY revision on Kaanapali
- Fixed pixel clock calculation for the bonded DSI mode panels with
compression enabled
DT bindings:
- Fixed DisplayPort description on Glymur
- Fixed model name in SM8750 MDSS schema
GPU:
- Added MODULE_DEVICE_TABLE to the GPU driver
- Fix bogus protect error on X2-85
- Fix dma_free_attrs() buffer size
- Gen8 UBWC fix for Glymur
From: Rob Clark <rob.clark@oss.qualcomm.com>
Link: https://patch.msgid.link/CACSVV00wZ95gFDLfzJ0Ywb8rsjPSjZ1aHdwE4smnyuZ=Fg-g8Q@mail.gmail.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
The NIX RAS health reporter recovery routine checks nix_af_rvu_int to
decide whether to re-enable NIX_AF_RAS interrupts. This is the RVU
interrupt status field and is unrelated to RAS events, so the recovery
flow may incorrectly skip re-enabling NIX_AF_RAS interrupts.
Check nix_af_rvu_ras instead before writing NIX_AF_RAS_ENA_W1S.
Fixes: 5ed66306ea ("octeontx2-af: Add devlink health reporters for NIX")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Link: https://patch.msgid.link/20260310184824.1183651-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The "rx_filter" member of "hwtstamp_config" structure is an enum field and
does not support bitwise OR combination of multiple filter values. It
causes error while linuxptp application tries to match rx filter version.
Fix this by storing the requested filter type in a new port field.
Fixes: 97248adb5a ("net: ti: am65-cpsw: Update hw timestamping filter for PTPv1 RX packets")
Signed-off-by: Chintan Vankar <c-vankar@ti.com>
Link: https://patch.msgid.link/20260310160940.109822-1-c-vankar@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Florian Westphal says:
====================
netfilter: updates for net
Due to large volume of backlogged patches its unlikely I will make the
2nd planned PR this week, so several legit fixes will be pushed back
to next week. Sorry for the inconvenience but I am out of ideas and
alternatives.
1) syzbot managed to add/remove devices to a flowtable, due to a bug in
the flowtable netdevice notifier this gets us a double-add and
eventually UaF when device is removed again (we only expect one
entry, duplicate remains past net_device end-of-life).
From Phil Sutter, bug added in 6.16.
2) Yiming Qian reports another nf_tables transaction handling bug:
in some cases error unwind misses to undo certain set elements,
resulting in refcount underflow and use-after-free, bug added in 6.4.
3) Jenny Guanni Qu found out-of-bounds read in pipapo set type.
While the value is never used, it still rightfully triggers KASAN
splats. Bug exists since this set type was added in 5.6.
4) a few x_tables modules contain copypastry tcp option parsing code which
can read 1 byte past the option area. This bug is ancient, fix from
David Dull.
5) nfnetlink_queue leaks kernel memory if userspace provides bad
NFQA_VLAN/NFQA_L2HDR attributes. From Hyunwoo Kim, bug stems from
from 4.7 days.
6) nfnetlink_cthelper has incorrect loop restart logic which may result
in reading one pointer past end of array. From 3.6 days, fix also from
Hyunwoo Kim.
7) xt_IDLETIMER v0 extension must reject working with timers added
by revision v1, else we get list corruption. Bug added in v5.7.
From Yifan Wu, Juefei Pu and Yuan Tan via Xin Lu.
* tag 'nf-26-03-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: xt_IDLETIMER: reject rev0 reuse of ALARM timer labels
netfilter: nfnetlink_cthelper: fix OOB read in nfnl_cthelper_dump_table()
netfilter: nfnetlink_queue: fix entry leak in bridge verdict error path
netfilter: x_tables: guard option walkers against 1-byte tail reads
netfilter: nft_set_pipapo: fix stack out-of-bounds read in pipapo_drop()
netfilter: nf_tables: always walk all pending catchall elements
netfilter: nf_tables: Fix for duplicate device in netdev hooks
====================
Link: https://patch.msgid.link/20260310132050.630-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2026-03-10 (ice, iavf, i40e, e1000e, e1000)
Nikolay Aleksandrov changes return code of RDMA related ice devlink get
parameters when irdma is not enabled to -EOPNOTSUPP as current return
of -ENODEV causes issues with devlink output.
Petr Oros resolves a couple of issues in iavf; freeing PTP resources
before reset and disable. Fixing contention issues with the netdev lock
between reset and some ethtool operations.
Alok Tiwari corrects an incorrect comparison of cloud filter values and
adjust some passed arguments to sizeof() for consistency on i40e.
Matt Vollrath removes an incorrect decrement for DMA error on e1000 and
e1000e drivers.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000/e1000e: Fix leak in DMA error cleanup
i40e: fix src IP mask checks and memcpy argument names in cloud filter
iavf: fix incorrect reset handling in callbacks
iavf: fix PTP use-after-free during reset
drivers: net: ice: fix devlink parameters get without irdma
====================
Link: https://patch.msgid.link/20260310205654.4109072-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca says:
====================
neighbour: fix update of proxy neighbour
While re-reading some "old" patches I ran into a small change of
behavior in commit dc2a27e524 ("neighbour: Update pneigh_entry in
pneigh_create().").
The old behavior was not consistent between ->protocol and ->flags,
and didn't offer a way to clear protocol, so maybe it's better to
change that (7-years-old [1]) behavior. But then we should change
non-proxy neighbours as well to keep neigh/pneigh consistent.
[1] df9b0e30d4 ("neighbor: Add protocol attribute")
====================
Link: https://patch.msgid.link/cover.1772894876.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The rtl8366rb_led_group_port_mask() function always returns LED port
bit in LED group 0; the switch statement returns the same thing in all
non-default cases.
This means that the driver does not currently support configuring LEDs
in non-zero LED groups.
Fix this.
Fixes: 32d6170054 ("net: dsa: realtek: add LED drivers for rtl8366rb")
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260311111237.29002-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A user can set conn_timeout to any value via
setsockopt(TIPC_CONN_TIMEOUT), including values less than 4. When a
SYN is rejected with TIPC_ERR_OVERLOAD and the retry path in
tipc_sk_filter_connect() executes:
delay %= (tsk->conn_timeout / 4);
If conn_timeout is in the range [0, 3], the integer division yields 0,
and the modulo operation triggers a divide-by-zero exception, causing a
kernel oops/panic.
Fix this by clamping conn_timeout to a minimum of 4 at the point of use
in tipc_sk_filter_connect().
Oops: divide error: 0000 [#1] SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 119 Comm: poc-F144 Not tainted 7.0.0-rc2+
RIP: 0010:tipc_sk_filter_rcv (net/tipc/socket.c:2236 net/tipc/socket.c:2362)
Call Trace:
tipc_sk_backlog_rcv (include/linux/instrumented.h:82 include/linux/atomic/atomic-instrumented.h:32 include/net/sock.h:2357 net/tipc/socket.c:2406)
__release_sock (include/net/sock.h:1185 net/core/sock.c:3213)
release_sock (net/core/sock.c:3797)
tipc_connect (net/tipc/socket.c:2570)
__sys_connect (include/linux/file.h:62 include/linux/file.h:83 net/socket.c:2098)
Fixes: 6787927475 ("tipc: buffer overflow handling in listener socket")
Cc: stable@vger.kernel.org
Signed-off-by: Mehul Rao <mehulrao@gmail.com>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Link: https://patch.msgid.link/20260310170730.28841-1-mehulrao@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never
initialized because inet6_init() exits before ndisc_init() is called which
initializes it. If bpf_redirect_neigh() is called with explicit AF_INET6
nexthop parameters, __bpf_redirect_neigh_v6() can skip the IPv6 FIB lookup
and call bpf_out_neigh_v6() directly. bpf_out_neigh_v6() then calls
ip_neigh_gw6(), which uses ipv6_stub->nd_tbl.
BUG: kernel NULL pointer dereference, address: 0000000000000248
Oops: Oops: 0000 [#1] SMP NOPTI
RIP: 0010:skb_do_redirect+0x44f/0xf40
Call Trace:
<TASK>
? srso_alias_return_thunk+0x5/0xfbef5
? __tcf_classify.constprop.0+0x83/0x160
? srso_alias_return_thunk+0x5/0xfbef5
? tcf_classify+0x2b/0x50
? srso_alias_return_thunk+0x5/0xfbef5
? tc_run+0xb8/0x120
? srso_alias_return_thunk+0x5/0xfbef5
__dev_queue_xmit+0x6fa/0x1000
? srso_alias_return_thunk+0x5/0xfbef5
packet_sendmsg+0x10da/0x1700
? srso_alias_return_thunk+0x5/0xfbef5
__sys_sendto+0x1f3/0x220
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x101/0xf80
? exc_page_fault+0x6e/0x170
? srso_alias_return_thunk+0x5/0xfbef5
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Fix this by adding an early check in bpf_out_neigh_v6(). If IPv6 is
disabled, drop the packet before neighbor lookup.
Suggested-by: Fernando Fernandez Mancera <fmancera@suse.de>
Fixes: ba452c9e99 ("bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260307-net-nd_tbl_fixes-v4-4-e2677e85628c@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never
initialized because inet6_init() exits before ndisc_init() is called which
initializes it. If bpf_redirect_neigh() is called from tc with an explicit
nexthop of nh_family == AF_INET6, bpf_out_neigh_v4() takes the AF_INET6
branch and calls ip_neigh_gw6(), which relies on ipv6_stub->nd_tbl.
BUG: kernel NULL pointer dereference, address: 0000000000000248
Oops: Oops: 0000 [#1] SMP NOPTI
RIP: 0010:skb_do_redirect+0xb93/0xf00
Call Trace:
<TASK>
? srso_alias_return_thunk+0x5/0xfbef5
? __tcf_classify.constprop.0+0x83/0x160
? srso_alias_return_thunk+0x5/0xfbef5
? tcf_classify+0x2b/0x50
? srso_alias_return_thunk+0x5/0xfbef5
? tc_run+0xb8/0x120
? srso_alias_return_thunk+0x5/0xfbef5
__dev_queue_xmit+0x6fa/0x1000
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? alloc_skb_with_frags+0x58/0x200
packet_sendmsg+0x10da/0x1700
? srso_alias_return_thunk+0x5/0xfbef5
__sys_sendto+0x1f3/0x220
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x101/0xf80
? exc_page_fault+0x6e/0x170
? srso_alias_return_thunk+0x5/0xfbef5
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Fix this by adding an early check in the AF_INET6 branch of
bpf_out_neigh_v4(). If IPv6 is disabled, unlock RCU and drop the packet.
Suggested-by: Fernando Fernandez Mancera <fmancera@suse.de>
Fixes: ba452c9e99 ("bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260307-net-nd_tbl_fixes-v4-3-e2677e85628c@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When booting with the 'ipv6.disable=1' parameter, the nd_tbl is never
initialized because inet6_init() exits before ndisc_init() is called
which initializes it. If bonding ARP/NS validation is enabled, an IPv6
NS/NA packet received on a slave can reach bond_validate_na(), which
calls bond_has_this_ip6(). That path calls ipv6_chk_addr() and can
crash in __ipv6_chk_addr_and_flags().
BUG: kernel NULL pointer dereference, address: 00000000000005d8
Oops: Oops: 0000 [#1] SMP NOPTI
RIP: 0010:__ipv6_chk_addr_and_flags+0x69/0x170
Call Trace:
<IRQ>
ipv6_chk_addr+0x1f/0x30
bond_validate_na+0x12e/0x1d0 [bonding]
? __pfx_bond_handle_frame+0x10/0x10 [bonding]
bond_rcv_validate+0x1a0/0x450 [bonding]
bond_handle_frame+0x5e/0x290 [bonding]
? srso_alias_return_thunk+0x5/0xfbef5
__netif_receive_skb_core.constprop.0+0x3e8/0xe50
? srso_alias_return_thunk+0x5/0xfbef5
? update_cfs_rq_load_avg+0x1a/0x240
? srso_alias_return_thunk+0x5/0xfbef5
? __enqueue_entity+0x5e/0x240
__netif_receive_skb_one_core+0x39/0xa0
process_backlog+0x9c/0x150
__napi_poll+0x30/0x200
? srso_alias_return_thunk+0x5/0xfbef5
net_rx_action+0x338/0x3b0
handle_softirqs+0xc9/0x2a0
do_softirq+0x42/0x60
</IRQ>
<TASK>
__local_bh_enable_ip+0x62/0x70
__dev_queue_xmit+0x2d3/0x1000
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? packet_parse_headers+0x10a/0x1a0
packet_sendmsg+0x10da/0x1700
? kick_pool+0x5f/0x140
? srso_alias_return_thunk+0x5/0xfbef5
? __queue_work+0x12d/0x4f0
__sys_sendto+0x1f3/0x220
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x101/0xf80
? exc_page_fault+0x6e/0x170
? srso_alias_return_thunk+0x5/0xfbef5
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Fix this by checking ipv6_mod_enabled() before dispatching IPv6 packets to
bond_na_rcv(). If IPv6 is disabled, return early from bond_rcv_validate()
and avoid the path to ipv6_chk_addr().
Suggested-by: Fernando Fernandez Mancera <fmancera@suse.de>
Fixes: 4e24be018e ("bonding: add new parameter ns_targets")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20260307-net-nd_tbl_fixes-v4-2-e2677e85628c@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When retrans mount option was introduced, the default value was set
as 1. However, in the light of some bugs that this has exposed recently
we should change it to 0 and retain the old behaviour before this option
was introduced.
Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
When looking up open handles to be re-used in cifs_open(), calling
cifs_get_{writable,readable}_path() is wrong as it will look up for
the first matching open handle, and if @file->f_flags doesn't match,
it will ignore the remaining open handles in
cifsInodeInfo::openFileList that might potentially match
@file->f_flags.
For writable and readable handles, fix this by calling
__cifs_get_writable_file() and __find_readable_file(), respectively,
with FIND_OPEN_FLAGS set.
With the patch, the following program ends up with two opens instead
of three sent over the wire.
```
#define _GNU_SOURCE
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
int fd;
fd = open("/mnt/1/foo", O_CREAT | O_WRONLY | O_TRUNC, 0664);
close(fd);
fd = open("/mnt/1/foo", O_DIRECT | O_WRONLY);
close(fd);
fd = open("/mnt/1/foo", O_WRONLY);
close(fd);
fd = open("/mnt/1/foo", O_DIRECT | O_WRONLY);
close(fd);
return 0;
}
```
```
$ mount.cifs //srv/share /mnt/1 -o ...
$ gcc test.c && ./a.out
```
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
parse_server_interfaces() initializes interface socket addresses with
CIFS_PORT. When the mount uses a non-default port this overwrites the
configured destination port.
Later, cifs_chan_update_iface() copies this sockaddr into server->dstaddr,
causing reconnect attempts to use the wrong port after server interface
updates.
Use the existing port from server->dstaddr instead.
Cc: stable@vger.kernel.org
Fixes: fe856be475 ("CIFS: parse and store info on iface queries")
Tested-by: Dr. Thomas Orgis <thomas.orgis@uni-hamburg.de>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
The DesignWare I3C master controller ACKs IBIs as soon as a valid
Device Address Table (DAT) entry is present. This can create a race
between device attachment (after DAA) and the point where the client
driver enables IBIs via i3c_device_enable_ibi().
Set DEV_ADDR_TABLE_SIR_REJECT in the DAT entry during
attach_i3c_dev() and reattach_i3c_dev() so that IBIs are rejected
by default. The bit is managed thereafter by the existing
dw_i3c_master_set_sir_enabled() function, which clears it in
enable_ibi() after ENEC is issued, and restores it in disable_ibi()
after DISEC.
Fixes: 1dd728f5d4 ("i3c: master: Add driver for Synopsys DesignWare IP")
Signed-off-by: Adrian Ng Ho Yin <adrianhoyin.ng@altera.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/53f5b8cbdd8af789ec38b95b02873f32f9182dd6.1770962368.git.adrianhoyin.ng@altera.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The DesignWare I3C master driver creates a virtual I2C adapter to
provide backward compatibility with I2C devices. However, the current
implementation does not associate this virtual adapter with any
Device Tree node.
Propagate the of_node from the I3C master platform device to the
virtual I2C adapter's device structure. This ensures that standard
I2C aliases are correctly resolved and bus numbering remains consistent.
Signed-off-by: Peter Yin <peteryin.openbmc@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260302075645.1492766-1-peteryin.openbmc@gmail.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Disruption of the MIPI I3C HCI controller's internal state can cause
i3c_hci_bus_disable() to fail when attempting to shut down the bus.
In the code paths where bus disable is invoked - bus clean-up and runtime
suspend - the controller does not need to remain operational afterward, so
a full controller reset is a safe recovery mechanism.
Add a fallback to issue a software reset when disabling the bus fails.
This ensures the bus is reliably halted even if the controller's state
machine is stuck or unresponsive.
The fallback is used both during bus clean-up and in the runtime suspend
path. In the latter case, ensure interrupts are quiesced after reset.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-15-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Shared interrupts may fire unexpectedly, including during periods when the
controller is not yet fully initialized. Commit b9a15012a1
("i3c: mipi-i3c-hci: Add optional Runtime PM support") addressed this issue
for the runtime-suspended state, but the same problem can also occur before
the bus is enabled for the first time.
Ensure the IRQ handler ignores interrupts until initialization is complete
by making consistent use of the existing irq_inactive flag. The flag is
now set to false immediately before enabling the bus.
To guarantee correct ordering with respect to the IRQ handler, protect
all transitions of irq_inactive with the same spinlock used inside the
handler.
Fixes: b8460480f6 ("i3c: mipi-i3c-hci: Allow for Multi-Bus Instances")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-14-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The DMA ring halts whenever a transfer encounters an error. The interrupt
handler previously attempted to detect this situation and restart the ring
if a transfer completed at the same time. However, this restart logic runs
entirely in interrupt context and is inherently racy: it interacts with
other paths manipulating the ring state, and fully serializing it within
the interrupt handler is not practical.
Move this error-recovery logic out of the interrupt handler and into the
transfer-processing path (i3c_hci_process_xfer()), where serialization and
state management are already controlled. Introduce a new optional I/O-ops
callback, handle_error(), invoked when a completed transfer reports an
error. For DMA operation, the implementation simply calls the existing
dequeue function, which safely aborts and restarts the ring when needed.
This removes the fragile ring-restart logic from the interrupt handler and
centralizes error handling where proper sequencing can be ensured.
Fixes: ccdb2e0e3b ("i3c: mipi-i3c-hci: Add Intel specific quirk to ring resuming")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-13-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Several parts of the MIPI I3C HCI driver duplicate the same sequence for
queuing a transfer, waiting for completion, and handling timeouts. This
logic appears in five separate locations and will be affected by an
upcoming fix.
Refactor the repeated code into a new helper, i3c_hci_process_xfer(), and
store the timeout value in the hci_xfer structure so that callers do not
need to pass it as a separate parameter.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-12-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The DMA dequeue path attempts to restart the ring after aborting an
in-flight transfer, but the current sequence is incomplete. The controller
must be brought out of the aborted state and the ring control registers
must be programmed in the correct order: first clearing ABORT, then
re-enabling the ring and asserting RUN_STOP to resume operation.
Add the missing controller resume step and update the ring control writes
so that the ring is restarted using the proper sequence.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-11-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The internal control command descriptor used for no-op commands includes a
Transaction ID (TID) field, but the no-op command constructed in
hci_dma_dequeue_xfer() omitted it. As a result, the hardware receives a
no-op descriptor without the expected TID.
This bug has gone unnoticed because the TID is currently not validated in
the no-op completion path, but the descriptor format requires it to be
present.
Add the missing TID field when generating a no-op descriptor so that its
layout matches the defined command structure.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-10-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The logic used to abort the DMA ring contains several flaws:
1. The driver unconditionally issues a ring abort even when the ring has
already stopped.
2. The completion used to wait for abort completion is never
re-initialized, resulting in incorrect wait behavior.
3. The abort sequence unintentionally clears RING_CTRL_ENABLE, which
resets hardware ring pointers and disrupts the controller state.
4. If the ring is already stopped, the abort operation should be
considered successful without attempting further action.
Fix the abort handling by checking whether the ring is running before
issuing an abort, re-initializing the completion when needed, ensuring that
RING_CTRL_ENABLE remains asserted during abort, and treating an already
stopped ring as a successful condition.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-9-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The DMA ring bookkeeping in the MIPI I3C HCI driver is updated from two
contexts: the DMA ring dequeue path (hci_dma_dequeue_xfer()) and the
interrupt handler (hci_dma_xfer_done()). Both modify the ring's
in-flight transfer state - specifically rh->src_xfers[] and
xfer->ring_entry - but without any serialization. This allows the two
paths to race, potentially leading to inconsistent ring state.
Serialize access to the shared ring state by extending the existing
spinlock to cover the DMA dequeue path and the entire interrupt handler.
Since the core IRQ handler now holds this lock, remove the per-function
locking from the PIO and DMA sub-handlers.
Additionally, clear the completed entry in rh->src_xfers[] in
hci_dma_xfer_done() so it cannot be matched or completed again.
Finally, place the ring restart sequence under the same lock in
hci_dma_dequeue_xfer() to avoid concurrent enqueue or completion
operations while the ring state is being modified.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-8-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The HCI DMA dequeue path (hci_dma_dequeue_xfer()) may be invoked for
multiple transfers that timeout around the same time. However, the
function is not serialized and can race with itself.
When a timeout occurs, hci_dma_dequeue_xfer() stops the ring, processes
incomplete transfers, and then restarts the ring. If another timeout
triggers a parallel call into the same function, the two instances may
interfere with each other - stopping or restarting the ring at unexpected
times.
Add a mutex so that hci_dma_dequeue_xfer() is serialized with respect to
itself.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-7-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The I3C subsystem allows multiple transfers to be queued concurrently.
However, the MIPI I3C HCI driver's DMA enqueue path, hci_dma_queue_xfer(),
lacks sufficient serialization.
In particular, the allocation of the enqueue_ptr and its subsequent update
in the RING_OPERATION1 register, must be done atomically. Otherwise, for
example, it would be possible for 2 transfers to be allocated the same
enqueue_ptr.
Extend the use of the existing spinlock for that purpose. Keep a count of
the number of xfers enqueued so that it is easy to determine if the ring
has enough space.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-6-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
The MIPI I3C HCI driver currently uses separate spinlocks for different
contexts (PIO vs. DMA rings). This split is unnecessary and complicates
upcoming fixes. The driver does not support concurrent PIO and DMA
operation, and it only supports a single DMA ring, so a single lock is
sufficient for all paths.
Introduce a unified spinlock in struct i3c_hci, switch both PIO and DMA
code to use it, and remove the per-context locks.
No functional change is intended in this patch.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-5-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Prepare for fixing a race in the DMA ring enqueue path when handling
parallel transfers. Move all DMA mapping out of hci_dma_queue_xfer()
and into a new helper that performs the mapping up front.
This refactoring allows the upcoming fix to extend the spinlock coverage
around the enqueue operation without performing DMA mapping under the
spinlock.
No functional change is intended in this patch.
Fixes: 9ad9a52cce ("i3c/master: introduce the mipi-i3c-hci driver")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260306072451.11131-4-adrian.hunter@intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>