Fix missing configuration for LAN865x silicon revisions B0 and B1 as per
Microchip Application Note AN1760 (Rev F, June 2024).
The Timer Increment register was not being set, which is required for
accurate timestamping. As per the application note, configure the MAC to
set timestamping at the end of the Start of Frame Delimiter (SFD), and
set the Timer Increment register to 40 ns (corresponding to a 25 MHz
internal clock).
Link: https://www.microchip.com/en-us/application-notes/an1760
Fixes: 5cd2340cb6 ("microchip: lan865x: add driver support for Microchip's LAN865X MAC-PHY")
Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250818060514.52795-3-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This fixes an issue where the transmit queue is started implicitly only
the very first time the device is registered. When the device is taken
down and brought back up again (using `ip` or `ifconfig`), the transmit
queue is not restarted, causing packet transmission to hang.
Adding an explicit call to netif_start_queue() in lan865x_net_open()
ensures the transmit queue is properly started every time the device
is reopened.
Fixes: 5cd2340cb6 ("microchip: lan865x: add driver support for Microchip's LAN865X MAC-PHY")
Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20250818060514.52795-2-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mark Bloch says:
====================
mlx5 HWS fixes 2025-08-17
The following patch set focuses on hardware steering fixes
found by the team.
====================
Link: https://patch.msgid.link/20250817202323.308604-1-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Specifying the counter action is not enough, as it is used by multiple
counters that were allocated in a bulk. By omitting the offset, rules
will be associated with a different counter from the same bulk.
Subsequently, the CT subsystem checks the correct counter, assumes that
no traffic has triggered the rule, and ages out the rule. The end result
is intermittent offloading of long lived connections, as rules are aged
out then promptly re-added.
Fix this by specifying the correct offset along with the counter rule.
Fixes: 34eea5b12a ("net/mlx5e: CT: Add initial support for Hardware Steering")
Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250817202323.308604-8-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
During table creation, caller passes a UID using ft_attr. The UID
value was ignored, which leads to problems when the caller sets the
UID to a non-zero value, such as SHARED_RESOURCE_UID (0xffff) - the
internal FT objects will be created with UID=0.
Fixes: 0869701cba ("net/mlx5: HWS, added FW commands handling")
Signed-off-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250817202323.308604-7-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If rule creation failed due to a full queue, due to timeout
in polling for completion, or due to matcher being in resize,
don't try to initiate rehash sequence - rehash would have
failed anyway.
Fixes: 2111bb970c ("net/mlx5: HWS, added backward-compatible API handling")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250817202323.308604-6-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
While moving the rules during rehash, CQ is not drained. The flush
and drain happens only when all the rules of a certain queue have been
moved. This behaviour can lead to accumulating large quantity of rules
that haven't got their completion yet, and eventually will fill up
the queue and will cause the rehash to fail.
Fix this problem by requiring drain once the number of outstanding
completions reaches a certain threshold.
Fixes: ef94799a87 ("net/mlx5: HWS, rework rehash loop")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250817202323.308604-5-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Moving rules from matcher to matcher should not fail.
However, if it does fail due to various reasons, the error flow
should allow the kernel to continue functioning (albeit with broken
steering rules) instead of going into series of soft lock-ups or
some other problematic behaviour.
Similar to the simple rules, complex rules rehash logic suffers
from the same problems. This patch fixes the error flow for moving
complex rules:
- If new rule creation fails before it was even enqeued, do not
poll for completion
- If TIMEOUT happened while moving the rule, no point trying
to poll for completions for other rules. Something is broken,
completion won't come, just abort the rehash sequence.
- If some other completion with error received, don't give up.
Continue handling rest of the rules to minimize the damage.
- Make sure that the first error code that was received will
be actually returned to the caller instead of replacing it
with the generic error code.
All the aforementioned issues stem from the same bad error flow,
so no point fixing them one by one and leaving partially broken
code - fixing them in one patch.
Fixes: 17e0accac5 ("net/mlx5: HWS, support complex matchers")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250817202323.308604-4-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Moving rules from matcher to matcher should not fail.
However, if it does fail due to various reasons, the error flow
should allow the kernel to continue functioning (albeit with broken
steering rules) instead of going into series of soft lock-ups or
some other problematic behaviour.
This patch fixes the error flow for moving simple rules:
- If new rule creation fails before it was even enqeued, do not
poll for completion
- If TIMEOUT happened while moving the rule, no point trying
to poll for completions for other rules. Something is broken,
completion won't come, just abort the rehash sequence.
- If some other completion with error received, don't give up.
Continue handling rest of the rules to minimize the damage.
- Make sure that the first error code that was received will
be actually returned to the caller instead of replacing it
with the generic error code.
All the aforementioned issues stem from the same bad error flow,
so no point fixing them one by one and leaving partially broken
code - fixing them in one patch.
Fixes: ef94799a87 ("net/mlx5: HWS, rework rehash loop")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250817202323.308604-3-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The clk_tx_i clock must be supplied to the MAC for successful
initialization. On TH1520 SoC, the clock is provided by an internal
divider configured through GMAC_PLLCLK_DIV register when using RGMII
interface. However, currently we don't setup the divider before
initialization of the MAC, resulting in DMA reset failures if the
bootloader/firmware doesn't enable the divider,
[ 7.839601] thead-dwmac ffe7060000.ethernet eth0: Register MEM_TYPE_PAGE_POOL RxQ-0
[ 7.938338] thead-dwmac ffe7060000.ethernet eth0: PHY [stmmac-0:02] driver [RTL8211F Gigabit Ethernet] (irq=POLL)
[ 8.160746] thead-dwmac ffe7060000.ethernet eth0: Failed to reset the dma
[ 8.170118] thead-dwmac ffe7060000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed
[ 8.179384] thead-dwmac ffe7060000.ethernet eth0: __stmmac_open: Hw setup failed
Let's simply write GMAC_PLLCLK_DIV_EN to GMAC_PLLCLK_DIV to enable the
divider before MAC initialization. Note that for reconfiguring the
divisor, the divider must be disabled first and re-enabled later to make
sure the new divisor take effect.
The exact clock rate doesn't affect MAC's initialization according to my
test. It's set to the speed required by RGMII when the linkspeed is
1Gbps and could be reclocked later after link is up if necessary.
Fixes: 33a1a01e3a ("net: stmmac: Add glue layer for T-HEAD TH1520 SoC")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Reviewed-by: Drew Fustini <fustini@kernel.org>
Link: https://patch.msgid.link/20250815104803.55294-1-ziyao@disroot.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A crash can occur if an ethtool operation is invoked
after shutdown() is called.
shutdown() is invoked during system shutdown to stop DMA operations
without performing expensive deallocations. It is discouraged to
unregister the netdev in this path, so the device may still be visible
to userspace and kernel helpers.
In gve, shutdown() tears down most internal data structures. If an
ethtool operation is dispatched after shutdown(), it will dereference
freed or NULL pointers, leading to a kernel panic. While graceful
shutdown normally quiesces userspace before invoking the reboot
syscall, forced shutdowns (as observed on GCP VMs) can still trigger
this path.
Fix by calling netif_device_detach() in shutdown().
This marks the device as detached so the ethtool ioctl handler
will skip dispatching operations to the driver.
Fixes: 974365e518 ("gve: Implement suspend/resume/shutdown")
Signed-off-by: Jordan Rhee <jordanrhee@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Link: https://patch.msgid.link/20250818211245.1156919-1-jeroendb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There was a problem when we received frames and the frames were
timestamped. The driver is configured to store the nanosecond part of
the timestmap in the ptp reserved bits and it would take the second part
by reading the LTC. The problem is that when reading the LTC we are in
atomic context and to read the second part will go over mdio bus which
might sleep, so we get an error.
The fix consists in actually put all the frames in a queue and start the
aux work and in that work to read the LTC and then calculate the full
received time.
Fixes: 7d272e63e0 ("net: phy: mscc: timestamping and PHC support")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250818081029.1300780-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When a user creates a dualpi2 qdisc it automatically sets a timer. This
timer will run constantly and update the qdisc's probability field.
The issue is that the timer acquires the qdisc root lock and runs in
hardirq. The qdisc root lock is also acquired in dev.c whenever a packet
arrives for this qdisc. Since the dualpi2 timer callback runs in hardirq,
it may interrupt the packet processing running in softirq. If that happens
and it runs on the same CPU, it will acquire the same lock and cause a
deadlock. The following splat shows up when running a kernel compiled with
lock debugging:
[ +0.000224] WARNING: inconsistent lock state
[ +0.000224] 6.16.0+ #10 Not tainted
[ +0.000169] --------------------------------
[ +0.000029] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ +0.000000] ping/156 [HC0[0]:SC0[2]:HE1:SE0] takes:
[ +0.000000] ffff897841242110 (&sch->root_lock_key){?.-.}-{3:3}, at: __dev_queue_xmit+0x86d/0x1140
[ +0.000000] {IN-HARDIRQ-W} state was registered at:
[ +0.000000] lock_acquire.part.0+0xb6/0x220
[ +0.000000] _raw_spin_lock+0x31/0x80
[ +0.000000] dualpi2_timer+0x6f/0x270
[ +0.000000] __hrtimer_run_queues+0x1c5/0x360
[ +0.000000] hrtimer_interrupt+0x115/0x260
[ +0.000000] __sysvec_apic_timer_interrupt+0x6d/0x1a0
[ +0.000000] sysvec_apic_timer_interrupt+0x6e/0x80
[ +0.000000] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ +0.000000] pv_native_safe_halt+0xf/0x20
[ +0.000000] default_idle+0x9/0x10
[ +0.000000] default_idle_call+0x7e/0x1e0
[ +0.000000] do_idle+0x1e8/0x250
[ +0.000000] cpu_startup_entry+0x29/0x30
[ +0.000000] rest_init+0x151/0x160
[ +0.000000] start_kernel+0x6f3/0x700
[ +0.000000] x86_64_start_reservations+0x24/0x30
[ +0.000000] x86_64_start_kernel+0xc8/0xd0
[ +0.000000] common_startup_64+0x13e/0x148
[ +0.000000] irq event stamp: 6884
[ +0.000000] hardirqs last enabled at (6883): [<ffffffffa75700b3>] neigh_resolve_output+0x223/0x270
[ +0.000000] hardirqs last disabled at (6882): [<ffffffffa7570078>] neigh_resolve_output+0x1e8/0x270
[ +0.000000] softirqs last enabled at (6880): [<ffffffffa757006b>] neigh_resolve_output+0x1db/0x270
[ +0.000000] softirqs last disabled at (6884): [<ffffffffa755b533>] __dev_queue_xmit+0x73/0x1140
[ +0.000000]
other info that might help us debug this:
[ +0.000000] Possible unsafe locking scenario:
[ +0.000000] CPU0
[ +0.000000] ----
[ +0.000000] lock(&sch->root_lock_key);
[ +0.000000] <Interrupt>
[ +0.000000] lock(&sch->root_lock_key);
[ +0.000000]
*** DEADLOCK ***
[ +0.000000] 4 locks held by ping/156:
[ +0.000000] #0: ffff897842332e08 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x41e/0xf40
[ +0.000000] #1: ffffffffa816f880 (rcu_read_lock){....}-{1:3}, at: ip_output+0x2c/0x190
[ +0.000000] #2: ffffffffa816f880 (rcu_read_lock){....}-{1:3}, at: ip_finish_output2+0xad/0x950
[ +0.000000] #3: ffffffffa816f840 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x73/0x1140
I am able to reproduce it consistently when running the following:
tc qdisc add dev lo handle 1: root dualpi2
ping -f 127.0.0.1
To fix it, make the timer run in softirq.
Fixes: 320d031ad6 ("sched: Struct definition and parsing of dualpi2 qdisc")
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20250815135317.664993-1-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This lets NetworkManager/ModemManager know that this is a modem and
needs to be connected first.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Link: https://patch.msgid.link/20250814154214.250103-1-lkundrak@v3.sk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To enable HSR / Switch offload, certain configurations are needed.
Currently they are done inside icssg_change_mode(). This function only
gets called if we move from one mode to another without bringing the
links up / down.
Once in HSR / Switch mode, if we bring the links down and bring it back
up again. The callback sequence is,
- emac_ndo_stop()
Firmwares are stopped
- emac_ndo_open()
Firmwares are loaded
In this path icssg_change_mode() doesn't get called and as a result the
configurations needed for HSR / Switch is not done.
To fix this, put all these configurations in a separate function
icssg_enable_fw_offload() and call this from both icssg_change_mode()
and emac_ndo_open()
Fixes: 56375086d0 ("net: ti: icssg-prueth: Enable HSR Tx duplication, Tx Tag and Rx Tag offload")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Link: https://patch.msgid.link/20250814105106.1491871-1-danishanwar@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
ppp_fill_forward_path() has two race conditions:
1. The ppp->channels list can change between list_empty() and
list_first_entry(), as ppp_lock() is not held. If the only channel
is deleted in ppp_disconnect_channel(), list_first_entry() may
access an empty head or a freed entry, and trigger a panic.
2. pch->chan can be NULL. When ppp_unregister_channel() is called,
pch->chan is set to NULL before pch is removed from ppp->channels.
Fix these by using a lockless RCU approach:
- Use list_first_or_null_rcu() to safely test and access the first list
entry.
- Convert list modifications on ppp->channels to their RCU variants and
add synchronize_net() after removal.
- Check for a NULL pch->chan before dereferencing it.
Fixes: f6efc675c9 ("net: ppp: resolve forwarding path for bridge pppoe devices")
Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Link: https://patch.msgid.link/20250814012559.3705-2-dqfext@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Ensure ndo_fill_forward_path() is called with RCU lock held.
Fixes: 2830e31477 ("net: ethernet: mtk-ppe: fix traffic offload with bridged wlan")
Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Link: https://patch.msgid.link/20250814012559.3705-1-dqfext@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts says:
====================
mptcp: misc fixes for v6.17-rc
Here are various fixes:
- Patch 1: Better handling SKB extension allocation failures.
A fix for v5.7.
- Patches 2, 3: Avoid resetting MPTCP limits when flushing MPTCP
endpoints. With a validation in the selftests. Fixes for v5.7.
- Patches 4, 5, 6: Disallow '0' as ADD_ADDR retransmission timeout.
With a preparation patch, and a validation in the selftests.
Fixes for v5.11.
- Patches 8, 9: Fix C23 extension warnings in the selftests,
spotted by GCC. Fixes for v6.16.
====================
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-0-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
GCC was complaining about the new label:
mptcp_inq.c:79:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
79 | int err = getaddrinfo(node, service, hints, res);
| ^
mptcp_sockopt.c:166:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
166 | int err = getaddrinfo(node, service, hints, res);
| ^
Simply declare 'err' before the label to avoid this warning.
Fixes: dd367e81b7 ("selftests: mptcp: sockopt: use IPPROTO_MPTCP for getaddrinfo")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-8-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
GCC was complaining about the new label:
mptcp_connect.c:187:2: warning: label followed by a declaration is a C23 extension [-Wc23-extensions]
187 | int err = getaddrinfo(node, service, hints, res);
| ^
Simply declare 'err' before the label to avoid this warning.
Fixes: a862771d1a ("selftests: mptcp: use IPPROTO_MPTCP for getaddrinfo")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-7-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To prevent test instability in the "delete re-add signal" test caused by
ADD_ADDR retransmissions, disable retransmissions for this test by setting
net.mptcp.add_addr_timeout to 0.
Suggested-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-6-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When add_addr_timeout was set to 0, this caused the ADD_ADDR to be
retransmitted immediately, which looks like a buggy behaviour. Instead,
interpret 0 as "no retransmissions needed".
The documentation is updated to explicitly state that setting the timeout
to 0 disables retransmission.
Fixes: 93f323b9cc ("mptcp: add a new sysctl add_addr_timeout")
Cc: stable@vger.kernel.org
Suggested-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-5-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
sk_reset_timer() was called twice in mptcp_pm_alloc_anno_list.
Simplify the code by using a 'goto' statement to eliminate the
duplication.
Note that this is not a fix, but it will help backporting the following
patch. The same "Fixes" tag has been added for this reason.
Fixes: 93f323b9cc ("mptcp: add a new sysctl add_addr_timeout")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-4-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This modification is linked to the parent commit where the received
ADD_ADDR limit was accidentally reset when the endpoints were flushed.
To validate that, the test is now flushing endpoints after having set
new limits, and before checking them.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes: 01cacb00b3 ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-3-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A flush of the MPTCP endpoints should not affect the MPTCP limits. In
other words, 'ip mptcp endpoint flush' should not change 'ip mptcp
limits'.
But it was the case: the MPTCP_PM_ATTR_RCV_ADD_ADDRS (add_addr_accepted)
limit was reset by accident. Removing the reset of this counter during a
flush fixes this issue.
Fixes: 01cacb00b3 ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Reported-by: Thomas Dreibholz <dreibh@simula.no>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/579
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-2-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When skb_ext_add(skb, SKB_EXT_MPTCP) fails in mptcp_incoming_options(),
we used to return true, letting the segment proceed through the TCP
receive path without a DSS mapping. Such segments can leave inconsistent
mapping state and trigger a mid-stream fallback to TCP, which in testing
collapsed (by artificially forcing failures in skb_ext_add) throughput
to zero.
Return false instead so the TCP input path drops the skb (see
tcp_data_queue() and step-7 processing). This is the safer choice
under memory pressure: it preserves MPTCP correctness and provides
backpressure to the sender.
Control packets remain unaffected: ACK updates and DATA_FIN handling
happen before attempting the extension allocation, and tcp_reset()
continues to ignore the return value.
With this change, MPTCP continues to work at high throughput if we
artificially inject failures into skb_ext_add.
Fixes: 6787b7e350 ("mptcp: avoid processing packet if a subflow reset")
Cc: stable@vger.kernel.org
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250815-net-mptcp-misc-fixes-6-17-rc2-v1-1-521fe9957892@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The seg6_genl_sethmac() directly uses the algorithm ID provided by the
userspace without verifying whether it is an HMAC algorithm supported
by the system.
If an unsupported HMAC algorithm ID is configured, packets using SRv6 HMAC
will be dropped during encapsulation or decapsulation.
Fixes: 4f4853dc1c ("ipv6: sr: implement API to control SR HMAC structure")
Signed-off-by: Minhong He <heminhong@kylinos.cn>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250815063845.85426-1-heminhong@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When performing Generic Segmentation Offload (GSO) on an IPv6 packet that
contains extension headers, the kernel incorrectly requests checksum offload
if the egress device only advertises NETIF_F_IPV6_CSUM feature, which has
a strict contract: it supports checksum offload only for plain TCP or UDP
over IPv6 and explicitly does not support packets with extension headers.
The current GSO logic violates this contract by failing to disable the feature
for packets with extension headers, such as those used in GREoIPv6 tunnels.
This violation results in the device being asked to perform an operation
it cannot support, leading to a `skb_warn_bad_offload` warning and a collapse
of network throughput. While device TSO/USO is correctly bypassed in favor
of software GSO for these packets, the GSO stack must be explicitly told not
to request checksum offload.
Mask NETIF_F_IPV6_CSUM, NETIF_F_TSO6 and NETIF_F_GSO_UDP_L4
in gso_features_check if the IPv6 header contains extension headers to compute
checksum in software.
The exception is a BIG TCP extension, which, as stated in commit
68e068cabd ("net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets"):
"The feature is only enabled on devices that support BIG TCP TSO.
The header is only present for PF_PACKET taps like tcpdump,
and not transmitted by physical devices."
kernel log output (truncated):
WARNING: CPU: 1 PID: 5273 at net/core/dev.c:3535 skb_warn_bad_offload+0x81/0x140
...
Call Trace:
<TASK>
skb_checksum_help+0x12a/0x1f0
validate_xmit_skb+0x1a3/0x2d0
validate_xmit_skb_list+0x4f/0x80
sch_direct_xmit+0x1a2/0x380
__dev_xmit_skb+0x242/0x670
__dev_queue_xmit+0x3fc/0x7f0
ip6_finish_output2+0x25e/0x5d0
ip6_finish_output+0x1fc/0x3f0
ip6_tnl_xmit+0x608/0xc00 [ip6_tunnel]
ip6gre_tunnel_xmit+0x1c0/0x390 [ip6_gre]
dev_hard_start_xmit+0x63/0x1c0
__dev_queue_xmit+0x6d0/0x7f0
ip6_finish_output2+0x214/0x5d0
ip6_finish_output+0x1fc/0x3f0
ip6_xmit+0x2ca/0x6f0
ip6_finish_output+0x1fc/0x3f0
ip6_xmit+0x2ca/0x6f0
inet6_csk_xmit+0xeb/0x150
__tcp_transmit_skb+0x555/0xa80
tcp_write_xmit+0x32a/0xe90
tcp_sendmsg_locked+0x437/0x1110
tcp_sendmsg+0x2f/0x50
...
skb linear: 00000000: e4 3d 1a 7d ec 30 e4 3d 1a 7e 5d 90 86 dd 60 0e
skb linear: 00000010: 00 0a 1b 34 3c 40 20 11 00 00 00 00 00 00 00 00
skb linear: 00000020: 00 00 00 00 00 12 20 11 00 00 00 00 00 00 00 00
skb linear: 00000030: 00 00 00 00 00 11 2f 00 04 01 04 01 01 00 00 00
skb linear: 00000040: 86 dd 60 0e 00 0a 1b 00 06 40 20 23 00 00 00 00
skb linear: 00000050: 00 00 00 00 00 00 00 00 00 12 20 23 00 00 00 00
skb linear: 00000060: 00 00 00 00 00 00 00 00 00 11 bf 96 14 51 13 f9
skb linear: 00000070: ae 27 a0 a8 2b e3 80 18 00 40 5b 6f 00 00 01 01
skb linear: 00000080: 08 0a 42 d4 50 d5 4b 70 f8 1a
Fixes: 04c20a9356 ("net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Suggested-by: Michal Schmidt <mschmidt@redhat.com>
Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Signed-off-by: Jakub Ramaseuski <jramaseu@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250814105119.1525687-1-jramaseu@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The CI has hit a couple of cases of:
RUN global.data_steal ...
tls.c:2762:data_steal:Expected recv(cfd, buf2, sizeof(buf2), MSG_DONTWAIT) (20000) == -1 (-1)
data_steal: Test terminated by timeout
FAIL global.data_steal
Looks like the 2msec sleep is not long enough. Make the sleep longer,
and then instead of second sleep wait for the thieving process to exit.
That way we can be sure it called recv() before us.
While at it also avoid trying to steal more than a record, this seems
to be causing issues in manual testing as well.
Fixes: d7e82594a4 ("selftests: tls: test TCP stealing data from under the TLS socket")
Link: https://patch.msgid.link/20250814194323.2014650-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
While trying to fill a random RSS key, the size of the pointer
is being used rather than the actual size of the RSS key.
Fix by passing an appropriate value of the RSS key.
This issue was reported by static coverity analyser.
Fixes: eb4898fde1 ("net: libwx: add wangxun vf common api")
Signed-off-by: Chandra Mohan Sundar <chandramohan.explore@gmail.com>
Link: https://patch.msgid.link/20250814163014.613004-1-chandramohan.explore@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- hci_conn: Fix running bis_cleanup for hci_conn->type PA_LINK
- hci_conn: Fix not cleaning up Broadcaster/Broadcast Source
- hci_core: Fix using {cis,bis}_capable for current settings
- hci_core: Fix using ll_privacy_capable for current settings
- hci_core: Fix not accounting for BIS/CIS/PA links separately
- hci_conn: do return error from hci_enhanced_setup_sync()
- hci_event: fix MTU for BN == 0 in CIS Established
- hci_sync: Fix scan state after PA Sync has been established
- hci_sync: Avoid adding default advertising on startup
- hci_sync: Prevent unintended PA sync when SID is 0xFF
- ISO: Fix getname not returning broadcast fields
- btmtk: Fix wait_on_bit_timeout interruption during shutdown
- btnxpuart: Uses threaded IRQ for host wakeup handling
-----BEGIN PGP SIGNATURE-----
iQJNBAABCgA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmifQh4ZHGx1aXoudm9u
LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKSpyD/4+a98UD/FMiu2MuowbCJvK
EWneAUc9l0j/bhVSaTnRzpXM6gov/SNdSFr9dLrTeYY75X+ky1YDFca/deP2fZzl
v+2H306yIw73k4DAEqkzOQk+YRpi/N7+rruytlZd7TC29zwer1IEg2Rv6hMqmZQt
bkFiEcR2HpIFaXGDAmsKmXEQ3xsNnbHpeuqQIYPC0uzTYDdW8O+vV/etgHeycfzc
dGDJOX+3oEAYCi3NtJY/DZLlIhKneLHRn7qGpDAXUWTRvzHESdzmOzSQ9wD0fVqD
pMMJv0dmaubH32JHafnIGixGb6LwhHWbse2yaei2uWfwukQJAzflfMwkManVsEBa
SM5mNXS586xeLfahLy1FmlbOaL5e7enL8Bhd4jZJGtIb1WRMZBsXCNKX0cZUKn8B
mNrM1zqQUvBrmS9G5KC2lAfgco0mqxoSKipgctJLamzNR2z6Yz4No0MLTzh89MnV
0waBToTn8sXidN/ENzOrxq6GGFZmaV0x3UXickwcbxLKC30+r3nXR4j0xK56N2ly
2sM+wU9CkXitQIW8PUWcYAWZ1k3/7ixv5/pIGVHnozQEcHiBX0OOmReauF3LWZjC
pWZU0/T7vduS4/3ThSBQNOjjWp3uhyKYCbbumMQsTVSsJ5MOgvf7IqLcVWAVi3Eg
1lbQkLQNC3uaosoV/JJfRg==
=ZtN+
-----END PGP SIGNATURE-----
Merge tag 'for-net-2025-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- hci_conn: Fix running bis_cleanup for hci_conn->type PA_LINK
- hci_conn: Fix not cleaning up Broadcaster/Broadcast Source
- hci_core: Fix using {cis,bis}_capable for current settings
- hci_core: Fix using ll_privacy_capable for current settings
- hci_core: Fix not accounting for BIS/CIS/PA links separately
- hci_conn: do return error from hci_enhanced_setup_sync()
- hci_event: fix MTU for BN == 0 in CIS Established
- hci_sync: Fix scan state after PA Sync has been established
- hci_sync: Avoid adding default advertising on startup
- hci_sync: Prevent unintended PA sync when SID is 0xFF
- ISO: Fix getname not returning broadcast fields
- btmtk: Fix wait_on_bit_timeout interruption during shutdown
- btnxpuart: Uses threaded IRQ for host wakeup handling
* tag 'for-net-2025-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: hci_core: Fix not accounting for BIS/CIS/PA links separately
Bluetooth: btnxpuart: Uses threaded IRQ for host wakeup handling
Bluetooth: hci_conn: do return error from hci_enhanced_setup_sync()
Bluetooth: hci_event: fix MTU for BN == 0 in CIS Established
Bluetooth: hci_sync: Prevent unintended PA sync when SID is 0xFF
Bluetooth: hci_core: Fix using ll_privacy_capable for current settings
Bluetooth: hci_core: Fix using {cis,bis}_capable for current settings
Bluetooth: btmtk: Fix wait_on_bit_timeout interruption during shutdown
Bluetooth: hci_conn: Fix not cleaning up Broadcaster/Broadcast Source
Bluetooth: hci_conn: Fix running bis_cleanup for hci_conn->type PA_LINK
Bluetooth: ISO: Fix getname not returning broadcast fields
Bluetooth: hci_sync: Fix scan state after PA Sync has been established
Bluetooth: hci_sync: Avoid adding default advertising on startup
====================
Link: https://patch.msgid.link/20250815142229.253052-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Petr Machata says:
====================
mlxsw: spectrum: Forward packets with an IPv4 link-local source IP
By default, Spectrum devices do not forward IPv4 packets with a link-local
source IP (i.e., 169.254.0.0/16). This behavior does not align with the
kernel which does forward them. Fix the issue and add a selftest.
====================
Link: https://patch.msgid.link/cover.1755174341.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
By default, the device does not forward IPv4 packets with a link-local
source IP (i.e., 169.254.0.0/16). This behavior does not align with the
kernel which does forward them.
Fix by instructing the device to forward such packets instead of
dropping them.
Fixes: ca360db4b8 ("mlxsw: spectrum: Disable DIP_LINK_LOCAL check in hardware pipeline")
Reported-by: Zoey Mertes <zoey@cloudflare.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/6721e6b2c96feb80269e72ce8d0b426e2f32d99c.1755174341.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This fixes the likes of hci_conn_num(CIS_LINK) returning the total of
ISO connection which includes BIS_LINK as well, so this splits the
iso_num into each link type and introduces hci_iso_num that can be used
in places where the total number of ISO connection still needs to be
used.
Fixes: 23205562ff ("Bluetooth: separate CIS_LINK and BIS_LINK link types")
Fixes: a7bcffc673 ("Bluetooth: Add PA_LINK to distinguish BIG sync and PA sync connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
This replaces devm_request_irq() with devm_request_threaded_irq().
On iMX93 11x11 EVK platform, the BT chip's BT_WAKE_OUT pin is connected
to an I2C GPIO expander instead of directly been connected to iMX GPIO.
When I2C GPIO expander's (PCAL6524) host driver receives an interrupt on
it's INTR line, the driver's interrupt handler needs to query the
interrupt source with PCAL6524 first, and then call the actual interrupt
handler, in this case the IRQ handler in BTNXPUART.
In order to handle interrupts when such I2C GPIO expanders are between
the host and interrupt source, devm_request_threaded_irq() is needed.
This commit also removes the IRQF_TRIGGER_FALLING flag, to allow setting
the IRQ trigger type from the device tree setting instead of hardcoding
in the driver.
Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
Reviewed-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The commit e07a06b4eb ("Bluetooth: Convert SCO configure_datapath to
hci_sync") missed to update the *return* statement under the *case* of
BT_CODEC_TRANSPARENT in hci_enhanced_setup_sync(), which led to returning
success (0) instead of the negative error code (-EINVAL). However, the
result of hci_enhanced_setup_sync() seems to be ignored anyway, since NULL
gets passed to hci_cmd_sync_queue() as the last argument in that case and
the only function interested in that result is specified by that argument.
Fixes: e07a06b4eb ("Bluetooth: Convert SCO configure_datapath to hci_sync")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
BN == 0x00 in CIS Established means no isochronous data for the
corresponding direction (Core v6.1 pp. 2394). In this case SDU MTU
should be 0.
However, the specification does not say the Max_PDU_C_To_P or P_To_C are
then zero. Intel AX210 in Framed CIS mode sets nonzero Max_PDU for
direction with zero BN. This causes failure later when we try to LE
Setup ISO Data Path for disabled direction, which is disallowed (Core
v6.1 pp. 2750).
Fix by setting SDU MTU to 0 if BN == 0.
Fixes: 2be22f1941 ("Bluetooth: hci_event: Fix parsing of CIS Established Event")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
ll_privacy_capable only indicates that the controller supports the
feature but it doesnt' check that LE is enabled so it end up being
marked as active in the current settings when it shouldn't.
Fixes: ad383c2c65 ("Bluetooth: hci_sync: Enable advertising when LL privacy is enabled")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
{cis,bis}_capable only indicates the controller supports the feature
since it doesn't check that LE is enabled so it shall not be used for
current setting, instead this introduces {cis,bis}_enabled macros that
can be used to indicate that these features are currently enabled.
Fixes: 26afbd826e ("Bluetooth: Add initial implementation of CIS connections")
Fixes: eca0ae4aea ("Bluetooth: Add initial implementation of BIS connections")
Fixes: ae75336131 ("Bluetooth: Check for ISO support in controller")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
During the shutdown process, an interrupt occurs that
prematurely terminates the wait for the expected event.
This change replaces TASK_INTERRUPTIBLE with
TASK_UNINTERRUPTIBLE in the wait_on_bit_timeout call to ensure
the shutdown process completes as intended without being
interrupted by signals.
Fixes: d019930b00 ("Bluetooth: btmtk: move btusb_mtk_hci_wmt_sync to btmtk.c")
Signed-off-by: Jiande Lu <jiande.lu@mediatek.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
This fixes Broadcaster/Broadcast Source not sending HCI_OP_LE_TERM_BIG
because HCI_CONN_PER_ADV where not being set.
Fixes: a7bcffc673 ("Bluetooth: Add PA_LINK to distinguish BIG sync and PA sync connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Connections with type of PA_LINK shall be considered temporary just to
track the lifetime of PA Sync setup, once the BIG Sync is established
and connection are created with BIS_LINK the existing PA_LINK
connection shall not longer use bis_cleanup otherwise it terminates the
PA Sync when that shall be left to BIS_LINK connection to do it.
Fixes: a7bcffc673 ("Bluetooth: Add PA_LINK to distinguish BIG sync and PA sync connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
getname shall return iso_bc fields for both BIS_LINK and PA_LINK since
the likes of bluetoothd do use the getpeername to retrieve the SID both
when enumerating the broadcasters and when synchronizing.
Fixes: a7bcffc673 ("Bluetooth: Add PA_LINK to distinguish BIG sync and PA sync connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>