A few variables linked to the Path-Managers are confusing, and it would
help current and future developers, to clarify them.
One of them is 'subflows', which in fact represents the number of extra
subflows: all the additional subflows created after the initial one, and
not the total number of subflows.
While at it, add an additional name for the corresponding variable in
MPTCP INFO: mptcpi_extra_subflows. Not to break the current uAPI, the
new name is added as a 'define' pointing to the former name. This will
then also help userspace devs.
No functional changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-5-ad126cc47c6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The previous commit adds an exception for the C-flag case. The
'mptcp_join.sh' selftest is extended to validate this case.
In this subtest, there is a typical CDN deployment with a client where
MPTCP endpoints have been 'automatically' configured:
- the server set net.mptcp.allow_join_initial_addr_port=0
- the client has multiple 'subflow' endpoints, and the default limits:
not accepting ADD_ADDRs.
Without the parent patch, the client is not able to establish new
subflows using its 'subflow' endpoints. The parent commit fixes that.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes: df377be387 ("mptcp: add deny_join_id0 in mptcp_options_received")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-2-ad126cc47c6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix to avoid cases where the `res` shell variable is
empty in script comparisons.
The comparison has been modified into string comparison to
handle other possible values the variable could assume.
The issue can be reproduced with the command:
make kselftest TARGETS=net
It solves the error:
./tfo_passive.sh: line 98: [: -eq: unary operator expected
Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250925132832.9828-1-alessandro.zanni87@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix a verification failure. filter_udphdr() calls bpf_xdp_pull_data(),
which will invalidate all pkt pointers. Therefore, all ctx->data loaded
before filter_udphdr() cannot be used. Reload it to prevent verification
errors.
The error may not appear on some compiler versions if they decide to
load ctx->data after filter_udphdr() when it is first used.
Fixes: efec2e55bd ("selftests: drv-net: Pull data before parsing headers")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250925161452.1290694-1-ameryhung@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull networking fixes from Paolo Abeni:
"Including fixes from Bluetooth, IPsec and CAN.
No known regressions at this point.
Current release - regressions:
- xfrm: xfrm_alloc_spi shouldn't use 0 as SPI
Previous releases - regressions:
- xfrm: fix offloading of cross-family tunnels
- bluetooth: fix several races leading to UaFs
- dsa: lantiq_gswip: fix FDB entries creation for the CPU port
- eth:
- tun: update napi->skb after XDP process
- mlx: fix UAF in flow counter release
Previous releases - always broken:
- core: forbid FDB status change while nexthop is in a group
- smc: fix warning in smc_rx_splice() when calling get_page()
- can: provide missing ndo_change_mtu(), to prevent buffer overflow.
- eth:
- i40e: fix VF config validation
- broadcom: fix support for PTP_EXTTS_REQUEST2 ioctl"
* tag 'net-6.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits)
octeontx2-pf: Fix potential use after free in otx2_tc_add_flow()
net: dsa: lantiq_gswip: suppress -EINVAL errors for bridge FDB entries added to the CPU port
net: dsa: lantiq_gswip: move gswip_add_single_port_br() call to port_setup()
libie: fix string names for AQ error codes
net/mlx5e: Fix missing FEC RS stats for RS_544_514_INTERLEAVED_QUAD
net/mlx5: HWS, ignore flow level for multi-dest table
net/mlx5: fs, fix UAF in flow counter release
selftests: fib_nexthops: Add test cases for FDB status change
selftests: fib_nexthops: Fix creation of non-FDB nexthops
nexthop: Forbid FDB status change while nexthop is in a group
net: allow alloc_skb_with_frags() to use MAX_SKB_FRAGS
bnxt_en: correct offset handling for IPv6 destination address
ptp: document behavior of PTP_STRICT_FLAGS
broadcom: fix support for PTP_EXTTS_REQUEST2 ioctl
broadcom: fix support for PTP_PEROUT_DUTY_CYCLE
Bluetooth: MGMT: Fix possible UAFs
Bluetooth: hci_event: Fix UAF in hci_acl_create_conn_sync
Bluetooth: hci_event: Fix UAF in hci_conn_tx_dequeue
Bluetooth: hci_sync: Fix hci_resume_advertising_sync
Bluetooth: Fix build after header cleanup
...
Currently, packets with fixed IDs will be merged only if their
don't-fragment bit is set. This restriction is unnecessary since
packets without the don't-fragment bit will be forwarded as-is even
if they were merged together. The merged packets will be segmented
into their original forms before being forwarded, either by GSO or
by TSO. The IDs will also remain identical unless NETIF_F_TSO_MANGLEID
is set, in which case the IDs can become incrementing, which is also fine.
Clean up the code by removing the unnecessary don't-fragment checks.
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250923085908.4687-5-richardbgobert@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add simple tests to validate that the driver sets up timestamping
configuration according to what is reported in capabilities.
For RX timestamping we allow driver to fallback to wider scope for
timestamping if filter is applied. That actually means that driver
can enable ptpv2-event when it reports ptpv2-l4-event is supported,
but not vice versa.
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250923173310.139623-5-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Florian Westphal says:
====================
netfilter: fixes for net-next
These fixes target next because the bug is either not severe or has
existed for so long that there is no reason to cram them in at the last
minute.
1) Fix IPVS ftp unregistering during netns cleanup, broken since netns
support was introduced in 2011 in the 2.6.39 kernel.
From Slavin Liu.
2) nfnetlink must reset the 'nlh' pointer back to the original
address when a batch is replayed, else we emit bogus ACK messages
and conceal real errno from userspace.
From Fernando Fernandez Mancera. This was broken since 6.10.
3) Recent fix for nftables 'pipapo' set type was incomplete, it only
made things work for the AVX2 version of the algorithm.
4) Testing revealed another problem with avx2 version that results in
out-of-bounds read access, this bug always existed since feature was
added in 5.7 kernel. This also comes with a selftest update.
Last fix resolves a long-standing bug (since 4.9) in conntrack /proc
interface:
Decrease skip count when we reap an expired entry during dump.
As-is we erronously elide one conntrack entry from dump for every expired
entry seen. From Eric Dumazet.
* tag 'nf-next-25-09-24' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_conntrack: do not skip entries in /proc/net/nf_conntrack
selftests: netfilter: nft_concat_range.sh: add check for double-create bug
netfilter: nft_set_pipapo_avx2: fix skip of expired entries
netfilter: nft_set_pipapo: use 0 genmask for packetpath lookups
netfilter: nfnetlink: reset nlh pointer during batch replay
ipvs: Defer ip_vs_ftp unregister during netns cleanup
====================
Link: https://patch.msgid.link/20250924140654.10210-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Martin KaFai Lau says:
====================
pull-request: bpf-next 2025-09-23
We've added 9 non-merge commits during the last 33 day(s) which contain
a total of 10 files changed, 480 insertions(+), 53 deletions(-).
The main changes are:
1) A new bpf_xdp_pull_data kfunc that supports pulling data from
a frag into the linear area of a xdp_buff, from Amery Hung.
This includes changes in the xdp_native.bpf.c selftest, which
Nimrod's future work depends on.
It is a merge from a stable branch 'xdp_pull_data' which has
also been merged to bpf-next.
There is a conflict with recent changes in 'include/net/xdp.h'
in the net-next tree that will need to be resolved.
2) A compiler warning fix when CONFIG_NET=n in the recent dynptr
skb_meta support, from Jakub Sitnicki.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
selftests: drv-net: Pull data before parsing headers
selftests/bpf: Test bpf_xdp_pull_data
bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN
bpf: Make variables in bpf_prog_test_run_xdp less confusing
bpf: Clear packet pointers after changing packet data in kfuncs
bpf: Support pulling non-linear xdp data
bpf: Allow bpf_xdp_shrink_data to shrink a frag from head and tail
bpf: Clear pfmemalloc flag when freeing all fragments
bpf: Return an error pointer for skb metadata when CONFIG_NET=n
====================
Link: https://patch.msgid.link/20250924050303.2466356-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a test case for bug resolved with:
'netfilter: nft_set_pipapo_avx2: fix skip of expired entries'.
It passes on nf.git (it uses the generic/C version for insertion
duplicate check) but fails on unpatched nf-next if AVX2 is supported:
cannot create same element twice 0s [FAIL]
Could create element twice in same transaction
table inet filter { # handle 8
[..]
elements = { 1.2.3.4 . 1.2.4.1 counter packets 0 bytes 0,
1.2.4.1 . 1.2.3.4 counter packets 0 bytes 0,
1.2.3.4 . 1.2.4.1 counter packets 0 bytes 0,
1.2.4.1 . 1.2.3.4 counter packets 0 bytes 0 }
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Add the following test cases for both IPv4 and IPv6:
* Can change from FDB nexthop to non-FDB nexthop and vice versa.
* Can change FDB nexthop address while in a group.
* Cannot change from FDB nexthop to non-FDB nexthop and vice versa while
in a group.
Output without "nexthop: Forbid FDB status change while nexthop is in a
group":
# ./fib_nexthops.sh -t "ipv6_fdb_grp_fcnal ipv4_fdb_grp_fcnal"
IPv6 fdb groups functional
--------------------------
[...]
TEST: Replace FDB nexthop to non-FDB nexthop [ OK ]
TEST: Replace non-FDB nexthop to FDB nexthop [ OK ]
TEST: Replace FDB nexthop address while in a group [ OK ]
TEST: Replace FDB nexthop to non-FDB nexthop while in a group [FAIL]
TEST: Replace non-FDB nexthop to FDB nexthop while in a group [FAIL]
[...]
IPv4 fdb groups functional
--------------------------
[...]
TEST: Replace FDB nexthop to non-FDB nexthop [ OK ]
TEST: Replace non-FDB nexthop to FDB nexthop [ OK ]
TEST: Replace FDB nexthop address while in a group [ OK ]
TEST: Replace FDB nexthop to non-FDB nexthop while in a group [FAIL]
TEST: Replace non-FDB nexthop to FDB nexthop while in a group [FAIL]
[...]
Tests passed: 36
Tests failed: 4
Tests skipped: 0
Output with "nexthop: Forbid FDB status change while nexthop is in a
group":
# ./fib_nexthops.sh -t "ipv6_fdb_grp_fcnal ipv4_fdb_grp_fcnal"
IPv6 fdb groups functional
--------------------------
[...]
TEST: Replace FDB nexthop to non-FDB nexthop [ OK ]
TEST: Replace non-FDB nexthop to FDB nexthop [ OK ]
TEST: Replace FDB nexthop address while in a group [ OK ]
TEST: Replace FDB nexthop to non-FDB nexthop while in a group [ OK ]
TEST: Replace non-FDB nexthop to FDB nexthop while in a group [ OK ]
[...]
IPv4 fdb groups functional
--------------------------
[...]
TEST: Replace FDB nexthop to non-FDB nexthop [ OK ]
TEST: Replace non-FDB nexthop to FDB nexthop [ OK ]
TEST: Replace FDB nexthop address while in a group [ OK ]
TEST: Replace FDB nexthop to non-FDB nexthop while in a group [ OK ]
TEST: Replace non-FDB nexthop to FDB nexthop while in a group [ OK ]
[...]
Tests passed: 40
Tests failed: 0
Tests skipped: 0
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250921150824.149157-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The test creates non-FDB nexthops without a nexthop device which leads
to the expected failure, but for the wrong reason:
# ./fib_nexthops.sh -t "ipv6_fdb_grp_fcnal ipv4_fdb_grp_fcnal" -v
IPv6 fdb groups functional
--------------------------
[...]
COMMAND: ip -netns me-nRsN3E nexthop add id 63 via 2001:db8:91::4
Error: Device attribute required for non-blackhole and non-fdb nexthops.
COMMAND: ip -netns me-nRsN3E nexthop add id 64 via 2001:db8:91::5
Error: Device attribute required for non-blackhole and non-fdb nexthops.
COMMAND: ip -netns me-nRsN3E nexthop add id 103 group 63/64 fdb
Error: Invalid nexthop id.
TEST: Fdb Nexthop group with non-fdb nexthops [ OK ]
[...]
IPv4 fdb groups functional
--------------------------
[...]
COMMAND: ip -netns me-nRsN3E nexthop add id 14 via 172.16.1.2
Error: Device attribute required for non-blackhole and non-fdb nexthops.
COMMAND: ip -netns me-nRsN3E nexthop add id 15 via 172.16.1.3
Error: Device attribute required for non-blackhole and non-fdb nexthops.
COMMAND: ip -netns me-nRsN3E nexthop add id 103 group 14/15 fdb
Error: Invalid nexthop id.
TEST: Fdb Nexthop group with non-fdb nexthops [ OK ]
COMMAND: ip -netns me-nRsN3E nexthop add id 16 via 172.16.1.2 fdb
COMMAND: ip -netns me-nRsN3E nexthop add id 17 via 172.16.1.3 fdb
COMMAND: ip -netns me-nRsN3E nexthop add id 104 group 14/15
Error: Invalid nexthop id.
TEST: Non-Fdb Nexthop group with fdb nexthops [ OK ]
[...]
COMMAND: ip -netns me-0dlhyd ro add 172.16.0.0/22 nhid 15
Error: Nexthop id does not exist.
TEST: Route add with fdb nexthop [ OK ]
In addition, as can be seen in the above output, a couple of IPv4 test
cases used the non-FDB nexthops (14 and 15) when they intended to use
the FDB nexthops (16 and 17). These test cases only passed because
failure was expected, but they failed for the wrong reason.
Fix the test to create the non-FDB nexthops with a nexthop device and
adjust the IPv4 test cases to use the FDB nexthops instead of the
non-FDB nexthops.
Output after the fix:
# ./fib_nexthops.sh -t "ipv6_fdb_grp_fcnal ipv4_fdb_grp_fcnal" -v
IPv6 fdb groups functional
--------------------------
[...]
COMMAND: ip -netns me-lNzfHP nexthop add id 63 via 2001:db8:91::4 dev veth1
COMMAND: ip -netns me-lNzfHP nexthop add id 64 via 2001:db8:91::5 dev veth1
COMMAND: ip -netns me-lNzfHP nexthop add id 103 group 63/64 fdb
Error: FDB nexthop group can only have fdb nexthops.
TEST: Fdb Nexthop group with non-fdb nexthops [ OK ]
[...]
IPv4 fdb groups functional
--------------------------
[...]
COMMAND: ip -netns me-lNzfHP nexthop add id 14 via 172.16.1.2 dev veth1
COMMAND: ip -netns me-lNzfHP nexthop add id 15 via 172.16.1.3 dev veth1
COMMAND: ip -netns me-lNzfHP nexthop add id 103 group 14/15 fdb
Error: FDB nexthop group can only have fdb nexthops.
TEST: Fdb Nexthop group with non-fdb nexthops [ OK ]
COMMAND: ip -netns me-lNzfHP nexthop add id 16 via 172.16.1.2 fdb
COMMAND: ip -netns me-lNzfHP nexthop add id 17 via 172.16.1.3 fdb
COMMAND: ip -netns me-lNzfHP nexthop add id 104 group 16/17
Error: Non FDB nexthop group cannot have fdb nexthops.
TEST: Non-Fdb Nexthop group with fdb nexthops [ OK ]
[...]
COMMAND: ip -netns me-lNzfHP ro add 172.16.0.0/22 nhid 16
Error: Route cannot point to a fdb nexthop.
TEST: Route add with fdb nexthop [ OK ]
[...]
Tests passed: 30
Tests failed: 0
Tests skipped: 0
Fixes: 0534c5489c ("selftests: net: add fdb nexthop tests")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250921150824.149157-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To test bpf_xdp_pull_data(), an xdp packet containing fragments as well
as free linear data area after xdp->data_end needs to be created.
However, bpf_prog_test_run_xdp() always fills the linear area with
data_in before creating fragments, leaving no space to pull data. This
patch will allow users to specify the linear data size through
ctx->data_end.
Currently, ctx_in->data_end must match data_size_in and will not be the
final ctx->data_end seen by xdp programs. This is because ctx->data_end
is populated according to the xdp_buff passed to test_run. The linear
data area available in an xdp_buff, max_linear_sz, is alawys filled up
before copying data_in into fragments.
This patch will allow users to specify the size of data that goes into
the linear area. When ctx_in->data_end is different from data_size_in,
only ctx_in->data_end bytes of data will be put into the linear area when
creating the xdp_buff.
While ctx_in->data_end will be allowed to be different from data_size_in,
it cannot be larger than the data_size_in as there will be no data to
copy from user space. If it is larger than the maximum linear data area
size, the layout suggested by the user will not be honored. Data beyond
max_linear_sz bytes will still be copied into fragments.
Finally, since it is possible for a NIC to produce a xdp_buff with empty
linear data area, allow it when calling bpf_test_init() from
bpf_prog_test_run_xdp() so that we can test XDP kfuncs with such
xdp_buff. This is done by moving lower-bound check to callers as most of
them already do except bpf_prog_test_run_skb(). The change also fixes a
bug that allows passing an xdp_buff with data < ETH_HLEN. This can
happen when ctx is used and metadata is at least ETH_HLEN.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250922233356.3356453-7-ameryhung@gmail.com
Exercise the scenario described in detail in the cover letter:
1) socket A: connect() from ephemeral port X
2) socket B: explicitly bind() to port X
3) check that port X is now excluded from ephemeral ports
4) close socket B to release the port bind
5) socket C: connect() from ephemeral port X
As well as a corner case to test that the connect-bind flag is cleared:
1) connect() from ephemeral port X
2) disconnect the socket with connect(AF_UNSPEC)
3) bind() it explicitly to port X
4) check that port X is now excluded from ephemeral ports
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://patch.msgid.link/20250917-update-bind-bucket-state-on-unhash-v5-2-57168b661b47@cloudflare.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This attribute is a boolean. No need to add it to set it to 'false'.
Indeed, the default value when this attribute is not set is naturally
'false'. A few bytes can then be saved by not adding this attribute if
the connection is not on the server side.
This prepares the future deprecation of its attribute, in favour of a
new flag.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250919-net-next-mptcp-server-side-flag-v1-1-a97a5d561a8b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Quoted from musl wiki:
GNU getopt permutes argv to pull options to the front, ahead of
non-option arguments. musl and the POSIX standard getopt stop
processing options at the first non-option argument with no
permutation.
Thus these scripts stop working on musl since non-option arguments for
tools using getopt() (in this case, (ar)ping) do not always come last.
Fix it by reordering arguments.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250919053538.1106753-1-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull iommufd fixes from Jason Gunthorpe:
"Fix two user triggerable use-after-free issues:
- Possible race UAF setting up mmaps
- Syzkaller found UAF when erroring an file descriptor creation ioctl
due to the fput() work queue"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd/selftest: Update the fail_nth limit
iommufd: WARN if an object is aborted with an elevated refcount
iommufd: Fix race during abort for file descriptors
iommufd: Fix refcounting race during mmap
Pull LoongArch fixes from Huacai Chen:
"Fix some build warnings for RUST-enabled objtool check, align ACPI
structures for ARCH_STRICT_ALIGN, fix an unreliable stack for live
patching, add some NULL pointer checkings, and fix some bugs around
KVM"
* tag 'loongarch-fixes-6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Avoid copy_*_user() with lock hold in kvm_pch_pic_regs_access()
LoongArch: KVM: Avoid copy_*_user() with lock hold in kvm_eiointc_sw_status_access()
LoongArch: KVM: Avoid copy_*_user() with lock hold in kvm_eiointc_regs_access()
LoongArch: KVM: Avoid copy_*_user() with lock hold in kvm_eiointc_ctrl_access()
LoongArch: KVM: Fix VM migration failure with PTW enabled
LoongArch: KVM: Remove unused returns and semicolons
LoongArch: vDSO: Check kcalloc() result in init_vdso()
LoongArch: Fix unreliable stack for live patching
LoongArch: Replace sprintf() with sysfs_emit()
LoongArch: Check the return value when creating kobj
LoongArch: Align ACPI structures if ARCH_STRICT_ALIGN enabled
LoongArch: Update help info of ARCH_STRICT_ALIGN
LoongArch: Handle jump tables options for RUST
LoongArch: Make LTO case independent in Makefile
objtool/LoongArch: Mark special atomic instruction as INSN_BUG type
objtool/LoongArch: Mark types based on break immediate code
Cross-merge networking fixes after downstream PR (net-6.17-rc7).
No conflicts.
Adjacent changes:
drivers/net/ethernet/mellanox/mlx5/core/en/fs.h
9536fbe10c ("net/mlx5e: Add PSP steering in local NIC RX")
7601a0a462 ("net/mlx5e: Add a miss level for ipsec crypto offload")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless. No known regressions at this point.
Current release - fix to a fix:
- eth: Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set"
- wifi: iwlwifi: pcie: fix byte count table for 7000/8000 devices
- net: clear sk->sk_ino in sk_set_socket(sk, NULL), fix CRIU
Previous releases - regressions:
- bonding: set random address only when slaves already exist
- rxrpc: fix untrusted unsigned subtract
- eth:
- ice: fix Rx page leak on multi-buffer frames
- mlx5: don't return mlx5_link_info table when speed is unknown
Previous releases - always broken:
- tls: make sure to abort the stream if headers are bogus
- tcp: fix null-deref when using TCP-AO with TCP_REPAIR
- dpll: fix skipping last entry in clock quality level reporting
- eth: qed: don't collect too many protection override GRC elements,
fix memory corruption"
* tag 'net-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp()
cnic: Fix use-after-free bugs in cnic_delete_task
devlink rate: Remove unnecessary 'static' from a couple places
MAINTAINERS: update sundance entry
net: liquidio: fix overflow in octeon_init_instr_queue()
net: clear sk->sk_ino in sk_set_socket(sk, NULL)
Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set"
selftests: tls: test skb copy under mem pressure and OOB
tls: make sure to abort the stream if headers are bogus
selftest: packetdrill: Add tcp_fastopen_server_reset-after-disconnect.pkt.
tcp: Clear tcp_sk(sk)->fastopen_rsk in tcp_disconnect().
octeon_ep: fix VF MAC address lifecycle handling
selftests: bonding: add vlan over bond testing
bonding: don't set oif to bond dev when getting NS target destination
net: rfkill: gpio: Fix crash due to dereferencering uninitialized pointer
net/mlx5e: Add a miss level for ipsec crypto offload
net/mlx5e: Harden uplink netdev access against device unbind
MAINTAINERS: make the DPLL entry cover drivers
doc/netlink: Fix typos in operation attributes
igc: don't fail igc_probe() on LED setup error
...
When compiling with LLVM and CONFIG_RUST is set, there exists the
following objtool warning:
rust/compiler_builtins.o: warning: objtool: __rust__unordsf2(): unexpected end of section .text.unlikely.
objdump shows that the end of section .text.unlikely is an atomic
instruction:
amswap.w $zero, $ra, $zero
According to the LoongArch Reference Manual, if the amswap.w atomic
memory access instruction has the same register number as rd and rj,
the execution will trigger an Instruction Non-defined Exception, so
mark the above instruction as INSN_BUG type to fix the warning.
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
If the break immediate code is 0, it should mark the type as
INSN_TRAP. If the break immediate code is 1, it should mark the
type as INSN_BUG.
While at it, format the code style and add the code comment for nop.
Cc: stable@vger.kernel.org
Suggested-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add a netlink family for PSP and allow drivers to register support.
The "PSP device" is its own object. This allows us to perform more
flexible reference counting / lifetime control than if PSP information
was part of net_device. In the future we should also be able
to "delegate" PSP access to software devices, such as *vlan, veth
or netkit more easily.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250917000954.859376-3-daniel.zahka@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The test reproduces the scenario explained in the previous patch.
Without the patch, the test triggers the warning and cannot see the last
retransmitted packet.
# ./ksft_runner.sh tcp_fastopen_server_reset-after-disconnect.pkt
TAP version 13
1..2
[ 29.229250] ------------[ cut here ]------------
[ 29.231414] WARNING: CPU: 26 PID: 0 at net/ipv4/tcp_timer.c:542 tcp_retransmit_timer+0x32/0x9f0
...
tcp_fastopen_server_reset-after-disconnect.pkt:26: error handling packet: Timed out waiting for packet
not ok 1 ipv4
tcp_fastopen_server_reset-after-disconnect.pkt:26: error handling packet: Timed out waiting for packet
not ok 2 ipv6
# Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250915175800.118793-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull perf tools fixes from Namhyung Kim:
"A small set of fixes for crashes in different commands and conditions"
* tag 'perf-tools-fixes-for-v6.17-2025-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf maps: Ensure kmap is set up for all inserts
perf lock: Provide a host_env for session new
perf subcmd: avoid crash in exclude_cmds when excludes is empty
The attribute WGALLOWEDIP_A_IPADDR can contain either an IPv4
or an IPv6 address depending on WGALLOWEDIP_A_FAMILY, however
in practice it is enough to look at the attribute length.
This patch implements an ipv4-or-v6 display hint, that can
deal with this kind of attribute.
It only implements this display hint for genetlink-legacy, it
can be added to other protocol variants if needed, but we don't
want to encourage it's use.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250915144301.725949-12-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds support for decoding hex input, so
that binary attributes can be read through --json.
Example (using future wireguard.yaml):
$ sudo ./tools/net/ynl/pyynl/cli.py --family wireguard \
--do set-device --json '{"ifindex":3,
"private-key":"2a ae 6c 35 c9 4f cf <... to 32 bytes>"}'
In order to somewhat mirror what is done in _formatted_string(),
then for non-binary attributes attempt to convert it to an int.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250915144301.725949-11-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since TypeArrayNest can now be used with many other sub-types
than nest, then rename it to TypeIndexedArray, to reduce
confusion.
This patch continues the rename, that was started in commit
aa6485d813 ("ynl: rename array-nest to indexed-array"),
when the YNL type was renamed.
In order to get rid of all references to the old naming,
within ynl, then renaming some variables in _multi_parse().
This is a trivial patch with no behavioural changes intended.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250915144301.725949-8-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In nested arrays don't require that the intermediate attribute
type should be a valid attribute type, it might just be zero
or an incrementing index, it is often not even used.
See include/net/netlink.h about NLA_NESTED_ARRAY:
> The difference to NLA_NESTED is the structure:
> NLA_NESTED has the nested attributes directly inside
> while an array has the nested attributes at another
> level down and the attribute types directly in the
> nesting don't matter.
Example based on include/uapi/linux/wireguard.h:
> WGDEVICE_A_PEERS: NLA_NESTED
> 0: NLA_NESTED
> WGPEER_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN
> [..]
> 0: NLA_NESTED
> ...
> ...
Previous the check required that the nested type was valid
in the parent attribute set, which in this case resolves to
WGDEVICE_A_UNSPEC, which is YNL_PT_REJECT, and it took the
early exit and returned YNL_PARSE_CB_ERROR.
This patch renames the old nl_attr_validate() to
__nl_attr_validate(), and creates a new inline function
nl_attr_validate() to mimic the old one.
The new __nl_attr_validate() takes the attribute type as an
argument, so we can use it to validate attributes of a
nested attribute, in the context of the parents attribute
type, which in the above case is generated as:
[WGDEVICE_A_PEERS] = {
.name = "peers",
.type = YNL_PT_NEST,
.nest = &wireguard_wgpeer_nest,
},
__nl_attr_validate() only checks if the attribute length
is plausible for a given attribute type, so the .nest in
the above example is not used.
As the new inline function needs to be defined after
ynl_attr_type(), then the definitions are moved down,
so we avoid a forward declaration of ynl_attr_type().
Some other examples are NL80211_BAND_ATTR_FREQS (nest) and
NL80211_ATTR_SUPPORTED_COMMANDS (u32) both in nl80211-user.c
$ make -C tools/net/ynl/generated nl80211-user.c
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250915144301.725949-7-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Refactor the generation of local variables needed when building
requests, by moving the logic from put_req_nested() into a new
helper put_local_vars(), and use the helper before .attr_put() is
called, thus generating the local variables assumed by .attr_put().
Previously only put_req_nested() generated the variables assumed
by .attr_put(), print_req() only generated the count iterator `i`,
and print_dump() neither generated `i` nor `array`.
This patch fixes the build errors below:
$ make -C tools/net/ynl/generated/
[...]
-e GEN wireguard-user.c
-e GEN wireguard-user.h
-e CC wireguard-user.o
wireguard-user.c: In function ‘wireguard_get_device_dump’:
wireguard-user.c:480:9: error: ‘array’ undeclared (first use in func)
480 | array = ynl_attr_nest_start(nlh, WGDEVICE_A_PEERS);
| ^~~~~
wireguard-user.c:480:9: note: each undeclared identifier is reported
only once for each function it appears in
wireguard-user.c:481:14: error: ‘i’ undeclared (first use in func)
481 | for (i = 0; i < req->_count.peers; i++)
| ^
wireguard-user.c: In function ‘wireguard_set_device’:
wireguard-user.c:533:9: error: ‘array’ undeclared (first use in func)
533 | array = ynl_attr_nest_start(nlh, WGDEVICE_A_PEERS);
| ^~~~~
make: *** [Makefile:52: wireguard-user.o] Error 1
make: Leaving directory '/usr/src/linux/tools/net/ynl/generated'
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250915144301.725949-5-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds support for NLA_POLICY_NESTED_ARRAY() policies.
Example spec (from future wireguard.yaml):
-
name: wgpeer
attributes:
-
name: allowedips
type: indexed-array
sub-type: nest
nested-attributes: wgallowedip
yields NLA_POLICY_NESTED_ARRAY(wireguard_wgallowedip_nl_policy).
This doesn't change any currently generated code, as it isn't
used in any specs currently used for generating code.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250915144301.725949-3-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>