2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
linux/net/core
Xin Long a12c76a033 net: sched: refine software bypass handling in tc_run
This patch addresses issues with filter counting in block (tcf_block),
particularly for software bypass scenarios, by introducing a more
accurate mechanism using useswcnt.

Previously, filtercnt and skipswcnt were introduced by:

  Commit 2081fd3445 ("net: sched: cls_api: add filter counter") and
  Commit f631ef39d8 ("net: sched: cls_api: add skip_sw counter")

  filtercnt tracked all tp (tcf_proto) objects added to a block, and
  skipswcnt counted tp objects with the skipsw attribute set.

The problem is: a single tp can contain multiple filters, some with skipsw
and others without. The current implementation fails in the case:

  When the first filter in a tp has skipsw, both skipswcnt and filtercnt
  are incremented, then adding a second filter without skipsw to the same
  tp does not modify these counters because tp->counted is already set.

  This results in bypass software behavior based solely on skipswcnt
  equaling filtercnt, even when the block includes filters without
  skipsw. Consequently, filters without skipsw are inadvertently bypassed.

To address this, the patch introduces useswcnt in block to explicitly count
tp objects containing at least one filter without skipsw. Key changes
include:

  Whenever a filter without skipsw is added, its tp is marked with usesw
  and counted in useswcnt. tc_run() now uses useswcnt to determine software
  bypass, eliminating reliance on filtercnt and skipswcnt.

  This refined approach prevents software bypass for blocks containing
  mixed filters, ensuring correct behavior in tc_run().

Additionally, as atomic operations on useswcnt ensure thread safety and
tp->lock guards access to tp->usesw and tp->counted, the broader lock
down_write(&block->cb_lock) is no longer required in tc_new_tfilter(),
and this resolves a performance regression caused by the filter counting
mechanism during parallel filter insertions.

  The improvement can be demonstrated using the following script:

  # cat insert_tc_rules.sh

    tc qdisc add dev ens1f0np0 ingress
    for i in $(seq 16); do
        taskset -c $i tc -b rules_$i.txt &
    done
    wait

  Each of rules_$i.txt files above includes 100000 tc filter rules to a
  mlx5 driver NIC ens1f0np0.

  Without this patch:

  # time sh insert_tc_rules.sh

    real    0m50.780s
    user    0m23.556s
    sys	    4m13.032s

  With this patch:

  # time sh insert_tc_rules.sh

    real    0m17.718s
    user    0m7.807s
    sys     3m45.050s

Fixes: 047f340b36 ("net: sched: make skip_sw actually skip software")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Tested-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-20 09:21:27 +00:00
..
bpf_sk_storage.c bpf: Add "bool swap_uptrs" arg to bpf_local_storage_update() and bpf_selem_alloc() 2024-10-24 10:25:59 -07:00
datagram.c net: add support for skbs with unreadable frags 2024-09-11 20:44:31 -07:00
dev_addr_lists_test.c net: dev_addr_lists: move locking out of init/exit in kunit 2024-04-15 10:26:35 +01:00
dev_addr_lists.c net: ti: icssg-prueth: Add Multicast Filtering support for VLAN in MAC mode 2025-01-14 12:17:27 +01:00
dev_ioctl.c dev: Hold rtnl_net_lock() for dev_ifsioc(). 2025-01-16 17:20:50 -08:00
dev.c net: sched: refine software bypass handling in tc_run 2025-01-20 09:21:27 +00:00
dev.h net: make netdev netlink ops hold netdev_lock() 2025-01-15 19:13:34 -08:00
devmem.c net: devmem: add ring parameter filtering 2025-01-15 14:42:11 -08:00
devmem.h tcp: RX path for devmem TCP 2024-09-11 20:44:32 -07:00
drop_monitor.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
dst_cache.c net: dst_cache: add two DEBUG_NET warnings 2024-06-03 18:50:09 -07:00
dst.c net: do not delay dst_entries_add() in dst_release() 2024-10-10 11:28:17 +02:00
failover.c net: failover: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf 2022-12-12 15:18:25 -08:00
fib_notifier.c net: do not acquire rtnl in fib_seq_sum() 2024-10-11 15:35:05 -07:00
fib_rules.c net: fib_rules: Enable flow label selector usage 2024-12-19 16:02:22 +01:00
filter.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-01-16 10:34:59 -08:00
flow_dissector.c net: flow_dissector: use DEBUG_NET_WARN_ON_ONCE 2024-07-18 10:52:17 +02:00
flow_offload.c tc: flower: Enable offload support IPSEC SPI field. 2023-08-02 10:09:32 +01:00
gen_estimator.c net: use unrcu_pointer() helper 2024-06-06 11:52:52 +02:00
gen_stats.c net: Remove the obsolte u64_stats_fetch_*_irq() users (net). 2022-10-28 20:13:54 -07:00
gro_cells.c net: move netdev_max_backlog to net_hotdata 2024-03-07 21:12:42 -08:00
gro.c net: Add netif_get_gro_max_size helper for GRO 2024-10-01 10:48:51 +02:00
gso.c net: introduce struct net_hotdata 2024-03-07 21:12:41 -08:00
hotdata.c net: move sysctl_mem_pcpu_rsv to net_hotdata 2024-04-30 18:46:52 -07:00
hwbm.c
ieee8021q_helpers.c net: add IEEE 802.1q specific helpers 2024-05-08 10:35:09 +01:00
link_watch.c ipvlan: Fix use-after-free in ipvlan_get_iflink(). 2025-01-07 17:50:49 -08:00
lwt_bpf.c bpf: lwtunnel: Prepare bpf_lwt_xmit_reroute() to future .flowi4_tos conversion. 2024-11-14 19:07:49 -08:00
lwtunnel.c
Makefile net: Implement fault injection forcing skb reallocation 2024-11-12 12:05:33 +01:00
mp_dmabuf_devmem.h memory-provider: dmabuf devmem memory provider 2024-09-11 20:44:31 -07:00
neighbour.c net/neighbor: clear error in case strict check is not set 2024-11-18 18:42:21 -08:00
net_namespace.c net: expedite synchronize_net() for cleanup_net() 2025-01-15 19:17:03 -08:00
net_test.c pfcp: always set pfcp metadata 2024-04-01 10:49:28 +01:00
net-procfs.c net: make softnet_data.dropped an atomic_t 2024-04-01 11:28:32 +01:00
net-sysfs.c net: protect NAPI config fields with netdev_lock() 2025-01-15 19:13:34 -08:00
net-sysfs.h
net-traces.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
netclassid_cgroup.c cgroup, netclassid: on modifying netclassid in cgroup, only consider the main process. 2023-10-16 16:36:53 -07:00
netdev_rx_queue.c netdev: define NETDEV_INTERNAL 2025-01-09 15:33:08 +01:00
netdev-genl-gen.c netdev: avoid CFI problems with sock priv helpers 2025-01-16 13:15:40 +01:00
netdev-genl-gen.h netdev-genl: Support setting per-NAPI config values 2024-10-14 17:54:29 -07:00
netdev-genl.c netdev-genl: remove rtnl_lock protection from NAPI ops 2025-01-15 19:13:35 -08:00
netevent.c
netmem_priv.h page_pool: devmem support 2024-09-11 20:44:31 -07:00
netpoll.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-01-16 10:34:59 -08:00
netprio_cgroup.c
of_net.c net: Explicitly include correct DT includes 2023-07-27 20:33:16 -07:00
page_pool_priv.h memory-provider: dmabuf devmem memory provider 2024-09-11 20:44:31 -07:00
page_pool_user.c netdev: add dmabuf introspection 2024-09-11 20:44:32 -07:00
page_pool.c selftests: drv-net-hw: inject pp_alloc_fail errors in the right place 2025-01-16 17:18:53 -08:00
pktgen.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-01-16 10:34:59 -08:00
ptp_classifier.c
request_sock.c tcp: make sure init the accept_queue's spinlocks once 2024-01-19 21:13:25 -08:00
rtnetlink.c rtnetlink: Add rtnl_net_lock_killable(). 2025-01-07 13:45:53 +01:00
rtnl_net_debug.c dev: Hold rtnl_net_lock() for dev_ifsioc(). 2025-01-16 17:20:50 -08:00
scm.c af_unix: Add dead flag to struct scm_fp_list. 2024-05-10 18:52:45 -07:00
secure_seq.c
selftests.c net: fill in MODULE_DESCRIPTION()s under net/core 2023-10-28 11:29:27 +01:00
skb_fault_injection.c net: Implement fault injection forcing skb reallocation 2024-11-12 12:05:33 +01:00
skbuff.c bpf, xdp: constify some bpf_prog * function arguments 2024-12-05 18:41:06 -08:00
skmsg.c skmsg: Return copied bytes in sk_msg_memcopy_from_iter 2024-12-20 22:53:36 +01:00
sock_destructor.h
sock_diag.c net: use unrcu_pointer() helper 2024-06-06 11:52:52 +02:00
sock_map.c bpf, sockmap: Fix race between element replace and close() 2024-12-10 17:38:05 +01:00
sock_reuseport.c net: core: annotate socks of struct sock_reuseport with __counted_by 2024-08-02 17:16:59 -07:00
sock.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-01-03 16:29:29 -08:00
stream.c net: Return error from sk_stream_wait_connect() if sk_wait_event() fails 2023-12-15 10:48:51 +00:00
sysctl_net_core.c net: sysctl: allow dump_cpumask to handle higher numbers of CPUs 2024-10-23 10:28:26 +02:00
timestamping.c net: Add the possibility to support a selected hwtstamp in netdevice 2024-12-16 12:51:40 +00:00
tso.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
utils.c net: Correct spelling in net/core 2024-08-26 09:37:23 -07:00
xdp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-01-16 10:34:59 -08:00