sugar/linux - Sugar's Linux Kernel Patch

Age	Commit message (Collapse)	Author
6 days	Merge tag 'net-next-6.19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API: - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers: - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support" * tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits) net: page_pool: sanitise allocation order net: page pool: xa init with destroy on pp init net/mlx5e: Support XDP target xmit with dummy program net/mlx5e: Update XDP features in switch channels selftests/tc-testing: Test CAKE scheduler when enqueue drops packets net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop wireguard: netlink: generate netlink code wireguard: uapi: generate header with ynl-gen wireguard: uapi: move flag enums wireguard: uapi: move enum wg_cmd wireguard: netlink: add YNL specification selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py selftests: drv-net: introduce Iperf3Runner for measurement use cases selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive() Documentation: net: dsa: mention simple HSR offload helpers Documentation: net: dsa: mention availability of RedBox ...
10 days	selftests/bpf: do not hardcode target rate in test_tc_edt BPF program	Alexis Lothoré (eBPF Foundation)
	test_tc_edt currently defines the target rate in both the userspace and BPF parts. This value could be defined once in the userspace part if we make it able to configure the BPF program before starting the test. Add a target_rate variable in the BPF part, and make the userspace part set it to the desired rate before attaching the shaping program. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-4-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 days	selftests/bpf: remove test_tc_edt.sh	Alexis Lothoré (eBPF Foundation)
	Now that test_tc_edt has been integrated in test_progs, remove the legacy shell script. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-3-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 days	selftests/bpf: integrate test_tc_edt into test_progs	Alexis Lothoré (eBPF Foundation)
	test_tc_edt.sh uses a pair of veth and a BPF program attached to the TX veth to shape the traffic to 5MBps. It then checks that the amount of received bytes (at interface level), compared to the TX duration, indeed matches 5Mbps. Convert this test script to the test_progs framework: - keep the double veth setup, isolated in two veths - run a small tcp server, and connect client to server - push a pre-configured amount of bytes, and measure how much time has been needed to push those - ensure that this rate is in a 2% error margin around the target rate This two percent value, while being tight, is hopefully large enough to not make the test too flaky in CI, while also turning it into a small example of BPF-based shaping. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-2-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 days	selftests/bpf: rename test_tc_edt.bpf.c section to expose program type	Alexis Lothoré (eBPF Foundation)
	The test_tc_edt BPF program uses a custom section name, which works fine when manually loading it with tc, but prevents it from being loaded with libbpf. Update the program section name to "tc" to be able to manipulate it with a libbpf-based C test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-1-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 days	selftests/bpf: Add success stats to rqspinlock stress test	Kumar Kartikeya Dwivedi
	Add stats to observe the success and failure rate of lock acquisition attempts in various contexts. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251128232802.1031906-7-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
11 days	bpf: Remove runqslower tool	Hoyeon Lee
	runqslower was added in commit 9c01546d26d2 "tools/bpf: Add runqslower tool to tools/bpf" as a BCC port to showcase early BPF CO-RE + libbpf workflows. runqslower continues to live in BCC (libbpf-tools), so there is no need to keep building and maintaining it. Drop tools/bpf/runqslower and remove all build hooks in tools/bpf and selftests accordingly. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Link: https://lore.kernel.org/r/20251126093821.373291-1-hoyeon.lee@suse.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
11 days	selftests/bpf: Remove usage of lsm/file_alloc_security in selftest	Amery Hung
	file_alloc_security hook is disabled. Use other LSM hooks in selftests instead. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20251126202927.2584874-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
11 days	bpf: force BPF_F_RDONLY_PROG on insn array creation	Anton Protopopov
	The original implementation added a hack to check_mem_access() to prevent programs from writing into insn arrays. To get rid of this hack, enforce BPF_F_RDONLY_PROG on map creation. Also fix the corresponding selftest, as the error message changes with this patch. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251128063224.1305482-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
14 days	tcp: remove icsk->icsk_retransmit_timer	Eric Dumazet
	Now sk->sk_timer is no longer used by TCP keepalive, we can use its storage for TCP and MPTCP retransmit timers for better cache locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 days	tcp: introduce icsk->icsk_keepalive_timer	Eric Dumazet
	sk->sk_timer has been used for TCP keepalives. Keepalive timers are not in fast path, we want to use sk->sk_timer storage for retransmit timers, for better cache locality. Create icsk->icsk_keepalive_timer and change keepalive code to no longer use sk->sk_timer. Added space is reclaimed in the following patch. This includes changes to MPTCP, which was also using sk_timer. Alias icsk->mptcp_tout_timer and icsk->icsk_keepalive_timer for inet_sk_diag_fill() sake. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 days	selftests/bpf: Make CS length configurable for rqspinlock stress test	Kumar Kartikeya Dwivedi
	Allow users to configure the critical section delay for both task/normal and NMI contexts, and set to 20ms and 10ms as before by default. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
14 days	selftests/bpf: Add lock wait time stats to rqspinlock stress test	Kumar Kartikeya Dwivedi
	Add statistics per-CPU broken down by context and various timing windows for the time taken to acquire an rqspinlock. Cases where all acquisitions fit into the 10ms window are skipped from printing, otherwise the full breakdown is displayed when printing the summary. This allows capturing precisely the number of times outlier attempts happened for a given lock in a given context. A critical detail is that time is captured regardless of success or failure, which is important to capture events for failed but long waiting timeout attempts. Output: [ 64.279459] rqspinlock acquisition latency histogram (ms): [ 64.279472] cpu1: total 528426 (normal 526559, nmi 1867) [ 64.279477] 0-1ms: total 524697 (normal 524697, nmi 0) [ 64.279480] 2-2ms: total 3652 (normal 1811, nmi 1841) [ 64.279482] 3-3ms: total 66 (normal 47, nmi 19) [ 64.279485] 4-4ms: total 2 (normal 1, nmi 1) [ 64.279487] 5-5ms: total 1 (normal 1, nmi 0) [ 64.279489] 6-6ms: total 1 (normal 0, nmi 1) [ 64.279490] 101-150ms: total 1 (normal 0, nmi 1) [ 64.279492] >= 251ms: total 6 (normal 2, nmi 4) ... Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
14 days	selftests/bpf: Relax CPU requirements for rqspinlock stress test	Kumar Kartikeya Dwivedi
	Only require 2 CPUs for AA, 3 for ABBA, 4 for ABBCCA, which is calculated nicely by adding to the mode enum. Enables running single CPU AA tests. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
14 days	selftests/bpf: Call bpf_get_numa_node_id() in trigger_count()	Menglong Dong
	The bench test "trig-kernel-count" can be used as a baseline comparison for fentry and other benchmarks, and the calling to bpf_get_numa_node_id() should be considered as composition of the baseline. So, let's call it in trigger_count(). Meanwhile, rename trigger_count() to trigger_kernel_count() to make it easier understand. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251116014242.151110-1-dongml2@chinatelecom.cn
2025-11-24	selftests/bpf: Fix htab_update/reenter_update selftest failure	Saket Kumar Bhaskar
	Since commit 31158ad02ddb ("rqspinlock: Add deadlock detection and recovery") the updated path on re-entrancy now reports deadlock via -EDEADLK instead of the previous -EBUSY. Also, the way reentrancy was exercised (via fentry/lookup_elem_raw) has been fragile because lookup_elem_raw may be inlined (find_kernel_btf_id() will return -ESRCH). To fix this fentry is attached to bpf_obj_free_fields() instead of lookup_elem_raw() and: - The htab map is made to use a BTF-described struct val with a struct bpf_timer so that check_and_free_fields() reliably calls bpf_obj_free_fields() on element replacement. - The selftest is updated to do two updates to the same key (insert + replace) in prog_test. - The selftest is updated to align with expected errno with the kernel’s current behavior. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/r/20251117060752.129648-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-24	selftests/bpf: Allow selftests to build with older xxd	Alan Maguire
	Currently selftests require xxd with the "-n <name>" option which allows the user to specify a name not derived from the input object path. Instead of relying on this newer feature, older xxd can be used if we link our desired name ("test_progs_verification_cert") to the input object. Many distros ship xxd in vim-common package and do not have the latest xxd with -n support. Fixes: b720903e2b14d ("selftests/bpf: Enable signature verification for some lskel tests") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20251120084754.640405-3-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests: bpf: Add tests for unbalanced rcu_read_lock	Puranjay Mohan
	As verifier now supports nested rcu critical sections, add new test cases to make sure unbalanced usage of rcu_read_lock()/unlock() is rejected. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251117200411.25563-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	bpf: support nested rcu critical sections	Puranjay Mohan
	Currently, nested rcu critical sections are rejected by the verifier and rcu_lock state is managed by a boolean variable. Add support for nested rcu critical sections by make active_rcu_locks a counter similar to active_preempt_locks. bpf_rcu_read_lock() increments this counter and bpf_rcu_read_unlock() decrements it, MEM_RCU -> PTR_UNTRUSTED transition happens when active_rcu_locks drops to 0. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251117200411.25563-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	bpf: test the correct stack liveness of tail calls	Eduard Zingerman
	A new test is added: caller_stack_write_tail_call tests that the live stack is correctly tracked for a tail call. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Martin Teichmann <martin.teichmann@xfel.eu> Link: https://lore.kernel.org/r/20251119160355.1160932-5-martin.teichmann@xfel.eu Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	bpf: test the proper verification of tail calls	Martin Teichmann
	Three tests are added: - invalidate_pkt_pointers_by_tail_call checks that one can use the packet pointer after a tail call. This was originally possible and also poses not problems, but was made impossible by 1a4607ffba35. - invalidate_pkt_pointers_by_static_tail_call tests a corner case found by Eduard Zingerman during the discussion of the original fix, which was broken in that fix. - subprog_result_tail_call tests that precision propagation works correctly across tail calls. This did not work before. Signed-off-by: Martin Teichmann <martin.teichmann@xfel.eu> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251119160355.1160932-3-martin.teichmann@xfel.eu Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests/bpf: Update test_tag to use sha256	Xing Guo
	commit 603b44162325 ("bpf: Update the bpf_prog_calc_tag to use SHA256") changed digest of prog_tag to SHA256 but forgot to update tests correspondingly. Fix it. Fixes: 603b44162325 ("bpf: Update the bpf_prog_calc_tag to use SHA256") Signed-off-by: Xing Guo <higuoxing@gmail.com> Link: https://lore.kernel.org/r/20251121061458.3145167-1-higuoxing@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests/bpf: Improve reliability of test_perf_branches_no_hw()	Matt Bobrowski
	Currently, test_perf_branches_no_hw() relies on the busy loop within test_perf_branches_common() being slow enough to allow at least one perf event sample tick to occur before starting to tear down the backing perf event BPF program. With a relatively small fixed iteration count of 1,000,000, this is not guaranteed on modern fast CPUs, resulting in the test run to subsequently fail with the following: bpf_testmod.ko is already unloaded. Loading bpf_testmod.ko... Successfully loaded bpf_testmod.ko. test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:PASS:output not valid 0 nsec check_good_sample:PASS:read_branches_size 0 nsec check_good_sample:PASS:read_branches_stack 0 nsec check_good_sample:PASS:read_branches_stack 0 nsec check_good_sample:PASS:read_branches_global 0 nsec check_good_sample:PASS:read_branches_global 0 nsec check_good_sample:PASS:read_branches_size 0 nsec test_perf_branches_no_hw:PASS:perf_event_open 0 nsec test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_bad_sample:FAIL:output not valid no valid sample from prog Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED Successfully unloaded bpf_testmod.ko. On a modern CPU (i.e. one with a 3.5 GHz clock rate), executing 1 million increments of a volatile integer can take significantly less than 1 millisecond. If the spin loop and detachment of the perf event BPF program elapses before the first 1 ms sampling interval elapses, the perf event will never end up firing. Fix this by bumping the loop iteration counter a little within test_perf_branches_common(), along with ensuring adding another loop termination condition which is directly influenced by the backing perf event BPF program executing. Notably, a concious decision was made to not adjust the sample_freq value as that is just not a reliable way to go about fixing the problem. It effectively still leaves the race window open. Fixes: 67306f84ca78c ("selftests/bpf: Add bpf_read_branch_records() selftest") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251119143540.2911424-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests/bpf: skip test_perf_branches_hw() on unsupported platforms	Matt Bobrowski
	Gracefully skip the test_perf_branches_hw subtest on platforms that do not support LBR or require specialized perf event attributes to enable branch sampling. For example, AMD's Milan (Zen 3) supports BRS rather than traditional LBR. This requires specific configurations (attr.type = PERF_TYPE_RAW, attr.config = RETIRED_TAKEN_BRANCH_INSTRUCTIONS) that differ from the generic setup used within this test. Notably, it also probably doesn't hold much value to special case perf event configurations for selected micro architectures. Fixes: 67306f84ca78c ("selftests/bpf: Add bpf_read_branch_records() selftest") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20251120142059.2836181-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests: bpf: Enable gotox tests from arm64	Puranjay Mohan
	arm64 JIT now supports gotox instruction and jumptables, so run tests in verifier_gotox.c for arm64. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251117130732.11107-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests/bpf: Use sockaddr_storage instead of sa46 in select_reuseport test	Hoyeon Lee
	The select_reuseport selftest uses a custom sa46 union to represent IPv4 and IPv6 addresses. This custom wrapper requires extra manual handling for address family and field extraction. Replace sa46 with sockaddr_storage and update the helper functions to operate on native socket structures. This simplifies the code and removes unnecessary custom address-handling logic. No functional changes intended. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251121081332.2309838-3-hoyeon.lee@suse.com
2025-11-21	selftests/bpf: Use sockaddr_storage directly in cls_redirect test	Hoyeon Lee
	The cls_redirect test uses a custom addr_port/tuple wrapper to represent IPv4/IPv6 addresses and ports. This custom wrapper requires extra conversion logic and specific helpers such as fill_addr_port(), which are no longer necessary when using standard socket address structures. This commit replaces addr_port/tuple with the standard sockaddr_storage so test handles address families and ports using native socket types. It removes the custom helper, eliminates redundant casts, and simplifies the setup helpers without functional changes. set_up_conn() and build_input() now take src/dst sockaddr_storage directly. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251121081332.2309838-2-hoyeon.lee@suse.com
2025-11-20	selftests/bpf: Use ASSERT_STRNEQ to factor in long slab cache names	Matt Bobrowski
	subtest_kmem_cache_iter_check_slabinfo() fundamentally compares slab cache names parsed out from /proc/slabinfo against those stored within struct kmem_cache_result. The current problem is that the slab cache name within struct kmem_cache_result is stored within a bounded fixed-length array (sized to SLAB_NAME_MAX(32)), whereas the name parsed out from /proc/slabinfo is not. Meaning, using ASSERT_STREQ() can certainly lead to test failures, particularly when dealing with slab cache names that are longer than SLAB_NAME_MAX(32) bytes. Notably, kmem_cache_create() allows callers to create slab caches with somewhat arbitrarily sized names via its __name identifier argument, so exceeding the SLAB_NAME_MAX(32) limit that is in place now can certainly happen. Make subtest_kmem_cache_iter_check_slabinfo() more reliable by only checking up to sizeof(struct kmem_cache_result.name) - 1 using ASSERT_STRNEQ(). Fixes: a496d0cdc84d ("selftests/bpf: Add a test for kmem_cache_iter") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://patch.msgid.link/20251118073734.4188710-1-mattbobrowski@google.com
2025-11-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.18-rc7). No conflicts, adjacent changes: tools/testing/selftests/net/af_unix/Makefile e1bb28bf13f4 ("selftest: af_unix: Add test for SO_PEEK_OFF.") 45a1cd8346ca ("selftests: af_unix: Add tests for ECONNRESET and EOF semantics") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18	selftests/bpf: Replace TCP CC string comparisons with bpf_strncmp	Hoyeon Lee
	The connect4_prog and bpf_iter_setsockopt tests duplicate the same open-coded TCP congestion control string comparison logic. Since bpf_strncmp() provides the same functionality, use it instead to avoid repeated open-coded loops. This change applies only to functional BPF tests and does not affect the verifier performance benchmarks (veristat.cfg). No functional changes intended. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251115225550.1086693-5-hoyeon.lee@suse.com
2025-11-18	selftests/bpf: Move common TCP helpers into bpf_tracing_net.h	Hoyeon Lee
	Some BPF selftests contain identical copies of the min(), max(), before(), and after() helpers. These repeated snippets are the same across the tests and do not need to be defined separately. Move these helpers into bpf_tracing_net.h so they can be shared by TCP related BPF programs. This removes repeated code and keeps the helpers in a single place. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251115225550.1086693-4-hoyeon.lee@suse.com
2025-11-14	selftests/bpf: Test bpf_skb_check_mtu(BPF_MTU_CHK_SEGS) when ↵	Martin KaFai Lau
	transport_header is not set Add a test to check that bpf_skb_check_mtu(BPF_MTU_CHK_SEGS) is rejected (-EINVAL) if skb->transport_header is not set. The test needs to lower the MTU of the loopback device. Thus, take this opportunity to run the test in a netns by adding "ns_" to the test name. The "serial_" prefix can then be removed. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20251112232331.1566074-2-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-14	selftests/bpf: Align kfuncs renamed in bpf tree	Mykyta Yatsenko
	bpf_task_work_schedule_resume() and bpf_task_work_schedule_signal() have been renamed in bpf tree to bpf_task_work_schedule_resume_impl() and bpf_task_work_schedule_signal_impl() accordingly. There are few uses of these kfuncs in selftests that are not in bpf tree, so that when we port [1] into bpf-next, those BPF programs will not compile. This patch aligns those remaining callsites with the kfunc renaming. It should go on top of [1] when applying on bpf-next. 1: https://lore.kernel.org/all/20251104-implv2-v3-0-4772b9ae0e06@meta.com/ Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20251105132105.597344-1-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.18-rc5+	Alexei Starovoitov
	Cross-merge BPF and other fixes after downstream PR. Minor conflict in kernel/bpf/helpers.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-14	selftests/bpf: Add BTF dedup tests for recursive typedef definitions	Paul Houssel
	Add several ./test_progs tests: 1. btf/dedup:recursive typedef ensures that deduplication no longer fails on recursive typedefs. 2. btf/dedup:typedef ensures that typedefs are deduplicated correctly just as they were before this patch. Signed-off-by: Paul Houssel <paul.houssel@orange.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/9fac2f744089f6090257d4c881914b79f6cd6c6a.1763037045.git.paul.houssel@orange.com
2025-11-14	selftests/bpf: Fix failure paths in send_signal test	Alexei Starovoitov
	When test_send_signal_kern__open_and_load() fails parent closes the pipe which cases ASSERT_EQ(read(pipe_p2c...)) to fail, but child continues and enters infinite loop, while parent is stuck in wait(NULL). Other error paths have similar issue, so kill the child before waiting on it. The bug was discovered while compiling all of selftests with -O1 instead of -O2 which caused progs/test_send_signal_kern.c to fail to load. Fixes: ab8b7f0cb358 ("tools/bpf: Add self tests for bpf_send_signal_thread()") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20251113171153.2583-1-alexei.starovoitov@gmail.com
2025-11-14	selftests/bpf: Convert glob_match() to bpf arena	Alexei Starovoitov
	Increase arena test coverage. Convert glob_match() to bpf arena in two steps: 1. Copy paste lib/glob.c into bpf_arena_strsearch.h Copy paste lib/globtests.c into progs/arena_strsearch.c 2. Add __arena to pointers Add __arg_arena to global functions that accept arena pointers Add cond_break to loops The test also serves as a good example of what's possible with bpf arena and how existing algorithms can be converted. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251111032931.21430-1-alexei.starovoitov@gmail.com
2025-11-14	selftests/bpf: Test widen_imprecise_scalars() with different stack depth	Eduard Zingerman
	A test case for a situation when widen_imprecise_scalars() is called with old->allocated_stack > cur->allocated_stack. Test structure: def widening_stack_size_bug(): r1 = 0 for r6 in 0..1: iterator_with_diff_stack_depth(r1) r1 = 42 def iterator_with_diff_stack_depth(r1): if r1 != 42: use 128 bytes of stack iterator based loop iterator_with_diff_stack_depth() is verified with r1 == 0 first and r1 == 42 next. Causing stack usage of 128 bytes on a first visit and 8 bytes on a second. Such arrangement triggered a KASAN error in widen_imprecise_scalars(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251114025730.772723-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-13	selftests/bpf: retry bpf_map_update_elem() when E2BIG is returned	Matt Bobrowski
	Executing the test_maps binary on platforms with extremely high core counts may cause intermittent assertion failures in test_update_delete() (called via test_map_parallel()). This can occur because bpf_map_update_elem() under some circumstances (specifically in this case while performing bpf_map_update_elem() with BPF_NOEXIST on a BPF_MAP_TYPE_HASH with its map_flags set to BPF_F_NO_PREALLOC) can return an E2BIG error code i.e. error -7 7 tools/testing/selftests/bpf/test_maps.c:#: void test_update_delete(unsigned int, void ): Assertion `err == 0' failed. tools/testing/selftests/bpf/test_maps.c:#: void __run_parallel(unsigned int, void ()(unsigned int, void ), void ): Assertion `status == 0' failed. As it turns out, is_map_full() which is called from alloc_htab_elem() can take on a conservative approach when htab->use_percpu_counter is true (which is the case here because the percpu_counter is used when a BPF_MAP_TYPE_HASH is created with its map_flags set to BPF_F_NO_PREALLOC). This conservative approach prioritizes preventing over-allocation and potential issues that could arise from possibly exceeding htab->map.max_entries in highly concurrent environments, even if it means slightly under-utilizing the htab map's capacity. Given that bpf_map_update_elem() from test_update_delete() can return E2BIG, update can_retry() such that it also accounts for the E2BIG error code (specifically only when running with map_flags being set to BPF_F_NO_PREALLOC). The retry loop will allow the global count belonging to the percpu_counter to become synchronized and better reflect the current htab map's capacity. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20251113092519.2632079-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-13	selftests/bpf: Add mptcp test with sockmap	Jiayuan Chen
	Add test cases to verify that when MPTCP falls back to plain TCP sockets, they can properly work with sockmap. Additionally, add test cases to ensure that sockmap correctly rejects MPTCP sockets as expected. Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251111060307.194196-4-jiayuan.chen@linux.dev
2025-11-13	selftests/bpf: Add test to verify freeing the special fields in pcpu maps	Leon Hwang
	Add test to verify that updating [lru_,]percpu_hash maps decrements refcount when BPF_KPTR_REF objects are involved. The tests perform the following steps: . Call update_elem() to insert an initial value. . Use bpf_refcount_acquire() to increment the refcount. . Store the node pointer in the map value. . Add the node to a linked list. . Probe-read the refcount and verify it is 2. . Call update_elem() again to trigger refcount decrement. . Probe-read the refcount and verify it is 1. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251105151407.12723-3-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-10	bpf/selftests: Add selftest for bpf_smc_hs_ctrl	D. Wythe
	This tests introduces a tiny smc_hs_ctrl for filtering SMC connections based on IP pairs, and also adds a realistic topology model to verify it. Also, we can only use SMC loopback under CI test, so an additional configuration needs to be enabled. Follow the steps below to run this test. make -C tools/testing/selftests/bpf cd tools/testing/selftests/bpf sudo ./test_progs -t smc Results shows: Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Tested-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20251107035632.115950-4-alibuda@linux.alibaba.com
2025-11-10	selftests/bpf: Cover skb metadata access after bpf_skb_change_proto	Jakub Sitnicki
	Add a test to verify that skb metadata remains accessible after calling bpf_skb_change_proto(), which modifies packet headroom to accommodate different IP header sizes. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-16-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Cover skb metadata access after change_head/tail helper	Jakub Sitnicki
	Add a test to verify that skb metadata remains accessible after calling bpf_skb_change_head() and bpf_skb_change_tail(), which modify packet headroom/tailroom and can trigger head reallocation. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-15-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Cover skb metadata access after bpf_skb_adjust_room	Jakub Sitnicki
	Add a test to verify that skb metadata remains accessible after calling bpf_skb_adjust_room(), which modifies the packet headroom and can trigger head reallocation. The helper expects an Ethernet frame carrying an IP packet so switch test packet identification by source MAC address since we can no longer rely on Ethernet proto being set to zero. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-14-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Cover skb metadata access after vlan push/pop helper	Jakub Sitnicki
	Add a test to verify that skb metadata remains accessible after calling bpf_skb_vlan_push() and bpf_skb_vlan_pop(), which modify the packet headroom. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-13-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Expect unclone to preserve skb metadata	Jakub Sitnicki
	Since pskb_expand_head() no longer clears metadata on unclone, update tests for cloned packets to expect metadata to remain intact. Also simplify the clone_dynptr_kept_on_{data,meta}_slice_write tests. Creating an r/w dynptr slice is sufficient to trigger an unclone in the prologue, so remove the extraneous writes to the data/meta slice. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-12-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Dump skb metadata on verification failure	Jakub Sitnicki
	Add diagnostic output when metadata verification fails to help with troubleshooting test failures. Introduce a check_metadata() helper that prints both expected and received metadata to the BPF program's stderr stream on mismatch. The userspace test reads and dumps this stream on failure. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-11-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Verify skb metadata in BPF instead of userspace	Jakub Sitnicki
	Move metadata verification into the BPF TC programs. Previously, userspace read metadata from a map and verified it once at test end. Now TC programs compare metadata directly using __builtin_memcmp() and set a test_pass flag. This enables verification at multiple points during test execution rather than a single final check. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-10-5ceb08a9b37b@cloudflare.com
2025-11-06	selftests/bpf: Use start_server_str rather than start_reuseport_server in ↵	Alexis Lothoré (eBPF Foundation)
	tc_tunnel Now that start_server_str enforces SO_REUSEADDR, there's no need to keep using start_reusport_server in tc_tunnel, especially since it only uses one server at a time. Replace start_reuseport_server with start_server_str in tc_tunnel test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-start-server-soreuseaddr-v1-2-1bbd9c1f8d65@bootlin.com