sugar/linux - Sugar's Linux Kernel Patch

Age	Commit message (Collapse)	Author
6 days	Merge tag 'net-next-6.19' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API: - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers: - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support" * tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits) net: page_pool: sanitise allocation order net: page pool: xa init with destroy on pp init net/mlx5e: Support XDP target xmit with dummy program net/mlx5e: Update XDP features in switch channels selftests/tc-testing: Test CAKE scheduler when enqueue drops packets net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop wireguard: netlink: generate netlink code wireguard: uapi: generate header with ynl-gen wireguard: uapi: move flag enums wireguard: uapi: move enum wg_cmd wireguard: netlink: add YNL specification selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py selftests: drv-net: introduce Iperf3Runner for measurement use cases selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive() Documentation: net: dsa: mention simple HSR offload helpers Documentation: net: dsa: mention availability of RedBox ...
10 days	selftests/bpf: do not hardcode target rate in test_tc_edt BPF program	Alexis Lothoré (eBPF Foundation)
	test_tc_edt currently defines the target rate in both the userspace and BPF parts. This value could be defined once in the userspace part if we make it able to configure the BPF program before starting the test. Add a target_rate variable in the BPF part, and make the userspace part set it to the desired rate before attaching the shaping program. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-4-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 days	selftests/bpf: integrate test_tc_edt into test_progs	Alexis Lothoré (eBPF Foundation)
	test_tc_edt.sh uses a pair of veth and a BPF program attached to the TX veth to shape the traffic to 5MBps. It then checks that the amount of received bytes (at interface level), compared to the TX duration, indeed matches 5Mbps. Convert this test script to the test_progs framework: - keep the double veth setup, isolated in two veths - run a small tcp server, and connect client to server - push a pre-configured amount of bytes, and measure how much time has been needed to push those - ensure that this rate is in a 2% error margin around the target rate This two percent value, while being tight, is hopefully large enough to not make the test too flaky in CI, while also turning it into a small example of BPF-based shaping. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-2-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 days	selftests/bpf: rename test_tc_edt.bpf.c section to expose program type	Alexis Lothoré (eBPF Foundation)
	The test_tc_edt BPF program uses a custom section name, which works fine when manually loading it with tc, but prevents it from being loaded with libbpf. Update the program section name to "tc" to be able to manipulate it with a libbpf-based C test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-1-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
11 days	selftests/bpf: Remove usage of lsm/file_alloc_security in selftest	Amery Hung
	file_alloc_security hook is disabled. Use other LSM hooks in selftests instead. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20251126202927.2584874-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
11 days	bpf: force BPF_F_RDONLY_PROG on insn array creation	Anton Protopopov
	The original implementation added a hack to check_mem_access() to prevent programs from writing into insn arrays. To get rid of this hack, enforce BPF_F_RDONLY_PROG on map creation. Also fix the corresponding selftest, as the error message changes with this patch. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251128063224.1305482-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
14 days	tcp: remove icsk->icsk_retransmit_timer	Eric Dumazet
	Now sk->sk_timer is no longer used by TCP keepalive, we can use its storage for TCP and MPTCP retransmit timers for better cache locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 days	tcp: introduce icsk->icsk_keepalive_timer	Eric Dumazet
	sk->sk_timer has been used for TCP keepalives. Keepalive timers are not in fast path, we want to use sk->sk_timer storage for retransmit timers, for better cache locality. Create icsk->icsk_keepalive_timer and change keepalive code to no longer use sk->sk_timer. Added space is reclaimed in the following patch. This includes changes to MPTCP, which was also using sk_timer. Alias icsk->mptcp_tout_timer and icsk->icsk_keepalive_timer for inet_sk_diag_fill() sake. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 days	selftests/bpf: Call bpf_get_numa_node_id() in trigger_count()	Menglong Dong
	The bench test "trig-kernel-count" can be used as a baseline comparison for fentry and other benchmarks, and the calling to bpf_get_numa_node_id() should be considered as composition of the baseline. So, let's call it in trigger_count(). Meanwhile, rename trigger_count() to trigger_kernel_count() to make it easier understand. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251116014242.151110-1-dongml2@chinatelecom.cn
2025-11-24	selftests/bpf: Fix htab_update/reenter_update selftest failure	Saket Kumar Bhaskar
	Since commit 31158ad02ddb ("rqspinlock: Add deadlock detection and recovery") the updated path on re-entrancy now reports deadlock via -EDEADLK instead of the previous -EBUSY. Also, the way reentrancy was exercised (via fentry/lookup_elem_raw) has been fragile because lookup_elem_raw may be inlined (find_kernel_btf_id() will return -ESRCH). To fix this fentry is attached to bpf_obj_free_fields() instead of lookup_elem_raw() and: - The htab map is made to use a BTF-described struct val with a struct bpf_timer so that check_and_free_fields() reliably calls bpf_obj_free_fields() on element replacement. - The selftest is updated to do two updates to the same key (insert + replace) in prog_test. - The selftest is updated to align with expected errno with the kernel’s current behavior. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/r/20251117060752.129648-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests: bpf: Add tests for unbalanced rcu_read_lock	Puranjay Mohan
	As verifier now supports nested rcu critical sections, add new test cases to make sure unbalanced usage of rcu_read_lock()/unlock() is rejected. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251117200411.25563-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	bpf: test the correct stack liveness of tail calls	Eduard Zingerman
	A new test is added: caller_stack_write_tail_call tests that the live stack is correctly tracked for a tail call. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Martin Teichmann <martin.teichmann@xfel.eu> Link: https://lore.kernel.org/r/20251119160355.1160932-5-martin.teichmann@xfel.eu Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	bpf: test the proper verification of tail calls	Martin Teichmann
	Three tests are added: - invalidate_pkt_pointers_by_tail_call checks that one can use the packet pointer after a tail call. This was originally possible and also poses not problems, but was made impossible by 1a4607ffba35. - invalidate_pkt_pointers_by_static_tail_call tests a corner case found by Eduard Zingerman during the discussion of the original fix, which was broken in that fix. - subprog_result_tail_call tests that precision propagation works correctly across tail calls. This did not work before. Signed-off-by: Martin Teichmann <martin.teichmann@xfel.eu> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251119160355.1160932-3-martin.teichmann@xfel.eu Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests/bpf: Improve reliability of test_perf_branches_no_hw()	Matt Bobrowski
	Currently, test_perf_branches_no_hw() relies on the busy loop within test_perf_branches_common() being slow enough to allow at least one perf event sample tick to occur before starting to tear down the backing perf event BPF program. With a relatively small fixed iteration count of 1,000,000, this is not guaranteed on modern fast CPUs, resulting in the test run to subsequently fail with the following: bpf_testmod.ko is already unloaded. Loading bpf_testmod.ko... Successfully loaded bpf_testmod.ko. test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:PASS:output not valid 0 nsec check_good_sample:PASS:read_branches_size 0 nsec check_good_sample:PASS:read_branches_stack 0 nsec check_good_sample:PASS:read_branches_stack 0 nsec check_good_sample:PASS:read_branches_global 0 nsec check_good_sample:PASS:read_branches_global 0 nsec check_good_sample:PASS:read_branches_size 0 nsec test_perf_branches_no_hw:PASS:perf_event_open 0 nsec test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_bad_sample:FAIL:output not valid no valid sample from prog Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED Successfully unloaded bpf_testmod.ko. On a modern CPU (i.e. one with a 3.5 GHz clock rate), executing 1 million increments of a volatile integer can take significantly less than 1 millisecond. If the spin loop and detachment of the perf event BPF program elapses before the first 1 ms sampling interval elapses, the perf event will never end up firing. Fix this by bumping the loop iteration counter a little within test_perf_branches_common(), along with ensuring adding another loop termination condition which is directly influenced by the backing perf event BPF program executing. Notably, a concious decision was made to not adjust the sample_freq value as that is just not a reliable way to go about fixing the problem. It effectively still leaves the race window open. Fixes: 67306f84ca78c ("selftests/bpf: Add bpf_read_branch_records() selftest") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251119143540.2911424-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests: bpf: Enable gotox tests from arm64	Puranjay Mohan
	arm64 JIT now supports gotox instruction and jumptables, so run tests in verifier_gotox.c for arm64. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251117130732.11107-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.18-rc7). No conflicts, adjacent changes: tools/testing/selftests/net/af_unix/Makefile e1bb28bf13f4 ("selftest: af_unix: Add test for SO_PEEK_OFF.") 45a1cd8346ca ("selftests: af_unix: Add tests for ECONNRESET and EOF semantics") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18	selftests/bpf: Replace TCP CC string comparisons with bpf_strncmp	Hoyeon Lee
	The connect4_prog and bpf_iter_setsockopt tests duplicate the same open-coded TCP congestion control string comparison logic. Since bpf_strncmp() provides the same functionality, use it instead to avoid repeated open-coded loops. This change applies only to functional BPF tests and does not affect the verifier performance benchmarks (veristat.cfg). No functional changes intended. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251115225550.1086693-5-hoyeon.lee@suse.com
2025-11-18	selftests/bpf: Move common TCP helpers into bpf_tracing_net.h	Hoyeon Lee
	Some BPF selftests contain identical copies of the min(), max(), before(), and after() helpers. These repeated snippets are the same across the tests and do not need to be defined separately. Move these helpers into bpf_tracing_net.h so they can be shared by TCP related BPF programs. This removes repeated code and keeps the helpers in a single place. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251115225550.1086693-4-hoyeon.lee@suse.com
2025-11-14	selftests/bpf: Test bpf_skb_check_mtu(BPF_MTU_CHK_SEGS) when ↵	Martin KaFai Lau
	transport_header is not set Add a test to check that bpf_skb_check_mtu(BPF_MTU_CHK_SEGS) is rejected (-EINVAL) if skb->transport_header is not set. The test needs to lower the MTU of the loopback device. Thus, take this opportunity to run the test in a netns by adding "ns_" to the test name. The "serial_" prefix can then be removed. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20251112232331.1566074-2-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-14	selftests/bpf: Align kfuncs renamed in bpf tree	Mykyta Yatsenko
	bpf_task_work_schedule_resume() and bpf_task_work_schedule_signal() have been renamed in bpf tree to bpf_task_work_schedule_resume_impl() and bpf_task_work_schedule_signal_impl() accordingly. There are few uses of these kfuncs in selftests that are not in bpf tree, so that when we port [1] into bpf-next, those BPF programs will not compile. This patch aligns those remaining callsites with the kfunc renaming. It should go on top of [1] when applying on bpf-next. 1: https://lore.kernel.org/all/20251104-implv2-v3-0-4772b9ae0e06@meta.com/ Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20251105132105.597344-1-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after 6.18-rc5+	Alexei Starovoitov
	Cross-merge BPF and other fixes after downstream PR. Minor conflict in kernel/bpf/helpers.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-14	selftests/bpf: Convert glob_match() to bpf arena	Alexei Starovoitov
	Increase arena test coverage. Convert glob_match() to bpf arena in two steps: 1. Copy paste lib/glob.c into bpf_arena_strsearch.h Copy paste lib/globtests.c into progs/arena_strsearch.c 2. Add __arena to pointers Add __arg_arena to global functions that accept arena pointers Add cond_break to loops The test also serves as a good example of what's possible with bpf arena and how existing algorithms can be converted. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251111032931.21430-1-alexei.starovoitov@gmail.com
2025-11-14	selftests/bpf: Test widen_imprecise_scalars() with different stack depth	Eduard Zingerman
	A test case for a situation when widen_imprecise_scalars() is called with old->allocated_stack > cur->allocated_stack. Test structure: def widening_stack_size_bug(): r1 = 0 for r6 in 0..1: iterator_with_diff_stack_depth(r1) r1 = 42 def iterator_with_diff_stack_depth(r1): if r1 != 42: use 128 bytes of stack iterator based loop iterator_with_diff_stack_depth() is verified with r1 == 0 first and r1 == 42 next. Causing stack usage of 128 bytes on a first visit and 8 bytes on a second. Such arrangement triggered a KASAN error in widen_imprecise_scalars(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251114025730.772723-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-13	selftests/bpf: Add mptcp test with sockmap	Jiayuan Chen
	Add test cases to verify that when MPTCP falls back to plain TCP sockets, they can properly work with sockmap. Additionally, add test cases to ensure that sockmap correctly rejects MPTCP sockets as expected. Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251111060307.194196-4-jiayuan.chen@linux.dev
2025-11-13	selftests/bpf: Add test to verify freeing the special fields in pcpu maps	Leon Hwang
	Add test to verify that updating [lru_,]percpu_hash maps decrements refcount when BPF_KPTR_REF objects are involved. The tests perform the following steps: . Call update_elem() to insert an initial value. . Use bpf_refcount_acquire() to increment the refcount. . Store the node pointer in the map value. . Add the node to a linked list. . Probe-read the refcount and verify it is 2. . Call update_elem() again to trigger refcount decrement. . Probe-read the refcount and verify it is 1. Signed-off-by: Leon Hwang <leon.hwang@linux.dev> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251105151407.12723-3-leon.hwang@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-10	bpf/selftests: Add selftest for bpf_smc_hs_ctrl	D. Wythe
	This tests introduces a tiny smc_hs_ctrl for filtering SMC connections based on IP pairs, and also adds a realistic topology model to verify it. Also, we can only use SMC loopback under CI test, so an additional configuration needs to be enabled. Follow the steps below to run this test. make -C tools/testing/selftests/bpf cd tools/testing/selftests/bpf sudo ./test_progs -t smc Results shows: Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Tested-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20251107035632.115950-4-alibuda@linux.alibaba.com
2025-11-10	selftests/bpf: Cover skb metadata access after bpf_skb_change_proto	Jakub Sitnicki
	Add a test to verify that skb metadata remains accessible after calling bpf_skb_change_proto(), which modifies packet headroom to accommodate different IP header sizes. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-16-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Cover skb metadata access after change_head/tail helper	Jakub Sitnicki
	Add a test to verify that skb metadata remains accessible after calling bpf_skb_change_head() and bpf_skb_change_tail(), which modify packet headroom/tailroom and can trigger head reallocation. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-15-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Cover skb metadata access after bpf_skb_adjust_room	Jakub Sitnicki
	Add a test to verify that skb metadata remains accessible after calling bpf_skb_adjust_room(), which modifies the packet headroom and can trigger head reallocation. The helper expects an Ethernet frame carrying an IP packet so switch test packet identification by source MAC address since we can no longer rely on Ethernet proto being set to zero. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-14-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Cover skb metadata access after vlan push/pop helper	Jakub Sitnicki
	Add a test to verify that skb metadata remains accessible after calling bpf_skb_vlan_push() and bpf_skb_vlan_pop(), which modify the packet headroom. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-13-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Expect unclone to preserve skb metadata	Jakub Sitnicki
	Since pskb_expand_head() no longer clears metadata on unclone, update tests for cloned packets to expect metadata to remain intact. Also simplify the clone_dynptr_kept_on_{data,meta}_slice_write tests. Creating an r/w dynptr slice is sufficient to trigger an unclone in the prologue, so remove the extraneous writes to the data/meta slice. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-12-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Dump skb metadata on verification failure	Jakub Sitnicki
	Add diagnostic output when metadata verification fails to help with troubleshooting test failures. Introduce a check_metadata() helper that prints both expected and received metadata to the BPF program's stderr stream on mismatch. The userspace test reads and dumps this stream on failure. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-11-5ceb08a9b37b@cloudflare.com
2025-11-10	selftests/bpf: Verify skb metadata in BPF instead of userspace	Jakub Sitnicki
	Move metadata verification into the BPF TC programs. Previously, userspace read metadata from a map and verified it once at test end. Now TC programs compare metadata directly using __builtin_memcmp() and set a test_pass flag. This enables verification at multiple points during test execution rather than a single final check. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251105-skb-meta-rx-path-v4-10-5ceb08a9b37b@cloudflare.com
2025-11-05	selftests/bpf: add C-level selftests for indirect jumps	Anton Protopopov
	Add C-level selftests for indirect jumps to validate LLVM and libbpf functionality. The tests are intentionally disabled, to be run locally by developers, but will not make the CI red. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251105090410.1250500-13-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-05	selftests/bpf: add new verifier_gotox test	Anton Protopopov
	Add a set of tests to validate core gotox functionality without need to rely on compilers. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251105090410.1250500-12-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-05	selftests/bpf: Add stacktrace ips test for raw_tp	Jiri Olsa
	Adding test that verifies we get expected initial 2 entries from stacktrace for rawtp probe via ORC unwind. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-5-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-11-05	selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi	Jiri Olsa
	Adding test that attaches kprobe/kretprobe multi and verifies the ORC stacktrace matches expected functions. Adding bpf_testmod_stacktrace_test function to bpf_testmod kernel module which is called through several functions so we get reliable call path for stacktrace. The test is only for ORC unwinder to keep it simple. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-11-04	bpf: add _impl suffix for bpf_stream_vprintk() kfunc	Mykyta Yatsenko
	Rename bpf_stream_vprintk() to bpf_stream_vprintk_impl(). This makes bpf_stream_vprintk() follow the already established "_impl" suffix-based naming convention for kfuncs with the bpf_prog_aux argument provided by the verifier implicitly. This convention will be taken advantage of with the upcoming KF_IMPLICIT_ARGS feature to preserve backwards compatibility to BPF programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20251104-implv2-v3-2-4772b9ae0e06@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-11-04	bpf:add _impl suffix for bpf_task_work_schedule* kfuncs	Mykyta Yatsenko
	Rename: bpf_task_work_schedule_resume()->bpf_task_work_schedule_resume_impl() bpf_task_work_schedule_signal()->bpf_task_work_schedule_signal_impl() This aligns task work scheduling kfuncs with the established naming scheme for kfuncs with the bpf_prog_aux argument provided by the verifier implicitly. This convention will be taken advantage of with the upcoming KF_IMPLICIT_ARGS feature to preserve backwards compatibility to BPF programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20251104-implv2-v3-1-4772b9ae0e06@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-11-03	selftests/bpf: Add test for conditional jumps on same scalar register	KaFai Wan
	Add test cases to verify the correctness of the BPF verifier's branch analysis when conditional jumps are performed on the same scalar register. And make sure that JGT does not trigger verifier BUG. Signed-off-by: KaFai Wan <kafai.wan@linux.dev> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251103063108.1111764-3-kafai.wan@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-03	selftests/bpf: Add tests for livepatch + bpf trampoline	Song Liu
	Both livepatch and BPF trampoline use ftrace. Special attention is needed when livepatch and fexit program touch the same function at the same time, because livepatch updates a kernel function and the BPF trampoline need to call into the right version of the kernel function. Use samples/livepatch/livepatch-sample.ko for the test. The test covers two cases: 1) When a fentry program is loaded first. This exercises the modify_ftrace_direct code path. 2) When a fentry program is loaded first. This exercises the register_ftrace_direct code path. Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251027175023.1521602-4-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-10-29	selftests/bpf: Fix intermittent failures in file_reader test	Mykyta Yatsenko
	file_reader/on_open_expect_fault intermittently fails when test_progs runs tests in parallel, because it expects a page fault on first read. Another file_reader test running concurrently may have already pulled the same pages into the page cache, eliminating the fault and causing a spurious failure. Make file_reader/on_open_expect_fault read from a file region that does not overlap with other file_reader tests, so the initial access still faults even under parallel execution. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20251029195907.858217-1-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-29	selftests/bpf: Integrate test_tc_tunnel.sh tests into test_progs	Alexis Lothoré (eBPF Foundation)
	The test_tc_tunnel.sh script checks that a large variety of tunneling mechanisms handled by the kernel can be handled as well by eBPF programs. While this test shares similarities with test_tunnel.c (which is already integrated in test_progs), those are testing slightly different things: - test_tunnel.c creates a tunnel interface, and then get and set tunnel keys in packet metadata, from BPF programs. - test_tc_tunnels.sh manually parses/crafts packets content Bring the tests covered by test_tc_tunnel.sh into the test_progs framework, by creating a dedicated test_tc_tunnel.sh. This new test defines a "generic" runner which, for each test configuration: - will configure the relevant veth pair, each of those isolated in a dedicated namespace - will check that traffic will fail if there is only an encapsulating program attached to one veth egress - will check that traffic succeed if we enable some decapsulation module on kernel side - will check that traffic still succeeds if we replace the kernel decapsulation with some eBPF ingress decapsulation. Example of the new test execution: # ./test_progs -a tc_tunnel #447/1 tc_tunnel/ipip_none:OK #447/2 tc_tunnel/ipip6_none:OK #447/3 tc_tunnel/ip6tnl_none:OK #447/4 tc_tunnel/sit_none:OK #447/5 tc_tunnel/vxlan_eth:OK #447/6 tc_tunnel/ip6vxlan_eth:OK #447/7 tc_tunnel/gre_none:OK #447/8 tc_tunnel/gre_eth:OK #447/9 tc_tunnel/gre_mpls:OK #447/10 tc_tunnel/ip6gre_none:OK #447/11 tc_tunnel/ip6gre_eth:OK #447/12 tc_tunnel/ip6gre_mpls:OK #447/13 tc_tunnel/udp_none:OK #447/14 tc_tunnel/udp_eth:OK #447/15 tc_tunnel/udp_mpls:OK #447/16 tc_tunnel/ip6udp_none:OK #447/17 tc_tunnel/ip6udp_eth:OK #447/18 tc_tunnel/ip6udp_mpls:OK #447 tc_tunnel:OK Summary: 1/18 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251027-tc_tunnel-v3-3-505c12019f9d@bootlin.com
2025-10-29	selftests/bpf: Make test_tc_tunnel.bpf.c compatible with big endian platforms	Alexis Lothoré (eBPF Foundation)
	When trying to run bpf-based encapsulation in a s390x environment, some parts of test_tc_tunnel.bpf.o do not encapsulate correctly the traffic, leading to tests failures. Adding some logs shows for example that packets about to be sent on an interface with the ip6vxlan_eth program attached do not have the expected value 5 in the ip header ihl field, and so are ignored by the program. This phenomenon appears when trying to cross-compile the selftests, rather than compiling it from a virtualized host: the selftests build system may then wrongly pick some host headers. If <asm/byteorder.h> ends up being picked on the host (and if the host has a endianness different from the target one), it will then expose wrong endianness defines (e.g __LITTLE_ENDIAN_BITFIELD instead of __BIT_ENDIAN_BITFIELD), and it will for example mess up the iphdr structure layout used in the ebpf program. To prevent this, directly use the vmlinux.h header generated by the selftests build system rather than including directly specific kernel headers. As a consequence, add some missing definitions that are not exposed by vmlinux.h, and adapt the bitfield manipulations to allow building and using the program on both types of platforms. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251027-tc_tunnel-v3-2-505c12019f9d@bootlin.com
2025-10-27	selftests/bpf/benchs: Add overwrite mode benchmark for BPF ring buffer	Xu Kuohai
	Add --rb-overwrite option to benchmark BPF ring buffer in overwrite mode. Since overwrite mode is not yet supported by libbpf for consumer, also add --rb-bench-producer option to benchmark producer directly without a consumer. Benchmarks on an x86_64 and an arm64 CPU are shown below for reference. - AMD EPYC 9654 (x86_64) Ringbuf, multi-producer contention in overwrite mode, no consumer ================================================================= rb-prod nr_prod 1 32.180 ± 0.033M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 2 9.617 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 3 8.810 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 4 9.272 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 8 9.173 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 12 3.086 ± 0.032M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 16 2.945 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 20 2.519 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 24 2.545 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 28 2.363 ± 0.024M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 32 2.357 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 36 2.267 ± 0.011M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 40 2.284 ± 0.020M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 44 2.215 ± 0.025M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 48 2.193 ± 0.023M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 52 2.208 ± 0.024M/s (drops 0.000 ± 0.000M/s) - HiSilicon Kunpeng 920 (arm64) Ringbuf, multi-producer contention in overwrite mode, no consumer ================================================================= rb-prod nr_prod 1 14.478 ± 0.006M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 2 21.787 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 3 6.045 ± 0.001M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 4 5.352 ± 0.003M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 8 4.850 ± 0.002M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 12 3.542 ± 0.016M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 16 3.509 ± 0.021M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 20 3.171 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 24 3.154 ± 0.014M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 28 2.974 ± 0.015M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 32 3.167 ± 0.014M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 36 2.903 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 40 2.866 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 44 2.914 ± 0.010M/s (drops 0.000 ± 0.000M/s) rb-prod nr_prod 48 2.806 ± 0.012M/s (drops 0.000 ± 0.000M/s) Rb-prod nr_prod 52 2.840 ± 0.012M/s (drops 0.000 ± 0.000M/s) Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251018035738.4039621-4-xukuohai@huaweicloud.com
2025-10-27	selftests/bpf: Add overwrite mode test for BPF ring buffer	Xu Kuohai
	Add overwrite mode test for BPF ring buffer. The test creates a BPF ring buffer in overwrite mode, then repeatedly reserves and commits records to check if the ring buffer works as expected both before and after overwriting occurs. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251018035738.4039621-3-xukuohai@huaweicloud.com
2025-10-27	selftests/bpf: add file dynptr tests	Mykyta Yatsenko
	Introducing selftests for validating file-backed dynptr works as expected. * validate implementation supports dynptr slice and read operations * validate destructors should be paired with initializers * validate sleepable progs can page in. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251026203853.135105-11-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-27	bpf: widen dynptr size/offset to 64 bit	Mykyta Yatsenko
	Dynptr currently caps size and offset at 24 bits, which isn’t sufficient for file-backed use cases; even 32 bits can be limiting. Refactor dynptr helpers/kfuncs to use 64-bit size and offset, ensuring consistency across the APIs. This change does not affect internals of xdp, skb or other dynptrs, which continue to behave as before. Also it does not break binary compatibility. The widening enables large-file access support via dynptr, implemented in the next patches. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251026203853.135105-3-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-27	selftests/bpf: remove unnecessary kfunc prototypes	Mykyta Yatsenko
	Remove unnecessary kfunc prototypes from test programs, these are provided by vmlinux.h Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251026203853.135105-2-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.18-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>