| Age | Commit message (Collapse) | Author |
|
Pull rdma updates from Jason Gunthorpe:
"This has another new RDMA driver 'bng_en' for latest generation
Broadcom NICs. There might be one more new driver still to come.
Otherwise it is a fairly quite cycle. Summary:
- Minor driver bug fixes and updates to cxgb4, rxe, rdmavt, bnxt_re,
mlx5
- Many bug fix patches for irdma
- WQ_PERCPU annotations and system_dfl_wq changes
- Improved mlx5 support for "other eswitches" and multiple PFs
- 1600Gbps link speed reporting support. Four Digits Now!
- New driver bng_en for latest generation Broadcom NICs
- Bonding support for hns
- Adjust mlx5's hmm based ODP to work with the very large address
space created by the new 5 level paging default on x86
- Lockdep fixups in rxe and siw"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (65 commits)
RDMA/rxe: reclassify sockets in order to avoid false positives from lockdep
RDMA/siw: reclassify sockets in order to avoid false positives from lockdep
RDMA/bng_re: Remove prefetch instruction
RDMA/core: Reduce cond_resched() frequency in __ib_umem_release
RDMA/irdma: Fix SRQ shadow area address initialization
RDMA/irdma: Remove doorbell elision logic
RDMA/irdma: Do not set IBK_LOCAL_DMA_LKEY for GEN3+
RDMA/irdma: Do not directly rely on IB_PD_UNSAFE_GLOBAL_RKEY
RDMA/irdma: Add missing mutex destroy
RDMA/irdma: Fix SIGBUS in AEQ destroy
RDMA/irdma: Add a missing kfree of struct irdma_pci_f for GEN2
RDMA/irdma: Fix data race in irdma_free_pble
RDMA/irdma: Fix data race in irdma_sc_ccq_arm
RDMA/mlx5: Add support for 1600_8x lane speed
RDMA/core: Add new IB rate for XDR (8x) support
IB/mlx5: Reduce IMR KSM size when 5-level paging is enabled
RDMA/bnxt_re: Pass correct flag for dma mr creation
RDMA/bnxt_re: Fix the inline size for GenP7 devices
RDMA/hns: Support reset recovery for bond
RDMA/hns: Support link state reporting for bond
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan
Williams)
- Switch vmd from custom domain number allocator to the common
allocator to prevent a potential race with new non-VMD buses (Dan
Williams)
- Enable Precision Time Measurement (PTM) only if device advertises
support for a relevant role, to prevent invalid PTM Requests that
cause ACS violations that are reported as AER Uncorrectable
Non-Fatal errors (Mika Westerberg)
Resource management:
- Prevent resource tree corruption when BAR resize fails (Ilpo
Järvinen)
- Restore BARs to the original size if a BAR resize fails (Ilpo
Järvinen)
- Remove BAR release from BAR resize attempts by the xe, i915, and
amdgpu drivers so the PCI core can restore BARs if the resize fails
(Ilpo Järvinen)
- Move Resizable BAR code to rebar.c (Ilpo Järvinen)
- Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo
Järvinen)
- Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo
Järvinen)
Power management and error handling:
- For drivers using PCI legacy suspend, save config state at suspend
so that state (not any earlier state from enumeration, probe, or
error recovery) will be restored when resuming (Lukas Wunner)
- For devices with no driver or a driver that lacks power management,
save config state at hibernate so that state (not any earlier state
from enumeration, probe, or error recovery) will be restored when
resuming (Lukas Wunner)
- Save device config space on device addition, before driver binding,
so error recovery works more reliably (Lukas Wunner)
- Drop pci_save_state() from several drivers that no longer need it
since the PCI core always does it and pci_restore_state() no longer
invalidates the saved state (Lukas Wunner)
- Document use of pci_save_state() by drivers to capture the state
they want restored during error recovery (Lukas Wunner)
Power control:
- Add a struct pci_ops.assert_perst() function pointer to
assert/deassert PCIe PERST# and implement it for the qcom driver
(Krishna Chaitanya Chundru)
- Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe
switch, which must be held in reset after poweron so the pwrctrl
driver can configure the switch via I2C before bringing up the
links (Krishna Chaitanya Chundru)
Endpoint framework:
- Convert the endpoint doorbell test to use a threaded IRQ to fix a
'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri)
- Add endpoint VNTB MSI doorbell support to reduce latency between
host and endpoint (Frank Li)
New native PCIe controller drivers:
- Add CIX Sky1 host controller DT binding and driver (Hans Zhang)
- Add NXP S32G host controller DT binding and driver (Vincent
Guittot)
- Add Renesas RZ/G3S host controller DT binding and driver (Claudiu
Beznea)
- Add SpacemiT K1 host controller DT binding and driver (Alex Elder)
Amlogic Meson PCIe controller driver:
- Update DT binding to name DBI region 'dbi', not 'elbi', and update
driver to support both (Manivannan Sadhasivam)
Apple PCIe controller driver:
- Move struct pci_host_bridge allocation from pci_host_common_init()
to callers, which significantly simplifies pcie-apple (Marc
Zyngier)
Broadcom STB PCIe controller driver:
- Disable advertising ASPM L0s support correctly (Jim Quinlan)
- Add a panic/die handler to print diagnostic info in case PCIe
caused an unrecoverable abort (Jim Quinlan)
Cadence PCIe controller driver:
- Add module support for Cadence platform host and endpoint
controller driver (Manikandan K Pillai)
- Split headers into 'legacy' (LGA) and 'high perf' (HPA) to prepare
for new CIX Sky1 driver (Manikandan K Pillai)
MediaTek PCIe controller driver:
- Convert DT binding to YAML schema (Christian Marangi)
- Add Airoha AN7583 DT compatible and driver support (Christian
Marangi)
Qualcomm PCIe controller driver:
- Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu)
- Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280,
sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT
schemas (Krzysztof Kozlowski)
- Look up OPP using both frequency and data rate (not just frequency)
so RPMh votes can account for both (Krishna Chaitanya Chundru)
Rockchip DesignWare PCIe controller driver:
- Add Rockchip RK3528 compatible strings in DT binding (Yao Zi)
STMicroelectronics STM32MP25 PCIe controller driver:
- Fix a race between link training and endpoint register
initialization (Christian Bruel)
- Align endpoint allocations to match the ATU requirements (Christian
Bruel)
Synopsys DesignWare PCIe controller driver:
- Clear L1 PM Substate Capability 'Supported' bits unless glue driver
says it's supported, which prevents users from enabling non-working
L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas)
- Remove now-superfluous L1SS disable code from tegra194 (Bjorn
Helgaas)
- Configure L1SS support in dw-rockchip when DT says
'supports-clkreq' (Shawn Lin)
TI Keystone PCIe controller driver:
- Fail the probe instead of silently succeeding if ks_pcie_of_data
didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli)
- Make keystone buildable as a loadable module, except on ARM32 where
hook_fault_code() is __init (Siddharth Vadapalli)"
* tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (100 commits)
MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer
MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer
PCI: sky1: Add PCIe host support for CIX Sky1
dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings
PCI: cadence: Add support for High Perf Architecture (HPA) controller
MAINTAINERS: Add NXP S32G PCIe controller driver maintainer
PCI: s32g: Add NXP S32G PCIe controller driver (RC)
PCI: dwc: Add register and bitfield definitions
dt-bindings: PCI: s32g: Add NXP S32G PCIe controller
PCI: Add Renesas RZ/G3S host controller driver
PCI: host-generic: Move bridge allocation outside of pci_host_common_init()
dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding
PCI: Validate pci_rebar_size_supported() input
Documentation: PCI: Amend error recovery doc with pci_save_state() rules
treewide: Drop pci_save_state() after pci_restore_state()
PCI/ERR: Ensure error recoverability at all times
PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths
PCI: dw-rockchip: Configure L1SS support
PCI: tegra194: Remove unnecessary L1SS disable code
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Replace busylock at the Tx queuing layer with a lockless list.
Resulting in a 300% (4x) improvement on heavy TX workloads, sending
twice the number of packets per second, for half the cpu cycles.
- Allow constantly busy flows to migrate to a more suitable CPU/NIC
queue.
Normally we perform queue re-selection when flow comes out of idle,
but under extreme circumstances the flows may be constantly busy.
Add sysctl to allow periodic rehashing even if it'd risk packet
reordering.
- Optimize the NAPI skb cache, make it larger, use it in more paths.
- Attempt returning Tx skbs to the originating CPU (like we already
did for Rx skbs).
- Various data structure layout and prefetch optimizations from Eric.
- Remove ktime_get() from the recvmsg() fast path, ktime_get() is
sadly quite expensive on recent AMD machines.
- Extend threaded NAPI polling to allow the kthread busy poll for
packets.
- Make MPTCP use Rx backlog processing. This lowers the lock
pressure, improving the Rx performance.
- Support memcg accounting of MPTCP socket memory.
- Allow admin to opt sockets out of global protocol memory accounting
(using a sysctl or BPF-based policy). The global limits are a poor
fit for modern container workloads, where limits are imposed using
cgroups.
- Improve heuristics for when to kick off AF_UNIX garbage collection.
- Allow users to control TCP SACK compression, and default to 33% of
RTT.
- Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid
unnecessarily aggressive rcvbuf growth and overshot when the
connection RTT is low.
- Preserve skb metadata space across skb_push / skb_pull operations.
- Support for IPIP encapsulation in the nftables flowtable offload.
- Support appending IP interface information to ICMP messages (RFC
5837).
- Support setting max record size in TLS (RFC 8449).
- Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL.
- Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock.
- Let users configure the number of write buffers in SMC.
- Add new struct sockaddr_unsized for sockaddr of unknown length,
from Kees.
- Some conversions away from the crypto_ahash API, from Eric Biggers.
- Some preparations for slimming down struct page.
- YAML Netlink protocol spec for WireGuard.
- Add a tool on top of YAML Netlink specs/lib for reporting commonly
computed derived statistics and summarized system state.
Driver API:
- Add CAN XL support to the CAN Netlink interface.
- Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as
defined by the OPEN Alliance's "Advanced diagnostic features for
100BASE-T1 automotive Ethernet PHYs" specification.
- Add DPLL phase-adjust-gran pin attribute (and implement it in
zl3073x).
- Refactor xfrm_input lock to reduce contention when NIC offloads
IPsec and performs RSS.
- Add info to devlink params whether the current setting is the
default or a user override. Allow resetting back to default.
- Add standard device stats for PSP crypto offload.
- Leverage DSA frame broadcast to implement simple HSR frame
duplication for a lot of switches without dedicated HSR offload.
- Add uAPI defines for 1.6Tbps link modes.
Device drivers:
- Add Motorcomm YT921x gigabit Ethernet switch support.
- Add MUCSE driver for N500/N210 1GbE NIC series.
- Convert drivers to support dedicated ops for timestamping control,
and away from the direct IOCTL handling. While at it support GET
operations for PHY timestamping.
- Add (and convert most drivers to) a dedicated ethtool callback for
reading the Rx ring count.
- Significant refactoring efforts in the STMMAC driver, which
supports Synopsys turn-key MAC IP integrated into a ton of SoCs.
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support PPS in/out on all pins
- Intel (100G, ice, idpf):
- ice: implement standard ethtool and timestamping stats
- i40e: support setting the max number of MAC addresses per VF
- iavf: support RSS of GTP tunnels for 5G and LTE deployments
- nVidia/Mellanox (mlx5):
- reduce downtime on interface reconfiguration
- disable being an XDP redirect target by default (same as
other drivers) to avoid wasting resources if feature is
unused
- Meta (fbnic):
- add support for Linux-managed PCS on 25G, 50G, and 100G links
- Wangxun:
- support Rx descriptor merge, and Tx head writeback
- support Rx coalescing offload
- support 25G SPF and 40G QSFP modules
- Ethernet virtual:
- Google (gve):
- allow ethtool to configure rx_buf_len
- implement XDP HW RX Timestamping support for DQ descriptor
format
- Microsoft vNIC (mana):
- support HW link state events
- handle hardware recovery events when probing the device
- Ethernet NICs consumer, and embedded:
- usbnet: add support for Byte Queue Limits (BQL)
- AMD (amd-xgbe):
- add device selftests
- NXP (enetc):
- add i.MX94 support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- bcmasp: add support for PHY-based Wake-on-LAN
- Broadcom switches (b53):
- support port isolation
- support BCM5389/97/98 and BCM63XX ARL formats
- Lantiq/MaxLinear switches:
- support bridge FDB entries on the CPU port
- use regmap for register access
- allow user to enable/disable learning
- support Energy Efficient Ethernet
- support configuring RMII clock delays
- add tagging driver for MaxLinear GSW1xx switches
- Synopsys (stmmac):
- support using the HW clock in free running mode
- add Eswin EIC7700 support
- add Rockchip RK3506 support
- add Altera Agilex5 support
- Cadence (macb):
- cleanup and consolidate descriptor and DMA address handling
- add EyeQ5 support
- TI:
- icssg-prueth: support AF_XDP
- Airoha access points:
- add missing Ethernet stats and link state callback
- add AN7583 support
- support out-of-order Tx completion processing
- Power over Ethernet:
- pd692x0: preserve PSE configuration across reboots
- add support for TPS23881B devices
- Ethernet PHYs:
- Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support
- Support 50G SerDes and 100G interfaces in Linux-managed PHYs
- micrel:
- support for non PTP SKUs of lan8814
- enable in-band auto-negotiation on lan8814
- realtek:
- cable testing support on RTL8224
- interrupt support on RTL8221B
- motorcomm: support for PHY LEDs on YT853
- microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag
- mscc: support for PHY LED control
- CAN drivers:
- m_can: add support for optional reset and system wake up
- remove can_change_mtu() obsoleted by core handling
- mcp251xfd: support GPIO controller functionality
- Bluetooth:
- add initial support for PASTa
- WiFi:
- split ieee80211.h file, it's way too big
- improvements in VHT radiotap reporting, S1G, Channel Switch
Announcement handling, rate tracking in mesh networks
- improve multi-radio monitor mode support, and add a cfg80211
debugfs interface for it
- HT action frame handling on 6 GHz
- initial chanctx work towards NAN
- MU-MIMO sniffer improvements
- WiFi drivers:
- RealTek (rtw89):
- support USB devices RTL8852AU and RTL8852CU
- initial work for RTL8922DE
- improved injection support
- Intel:
- iwlwifi: new sniffer API support
- MediaTek (mt76):
- WED support for >32-bit DMA
- airoha NPU support
- regdomain improvements
- continued WiFi7/MLO work
- Qualcomm/Atheros:
- ath10k: factory test support
- ath11k: TX power insertion support
- ath12k: BSS color change support
- ath12k: statistics improvements
- brcmfmac: Acer A1 840 tablet quirk
- rtl8xxxu: 40 MHz connection fixes/support"
* tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits)
net: page_pool: sanitise allocation order
net: page pool: xa init with destroy on pp init
net/mlx5e: Support XDP target xmit with dummy program
net/mlx5e: Update XDP features in switch channels
selftests/tc-testing: Test CAKE scheduler when enqueue drops packets
net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop
wireguard: netlink: generate netlink code
wireguard: uapi: generate header with ynl-gen
wireguard: uapi: move flag enums
wireguard: uapi: move enum wg_cmd
wireguard: netlink: add YNL specification
selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py
selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py
selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py
selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py
selftests: drv-net: introduce Iperf3Runner for measurement use cases
selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS
net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive()
Documentation: net: dsa: mention simple HSR offload helpers
Documentation: net: dsa: mention availability of RedBox
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Allow creaing nbcon console drivers with an unsafe write_atomic()
callback that can only be called by the final nbcon_atomic_flush_unsafe().
Otherwise, the driver would rely on the kthread.
It is going to be used as the-best-effort approach for an
experimental nbcon netconsole driver, see
https://lore.kernel.org/r/20251121-nbcon-v1-2-503d17b2b4af@debian.org
Note that a safe .write_atomic() callback is supposed to work in NMI
context. But some networking drivers are not safe even in IRQ
context:
https://lore.kernel.org/r/oc46gdpmmlly5o44obvmoatfqo5bhpgv7pabpvb6sjuqioymcg@gjsma3ghoz35
In an ideal world, all networking drivers would be fixed first and
the atomic flush would be blocked only in NMI context. But it brings
the question how reliable networking drivers are when the system is
in a bad state. They might block flushing more reliable serial
consoles which are more suitable for serious debugging anyway.
- Allow to use the last 4 bytes of the printk ring buffer.
- Prevent queuing IRQ work and block printk kthreads when consoles are
suspended. Otherwise, they create non-necessary churn or even block
the suspend.
- Release console_lock() between each record in the kthread used for
legacy consoles on RT. It might significantly speed up the boot.
- Release nbcon context between each record in the atomic flush. It
prevents stalls of the related printk kthread after it has lost the
ownership in the middle of a record
- Add support for NBCON consoles into KDB
- Add %ptsP modifier for printing struct timespec64 and use it where
possible
- Misc code clean up
* tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (48 commits)
printk: Use console_is_usable on console_unblank
arch: um: kmsg_dump: Use console_is_usable
drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOT
lib/vsprintf: Unify FORMAT_STATE_NUM handlers
printk: Avoid irq_work for printk_deferred() on suspend
printk: Avoid scheduling irq_work on suspend
printk: Allow printk_trigger_flush() to flush all types
tracing: Switch to use %ptSp
scsi: snic: Switch to use %ptSp
scsi: fnic: Switch to use %ptSp
s390/dasd: Switch to use %ptSp
ptp: ocp: Switch to use %ptSp
pps: Switch to use %ptSp
PCI: epf-test: Switch to use %ptSp
net: dsa: sja1105: Switch to use %ptSp
mmc: mmc_test: Switch to use %ptSp
media: av7110: Switch to use %ptSp
ipmi: Switch to use %ptSp
igb: Switch to use %ptSp
e1000e: Switch to use %ptSp
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull arm64 FPSIMD on-stack buffer updates from Eric Biggers:
"This is a core arm64 change. However, I was asked to take this because
most uses of kernel-mode FPSIMD are in crypto or CRC code.
In v6.8, the size of task_struct on arm64 increased by 528 bytes due
to the new 'kernel_fpsimd_state' field. This field was added to allow
kernel-mode FPSIMD code to be preempted.
Unfortunately, 528 bytes is kind of a lot for task_struct. This
regression in the task_struct size was noticed and reported.
Recover that space by making this state be allocated on the stack at
the beginning of each kernel-mode FPSIMD section.
To make it easier for all the users of kernel-mode FPSIMD to do that
correctly, introduce and use a 'scoped_ksimd' abstraction"
* tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (23 commits)
lib/crypto: arm64: Move remaining algorithms to scoped ksimd API
lib/crypto: arm/blake2b: Move to scoped ksimd API
arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
arm64/fpu: Enforce task-context only for generic kernel mode FPU
net/mlx5: Switch to more abstract scoped ksimd guard API on arm64
arm64/xorblocks: Switch to 'ksimd' scoped guard API
crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API
crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API
crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API
crypto/arm64: polyval - Switch to 'ksimd' scoped guard API
crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API
crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API
crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API
crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API
raid6: Move to more abstract 'ksimd' guard API
crypto: aegis128-neon - Move to more abstract 'ksimd' guard API
crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit
crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit
crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit
lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API
...
|
|
Merge in late fixes in preparation for the net-next PR.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Save per-channel resources in default, in device and host memory.
As no better API exist, make the XDP-redirect-target SQ available by
loading a dummy XDP program.
This improves the latency of interface up/down operations when feature
is disabled.
Perf numbers:
NIC: Connect-X7.
Setup: 248 channels, default mtu and rx/tx ring sizes.
Interface up + down:
Before: 2.246 secs
After: 1.798 secs (-0.448 sec)
Saves ~1.8 msec per channel.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: William Tu <witu@nvidia.com>
Link: https://patch.msgid.link/1764497617-1326331-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The XDP features state might depend of the state of other features, like
HW-LRO / HW-GRO.
In general, move the re-evaluation announcement of the XDP features
(xdp_set_features_flag_locked) into the flow where configuration gets
changed. There's no point in updating them elsewhere.
This is a more appropriate place, as this modifies the announced
features while channels are inactive, which avoids the small interval
between channel activation and the proper setting of the XDP features.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: William Tu <witu@nvidia.com>
Link: https://patch.msgid.link/1764497617-1326331-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Use the napi functions napi_alloc_skb() and napi_gro_receive() instead
of netdev_alloc_skb() and netif_receive_skb() for more efficient packet
receiving. The switch to napi aware functions increases the RX
throughput, reduces the occurrence of retransmissions and improves the
resilience against SKB allocation failures.
Signed-off-by: Florian Fuchs <fuchsfl@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251130194155.1950980-1-fuchsfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
MLX5E_100MB and MLX5E_1GB defines are confusing, MLX5E_100MB is not
equal to 100 * MEGA, and MLX5E_1GB is not equal to one GIGA, as they
hide the Kbps rate conversion required for ieee_maxrate.
Replace hardcoded bandwidth conversion values with standard unit
definitions from linux/units.h. Rename MLX5E_100MB/MLX5E_1GB to
MLX5E_100MB_TO_KB/MLX5E_1GB_TO_KB to clarify these are conversion
factors to Kbps, not absolute bandwidth values.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1764498334-1327918-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace hard coded 255 magic number with U8_MAX (the register field is 8
bits).
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1764498334-1327918-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Clarify that the limit represents the threshold for using 100 Mbps
units rather than a general Mbps limit.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1764498334-1327918-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Change upper_limit_mbps/gbps from __u64 to u64 to follow kernel coding
conventions.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1764498334-1327918-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 17e9f841dd227a4dc976b22d000d5f669bc14493.
Nathan reports error messages appearing in dmesg since commit
under Fixes:
[ 3.844125] r8169 0000:01:00.0 (unnamed net_device) (uninitialized): rtl_eriar_cond == 0 (loop: 100, delay: 100).
[ 3.864844] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 3.878825] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 3.892632] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 5.002551] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 5.016286] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
[ 5.030027] r8169 0000:01:00.0 eth0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
Let's drop the bad change and revisit in the next release cycle.
Repoted-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/20251201224238.GA604467@ax162
Fixes: 17e9f841dd22 ("r8169: add DASH support for RTL8127AP")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Under heavy load, Rx Buffer Unavailable (RBU) can occur if Rx processing
is slower than network. When an RBU is signaled, try to schedule NAPI to
help recover from such situation (including cases where an IRQ may be
missed or such)
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Link: https://patch.msgid.link/20251129175016.3034185-3-Raju.Rangoju@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor the DMA interrupt bottom-half handling to improve the
readability, maintainability, without changing the intended behavior.
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Link: https://patch.msgid.link/20251129175016.3034185-2-Raju.Rangoju@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When MANA is being probed, it's possible that hardware is in recovery
mode and the device may get GDMA_EQE_HWC_RESET_REQUEST over HWC in the
middle of the probe. Detect such condition and go through the recovery
service procedure.
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1764193552-9712-1-git-send-email-longli@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch fixes multiple spelling mistakes in dl2k driver comments:
- "deivices" -> "devices"
- "Ttransmit" -> "Transmit"
- "catastronphic" -> "catastrophic"
- "Extened" -> "Extended"
Also fix incorrect unit description: `rx_timeout` uses 640ns increments,
not 64ns.
- "64ns" -> "640ns"
These are comment-only changes and do not affect runtime behavior.
Signed-off-by: Yeounsu Moon <yyyynoom@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251130220652.5425-2-yyyynoom@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert the enetc driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc for handling
ETHTOOL_GRXRINGS command. This simplifies the code in two ways:
1. For enetc_get_rxnfc(): Remove the ETHTOOL_GRXRINGS case from the
switch statement while keeping other cases for classifier rules.
2. For enetc4_get_rxnfc(): Remove it completely and use
enetc_get_rxnfc() instead.
Now on, enetc_get_rx_ring_count() is the callback that returns the
number of RX rings for enetc driver.
Also, remove the documentation around enetc4_get_rxnfc(), which was not
matching what the function did(?!).
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251128-gxring_freescale-v1-3-22a978abf29e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert the dpaa2 driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc for handling
ETHTOOL_GRXRINGS command. This simplifies the code by removing the
ETHTOOL_GRXRINGS case from the switch statement and replacing it with
a direct return of the queue count.
The driver still maintains .get_rxnfc for other commands including
ETHTOOL_GRXCLSRLCNT, ETHTOOL_GRXCLSRULE, and ETHTOOL_GRXCLSRLALL.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251128-gxring_freescale-v1-2-22a978abf29e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert the gianfar driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc for handling
ETHTOOL_GRXRINGS command. This simplifies the code by removing the
ETHTOOL_GRXRINGS case from the switch statement and replacing it with
a direct return of the queue count.
The driver still maintains .get_rxnfc for other commands including
ETHTOOL_GRXCLSRLCNT, ETHTOOL_GRXCLSRULE, and ETHTOOL_GRXCLSRLALL.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251128-gxring_freescale-v1-1-22a978abf29e@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns bcmgenet with the
new ethtool API for querying RX ring parameters.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20251127-grxrings_broadcom-v1-2-b0b182864950@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns bnxt with the new
ethtool API for querying RX ring parameters.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20251127-grxrings_broadcom-v1-1-b0b182864950@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The firmware can now cache the virtual link admin state (auto/on/off) of
all VFs and as such, the PF driver no longer has to intercept the VF
driver's port_phy_qcfg() call and then provide the link admin state.
If the FW does not have this capability, fall back to the existing
interception method.
The initial default link admin state (auto) is also set initially when
the VFs are created.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Mohammad Shuab Siddique <mohammad-shuab.siddique@broadcom.com>
Signed-off-by: Rob Miller <rmiller@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With End-of-Packet padding (EOP) set, the chip will disable Relaxed
Ordering (RO) of TPA data packets. A TPA segment with EOP set will be
padded to the next cache boundary and can potentially overwrite the
beginning bytes of the next TPA segment when RO is enabled on 5760X.
To prevent that, the chip disables RO for TPA when EOP is set.
To take advantge of RO and higher performance, do not set EOP on
5760X chips when TPA is enabled. Define a proper RX_BD_FLAGS_AGG_EOP
constant to make it clear that we are setting EOP.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On newer chips that use NQs and CQs, add the CQ ring dump to
bnxt_dump_cp_sw_state() to make it more complete.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
MSIX is always requested when the RoCE driver calls bnxt_register_dev().
We already check bnxt_ulp_registered(), so checking the flag is
redundant. It was a left-over flag after converting to auxbus, so
remove it.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rturn early with -EOPNOTSUPP and an extack message if the PHY type is
BaseT since module status is not available for BaseT.
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Gautam R A <gautam-r.a@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The priority packet and byte counters in ethtool -S are returned by
the driver based on the pri2cos mapping. The assumption is that each
priority is mapped to one and only one hardware CoS queue. In a
special RoCE configuration, the FW uses combined CoS queue 0 and CoS
queue 1 for the priority mapped to CoS queue 0. In this special
case, we need to add the CoS queue 0 and CoS queue 1 counters for
the priority packet and byte counters.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The current dev_warn messages for too many VLAN changes are confusing
and one place incorrectly references "add" instead of "delete" VLANs
due to copy-paste errors.
- Use dev_info instead of dev_warn to lower the log level.
- Rephrase the message to: "virtchnl: Too many VLAN [add|delete]
([v1|v2]) requests; splitting into multiple messages to PF\n".
Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-12-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
- Fix a typo in the ice_fdir_has_frag() kernel-doc comment ("is" -> "if")
- Correct the NVM erase error message format string from "0x02%x" to
"0x%02x" so the module value is printed correctly.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-11-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The error messages in idpf_rx_desc_alloc_all() used the group index i
when reporting memory allocation failures for individual Rx and Rx buffer
queues. This is incorrect.
Update the messages to use the correct queue index j and include the
queue group index i for clearer identification of the affected Rx and Rx
buffer queues.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-10-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
idpf_compl_queue uses a union for comp, comp_4b, and desc_ring. The
release path should check complq->desc_ring to determine whether the DMA
descriptor ring is allocated. The current check against comp works but is
leftover from a previous commit and is misleading in this context.
Switching the check to desc_ring improves readability and more directly
reflects the intended meaning, since desc_ring is the field representing
the allocated DMA-backed descriptor ring.
No functional change.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-9-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ixgbe_non_sfp_link_config() is called twice in ixgbe_open()
once to assign its return value to err and again in the
conditional check. This patch uses the stored err value
instead of calling the function a second time. This avoids
redundant work and ensures consistent error reporting.
Also fix a small typo in the ixgbe_remove() comment:
"The could be caused" -> "This could be caused".
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-8-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The caller, ethtool_set_eeprom(), already performs the same checks so
these are unnecessary in the driver. This reverts commit
90fb7db49c6d ("e1000e: fix heap overflow in e1000_set_eeprom"), however,
corrections for RCT have been kept.
Link: https://lore.kernel.org/all/db92fcc8-114d-4e85-9d15-7860545bc65e@suse.de/
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-7-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert vport state to a bitmap and remove the DOWN state which is
redundant in the existing logic. There are no functional changes aside
from the use of bitwise operations when setting and checking the states.
Removed the double underscore to be consistent with the naming of other
bitmaps in the header and renamed current_state to vport_is_up to match
the meaning of the new variable.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Chittim Madhu <madhu.chittim@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-6-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Flex array should be at the end of the structure and use [] syntax
Remove unused fields of ixgbevf_q_vector.
They aren't used since busy poll was moved to core code in commit
508aac6dee02 ("ixgbevf: get rid of custom busy polling code").
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Natalia Wochtman <natalia.wochtman@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-5-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert the Cavium Thunder NIC VF driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc solely for handling
ETHTOOL_GRXRINGS command. This simplifies the code by removing the
switch statement and replacing it with a direct return of the queue
count.
The new callback provides the same functionality in a more direct way,
following the ongoing ethtool API modernization.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20251126-gxring_cavium-v1-1-a066c0c9e0c6@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The extra "count >= limit" check in stmmac_rx_zc() is redundant and
has no effect because the value of "count" doesn't change after the
while condition at this point.
However, it can change after "read_again:" label:
while (count < limit) {
...
if (count >= limit)
break;
read_again:
...
/* XSK pool expects RX frame 1:1 mapped to XSK buffer */
if (likely(status & rx_not_ls)) {
xsk_buff_free(buf->xdp);
buf->xdp = NULL;
dirty++;
count++;
goto read_again;
}
...
This patch addresses the same issue previously resolved in stmmac_rx()
by commit fa02de9e7588 ("net: stmmac: fix rx budget limit check").
The fix is the same: move the check after the label to ensure that it
bounds the goto loop.
Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy")
Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20251126104327.175590-1-aleksei.kodanev@bell-sw.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This adds DASH support for chip RTL8127AP. Its mac version is
RTL_GIGA_MAC_VER_80. DASH is a standard for remote management of network
device, allowing out-of-band control.
Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
Link: https://patch.msgid.link/20251126055950.2050-1-javen_xu@realsil.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ptp_clock_settime() assumes every ptp_clock has implemented settime64().
Stub it with -EOPNOTSUPP to prevent a NULL dereference.
The fix is similar to commit 329d050bbe63 ("gve: Implement settime64
with -EOPNOTSUPP").
Fixes: d734223b2f0d ("iavf: add initial framework for registering PTP clock")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Tim Hostetler <thostet@google.com>
Link: https://patch.msgid.link/20251126094850.2842557-1-mschmidt@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The tx->dropped_pkt counter is a 64-bit integer that is incremented
directly. On 32-bit architectures, this operation is not atomic and
can lead to read/write tearing if a reader accesses the counter during
the update. This can result in incorrect values being reported for
dropped packets.
To prevent this potential data corruption, wrap the increment
operation with u64_stats_update_begin() and u64_stats_update_end().
This ensures that updates to the 64-bit counter are atomic, even on
32-bit systems, by using a sequence lock.
The u64_stats_sync API requires the writer to have exclusive access,
which is already provided in this context by the network stack's
serialization of the transmit path (net_device_ops::ndo_start_xmit
[1]) for a given queue.
[1]: https://www.kernel.org/doc/Documentation/networking/netdevices.txt
Signed-off-by: Max Yuan <maxyuan@google.com>
Reviewed-by: Jordan Rhee <jordanrhee@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As those following recent changes from Eric know very well
using NAPI skb cache is crucial to achieve good perf, at
least on recent AMD platforms. Make sure bnxt feeds the skb
cache with Tx skbs.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Conflicts:
net/xdp/xsk.c
0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number")
8da7bea7db69 ("xsk: add indirect call for xsk_destruct_skb")
30ed05adca4a ("xsk: use a smaller new lock for shared pool case")
https://lore.kernel.org/20251127105450.4a1665ec@canb.auug.org.au
https://lore.kernel.org/eb4eee14-7e24-4d1b-b312-e9ea738fefee@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In Store and Forward mode, flushing frames when the receive buffer is
unavailable, can cause the MTL Rx FIFO to go out of sync. This results
in buffering of a few frames in the FIFO without triggering Rx DMA
from transferring the data to the system memory until another packet
is received. Once the issue happens, for a ping request, the packet is
forwarded to the system memory only after we receive another packet
and hece we observe a latency equivalent to the ping interval.
64 bytes from 192.168.2.100: seq=1 ttl=64 time=1000.344 ms
Also, we can observe constant gmacgrp_debug register value of
0x00000120, which indicates "Reading frame data".
The issue is not reproducible after disabling frame flushing when Rx
buffer is unavailable. But in that case, the Rx DMA enters a suspend
state due to buffer unavailability. To resume operation, software
must write to the receive_poll_demand register after adding new
descriptors, which reactivates the Rx DMA.
This issue is observed in the socfpga platforms which has dwmac1000 IP
like Arria 10, Cyclone V and Agilex 7. Issue is reproducible after
running iperf3 server at the DUT for UDP lower packet sizes.
Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com>
Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20251126-a10_ext_fix-v1-1-d163507f646f@altera.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There are currently two situations that can trigger the PTP interrupt,
one is the PPS event, the other is the PEROUT event. However, the irq
handler fec_pps_interrupt() does not check the irq event type and
directly registers a PPS event into the system, but the event may be
a PEROUT event. This is incorrect because PEROUT is an output signal,
while PPS is the input of the kernel PPS system. Therefore, add a check
for the event type, if pps_enable is true, it means that the current
event is a PPS event, and then the PPS event is registered.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251125085210.1094306-5-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In the current driver, PPS and PEROUT use the same channel to generate
the events, so they cannot be enabled at the same time. Otherwise, the
later configuration will overwrite the earlier configuration. Therefore,
when configuring PPS, the driver will check whether PEROUT is enabled.
Similarly, when configuring PEROUT, the driver will check whether PPS
is enabled.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251125085210.1094306-4-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If the previously set PEROUT is already active, updating it will cause
the new PEROUT to start immediately instead of at the specified time.
This is because fep->reload_period is updated whithout check whether
the PEROUT is enabled, and the old PEROUT is not disabled. Therefore,
the pulse period will be updated immediately in the pulse interrupt
handler fec_pps_interrupt().
Currently, the driver does not support directly updating PEROUT and it
will make the logic be more complicated. To fix the current issue, add
a check before enabling the PEROUT, the driver will return an error if
PEROUT is enabled. If users wants to update a new PEROUT, they should
disable the old PEROUT first.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251125085210.1094306-3-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The PEROUT allows the user to set a specified future time to output the
periodic signal. If the future time is far from the current time, the FEC
driver will use hrtimer to configure PEROUT one second before the future
time. However, the hrtimer will not be canceled if the PEROUT is disabled
before the hrtimer expires. So the PEROUT will be configured when the
hrtimer expires, which is not as expected. Therefore, cancel the hrtimer
in fec_ptp_pps_disable() to fix this issue.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251125085210.1094306-2-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
As we have exposed the PCS registers via the SWMII we can now start looking
at connecting the XPCS driver to those registers and let it mange the PCS
instead of us doing it directly from the fbnic driver.
For now this just gets us the ability to detect link. The hope is in the
future to add some of the vendor specific registers to begin enabling XPCS
configuration of the interface.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374325295.959489.14521115864034905277.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|