summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/ice
AgeCommit message (Collapse)Author
2025-09-11ice: allow calling custom send function in fwlogMichal Swiatkowski
Fwlog code needs to communicate with FW. In ice it is done through admin queue command. Allow indirect calling the send function to move the specific admin queue send function from fwlog core code. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-11ice: add pdev into fwlog structure and use it for loggingMichal Swiatkowski
Prepare the code to be moved to the library. ice_debug() won't be there so switch to dev_dbg(). Add struct pdev pointer in fwlog to track on which pdev the fwlog was created. Switch the dev passed in dev_warn() to the one stored in fwlog. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-11ice: introduce ice_fwlog structureMichal Swiatkowski
The new structure is needed to make the fwlog code a library. A goal is to drop ice_hw structure in all fwlog related functions calls. Pass a ice_fwlog pointer across fwlog functions and use it wherever it is possible. Still use &hw->fwlog in debugfs code as it needs changing the value being passed in priv. It will be done in one of the next patches. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-11ice: drop ice_pf_fwlog_update_module()Michal Swiatkowski
Any other access to fwlog_cfg isn't done through a function. Follow scheme that is used to access other fwlog_cfg elements from debugfs and write to the log_level directly. ice_pf_fwlog_update_module() is called only twice (from one function). Remove it. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-11ice: move get_fwlog_data() to fwlog fileMichal Swiatkowski
Change the function prototype to receive hw structure instead of pf to simplify the call. Instead of passing whole event pass only msg_buf pointer and length. Make ice_fwlog_ring_full() static as it isn't called from any other context. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-11ice: make fwlog functions staticMichal Swiatkowski
ice_fwlog_supported(), ice_fwlog_get() and ice_fwlog_supported() aren't called outside the ice_fwlog.c file. Make it static and move in the file to allow clean build. Drop ice_fwlog_get(). It is called only from ice_fwlog_init() function where the fwlog support is already checked. There is no need to check it again, call ice_aq_fwlog_get() instead. Drop no longer valid comment from ice_fwlog_get_supported(). Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-11net: xdp: pass full flags to xdp_update_skb_shared_info()Jakub Kicinski
xdp_update_skb_shared_info() needs to update skb state which was maintained in xdp_buff / frame. Pass full flags into it, instead of breaking it out bit by bit. We will need to add a bit for unreadable frags (even tho XDP doesn't support those the driver paths may be common), at which point almost all call sites would become: xdp_update_skb_shared_info(skb, num_frags, sinfo->xdp_frags_size, MY_PAGE_SIZE * num_frags, xdp_buff_is_frag_pfmemalloc(xdp), xdp_buff_is_frag_unreadable(xdp)); Keep a helper for accessing the flags, in case we need to transform them somehow in the future (e.g. to cover up xdp_buff vs xdp_frame differences). While we are touching call callers - rename the helper to xdp_update_skb_frags_info(), previous name may have implied that it's shinfo that's updated. We are updating flags in struct sk_buff based on frags that got attched. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/20250905221539.2930285-2-kuba@kernel.org Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc5). No conflicts. Adjacent changes: include/net/sock.h c51613fa276f ("net: add sk->sk_drop_counters") 5d6b58c932ec ("net: lockless sock_i_ino()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02ice: fix NULL access of tx->in_use in ice_ll_ts_intrJacob Keller
Recent versions of the E810 firmware have support for an extra interrupt to handle report of the "low latency" Tx timestamps coming from the specialized low latency firmware interface. Instead of polling the registers, software can wait until the low latency interrupt is fired. This logic makes use of the Tx timestamp tracking structure, ice_ptp_tx, as it uses the same "ready" bitmap to track which Tx timestamps complete. Unfortunately, the ice_ll_ts_intr() function does not check if the tracker is initialized before its first access. This results in NULL dereference or use-after-free bugs similar to the issues fixed in the ice_ptp_ts_irq() function. Fix this by only checking the in_use bitmap (and other fields) if the tracker is marked as initialized. The reset flow will clear the init field under lock before it tears the tracker down, thus preventing any use-after-free or NULL access. Fixes: 82e71b226e0e ("ice: Enable SW interrupt from FW for LL TS") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02ice: fix NULL access of tx->in_use in ice_ptp_ts_irqJacob Keller
The E810 device has support for a "low latency" firmware interface to access and read the Tx timestamps. This interface does not use the standard Tx timestamp logic, due to the latency overhead of proxying sideband command requests over the firmware AdminQ. The logic still makes use of the Tx timestamp tracking structure, ice_ptp_tx, as it uses the same "ready" bitmap to track which Tx timestamps complete. Unfortunately, the ice_ptp_ts_irq() function does not check if the tracker is initialized before its first access. This results in NULL dereference or use-after-free bugs similar to the following: [245977.278756] BUG: kernel NULL pointer dereference, address: 0000000000000000 [245977.278774] RIP: 0010:_find_first_bit+0x19/0x40 [245977.278796] Call Trace: [245977.278809] ? ice_misc_intr+0x364/0x380 [ice] This can occur if a Tx timestamp interrupt races with the driver reset logic. Fix this by only checking the in_use bitmap (and other fields) if the tracker is marked as initialized. The reset flow will clear the init field under lock before it tears the tracker down, thus preventing any use-after-free or NULL access. Fixes: f9472aaabd1f ("ice: Process TSYN IRQ in a separate function") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.17-rc4). No conflicts. Adjacent changes: drivers/net/ethernet/intel/idpf/idpf_txrx.c 02614eee26fb ("idpf: do not linearize big TSO packets") 6c4e68480238 ("idpf: remove obsolete stashing code") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-28Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: split ice_virtchnl.c git-blame friendly way Przemek Kitszel says: Split ice_virtchnl.c into two more files (+headers), in a way that git-blame works better. Then move virtchnl files into a new subdir. No logic changes. I have developed (or discovered ;)) how to split a file in a way that both old and new are nice in terms of git-blame There was not much discussion on [RFC], so I would like to propose to go forward with this approach. There are more commits needed to have it nice, so it forms a git-log vs git-blame tradeoff, but (after the brief moment that this is on the top) we spend orders of magnitude more time looking at the blame output (and commit messages linked from that) - so I find it much better to see actual logic changes instead of "move xx to yy" stuff (typical for "squashed/single-commit splits"). Cherry-picks/rebases work the same with this method as with simple "squashed/single-commit" approach (literally all commits squashed into one (to have better git-log, but shitty git-blame output). Rationale for the split itself is, as usual, "file is big and we want to extend it". * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: finish virtchnl.c split into rss.c ice: extract virt/rss.c: cleanup - p2 ice: extract virt/rss.c: cleanup - p1 ice: split RSS stuff out of virtchnl.c - copy back ice: split RSS stuff out of virtchnl.c - tmp rename ice: finish virtchnl.c split into queues.c ice: extract virt/queues.c: cleanup - p3 ice: extract virt/queues.c: cleanup - p2 ice: extract virt/queues.c: cleanup - p1 ice: split queue stuff out of virtchnl.c - copy back ice: split queue stuff out of virtchnl.c - tmp rename ice: add virt/ and move ice_virtchnl* files there ==================== Link: https://patch.msgid.link/20250827224641.415806-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27ice: finish virtchnl.c split into rss.cPrzemek Kitszel
Move functions out of virt/virtchnl.c to virt/rss.c. Same "git tricks" used as for the split into virt/queues.c that is immediately preceding this split. Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: extract virt/rss.c: cleanup - p2Przemek Kitszel
Remove remaining portion of the stuff that stays within virtchnl.c, (separate commits to have nicer, removal-only, history). Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: extract virt/rss.c: cleanup - p1Przemek Kitszel
Start removing stuff from virt/rss.c that will be kept in the main virtchnl file. Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: split RSS stuff out of virtchnl.c - copy backPrzemek Kitszel
Add copy of virtchnl.c under the original name/location. Now both virt/virtchnl.c and virt/rss.c have the same content, and only the former of the two in use. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: split RSS stuff out of virtchnl.c - tmp renamePrzemek Kitszel
Temporary rename of virtchnl.c into rss.c In order to split virtchnl.c in a way that makes it much easier to still blame new file, we do it via multiple git steps. Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: finish virtchnl.c split into queues.cPrzemek Kitszel
Move queue configuration functions out of virtchnl.c to queues.c. Technically the move was started by an earlier commit of the series, that was duplicating virtchnl.c as queues.c (what forces git to nicely show blame when asked), followed by a couple of cleanup commits (that removed stuff that is not moved from the new file, again - multiple commits, to avoid git saving on changes lines by reusing removed lines as a content of some kept function), with this final commit actually enabling compilation of the new file, removing stuff from the virtchnl.c, and making some moved functions visible (via static keyword removal). Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: extract virt/queues.c: cleanup - p3Przemek Kitszel
Remove final portion of the stuff that stays within virtchnl.c, (separate commits to have nicer, removal-only, history). Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: extract virt/queues.c: cleanup - p2Przemek Kitszel
Remove next piece of the content that stays in virtchnl.c, (separate commits to have nicer git history). Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: extract virt/queues.c: cleanup - p1Przemek Kitszel
Start removing stuff from virt/queues.c that will be kept in the main virtchnl file. Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: split queue stuff out of virtchnl.c - copy backPrzemek Kitszel
Add copy of virtchnl.c under the original name/location. Now both virt/virtchnl.c and virt/queues.c have the same content, and only the former of the two is in use. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-27ice: split queue stuff out of virtchnl.c - tmp renamePrzemek Kitszel
Temporary rename of virtchnl.c into queues.c In order to split virtchnl.c in a way that makes it much easier to still blame new file, we do it via multiple git steps. Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-26devlink: Move graceful period parameter to reporter opsShahar Shitrit
Move the default graceful period from a parameter to devlink_health_reporter_create() to a field in the devlink_health_reporter_ops structure. This change improves consistency, as the graceful period is inherently tied to the reporter's behavior and recovery policy. It simplifies the signature of devlink_health_reporter_create() and its internal helper functions. It also centralizes the reporter configuration at the ops structure, preparing the groundwork for a downstream patch that will introduce a devlink health reporter burst period attribute whose default value will similarly be provided by the driver via the ops structure. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250824084354.533182-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-25ice: fix incorrect counter for buffer allocation failuresMichal Kubiak
Currently, the driver increments `alloc_page_failed` when buffer allocation fails in `ice_clean_rx_irq()`. However, this counter is intended for page allocation failures, not buffer allocation issues. This patch corrects the counter by incrementing `alloc_buf_failed` instead, ensuring accurate statistics reporting for buffer allocation failures. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Reported-by: Jacob Keller <jacob.e.keller@intel.com> Suggested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Priya Singh <priyax.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25ice: use fixed adapter index for E825C embedded devicesJacob Keller
The ice_adapter structure is used by the ice driver to connect multiple physical functions of a device in software. It was introduced by commit 0e2bddf9e5f9 ("ice: add ice_adapter for shared data across PFs on the same NIC") and is primarily used for PTP support, as well as for handling certain cross-PF synchronization. The original design of ice_adapter used PCI address information to determine which devices should be connected. This was extended to support E825C devices by commit fdb7f54700b1 ("ice: Initial support for E825C hardware in ice_adapter"), which used the device ID for E825C devices instead of the PCI address. Later, commit 0093cb194a75 ("ice: use DSN instead of PCI BDF for ice_adapter index") replaced the use of Bus/Device/Function addressing with use of the device serial number. E825C devices may appear in "Dual NAC" configuration which has multiple physical devices tied to the same clock source and which need to use the same ice_adapter. Unfortunately, each "NAC" has its own NVM which has its own unique Device Serial Number. Thus, use of the DSN for connecting ice_adapter does not work properly. It "worked" in the pre-production systems because the DSN was not initialized on the test NVMs and all the NACs had the same zero'd serial number. Since we cannot rely on the DSN, lets fall back to the logic in the original E825C support which used the device ID. This is safe for E825C only because of the embedded nature of the device. It isn't a discreet adapter that can be plugged into an arbitrary system. All E825C devices on a given system are connected to the same clock source and need to be configured through the same PTP clock. To make this separation clear, reserve bit 63 of the 64-bit index values as a "fixed index" indicator. Always clear this bit when using the device serial number as an index. For E825C, use a fixed value defined as the 0x579C E825C backplane device ID bitwise ORed with the fixed index indicator. This is slightly different than the original logic of just using the device ID directly. Doing so prevents a potential issue with systems where only one of the NACs is connected with an external PHY over SGMII. In that case, one NAC would have the E825C_SGMII device ID, but the other would not. Separate the determination of the full 64-bit index from the 32-bit reduction logic. Provide both ice_adapter_index() and a wrapping ice_adapter_xa_index() which handles reducing the index to a long on 32-bit systems. As before, cache the full index value in the adapter structure to warn about collisions. This fixes issues with E825C not initializing PTP on both NACs, due to failure to connect the appropriate devices to the same ice_adapter. Fixes: 0093cb194a75 ("ice: use DSN instead of PCI BDF for ice_adapter index") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25ice: don't leave device non-functional if Tx scheduler config failsJacob Keller
The ice_cfg_tx_topo function attempts to apply Tx scheduler topology configuration based on NVM parameters, selecting either a 5 or 9 layer topology. As part of this flow, the driver acquires the "Global Configuration Lock", which is a hardware resource associated with programming the DDP package to the device. This "lock" is implemented by firmware as a way to guarantee that only one PF can program the DDP for a device. Unlike a traditional lock, once a PF has acquired this lock, no other PF will be able to acquire it again (including that PF) until a CORER of the device. Future requests to acquire the lock report that global configuration has already completed. The following flow is used to program the Tx topology: * Read the DDP package for scheduler configuration data * Acquire the global configuration lock * Program Tx scheduler topology according to DDP package data * Trigger a CORER which clears the global configuration lock This is followed by the flow for programming the DDP package: * Acquire the global configuration lock (again) * Download the DDP package to the device * Release the global configuration lock. However, if configuration of the Tx topology fails, (i.e. ice_get_set_tx_topo returns an error code), the driver exits ice_cfg_tx_topo() immediately, and fails to trigger CORER. While the global configuration lock is held, the firmware rejects most AdminQ commands, as it is waiting for the DDP package download (or Tx scheduler topology programming) to occur. The current driver flows assume that the global configuration lock has been reset by CORER after programming the Tx topology. Thus, the same PF attempts to acquire the global lock again, and fails. This results in the driver reporting "an unknown error occurred when loading the DDP package". It then attempts to enter safe mode, but ultimately fails to finish ice_probe() since nearly all AdminQ command report error codes, and the driver stops loading the device at some point during its initialization. The only currently known way that ice_get_set_tx_topo() can fail is with certain older DDP packages which contain invalid topology configuration, on firmware versions which strictly validate this data. The most recent releases of the DDP have resolved the invalid data. However, it is still poor practice to essentially brick the device, and prevent access to the device even through safe mode or recovery mode. It is also plausible that this command could fail for some other reason in the future. We cannot simply release the global lock after a failed call to ice_get_set_tx_topo(). Releasing the lock indicates to firmware that global configuration (downloading of the DDP) has completed. Future attempts by this or other PFs to load the DDP will fail with a report that the DDP package has already been downloaded. Then, PFs will enter safe mode as they realize that the package on the device does not meet the minimum version requirement to load. The reported error messages are confusing, as they indicate the version of the default "safe mode" package in the NVM, rather than the version of the file loaded from /lib/firmware. Instead, we need to trigger CORER to clear global configuration. This is the lowest level of hardware reset which clears the global configuration lock and related state. It also clears any already downloaded DDP. Crucially, it does *not* clear the Tx scheduler topology configuration. Refactor ice_cfg_tx_topo() to always trigger a CORER after acquiring the global lock, regardless of success or failure of the topology configuration. We need to re-initialize the HW structure when we trigger the CORER. Thus, it makes sense for this to be the responsibility of ice_cfg_tx_topo() rather than its caller, ice_init_tx_topology(). This avoids needless re-initialization in cases where we don't attempt to update the Tx scheduler topology, such as if it has already been programmed. There is one catch: failure to re-initialize the HW struct should stop ice_probe(). If this function fails, we won't have a valid HW structure and cannot ensure the device is functioning properly. To handle this, ensure ice_cfg_tx_topo() returns a limited set of error codes. Set aside one specifically, -ENODEV, to indicate that the ice_init_tx_topology() should fail and stop probe. Other error codes indicate failure to apply the Tx scheduler topology. This is treated as a non-fatal error, with an informational message informing the system administrator that the updated Tx topology did not apply. This allows the device to load and function with the default Tx scheduler topology, rather than failing to load entirely. Note that this use of CORER will not result in loops with future PFs attempting to also load the invalid Tx topology configuration. The first PF will acquire the global configuration lock as part of programming the DDP. Each PF after this will attempt to acquire the global lock as part of programming the Tx topology, and will fail with the indication from firmware that global configuration is already complete. Tx scheduler topology configuration is only performed during driver init (probe or devlink reload) and not during cleanup for a CORER that happens after probe completes. Fixes: 91427e6d9030 ("ice: Support 5 layer topology") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25ice: fix NULL pointer dereference in ice_unplug_aux_dev() on resetEmil Tantilov
Issuing a reset when the driver is loaded without RDMA support, will results in a crash as it attempts to remove RDMA's non-existent auxbus device: echo 1 > /sys/class/net/<if>/device/reset BUG: kernel NULL pointer dereference, address: 0000000000000008 ... RIP: 0010:ice_unplug_aux_dev+0x29/0x70 [ice] ... Call Trace: <TASK> ice_prepare_for_reset+0x77/0x260 [ice] pci_dev_save_and_disable+0x2c/0x70 pci_reset_function+0x88/0x130 reset_store+0x5a/0xa0 kernfs_fop_write_iter+0x15e/0x210 vfs_write+0x273/0x520 ksys_write+0x6b/0xe0 do_syscall_64+0x79/0x3b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e ice_unplug_aux_dev() checks pf->cdev_info->adev for NULL pointer, but pf->cdev_info will also be NULL, leading to the deref in the trace above. Introduce a flag to be set when the creation of the auxbus device is successful, to avoid multiple NULL pointer checks in ice_unplug_aux_dev(). Fixes: c24a65b6a27c7 ("iidc/ice/irdma: Update IDC to support multiple consumers") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-25ice: add virt/ and move ice_virtchnl* files therePrzemek Kitszel
Introduce virt/ directory to collect virtchnl files. We are going to implement a few sizable extensions soon, each of them increasing virt/ size, so it looks sensible to introduce a new dir. Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-14ice: Implement support for SRIOV VFs across Active/Active bondsDave Ertman
This patch implements the software flows to handle SRIOV VF communication across an Active/Active link aggregate. The same restrictions apply as are in place for the support of Active/Backup bonds. - the two interfaces must be on the same NIC - the FW LLDP engine needs to be disabled - the DDP package that supports VF LAG must be loaded on device - the two interfaces must have the same QoS config - only the first interface added to the bond will have VF support - the interface with VFs must be in switchdev mode With the additional requirement of - the version of the FW on the NIC needs to have VF Active/Active support This requirement is indicated in the capabilities struct associated with the NVM loaded on the NIC. The balancing of traffic between the two interfaces is done on a queue basis. Taking the queues allocated to all of the VFs as a whole, one half of them will be distributed to each interface. When a link goes down, then the queues allocated to the down interface will migrate to the active port. When the down port comes back up, then the same queues as were originally assigned there will be moved back. Co-developed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-14ice: Cleanup variable initialization in LAG codeDave Ertman
In preparation for implementing SRIOV Active-Active LAG support, cleanup several unneeded variable initializations in declaration blocks. Also move a couple of variable initializations into declaration block that should be there. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-14ice: move LAG function in code to prepare for Active-ActiveDave Ertman
In the SRIOV LAG Active-Active, the functions ice_lag_cfg_pf_fltr's and ice_lag_config_eswitch's content are moved to earlier locations in the source file. Also, ice_lag_cfg_pf_fltr is renamed, and its flow is changed. To reduce the delta in the larger patch, move the original functions to their new location so that only functional changes are needed in the larger patch. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-14ice: Add driver specific prefix to LAG definesDave Ertman
A define in the LAG code is missing a driver specific prefix. Add a prefix to the define. Also shorten a defines name and move to a more logical place. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-14ice: replace u8 elements with bool where appropriateDave Ertman
In preparation for the new LAG functionality implementation, there are a couple of existing LAG elements of the capabilities struct that should be bool instead of u8. Since we are adding a new element to this struct that should also be a bool, fix the existing LAG u8 in this patch and eliminate !! operators where possible. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-14ice: Remove casts on void pointers in LAG codeDave Ertman
This series will be touching on the LAG code in the ice driver, to prevent moving or propagating casting on void pointers, clean them up first. This also allows for moving the variable initialization into the variable declaration. Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-08-13ice: Don't use %pK through printk or tracepointsThomas Weißschuh
In the past %pK was preferable to %p as it would not leak raw pointer values into the kernel log. Since commit ad67b74d2469 ("printk: hash addresses printed with %p") the regular %p has been improved to avoid this issue. Furthermore, restricted pointers ("%pK") were never meant to be used through printk(). They can still unintentionally leak raw pointers or acquire sleeping locks in atomic contexts. Switch to the regular pointer formatting which is safer and easier to reason about. There are still a few users of %pK left, but these use it through seq_file, for which its usage is safe. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Acked-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250811-restricted-pointers-net-v5-1-2e2fdc7d3f2c@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== libie: commonize adminq structure Michal Swiatkowski says: It is a prework to allow reusing some specific Intel code (eq. fwlog). Move common *_aq_desc structure to libie header and changing it in ice, ixgbe, i40e and iavf. Only generic adminq commands can be easily moved to common header, as rest is slightly different. Format remains the same. It will be better to correctly move it when it will be needed to commonize other part of the code. Move *_aq_str() to new libie module (libie_adminq) and use it across drivers. The functions are exactly the same in each driver. Some more adminq helpers/functions can be moved to libie_adminq when needed. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: i40e: use libie_aq_str iavf: use libie_aq_str ice: use libie_aq_str libie: add adminq helper for converting err to str iavf: use libie adminq descriptors i40e: use libie adminq descriptors ixgbe: use libie adminq descriptors ice, libie: move generic adminq descriptors to lib ==================== Link: https://patch.msgid.link/20250724182826.3758850-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-25net: Fix typosBjorn Helgaas
Fix typos in comments and error messages. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: David Arinzon <darinzon@amazon.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250723201528.2908218-1-helgaas@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc8). Conflicts: drivers/net/ethernet/microsoft/mana/gdma_main.c 9669ddda18fb ("net: mana: Fix warnings for missing export.h header inclusion") 755391121038 ("net: mana: Allocate MSI-X vectors dynamically") https://lore.kernel.org/20250711130752.23023d98@canb.auug.org.au Adjacent changes: drivers/net/ethernet/ti/icssg/icssg_prueth.h 6e86fb73de0f ("net: ti: icssg-prueth: Fix buffer allocation for ICSSG") ffe8a4909176 ("net: ti: icssg-prueth: Read firmware-names from device tree") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-24ice: use libie_aq_strMichal Swiatkowski
Simple: s/ice_aq_str/libie_aq_str Add libie_aminq module in ice Kconfig. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-24i40e: use libie adminq descriptorsMichal Swiatkowski
Use libie_aq_desc instead of i40e_aq_desc. Do needed changes to allow clean build. Get version descriptor is a little less detailed on i40e. To not mess up with shifting or union inside libie desc use get version descriptor from i40e. Move additional caps for i40e to libie. Fix RCT in declaration that is using libie_aq_desc; Use libie_aq_raw() wherever it can be used. The libie aq error is extended, cover it in ice driver just to clean build. In next patches the libie code for that will be used in each of intel driver. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-24ice, libie: move generic adminq descriptors to libMichal Swiatkowski
The descriptor structure is the same in ice, ixgbe and i40e. Move it to common libie header to use it across different driver. Leave device specific adminq commands in separate folders. This lead to a change that need to be done in filling/getting descriptor: - previous: struct specific_desc *cmd; cmd = &desc.params.specific_desc; - now: struct specific_desc *cmd; cmd = libie_aq_raw(&desc); Do this changes across the driver to allow clean build. The casting only have to be done in case of specific descriptors, for generic one union can still be used. Changes beside code moving: - change ICE_ prefix to LIBIE_ prefix (ice_ and libie_ too) - remove shift variables not otherwise needed (in libie_aq_flags) - fill/get descriptor data based on desc.params.raw whenever the descriptor isn't defined in libie - move defines from the libie_aq_sth structure outside - add libie_aq_raw helper and use it instead of explicit casting Reviewed by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-21ice: Fix a null pointer dereference in ice_copy_and_init_pkg()Haoxiang Li
Add check for the return value of devm_kmemdup() to prevent potential null pointer dereference. Fixes: c76488109616 ("ice: Implement Dynamic Device Personalization (DDP) download") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-18ice: breakout common LAG code into helpersDave Ertman
In the VF handling code, parts of the code for lag can be broken out into helper functions to reduce code duplication. Break this code out into helper functions Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-18ice: convert ice_add_prof() to bitmapJesse Brandeburg
Previously the ice_add_prof() took an array of u8 and looped over it with for_each_set_bit(), examining each 8 bit value as a bitmap. This was just hard to understand and unnecessary, and was triggering undefined behavior sanitizers with unaligned accesses within bitmap fields (on our internal tools/builds). Since the @ptype being passed in was already declared as a bitmap, refactor this to use native types with the advantage of simplifying the code to use a single loop. Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> CC: Jesse Brandeburg <jbrandeburg@cloudflare.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-18ice: add E835 device IDsDawid Osuchowski
E835 is an enhanced version of the E830. It continues to use the same set of commands, registers and interfaces as other devices in the 800 Series. Following device IDs are added: - 0x1248: Intel(R) Ethernet Controller E835-CC for backplane - 0x1249: Intel(R) Ethernet Controller E835-CC for QSFP - 0x124A: Intel(R) Ethernet Controller E835-CC for SFP - 0x1261: Intel(R) Ethernet Controller E835-C for backplane - 0x1262: Intel(R) Ethernet Controller E835-C for QSFP - 0x1263: Intel(R) Ethernet Controller E835-C for SFP - 0x1265: Intel(R) Ethernet Controller E835-L for backplane - 0x1266: Intel(R) Ethernet Controller E835-L for QSFP - 0x1267: Intel(R) Ethernet Controller E835-L for SFP Reviewed-by: Konrad Knitter <konrad.knitter@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-18ice: add 40G speed to Admin Command GET PORT OPTIONAleksandr Loktionov
Introduce the ICE_AQC_PORT_OPT_MAX_LANE_40G constant and update the code to process this new option in both the devlink and the Admin Queue Command GET PORT OPTION (opcode 0x06EA) message, similar to existing constants like ICE_AQC_PORT_OPT_MAX_LANE_50G, ICE_AQC_PORT_OPT_MAX_LANE_100G, and so on. This feature allows the driver to correctly report configuration options for 2x40G on E823 and other cards in the future via devlink. Example command: devlink port split pci/0000:01:00.0/0 count 2 Example dmesg: ice 0000:01:00.0: Available port split options and max port speeds (Gbps): ice 0000:01:00.0: Status Split Quad 0 Quad 1 ice 0000:01:00.0: count L0 L1 L2 L3 L4 L5 L6 L7 ice 0000:01:00.0: 2 40 - - - 40 - - - ice 0000:01:00.0: 2 50 - 50 - - - - - ice 0000:01:00.0: 4 25 25 25 25 - - - - ice 0000:01:00.0: 4 25 25 - - 25 25 - - ice 0000:01:00.0: Active 8 10 10 10 10 10 10 10 10 ice 0000:01:00.0: 1 100 - - - - - - - Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc7). Conflicts: Documentation/netlink/specs/ovpn.yaml 880d43ca9aa4 ("netlink: specs: clean up spaces in brackets") af52020fc599 ("ovpn: reject unexpected netlink attributes") drivers/net/phy/phy_device.c a44312d58e78 ("net: phy: Don't register LEDs for genphy") f0f2b992d818 ("net: phy: Don't register LEDs for genphy") https://lore.kernel.org/20250710114926.7ec3a64f@kernel.org drivers/net/wireless/intel/iwlwifi/fw/regulatory.c drivers/net/wireless/intel/iwlwifi/mld/regulatory.c 5fde0fcbd760 ("wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap") ea045a0de3b9 ("wifi: iwlwifi: add support for accepting raw DSM tables by firmware") net/ipv6/mcast.c ae3264a25a46 ("ipv6: mcast: Delay put pmc->idev in mld_del_delrec()") a8594c956cc9 ("ipv6: mcast: Avoid a duplicate pointer check in mld_del_delrec()") https://lore.kernel.org/8cc52891-3653-4b03-a45e-05464fe495cf@kernel.org No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-15ice: check correct pointer in fwlog debugfsMichal Swiatkowski
pf->ice_debugfs_pf_fwlog should be checked for an error here. Fixes: 96a9a9341cda ("ice: configure FW logging") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-07-15ice: add NULL check in eswitch lag checkDave Ertman
The function ice_lag_is_switchdev_running() is being called from outside of the LAG event handler code. This results in the lag->upper_netdev being NULL sometimes. To avoid a NULL-pointer dereference, there needs to be a check before it is dereferenced. Fixes: 776fe19953b0 ("ice: block default rule setting on LAG interface") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>