diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-03-26 21:48:21 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-03-26 21:48:21 -0700 |
| commit | 1a9239bb4253f9076b5b4b2a1a4e8d7defd77a95 (patch) | |
| tree | 286dda5e84757594218e684b94b01b3a3cac15a2 /drivers/net/virtio_net.c | |
| parent | e61f33273ca755b3e2ebee4520a76097199dc7a8 (diff) | |
| parent | 023b1e9d265ca0662111a9df23d22b4632717a8a (diff) | |
Merge tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Continue Netlink conversions to per-namespace RTNL lock
(IPv4 routing, routing rules, routing next hops, ARP ioctls)
- Continue extending the use of netdev instance locks. As a driver
opt-in protect queue operations and (in due course) ethtool
operations with the instance lock and not RTNL lock.
- Support collecting TCP timestamps (data submitted, sent, acked) in
BPF, allowing for transparent (to the application) and lower
overhead tracking of TCP RPC performance.
- Tweak existing networking Rx zero-copy infra to support zero-copy
Rx via io_uring.
- Optimize MPTCP performance in single subflow mode by 29%.
- Enable GRO on packets which went thru XDP CPU redirect (were queued
for processing on a different CPU). Improving TCP stream
performance up to 2x.
- Improve performance of contended connect() by 200% by searching for
an available 4-tuple under RCU rather than a spin lock. Bring an
additional 229% improvement by tweaking hash distribution.
- Avoid unconditionally touching sk_tsflags on RX, improving
performance under UDP flood by as much as 10%.
- Avoid skb_clone() dance in ping_rcv() to improve performance under
ping flood.
- Avoid FIB lookup in netfilter if socket is available, 20% perf win.
- Rework network device creation (in-kernel) API to more clearly
identify network namespaces and their roles. There are up to 4
namespace roles but we used to have just 2 netns pointer arguments,
interpreted differently based on context.
- Use sysfs_break_active_protection() instead of trylock to avoid
deadlocks between unregistering objects and sysfs access.
- Add a new sysctl and sockopt for capping max retransmit timeout in
TCP.
- Support masking port and DSCP in routing rule matches.
- Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.
- Support specifying at what time packet should be sent on AF_XDP
sockets.
- Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin
users.
- Add Netlink YAML spec for WiFi (nl80211) and conntrack.
- Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
which only need to be exported when IPv6 support is built as a
module.
- Age FDB entries based on Rx not Tx traffic in VxLAN, similar to
normal bridging.
- Allow users to specify source port range for GENEVE tunnels.
- netconsole: allow attaching kernel release, CPU ID and task name to
messages as metadata
Driver API:
- Continue rework / fixing of Energy Efficient Ethernet (EEE) across
the SW layers. Delegate the responsibilities to phylink where
possible. Improve its handling in phylib.
- Support symmetric OR-XOR RSS hashing algorithm.
- Support tracking and preserving IRQ affinity by NAPI itself.
- Support loopback mode speed selection for interface selftests.
Device drivers:
- Remove the IBM LCS driver for s390
- Remove the sb1000 cable modem driver
- Add support for SFP module access over SMBus
- Add MCTP transport driver for MCTP-over-USB
- Enable XDP metadata support in multiple drivers
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- add PCIe TLP Processing Hints (TPH) support for new AMD
platforms
- support dumping RoCE queue state for debug
- opt into instance locking
- Intel (100G, ice, idpf):
- ice: rework MSI-X IRQ management and distribution
- ice: support for E830 devices
- iavf: add support for Rx timestamping
- iavf: opt into instance locking
- nVidia/Mellanox:
- mlx4: use page pool memory allocator for Rx
- mlx5: support for one PTP device per hardware clock
- mlx5: support for 200Gbps per-lane link modes
- mlx5: move IPSec policy check after decryption
- AMD/Solarflare:
- support FW flashing via devlink
- Cisco (enic):
- use page pool memory allocator for Rx
- enable 32, 64 byte CQEs
- get max rx/tx ring size from the device
- Meta (fbnic):
- support flow steering and RSS configuration
- report queue stats
- support TCP segmentation
- support IRQ coalescing
- support ring size configuration
- Marvell/Cavium:
- support AF_XDP
- Wangxun:
- support for PTP clock and timestamping
- Huawei (hibmcge):
- checksum offload
- add more statistics
- Ethernet virtual:
- VirtIO net:
- aggressively suppress Tx completions, improve perf by 96%
with 1 CPU and 55% with 2 CPUs
- expose NAPI to IRQ mapping and persist NAPI settings
- Google (gve):
- support XDP in DQO RDA Queue Format
- opt into instance locking
- Microsoft vNIC:
- support BIG TCP
- Ethernet NICs consumer, and embedded:
- Synopsys (stmmac):
- cleanup Tx and Tx clock setting and other link-focused
cleanups
- enable SGMII and 2500BASEX mode switching for Intel platforms
- support Sophgo SG2044
- Broadcom switches (b53):
- support for BCM53101
- TI:
- iep: add perout configuration support
- icssg: support XDP
- Cadence (macb):
- implement BQL
- Xilinx (axinet):
- support dynamic IRQ moderation and changing coalescing at
runtime
- implement BQL
- report standard stats
- MediaTek:
- support phylink managed EEE
- Intel:
- igc: don't restart the interface on every XDP program change
- RealTek (r8169):
- support reading registers of internal PHYs directly
- increase max jumbo packet size on RTL8125/RTL8126
- Airoha:
- support for RISC-V NPU packet processing unit
- enable scatter-gather and support MTU up to 9kB
- Tehuti (tn40xx):
- support cards with TN4010 MAC and an Aquantia AQR105 PHY
- Ethernet PHYs:
- support for TJA1102S, TJA1121
- dp83tg720: add randomized polling intervals for link detection
- dp83822: support changing the transmit amplitude voltage
- support for LEDs on 88q2xxx
- CAN:
- canxl: support Remote Request Substitution bit access
- flexcan: add S32G2/S32G3 SoC
- WiFi:
- remove cooked monitor support
- strict mode for better AP testing
- basic EPCS support
- OMI RX bandwidth reduction support
- batman-adv: add support for jumbo frames
- WiFi drivers:
- RealTek (rtw88):
- support RTL8814AE and RTL8814AU
- RealTek (rtw89):
- switch using wiphy_lock and wiphy_work
- add BB context to manipulate two PHY as preparation of MLO
- improve BT-coexistence mechanism to play A2DP smoothly
- Intel (iwlwifi):
- add new iwlmld sub-driver for latest HW/FW combinations
- MediaTek (mt76):
- preparation for mt7996 Multi-Link Operation (MLO) support
- Qualcomm/Atheros (ath12k):
- continued work on MLO
- Silabs (wfx):
- Wake-on-WLAN support
- Bluetooth:
- add support for skb TX SND/COMPLETION timestamping
- hci_core: enable buffer flow control for SCO/eSCO
- coredump: log devcd dumps into the monitor
- Bluetooth drivers:
- intel: add support to configure TX power
- nxp: handle bootloader error during cmd5 and cmd7"
* tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits)
unix: fix up for "apparmor: add fine grained af_unix mediation"
mctp: Fix incorrect tx flow invalidation condition in mctp-i2c
net: usb: asix: ax88772: Increase phy_name size
net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string
net: libwx: fix Tx L4 checksum
net: libwx: fix Tx descriptor content for some tunnel packets
atm: Fix NULL pointer dereference
net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards
net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card
net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus
net: phy: aquantia: add essential functions to aqr105 driver
net: phy: aquantia: search for firmware-name in fwnode
net: phy: aquantia: add probe function to aqr105 for firmware loading
net: phy: Add swnode support to mdiobus_scan
gve: add XDP DROP and PASS support for DQ
gve: update XDP allocation path support RX buffer posting
gve: merge packet buffer size fields
gve: update GQ RX to use buf_size
gve: introduce config-based allocation for XDP
gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics
...
Diffstat (limited to 'drivers/net/virtio_net.c')
| -rw-r--r-- | drivers/net/virtio_net.c | 265 |
1 files changed, 149 insertions, 116 deletions
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 9d7c37e968b5..7e4617216a4b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -360,24 +360,7 @@ struct receive_queue { struct xdp_buff **xsk_buffs; }; -/* This structure can contain rss message with maximum settings for indirection table and keysize - * Note, that default structure that describes RSS configuration virtio_net_rss_config - * contains same info but can't handle table values. - * In any case, structure would be passed to virtio hw through sg_buf split by parts - * because table sizes may be differ according to the device configuration. - */ #define VIRTIO_NET_RSS_MAX_KEY_SIZE 40 -struct virtio_net_ctrl_rss { - u32 hash_types; - u16 indirection_table_mask; - u16 unclassified_queue; - u16 hash_cfg_reserved; /* for HASH_CONFIG (see virtio_net_hash_config for details) */ - u16 max_tx_vq; - u8 hash_key_length; - u8 key[VIRTIO_NET_RSS_MAX_KEY_SIZE]; - - u16 *indirection_table; -}; /* Control VQ buffers: protected by the rtnl lock */ struct control_buf { @@ -421,7 +404,9 @@ struct virtnet_info { u16 rss_indir_table_size; u32 rss_hash_types_supported; u32 rss_hash_types_saved; - struct virtio_net_ctrl_rss rss; + struct virtio_net_rss_config_hdr *rss_hdr; + struct virtio_net_rss_config_trailer rss_trailer; + u8 rss_hash_key_data[VIRTIO_NET_RSS_MAX_KEY_SIZE]; /* Has control virtqueue */ bool has_cvq; @@ -523,23 +508,16 @@ enum virtnet_xmit_type { VIRTNET_XMIT_TYPE_XSK, }; -static int rss_indirection_table_alloc(struct virtio_net_ctrl_rss *rss, u16 indir_table_size) +static size_t virtnet_rss_hdr_size(const struct virtnet_info *vi) { - if (!indir_table_size) { - rss->indirection_table = NULL; - return 0; - } + u16 indir_table_size = vi->has_rss ? vi->rss_indir_table_size : 1; - rss->indirection_table = kmalloc_array(indir_table_size, sizeof(u16), GFP_KERNEL); - if (!rss->indirection_table) - return -ENOMEM; - - return 0; + return struct_size(vi->rss_hdr, indirection_table, indir_table_size); } -static void rss_indirection_table_free(struct virtio_net_ctrl_rss *rss) +static size_t virtnet_rss_trailer_size(const struct virtnet_info *vi) { - kfree(rss->indirection_table); + return struct_size(&vi->rss_trailer, hash_key_data, vi->rss_key_size); } /* We use the last two bits of the pointer to distinguish the xmit type. */ @@ -1088,11 +1066,10 @@ static bool is_xdp_raw_buffer_queue(struct virtnet_info *vi, int q) return false; } -static void check_sq_full_and_disable(struct virtnet_info *vi, - struct net_device *dev, - struct send_queue *sq) +static bool tx_may_stop(struct virtnet_info *vi, + struct net_device *dev, + struct send_queue *sq) { - bool use_napi = sq->napi.weight; int qnum; qnum = sq - vi->sq; @@ -1114,6 +1091,25 @@ static void check_sq_full_and_disable(struct virtnet_info *vi, u64_stats_update_begin(&sq->stats.syncp); u64_stats_inc(&sq->stats.stop); u64_stats_update_end(&sq->stats.syncp); + + return true; + } + + return false; +} + +static void check_sq_full_and_disable(struct virtnet_info *vi, + struct net_device *dev, + struct send_queue *sq) +{ + bool use_napi = sq->napi.weight; + int qnum; + + qnum = sq - vi->sq; + + if (tx_may_stop(vi, dev, sq)) { + struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); + if (use_napi) { if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) virtqueue_napi_schedule(&sq->napi, sq->vq); @@ -2789,7 +2785,8 @@ static void skb_recv_done(struct virtqueue *rvq) virtqueue_napi_schedule(&rq->napi, rvq); } -static void virtnet_napi_enable(struct virtqueue *vq, struct napi_struct *napi) +static void virtnet_napi_do_enable(struct virtqueue *vq, + struct napi_struct *napi) { napi_enable(napi); @@ -2802,10 +2799,21 @@ static void virtnet_napi_enable(struct virtqueue *vq, struct napi_struct *napi) local_bh_enable(); } -static void virtnet_napi_tx_enable(struct virtnet_info *vi, - struct virtqueue *vq, - struct napi_struct *napi) +static void virtnet_napi_enable(struct receive_queue *rq) +{ + struct virtnet_info *vi = rq->vq->vdev->priv; + int qidx = vq2rxq(rq->vq); + + virtnet_napi_do_enable(rq->vq, &rq->napi); + netif_queue_set_napi(vi->dev, qidx, NETDEV_QUEUE_TYPE_RX, &rq->napi); +} + +static void virtnet_napi_tx_enable(struct send_queue *sq) { + struct virtnet_info *vi = sq->vq->vdev->priv; + struct napi_struct *napi = &sq->napi; + int qidx = vq2txq(sq->vq); + if (!napi->weight) return; @@ -2817,13 +2825,30 @@ static void virtnet_napi_tx_enable(struct virtnet_info *vi, return; } - return virtnet_napi_enable(vq, napi); + virtnet_napi_do_enable(sq->vq, napi); + netif_queue_set_napi(vi->dev, qidx, NETDEV_QUEUE_TYPE_TX, napi); } -static void virtnet_napi_tx_disable(struct napi_struct *napi) +static void virtnet_napi_tx_disable(struct send_queue *sq) { - if (napi->weight) + struct virtnet_info *vi = sq->vq->vdev->priv; + struct napi_struct *napi = &sq->napi; + int qidx = vq2txq(sq->vq); + + if (napi->weight) { + netif_queue_set_napi(vi->dev, qidx, NETDEV_QUEUE_TYPE_TX, NULL); napi_disable(napi); + } +} + +static void virtnet_napi_disable(struct receive_queue *rq) +{ + struct virtnet_info *vi = rq->vq->vdev->priv; + struct napi_struct *napi = &rq->napi; + int qidx = vq2rxq(rq->vq); + + netif_queue_set_napi(vi->dev, qidx, NETDEV_QUEUE_TYPE_RX, NULL); + napi_disable(napi); } static void refill_work(struct work_struct *work) @@ -2836,9 +2861,23 @@ static void refill_work(struct work_struct *work) for (i = 0; i < vi->curr_queue_pairs; i++) { struct receive_queue *rq = &vi->rq[i]; + /* + * When queue API support is added in the future and the call + * below becomes napi_disable_locked, this driver will need to + * be refactored. + * + * One possible solution would be to: + * - cancel refill_work with cancel_delayed_work (note: + * non-sync) + * - cancel refill_work with cancel_delayed_work_sync in + * virtnet_remove after the netdev is unregistered + * - wrap all of the work in a lock (perhaps the netdev + * instance lock) + * - check netif_running() and return early to avoid a race + */ napi_disable(&rq->napi); still_empty = !try_fill_recv(vi, rq, GFP_KERNEL); - virtnet_napi_enable(rq->vq, &rq->napi); + virtnet_napi_do_enable(rq->vq, &rq->napi); /* In theory, this can happen: if we don't get any buffers in * we will *never* try to fill again. @@ -3035,8 +3074,8 @@ static int virtnet_poll(struct napi_struct *napi, int budget) static void virtnet_disable_queue_pair(struct virtnet_info *vi, int qp_index) { - virtnet_napi_tx_disable(&vi->sq[qp_index].napi); - napi_disable(&vi->rq[qp_index].napi); + virtnet_napi_tx_disable(&vi->sq[qp_index]); + virtnet_napi_disable(&vi->rq[qp_index]); xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq); } @@ -3055,8 +3094,8 @@ static int virtnet_enable_queue_pair(struct virtnet_info *vi, int qp_index) if (err < 0) goto err_xdp_reg_mem_model; - virtnet_napi_enable(vi->rq[qp_index].vq, &vi->rq[qp_index].napi); - virtnet_napi_tx_enable(vi, vi->sq[qp_index].vq, &vi->sq[qp_index].napi); + virtnet_napi_enable(&vi->rq[qp_index]); + virtnet_napi_tx_enable(&vi->sq[qp_index]); return 0; @@ -3253,15 +3292,10 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) bool use_napi = sq->napi.weight; bool kick; - /* Free up any pending old buffers before queueing new ones. */ - do { - if (use_napi) - virtqueue_disable_cb(sq->vq); - + if (!use_napi) free_old_xmit(sq, txq, false); - - } while (use_napi && !xmit_more && - unlikely(!virtqueue_enable_cb_delayed(sq->vq))); + else + virtqueue_disable_cb(sq->vq); /* timestamp packet in software */ skb_tx_timestamp(skb); @@ -3287,7 +3321,10 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) nf_reset_ct(skb); } - check_sq_full_and_disable(vi, dev, sq); + if (use_napi) + tx_may_stop(vi, dev, sq); + else + check_sq_full_and_disable(vi, dev,sq); kick = use_napi ? __netdev_tx_sent_queue(txq, skb->len, xmit_more) : !xmit_more || netif_xmit_stopped(txq); @@ -3299,6 +3336,9 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) } } + if (use_napi && kick && unlikely(!virtqueue_enable_cb_delayed(sq->vq))) + virtqueue_napi_schedule(&sq->napi, sq->vq); + return NETDEV_TX_OK; } @@ -3307,7 +3347,7 @@ static void virtnet_rx_pause(struct virtnet_info *vi, struct receive_queue *rq) bool running = netif_running(vi->dev); if (running) { - napi_disable(&rq->napi); + virtnet_napi_disable(rq); virtnet_cancel_dim(vi, &rq->dim); } } @@ -3320,7 +3360,7 @@ static void virtnet_rx_resume(struct virtnet_info *vi, struct receive_queue *rq) schedule_delayed_work(&vi->refill, 0); if (running) - virtnet_napi_enable(rq->vq, &rq->napi); + virtnet_napi_enable(rq); } static int virtnet_rx_resize(struct virtnet_info *vi, @@ -3349,7 +3389,7 @@ static void virtnet_tx_pause(struct virtnet_info *vi, struct send_queue *sq) qindex = sq - vi->sq; if (running) - virtnet_napi_tx_disable(&sq->napi); + virtnet_napi_tx_disable(sq); txq = netdev_get_tx_queue(vi->dev, qindex); @@ -3383,7 +3423,7 @@ static void virtnet_tx_resume(struct virtnet_info *vi, struct send_queue *sq) __netif_tx_unlock_bh(txq); if (running) - virtnet_napi_tx_enable(vi, sq->vq, &sq->napi); + virtnet_napi_tx_enable(sq); } static int virtnet_tx_resize(struct virtnet_info *vi, struct send_queue *sq, @@ -3576,15 +3616,16 @@ static void virtnet_rss_update_by_qpairs(struct virtnet_info *vi, u16 queue_pair for (; i < vi->rss_indir_table_size; ++i) { indir_val = ethtool_rxfh_indir_default(i, queue_pairs); - vi->rss.indirection_table[i] = indir_val; + vi->rss_hdr->indirection_table[i] = cpu_to_le16(indir_val); } - vi->rss.max_tx_vq = queue_pairs; + vi->rss_trailer.max_tx_vq = cpu_to_le16(queue_pairs); } static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) { struct virtio_net_ctrl_mq *mq __free(kfree) = NULL; - struct virtio_net_ctrl_rss old_rss; + struct virtio_net_rss_config_hdr *old_rss_hdr; + struct virtio_net_rss_config_trailer old_rss_trailer; struct net_device *dev = vi->dev; struct scatterlist sg; @@ -3599,24 +3640,28 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs) * update (VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET below) and return directly. */ if (vi->has_rss && !netif_is_rxfh_configured(dev)) { - memcpy(&old_rss, &vi->rss, sizeof(old_rss)); - if (rss_indirection_table_alloc(&vi->rss, vi->rss_indir_table_size)) { - vi->rss.indirection_table = old_rss.indirection_table; + old_rss_hdr = vi->rss_hdr; + old_rss_trailer = vi->rss_trailer; + vi->rss_hdr = devm_kzalloc(&dev->dev, virtnet_rss_hdr_size(vi), GFP_KERNEL); + if (!vi->rss_hdr) { + vi->rss_hdr = old_rss_hdr; return -ENOMEM; } + *vi->rss_hdr = *old_rss_hdr; virtnet_rss_update_by_qpairs(vi, queue_pairs); if (!virtnet_commit_rss_command(vi)) { /* restore ctrl_rss if commit_rss_command failed */ - rss_indirection_table_free(&vi->rss); - memcpy(&vi->rss, &old_rss, sizeof(old_rss)); + devm_kfree(&dev->dev, vi->rss_hdr); + vi->rss_hdr = old_rss_hdr; + vi->rss_trailer = old_rss_trailer; dev_warn(&dev->dev, "Fail to set num of queue pairs to %d, because committing RSS failed\n", queue_pairs); return -EINVAL; } - rss_indirection_table_free(&old_rss); + devm_kfree(&dev->dev, old_rss_hdr); goto succ; } @@ -4061,28 +4106,12 @@ static int virtnet_set_ringparam(struct net_device *dev, static bool virtnet_commit_rss_command(struct virtnet_info *vi) { struct net_device *dev = vi->dev; - struct scatterlist sgs[4]; - unsigned int sg_buf_size; + struct scatterlist sgs[2]; /* prepare sgs */ - sg_init_table(sgs, 4); - - sg_buf_size = offsetof(struct virtio_net_ctrl_rss, hash_cfg_reserved); - sg_set_buf(&sgs[0], &vi->rss, sg_buf_size); - - if (vi->has_rss) { - sg_buf_size = sizeof(uint16_t) * vi->rss_indir_table_size; - sg_set_buf(&sgs[1], vi->rss.indirection_table, sg_buf_size); - } else { - sg_set_buf(&sgs[1], &vi->rss.hash_cfg_reserved, sizeof(uint16_t)); - } - - sg_buf_size = offsetof(struct virtio_net_ctrl_rss, key) - - offsetof(struct virtio_net_ctrl_rss, max_tx_vq); - sg_set_buf(&sgs[2], &vi->rss.max_tx_vq, sg_buf_size); - - sg_buf_size = vi->rss_key_size; - sg_set_buf(&sgs[3], vi->rss.key, sg_buf_size); + sg_init_table(sgs, 2); + sg_set_buf(&sgs[0], vi->rss_hdr, virtnet_rss_hdr_size(vi)); + sg_set_buf(&sgs[1], &vi->rss_trailer, virtnet_rss_trailer_size(vi)); if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ, vi->has_rss ? VIRTIO_NET_CTRL_MQ_RSS_CONFIG @@ -4099,17 +4128,17 @@ err: static void virtnet_init_default_rss(struct virtnet_info *vi) { - vi->rss.hash_types = vi->rss_hash_types_supported; + vi->rss_hdr->hash_types = cpu_to_le32(vi->rss_hash_types_supported); vi->rss_hash_types_saved = vi->rss_hash_types_supported; - vi->rss.indirection_table_mask = vi->rss_indir_table_size - ? vi->rss_indir_table_size - 1 : 0; - vi->rss.unclassified_queue = 0; + vi->rss_hdr->indirection_table_mask = vi->rss_indir_table_size + ? cpu_to_le16(vi->rss_indir_table_size - 1) : 0; + vi->rss_hdr->unclassified_queue = 0; virtnet_rss_update_by_qpairs(vi, vi->curr_queue_pairs); - vi->rss.hash_key_length = vi->rss_key_size; + vi->rss_trailer.hash_key_length = vi->rss_key_size; - netdev_rss_key_fill(vi->rss.key, vi->rss_key_size); + netdev_rss_key_fill(vi->rss_hash_key_data, vi->rss_key_size); } static void virtnet_get_hashflow(const struct virtnet_info *vi, struct ethtool_rxnfc *info) @@ -4220,7 +4249,7 @@ static bool virtnet_set_hashflow(struct virtnet_info *vi, struct ethtool_rxnfc * if (new_hashtypes != vi->rss_hash_types_saved) { vi->rss_hash_types_saved = new_hashtypes; - vi->rss.hash_types = vi->rss_hash_types_saved; + vi->rss_hdr->hash_types = cpu_to_le32(vi->rss_hash_types_saved); if (vi->dev->features & NETIF_F_RXHASH) return virtnet_commit_rss_command(vi); } @@ -5400,11 +5429,11 @@ static int virtnet_get_rxfh(struct net_device *dev, if (rxfh->indir) { for (i = 0; i < vi->rss_indir_table_size; ++i) - rxfh->indir[i] = vi->rss.indirection_table[i]; + rxfh->indir[i] = le16_to_cpu(vi->rss_hdr->indirection_table[i]); } if (rxfh->key) - memcpy(rxfh->key, vi->rss.key, vi->rss_key_size); + memcpy(rxfh->key, vi->rss_hash_key_data, vi->rss_key_size); rxfh->hfunc = ETH_RSS_HASH_TOP; @@ -5428,7 +5457,7 @@ static int virtnet_set_rxfh(struct net_device *dev, return -EOPNOTSUPP; for (i = 0; i < vi->rss_indir_table_size; ++i) - vi->rss.indirection_table[i] = rxfh->indir[i]; + vi->rss_hdr->indirection_table[i] = cpu_to_le16(rxfh->indir[i]); update = true; } @@ -5440,7 +5469,7 @@ static int virtnet_set_rxfh(struct net_device *dev, if (!vi->has_rss && !vi->has_rss_hash_report) return -EOPNOTSUPP; - memcpy(vi->rss.key, rxfh->key, vi->rss_key_size); + memcpy(vi->rss_hash_key_data, rxfh->key, vi->rss_key_size); update = true; } @@ -5617,8 +5646,11 @@ static void virtnet_freeze_down(struct virtio_device *vdev) netif_tx_lock_bh(vi->dev); netif_device_detach(vi->dev); netif_tx_unlock_bh(vi->dev); - if (netif_running(vi->dev)) + if (netif_running(vi->dev)) { + rtnl_lock(); virtnet_close(vi->dev); + rtnl_unlock(); + } } static int init_vqs(struct virtnet_info *vi); @@ -5638,7 +5670,9 @@ static int virtnet_restore_up(struct virtio_device *vdev) enable_rx_mode_work(vi); if (netif_running(vi->dev)) { + rtnl_lock(); err = virtnet_open(vi->dev); + rtnl_unlock(); if (err) return err; } @@ -5928,8 +5962,8 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, /* Make sure NAPI is not using any XDP TX queues for RX. */ if (netif_running(dev)) { for (i = 0; i < vi->max_queue_pairs; i++) { - napi_disable(&vi->rq[i].napi); - virtnet_napi_tx_disable(&vi->sq[i].napi); + virtnet_napi_disable(&vi->rq[i]); + virtnet_napi_tx_disable(&vi->sq[i]); } } @@ -5966,9 +6000,8 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, if (old_prog) bpf_prog_put(old_prog); if (netif_running(dev)) { - virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi); - virtnet_napi_tx_enable(vi, vi->sq[i].vq, - &vi->sq[i].napi); + virtnet_napi_enable(&vi->rq[i]); + virtnet_napi_tx_enable(&vi->sq[i]); } } @@ -5983,9 +6016,8 @@ err: if (netif_running(dev)) { for (i = 0; i < vi->max_queue_pairs; i++) { - virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi); - virtnet_napi_tx_enable(vi, vi->sq[i].vq, - &vi->sq[i].napi); + virtnet_napi_enable(&vi->rq[i]); + virtnet_napi_tx_enable(&vi->sq[i]); } } if (prog) @@ -6046,9 +6078,9 @@ static int virtnet_set_features(struct net_device *dev, if ((dev->features ^ features) & NETIF_F_RXHASH) { if (features & NETIF_F_RXHASH) - vi->rss.hash_types = vi->rss_hash_types_saved; + vi->rss_hdr->hash_types = cpu_to_le32(vi->rss_hash_types_saved); else - vi->rss.hash_types = VIRTIO_NET_HASH_REPORT_NONE; + vi->rss_hdr->hash_types = cpu_to_le32(VIRTIO_NET_HASH_REPORT_NONE); if (!virtnet_commit_rss_command(vi)) return -EINVAL; @@ -6392,8 +6424,9 @@ static int virtnet_alloc_queues(struct virtnet_info *vi) INIT_DELAYED_WORK(&vi->refill, refill_work); for (i = 0; i < vi->max_queue_pairs; i++) { vi->rq[i].pages = NULL; - netif_napi_add_weight(vi->dev, &vi->rq[i].napi, virtnet_poll, - napi_weight); + netif_napi_add_config(vi->dev, &vi->rq[i].napi, virtnet_poll, + i); + vi->rq[i].napi.weight = napi_weight; netif_napi_add_tx_weight(vi->dev, &vi->sq[i].napi, virtnet_poll_tx, napi_tx ? napi_weight : 0); @@ -6737,9 +6770,11 @@ static int virtnet_probe(struct virtio_device *vdev) virtio_cread16(vdev, offsetof(struct virtio_net_config, rss_max_indirection_table_length)); } - err = rss_indirection_table_alloc(&vi->rss, vi->rss_indir_table_size); - if (err) + vi->rss_hdr = devm_kzalloc(&vdev->dev, virtnet_rss_hdr_size(vi), GFP_KERNEL); + if (!vi->rss_hdr) { + err = -ENOMEM; goto free; + } if (vi->has_rss || vi->has_rss_hash_report) { vi->rss_key_size = @@ -7018,8 +7053,6 @@ static void virtnet_remove(struct virtio_device *vdev) remove_vq_common(vi); - rss_indirection_table_free(&vi->rss); - free_netdev(vi->dev); } |