2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

44 Commits

Author SHA1 Message Date
Maciej Fijalkowski
617f3e1b58 ice: xsk: allocate separate memory for XDP SW ring
Currently, the zero-copy data path is reusing the memory region that was
initially allocated for an array of struct ice_rx_buf for its own
purposes. This is error prone as it is based on the ice_rx_buf struct
always being the same size or bigger than what the zero-copy path needs.
There can also be old values present in that array giving rise to errors
when the zero-copy path uses it.

Fix this by freeing the ice_rx_buf region and allocating a new array for
the zero-copy path that has the right length and is initialized to zero.

Fixes: 57f7f8b6bc ("ice: Use xdp_buf instead of rx_buf for xsk zero-copy")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17 11:09:04 -08:00
Brett Creeley
b385cca473 ice: Fix not stopping Tx queues for VFs
When a VF is removed and/or reset its Tx queues need to be
stopped from the PF. This is done by calling the ice_dis_vf_qs()
function, which calls ice_vsi_stop_lan_tx_rings(). Currently
ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA.
Unfortunately, this is causing the Tx queues to not be disabled in some
cases and when the VF tries to re-enable/reconfigure its Tx queues over
virtchnl the op is failing. This is because a VF can be reset and/or
removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues
were already configured via ice_vsi_cfg_single_txq() in the
VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit
is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always
happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op.

This was causing the following error message when loading the ice
driver, creating VFs, and modifying VF trust in an endless loop:

[35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
[35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
[35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6

Fix this by always calling ice_dis_vf_qs() and silencing the error
message in ice_vsi_stop_tx_ring() since the calling code ignores the
return anyway. Also, all other places that call ice_vsi_stop_tx_ring()
catch the error, so this doesn't affect those flows since there was no
change to the values the function returns.

Other solutions were considered (i.e. tracking which VF queues had been
"started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed
more complicated than it was worth. This solution also brings in the
chance for other unexpected conditions due to invalid state bit checks.
So, the proposed solution seemed like the best option since there is no
harm in failing to stop Tx queues that were never started.

This issue can be seen using the following commands:

for i in {0..50}; do
        rmmod ice
        modprobe ice

        sleep 1

        echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
        echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs

        ip link set ens785f1 vf 0 trust on
        ip link set ens785f0 vf 0 trust on

        sleep 2

        echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
        echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
        sleep 1
        echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
        echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs

        ip link set ens785f1 vf 0 trust on
        ip link set ens785f0 vf 0 trust on
done

Fixes: 77ca27c417 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-03 08:16:08 -07:00
Kiran Patil
0754d65bd4 ice: Add infrastructure for mqprio support via ndo_setup_tc
Add infrastructure required for "ndo_setup_tc:qdisc_mqprio".
ice_vsi_setup is modified to configure traffic classes based
on mqprio data received from the stack. This includes low-level
functions to configure min, max rate-limit parameters in hardware
for traffic classes. Each traffic class gets mapped to a hardware
channel (VSI) which can be individually configured with different
bandwidth parameters.

Co-developed-by: Tarun Singh <tarun.k.singh@intel.com>
Signed-off-by: Tarun Singh <tarun.k.singh@intel.com>
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-20 15:57:54 -07:00
Maciej Fijalkowski
e72bba2135 ice: split ice_ring onto Tx/Rx separate structs
While it was convenient to have a generic ring structure that served
both Tx and Rx sides, next commits are going to introduce several
Tx-specific fields, so in order to avoid hurting the Rx side, let's
pull out the Tx ring onto new ice_tx_ring and ice_rx_ring structs.

Rx ring could be handled by the old ice_ring which would reduce the code
churn within this patch, but this would make things asymmetric.

Make the union out of the ring container within ice_q_vector so that it
is possible to iterate over newly introduced ice_tx_ring.

Remove the @size as it's only accessed from control path and it can be
calculated pretty easily.

Change definitions of ice_update_ring_stats and
ice_fetch_u64_stats_per_ring so that they are ring agnostic and can be
used for both Rx and Tx rings.

Sizes of Rx and Tx ring structs are 256 and 192 bytes, respectively. In
Rx ring xdp_rxq_info occupies its own cacheline, so it's the major
difference now.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-15 07:39:02 -07:00
Maciej Fijalkowski
dc23715cf3 ice: move ice_container_type onto ice_ring_container
Currently ice_container_type is scoped only for ice_ethtool.c. Next
commit that will split the ice_ring struct onto Rx/Tx specific ring
structs is going to also modify the type of linked list of rings that is
within ice_ring_container. Therefore, the functions that are taking the
ice_ring_container as an input argument will need to be aware of a ring
type that will be looked up.

Embed ice_container_type within ice_ring_container and initialize it
properly when allocating the q_vectors.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-15 07:39:02 -07:00
Grzegorz Nitka
f66756e0ea ice: introduce new type of VSI for switchdev
New type of VSI has to be defined for switchdev control plane
VSI. Number of allocated Tx and Rx queue has to be equal to
amount of VFs, because each port representor should have one
Tx and Rx queue.

Also to not increase number of used irqs too much, control plane
VSI uses only one q_vector and handle all queues in one irq.
To allow handling all queues in one irq , new function to clean
msix for eswitch was introduced. This function will schedule napi
for each representor instead of scheduling it only for one like in
normal clean irq function.

Only one additional msix has to be requested. Always try to request
it in ice_ena_msix_range function.

Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-07 10:41:42 -07:00
Jacob Keller
ea9b847cda ice: enable transmit timestamps for E810 devices
Add support for enabling Tx timestamp requests for outgoing packets on
E810 devices.

The ice hardware can support multiple outstanding Tx timestamp requests.
When sending a descriptor to hardware, a Tx timestamp request is made by
setting a request bit, and assigning an index that represents which Tx
timestamp index to store the timestamp in.

Hardware makes no effort to synchronize the index use, so it is up to
software to ensure that Tx timestamp indexes are not re-used before the
timestamp is reported back.

To do this, introduce a Tx timestamp tracker which will keep track of
currently in-use indexes.

In the hot path, if a packet has a timestamp request, an index will be
requested from the tracker. Unfortunately, this does require a lock as
the indexes are shared across all queues on a PHY. There are not enough
indexes to reliably assign only 1 to each queue.

For the E810 devices, the timestamp indexes are not shared across PHYs,
so each port can have its own tracking.

Once hardware captures a timestamp, an interrupt is fired. In this
interrupt, trigger a new work item that will figure out which timestamp
was completed, and report the timestamp back to the stack.

This function loops through the Tx timestamp indexes and checks whether
there is now a valid timestamp. If so, it clears the PHY timestamp
indication in the PHY memory, locks and removes the SKB and bit in the
tracker, then reports the timestamp to the stack.

It is possible in some cases that a timestamp request will be initiated
but never completed. This might occur if the packet is dropped by
software or hardware before it reaches the PHY.

Add a task to the periodic work function that will check whether
a timestamp request is more than a few seconds old. If so, the timestamp
index is cleared in the PHY, and the SKB is released.

Just as with Rx timestamps, the Tx timestamps are only 40 bits wide, and
use the same overall logic for extending to 64 bits of nanoseconds.

With this change, E810 devices should be able to perform basic PTP
functionality.

Future changes will extend the support to cover the E822-based devices.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11 08:47:41 -07:00
Jacob Keller
77a781155a ice: enable receive hardware timestamping
Add SIOCGHWTSTAMP and SIOCSHWTSTAMP ioctl handlers to respond to
requests to enable timestamping support. If the request is for enabling
Rx timestamps, set a bit in the Rx descriptors to indicate that receive
timestamps should be reported.

Hardware captures receive timestamps in the PHY which only captures part
of the timer, and reports only 40 bits into the Rx descriptor. The upper
32 bits represent the contents of GLTSYN_TIME_L at the point of packet
reception, while the lower 8 bits represent the upper 8 bits of
GLTSYN_TIME_0.

The networking and PTP stack expect 64 bit timestamps in nanoseconds. To
support this, implement some logic to extend the timestamps by using the
full PHC time.

If the Rx timestamp was captured prior to the PHC time, then the real
timestamp is

  PHC - (lower_32_bits(PHC) - timestamp)

If the Rx timestamp was captured after the PHC time, then the real
timestamp is

  PHC + (timestamp - lower_32_bits(PHC))

These calculations are correct as long as neither the PHC timestamp nor
the Rx timestamps are more than 2^32-1 nanseconds old. Further, we can
detect when the Rx timestamp is before or after the PHC as long as the
PHC timestamp is no more than 2^31-1 nanoseconds old.

In that case, we calculate the delta between the lower 32 bits of the
PHC and the Rx timestamp. If it's larger than 2^31-1 then the Rx
timestamp must have been captured in the past. If it's smaller, then the
Rx timestamp must have been captured after PHC time.

Add an ice_ptp_extend_32b_ts function that relies on a cached copy of
the PHC time and implements this algorithm to calculate the proper upper
32bits of the Rx timestamps.

Cache the PHC time periodically in all of the Rx rings. This enables
each Rx ring to simply call the extension function with a recent copy of
the PHC time. By ensuring that the PHC time is kept up to date
periodically, we ensure this algorithm doesn't use stale data and
produce incorrect results.

To cache the time, introduce a kworker and a kwork item to periodically
store the Rx time. It might seem like we should use the .do_aux_work
interface of the PTP clock. This doesn't work because all PFs must cache
this time, but only one PF owns the PTP clock device.

Thus, the ice driver will manage its own kthread instead of relying on
the PTP do_aux_work handler.

With this change, the driver can now report Rx timestamps on all
incoming packets.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11 08:47:41 -07:00
Krzysztof Kazimierczak
43c7f9198d ice: Refactor ice_setup_rx_ctx
Move AF_XDP logic and buffer allocation out of ice_setup_rx_ctx() to a
new function ice_vsi_cfg_rxq(), so the function actually sets up the Rx
context.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Co-developed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
2021-06-07 08:58:56 -07:00
Jesse Brandeburg
d59684a07e ice: refactor ITR data structures
Use a dedicated bitfield in order to both increase
the amount of checking around the length of ITR writes
as well as simplify the checks of dynamic mode.

Basically unpack the "high bit means dynamic" logic
into bitfields.

Also, remove some unused ITR defines.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14 17:00:06 -07:00
Jesse Brandeburg
b8b4772377 ice: refactor interrupt moderation writes
Introduce several new helpers for writing ITR and GLINT_RATE
registers, and refactor the code calling them.  This resulted
in removal of several duplicate functions and rolled a bunch
of simple code back into the calling routines.

In particular this removes some code that was doing both
a store and a set in a helper function, which seems better
done as separate tasks in the caller (and generally takes
less lines of code even with a tiny bit of repetition).

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14 17:00:05 -07:00
Benita Bose
634da4c118 ice: Add Support for XPS
Enable and configure XPS. The driver code implemented sets up the Transmit
Packet Steering Map, which in turn will be used by the kernel in queue
selection during Tx.

Signed-off-by: Benita Bose <benita.bose@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31 14:21:27 -07:00
Maciej Fijalkowski
89861c485c ice: move headroom initialization to ice_setup_rx_ctx
ice_rx_offset(), that is supposed to initialize the Rx buffer headroom,
relies on ICE_RX_FLAGS_RING_BUILD_SKB flag as well as XDP prog presence.

Currently, the callsite of mentioned function is placed incorrectly
within ice_setup_rx_ring() where Rx ring's build skb flag is not
set yet. This causes the XDP_REDIRECT to be partially broken due to
inability to create xdp_frame in the headroom space, as the headroom is
0.

Fix this by moving ice_rx_offset() to ice_setup_rx_ctx() after the flag
setting.

Fixes: f1b1f409bf ("ice: store the result of ice_rx_offset() onto ice_ring")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-12 07:43:46 -08:00
Magnus Karlsson
ed0907e3bd ice: fix napi work done reporting in xsk path
Fix the wrong napi work done reporting in the xsk path of the ice
driver. The code in the main Rx processing loop was written to assume
that the buffer allocation code returns true if all allocations where
successful and false if not. In contrast with all other Intel NIC xsk
drivers, the ice_alloc_rx_bufs_zc() has the inverted logic messing up
the work done reporting in the napi loop.

This can be fixed either by inverting the return value from
ice_alloc_rx_bufs_zc() in the function that uses this in an incorrect
way, or by changing the return value of ice_alloc_rx_bufs_zc(). We
chose the latter as it makes all the xsk allocation functions for
Intel NICs behave in the same way. My guess is that it was this
unexpected discrepancy that gave rise to this bug in the first place.

Fixes: 5bb0c4b5eb ("ice, xsk: Move Rx allocation out of while-loop")
Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-12 07:43:46 -08:00
Björn Töpel
b02e5a0ebb xsk: Propagate napi_id to XDP socket Rx path
Add napi_id to the xdp_rxq_info structure, and make sure the XDP
socket pick up the napi_id in the Rx path. The napi_id is used to find
the corresponding NAPI structure for socket busy polling.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/bpf/20201130185205.196029-7-bjorn.topel@gmail.com
2020-12-01 00:09:25 +01:00
Magnus Karlsson
c4655761d3 xsk: i40e: ice: ixgbe: mlx5: Rename xsk zero-copy driver interfaces
Rename the AF_XDP zero-copy driver interface functions to better
reflect what they do after the replacement of umems with buffer
pools in the previous commit. Mostly it is about replacing the
umem name from the function names with xsk_buff and also have
them take the a buffer pool pointer instead of a umem. The
various ring functions have also been renamed in the process so
that they have the same naming convention as the internal
functions in xsk_queue.h. This so that it will be clearer what
they do and also for consistency.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/1598603189-32145-3-git-send-email-magnus.karlsson@intel.com
2020-08-31 21:15:04 +02:00
Magnus Karlsson
1742b3d528 xsk: i40e: ice: ixgbe: mlx5: Pass buffer pool to driver instead of umem
Replace the explicit umem reference passed to the driver in AF_XDP
zero-copy mode with the buffer pool instead. This in preparation for
extending the functionality of the zero-copy mode so that umems can be
shared between queues on the same netdev and also between netdevs. In
this commit, only an umem reference has been added to the buffer pool
struct. But later commits will add other entities to it. These are
going to be entities that are different between different queue ids
and netdevs even though the umem is shared between them.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/1598603189-32145-2-git-send-email-magnus.karlsson@intel.com
2020-08-31 21:15:03 +02:00
Bruce Allan
66486d8943 ice: replace single-element array used for C struct hack
Convert the pre-C90-extension "C struct hack" method (using a single-
element array at the end of a structure for implementing variable-length
types) to the preferred use of C99 flexible array member.

Additional code cleanups were done near areas affected by this change.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-01 16:35:23 -07:00
Brett Creeley
401ce33b32 ice: Always clear QRXFLXP_CNTXT before writing new value
Always clear the previous value in QRXFLXP_CNTXT before writing a new
value. This will make it so re-used queues will not accidentally take the
previously configured settings.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-31 03:47:35 -07:00
Krzysztof Kazimierczak
3f0d97cdfe ice: Check UMEM FQ size when allocating bufs
If a UMEM is present on a queue when an interface/queue pair is being
enabled, the driver will try to prepare the Rx buffers in advance to
improve performance. However, if fill queue is shorter than HW Rx ring,
the driver will report failure after getting the last address from the
fill queue.

This still lets the driver process the packets correctly during the NAPI
poll, but leads to a constant NAPI rescheduling. Not allocating the
buffers in advance would result in a potential performance decrease.

Commit d57d76428a ("xsk: Add API to check for available entries in FQ")
provides an API that lets drivers check the number of addresses that the
fill queue holds.

Notify the user if fill queue is not long enough to prepare all buffers
before packet processing starts, and allocate the buffers during the
NAPI poll. If the fill queue size is sufficient, prepare Rx buffers in
advance.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 18:13:59 -07:00
Bruce Allan
7e34786a74 ice: avoid undefined behavior
When writing the driver's struct ice_tlan_ctx structure, do not write the
8-bit element int_q_state with the associated internal-to-hardware field
which is 122-bits, otherwise the helper function ice_write_byte() will use
undefined behavior when setting the mask used for that write.  This should
not cause any functional change and will avoid use of undefined behavior.
Also, update a comment to highlight this structure element is not written.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:58:21 -07:00
Jesse Brandeburg
22bef5e78f ice: fix signed vs unsigned comparisons
Fix the remaining signed vs unsigned issues, which appear
when compiling with -Werror=sign-compare.

Many of these are because there is an external interface that is passing
an int to us (which we can't change) but that we (rightfully) store
and compare against as an unsigned in our data structures.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27 17:02:47 -07:00
David S. Miller
2b1a7f741a Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2020-05-22

This series contains updates to virtchnl and the ice driver.

Geert Uytterhoeven fixes a data structure alignment issue in the
virtchnl structures.

Henry adds Flow Director support which allows for the redirection on
ntuple rules over six patches.  Initially Henry adds the initial
infrastructure for Flow Director, and then later adds IPv4 and IPv6
support, as well as being able to display the ntuple rules.

Bret add Accelerated Receive Flow Steering (aRFS) support which is used
to steer receive flows to a specific queue.  Fixes a transmit timeout
when the VF link transitions from up/down/up because the transmit and
receive queue interrupts are not enabled as part of VF's link up.  Fixed
an issue when the default VF LAN address is changed and after reset the
PF will attempt to add the new MAC, which fails because it already
exists. This causes the VF to be disabled completely until it is removed
and enabled via sysfs.

Anirudh (Ani) makes a fix where the ice driver needs to call set_mac_cfg
to enable jumbo frames, so ensure it gets called during initialization
and after reset.  Fix bad register reads during a register dump in
ethtool by removing the bad registers.

Paul fixes an issue where the receive Malicious Driver Detection (MDD)
auto reset message was not being logged because it occurred after the VF
reset.

Victor adds a check for compatibility between the Dynamic Device
Personalization (DDP) package and the NIC firmware to ensure that
everything aligns.

Jesse fixes a administrative queue string call with the appropriate
error reporting variable.  Also fixed the loop variables that are
comparing or assigning signed against unsigned values.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-23 16:51:26 -07:00
Henry Tieman
148beb6120 ice: Initialize Flow Director resources
Flow Director allows for redirection based on ntuple rules. Rules are
programmed using the ethtool set-ntuple interface. Supported actions are
redirect to queue and drop.

Setup the initial framework to process Flow Director filters. Create and
allocate resources to manage and program filters to the hardware. Filters
are processed via a sideband interface; a control VSI is created to manage
communication and process requests through the sideband. Upon allocation of
resources, update the hardware tables to accept perfect filters.

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22 21:26:37 -07:00
David S. Miller
a152b85984 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-05-23

The following pull-request contains BPF updates for your *net-next* tree.

We've added 50 non-merge commits during the last 8 day(s) which contain
a total of 109 files changed, 2776 insertions(+), 2887 deletions(-).

The main changes are:

1) Add a new AF_XDP buffer allocation API to the core in order to help
   lowering the bar for drivers adopting AF_XDP support. i40e, ice, ixgbe
   as well as mlx5 have been moved over to the new API and also gained a
   small improvement in performance, from Björn Töpel and Magnus Karlsson.

2) Add getpeername()/getsockname() attach types for BPF sock_addr programs
   in order to allow for e.g. reverse translation of load-balancer backend
   to service address/port tuple from a connected peer, from Daniel Borkmann.

3) Improve the BPF verifier is_branch_taken() logic to evaluate pointers
   being non-NULL, e.g. if after an initial test another non-NULL test on
   that pointer follows in a given path, then it can be pruned right away,
   from John Fastabend.

4) Larger rework of BPF sockmap selftests to make output easier to understand
   and to reduce overall runtime as well as adding new BPF kTLS selftests
   that run in combination with sockmap, also from John Fastabend.

5) Batch of misc updates to BPF selftests including fixing up test_align
   to match verifier output again and moving it under test_progs, allowing
   bpf_iter selftest to compile on machines with older vmlinux.h, and
   updating config options for lirc and v6 segment routing helpers, from
   Stanislav Fomichev, Andrii Nakryiko and Alan Maguire.

6) Conversion of BPF tracing samples outdated internal BPF loader to use
   libbpf API instead, from Daniel T. Lee.

7) Follow-up to BPF kernel test infrastructure in order to fix a flake in
   the XDP selftests, from Jesper Dangaard Brouer.

8) Minor improvements to libbpf's internal hashmap implementation, from
   Ian Rogers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22 18:30:34 -07:00
Karol Kolacinski
88865fc4bb ice: Fix casting issues
Change min() macros to min_t() which has compare type specified and it
helps avoid precision loss.

In some cases there was precision loss during calls or assignments.
Some fields in structs were unnecessarily large and gave multiple
warnings.

There were also some minor type differences which are now fixed as well as
some cases where a simple cast was needed.

Callers were were passing data that is a u16 to
ice_sched_cfg_node_bw_alloc() but the function was truncating that to a u8.
Fix that by changing the function to take a u16.

Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Lihong Yang
0fee35774d ice: Provide more meaningful error message
When printing the ice status or AQ error codes, instead of printing out the
numerical value, provide the description of the error code. This provides
more info about the issue than a number.

Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21 22:10:04 -07:00
Björn Töpel
175fc43067 ice, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL
APIs.

v4->v5: Fixed "warning: Excess function parameter 'alloc' description
        in 'ice_alloc_rx_bufs_zc'" and "warning: Excess function
        parameter 'xdp' description in
        'ice_construct_skb_zc'". (Jakub)

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/bpf/20200520192103.355233-10-bjorn.topel@gmail.com
2020-05-21 17:31:26 -07:00
Jesse Brandeburg
af23635a53 ice: add backslash-n to strings
There were several strings found without line feeds, fix
them by adding a line feed, as is typical.  Without this
lotsofmessagescanbejumbledtogether.

This patch has known checkpatch warnings from long lines
for the NL_* messages, because checkpatch doesn't know
how to ignore them.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19 13:26:45 -08:00
Brett Creeley
36be2baa09 ice: Always clear the QRXFLXP_CNTXT register for VF Rx queues
Currently when the PF reduces its number of channels via ethtool and
then VFs are created there may be stale data for some of the Rx queues
belonging to VFs. This happens when a VF reuses an Rx queue that was
previously used by the PF. Specifically, the QRXFLXP_CNTXT register
will have incorrect values. Fix this by always clearing the relevant
values in the QRXFLXP_CNTXT register for VF queues.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19 13:01:51 -08:00
Bruce Allan
752eee0678 ice: remove unnecessary fallthrough comments
Fallthrough comments are used to explicitly indicate the code is intended
to flow from one case statement to the next in a switch statement rather
than break out of the switch statement.  They are only needed when a case
has one or more statements to execute before falling through to the next
case, not when there is a list of cases for which the same statement(s)
should be executed.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-15 16:56:48 -08:00
Brett Creeley
13a6233b03 ice: Add support to enable/disable all Rx queues before waiting
Currently when we enable/disable all Rx queues we do the following
sequence for each Rx queue and then move to the next queue.

1. Enable/Disable the Rx queue via register write.
2. Read the configuration register to determine if the Rx queue was
enabled/disabled successfully.

In some cases enabling/disabling queue 0 fails because of step 2 above.
Fix this by doing step 1 for all of the Rx queues and then step 2 for
all of the Rx queues.

Also, there are cases where we enable/disable a single queue (i.e.
SR-IOV and XDP) so add a new function that does step 1 and 2 above with
a read flush in between.

This change also required a single Rx queue to be enabled/disabled with
and without waiting for the change to propagate through hardware. Fix
this by adding a boolean wait flag to the necessary functions.

Also, add the keywords "one" and "all" to distinguish between
enabling/disabling a single Rx queue and all Rx queues respectively.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-15 16:39:55 -08:00
Anirudh Venkataramanan
3306f79f42 ice: Cleanup ice_vsi_alloc_q_vectors
1. Remove local variable num_q_vectors and use vsi->num_q_vectors instead
2. Remove local variable pf and pass vsi->back to ice_pf_to_dev

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:49:04 -08:00
Anirudh Venkataramanan
19cce2c6f6 ice: Make print statements more compact
Formatting strings in print function calls (like dev_info, dev_err, etc.)
can exceed 80 columns without making checkpatch unhappy. So remove
newlines where applicable and make print statements more compact.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:49:00 -08:00
Anirudh Venkataramanan
9a946843ba ice: Use ice_pf_to_dev
Use ice_pf_to_dev(pf) instead of &pf->pdev->dev
Use ice_pf_to_dev(vsi->back) instead of &vsi->back->pdev->dev
When a pointer to the pf instance is available, use ice_pf_to_dev
instead of ice_hw_to_dev

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:55 -08:00
Krzysztof Kazimierczak
9112539934 ice: Suppress Coverity warnings for xdp_rxq_info_reg
Coverity reports some of the calls to xdp_rxq_info_reg() as potential
issues, because the driver does not check its return value. However,
those calls are wrapped with "if (!xdp_rxq_info_is_reg(&ring->xdp_rxq))"
and this check alone is enough to be sure that the function will never
fail.

All possible states of xdp_rxq_info are:
 - NEW,
 - REGISTERED,
 - UNREGISTERED,
 - UNUSED.

The driver won't mark a queue as UNUSED under no circumstance, so the
return value can be ignored safely.

Add comments for Coverity right above calls to xdp_rxq_info_reg() to
suppress the warnings.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-03 16:08:33 -08:00
Krzysztof Kazimierczak
65bb559b6c ice: Add a boundary check in ice_xsk_umem()
In ice_xsk_umem(), variable qid which is later used as an array index,
is not validated for a possible boundary exceedance. Because of that,
a calling function might receive an invalid address, which causes
general protection fault when dereferenced.

To address this, add a boundary check to see if qid is greater than the
size of a UMEM array. Also, don't let user change vsi->num_xsk_umems
just by trying to setup a second UMEM if its value is already set up
(i.e. UMEM region has already been allocated for this VSI).

While at it, make sure that ring->zca.free pointer is always zeroed out
if there is no UMEM on a specified ring.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-03 16:08:33 -08:00
Michal Swiatkowski
118e0e1002 ice: Set default value for ITR in alloc function
When the user sets itr_setting to zero from ethtool -C, the driver changes
this value to default in ice_cfg_itr (for example after changing ring
param). Remove code that sets default value in ice_cfg_itr and move it to
place where the driver allocates q_vectors.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-03 16:08:33 -08:00
Brett Creeley
4015d11e4b ice: Add ice_pf_to_dev(pf) macro
We use &pf->dev->pdev all over the code. Add a simple
macro to do this for us. When multiple de-references
like this are being done add a local struct device
variable.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-22 13:15:17 -08:00
Maciej Fijalkowski
59bb080805 ice: introduce frame padding computation logic
Take into account the underlying architecture specific settings and
based on that calculate the possible padding that can be supplied.
Typically, for x86 and standard MTU size we will end up with 192 bytes
of headroom. This is the same behavior as our other drivers have and we
can dedicate it for XDP purposes.

Furthermore, introduce the Rx ring flag for indicating whether build_skb
is used on particular. Based on that invoke the routines for padding
calculation.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04 13:09:50 -08:00
Krzysztof Kazimierczak
2d4238f556 ice: Add support for AF_XDP
Add zero copy AF_XDP support.  This patch adds zero copy support for
Tx and Rx; code for zero copy is added to ice_xsk.h and ice_xsk.c.

For Tx, implement ndo_xsk_wakeup. As with other drivers, reuse
existing XDP Tx queues for this task, since XDP_REDIRECT guarantees
mutual exclusion between different NAPI contexts based on CPU ID. In
turn, a netdev can XDP_REDIRECT to another netdev with a different
NAPI context, since the operation is bound to a specific core and each
core has its own hardware ring.

For Rx, allocate frames as MEM_TYPE_ZERO_COPY on queues that AF_XDP is
enabled.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04 12:01:55 -08:00
Maciej Fijalkowski
efc2214b60 ice: Add support for XDP
Add support for XDP. Implement ndo_bpf and ndo_xdp_xmit.  Upon load of
an XDP program, allocate additional Tx rings for dedicated XDP use.
The following actions are supported: XDP_TX, XDP_DROP, XDP_REDIRECT,
XDP_PASS, and XDP_ABORTED.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04 10:23:59 -08:00
Maciej Fijalkowski
e75d1b2c37 ice: get rid of per-tc flow in Tx queue configuration routines
There's no reason for treating DCB as first class citizen when configuring
the Tx queues and going through TCs. Reverse the logic and base the
configuration logic on rings, which is the object of interest anyway.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04 10:03:14 -08:00
Anirudh Venkataramanan
eff380aaff ice: Introduce ice_base.c
Remove a few uses of kernel configuration flags from ice_lib.c by
introducing a new source file ice_base.c. Also move corresponding
function prototypes from ice_lib.h to ice_base.h and include ice_base.h
where required.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04 10:03:14 -08:00