2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

5265 Commits

Author SHA1 Message Date
David S. Miller
71f9b61c5b Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-09-25

This series contains updates to i40e and xsk.

Mariusz fixes an issue where the VF link state was not being updated
properly when the PF is down or up.  Also cleaned up the promiscuous
configuration during a VF reset.

Patryk simplifies the code a bit to use the variables for PF and HW that
are declared, rather than using the VSI pointers.  Cleaned up the
message length parameter to several virtchnl functions, since it was not
being used (or needed).

Harshitha fixes two potential race conditions when trying to change VF
settings by creating a helper function to validate that the VF is
enabled and that the VSI is set up.

Sergey corrects a double "link down" message by putting in a check for
whether or not the link is up or going down.

Björn addresses an AF_XDP zero-copy issue that buffers passed
from userspace to the kernel was leaked when the hardware descriptor
ring was torn down.  A zero-copy capable driver picks buffers off the
fill ring and places them on the hardware receive ring to be completed at
a later point when DMA is complete. Similar on the transmit side; The
driver picks buffers off the transmit ring and places them on the
transmit hardware ring.

In the typical flow, the receive buffer will be placed onto an receive
ring (completed to the user), and the transmit buffer will be placed on
the completion ring to notify the user that the transfer is done.

However, if the driver needs to tear down the hardware rings for some
reason (interface goes down, reconfiguration and such), the userspace
buffers cannot be leaked. They have to be reused or completed back to
userspace.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 20:25:00 -07:00
Björn Töpel
3ab52af58f i40e: disallow changing the number of descriptors when AF_XDP is on
When an AF_XDP UMEM is attached to any of the Rx rings, we disallow a
user to change the number of descriptors via e.g. "ethtool -G IFNAME".

Otherwise, the size of the stash/reuse queue can grow unbounded, which
would result in OOM or leaking userspace buffers.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:16:19 -07:00
Björn Töpel
411dc16ff1 i40e: clean zero-copy XDP Rx ring on shutdown/reset
Outstanding Rx descriptors are temporarily stored on a stash/reuse
queue. When/if the HW rings comes up again, entries from the stash are
used to re-populate the ring.

The latter required some restructuring of the allocation scheme for
the AF_XDP zero-copy implementation. There is now a fast, and a slow
allocation. The "fast allocation" is used from the fast-path and
obtains free buffers from the fill ring and the internal recycle
mechanism. The "slow allocation" is only used in ring setup, and
obtains buffers from the fill ring and the stash (if any).

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:15:16 -07:00
Björn Töpel
9dbb137045 i40e: clean zero-copy XDP Tx ring on shutdown/reset
When the zero-copy enabled XDP Tx ring is torn down, due to
configuration changes, outstanding frames on the hardware descriptor
ring are queued on the completion ring.

The completion ring has a back-pressure mechanism that will guarantee
that there is sufficient space on the ring.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:10:24 -07:00
Patryk Małek
679b05c053 i40e: Remove unused msglen parameter from virtchnl functions
msglen parameter seems to be unused in several virtchnl function.
This patch removes it from signatures of those functions.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:08:42 -07:00
Sergey Nemov
fd835129ab i40e: fix double 'NIC Link is Down' messages
When isup is false meaning that interface is going to shut down
set new speed to 0 to avoid double 'NIC Link is Down' messages.

Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:06:50 -07:00
Harshitha Ramamurthy
ed277c50c0 i40e: add a helper function to validate a VF based on the vf id
When we are trying to change VF settings, it is possible for 2 race
conditions to happen. One, when the VF is created but not yet enabled.
Second, the VF is enabled but the VSI is still not created or not yet
re-created in the VF reset flow.

This patch introduces a helper function to validate that the VF is
enabled and that the VSI is set up. This patch also calls this
function from other functions which could get into these race conditions.
While we are poking around here, remove unnecessary parenthesis that
checkpatch was complaining about.

Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:05:37 -07:00
Patryk Małek
e7bac7afa6 i40e: use declared variables for pf and hw
In order to slightly simplify the code use the variables for pf and hw
that are declared in i40e_set_mac function.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 13:03:36 -07:00
Mariusz Stachura
0ce5233e6c i40e: Unset promiscuous settings on VF reset
This patch cleans up promiscuous configuration when a VF reset occurs.
Previously the promiscuous mode settings were still there after the VF
driver removal.

Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 12:49:05 -07:00
Mariusz Stachura
f3fc7915a5 i40e: Fix VF's link state notification
This resolves an issue where the VF link state was not being updated
when the PF is down or up, and the VF link state would always show
that it is running.

Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25 12:46:49 -07:00
David S. Miller
a06ee256e5 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Version bump conflict in batman-adv, take what's in net-next.

iavf conflict, adjustment of netdev_ops in net-next conflicting
with poll controller method removal in net.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:35:29 -07:00
Eric Dumazet
1aa28fb983 i40evf: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

i40evf uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Eric Dumazet
158a08a694 ice: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ice uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:25 -07:00
Eric Dumazet
0542997ede igb: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

igb uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
Eric Dumazet
2753166e4b ixgb: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ixgb uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

This also removes a problematic use of disable_irq() in
a context it is forbidden, as explained in commit
af3e0fcf78 ("8139too: Use disable_irq_nosync() in
rtl8139_poll_controller()")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
Eric Dumazet
dda9d57e2d fm10k: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
lasts for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

fm10k uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
Eric Dumazet
6f5d941eba ixgbevf: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ixgbevf uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
Eric Dumazet
b80e71a986 ixgbe: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ixgbe uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Reported-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-23 21:55:24 -07:00
YueHaibing
da2cfbd3e7 e1000: remove set but not used variable 'txb2b'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/intel/e1000/e1000_main.c: In function 'e1000_watchdog':
drivers/net/ethernet/intel/e1000/e1000_main.c:2436:9: warning:
 variable 'txb2b' set but not used [-Wunused-but-set-variable]

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-19 23:09:23 -07:00
Jesse Brandeburg
98674ebec8 intel-ethernet: use correct module license
We recently updated all our SPDX identifiers to correctly
indicate our net/ethernet/intel/* drivers were always released
and intended to be released under GPL v2, but the MODULE_LICENSE
declaration was never updated.

Fix the MODULE_LICENSE to be GPL v2, for all our drivers.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:32:59 -07:00
Jesse Brandeburg
66bc8e0f59 iavf: finish renaming files to iavf
This finishes the process of renaming the files that
make sense to rename (skipping adminq related files that
talk to i40e), and fixes up the build and the #includes
so that everything builds nicely.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:32:55 -07:00
Jesse Brandeburg
56184e01c0 iavf: rename most of i40e strings
This is the big rename patch, it takes most of the i40e_
and I40E_ strings and renames them to iavf_ and IAVF_.

Some of the adminq code, as well as most of the client
interface code used by RDMA is left unchanged in order
to indicate that the driver is talking to non-internal to
iavf code.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:32:29 -07:00
Jesse Brandeburg
ad64ed8bf9 iavf: tracing infrastructure rename
Rename the i40e_trace file and fix up all the callers
to the new names inside the iavf_trace.h file.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:18:20 -07:00
Jesse Brandeburg
f1aa1abaf5 iavf: replace i40e_debug with iavf version
Change another string (i40e_debug)

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:18:14 -07:00
Jesse Brandeburg
f349daa588 iavf: rename i40e_hw to iavf_hw
Fix up the i40e_hw names to new name, including versions
inside other strings.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:18:10 -07:00
Jesse Brandeburg
83eafc4922 iavf: rename I40E_ADMINQ_DESC
Take care of some renames containing I40E_ADMINQ_DESC.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:18:02 -07:00
Jesse Brandeburg
4dbc76e014 iavf: rename device ID defines
Rename the device ID defines to have IAVF in them
and remove all the unused defines.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:17:59 -07:00
Jesse Brandeburg
f1cad2ce06 iavf: remove references to old names
Remove the register name references to I40E_VF* and change to
IAVF_VF. Update the descriptor names and defines to the IAVF
name.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:17:56 -07:00
Jesse Brandeburg
5ec8b7d114 iavf: move i40evf files to new name
Simply move the i40evf files to the new name, updating the #includes
to track the new names, and updating the Makefile as well.

A future patch will remove the i40e references (after the code
removal patches later in this series).

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:17:53 -07:00
Jesse Brandeburg
0b6591e646 iavf: rename i40e_status to iavf_status
This is just a rename of an internal variable i40e_status, but
it was a pretty big change and so deserved it's own patch.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:17:50 -07:00
Jesse Brandeburg
129cf89e58 iavf: rename functions and structs to new name
This basically begins the internal portion of the rename of i40evf to iavf,
by renaming many of the functions, structs, variables and defines.

Most of the changes were made mechanically, which introduces some
alignment issues.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:13:49 -07:00
Jesse Brandeburg
ee61022acf iavf: diet and reformat
Remove a bunch of unused code and reformat a few lines. Also
remove some now un-necessary files.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 15:05:55 -07:00
Jesse Brandeburg
8062b2263a intel-ethernet: rename i40evf to iavf
Rename the Intel Ethernet Adaptive Virtual Function driver
(i40evf) to a new name (iavf) that is more consistent with
the ongoing maintenance of the driver as the universal VF driver
for multiple product lines.

This first patch fixes up the directory names and the .ko name,
intentionally ignoring the function names inside the driver
for now.  Basically this is the simplest patch that gets
the rename done and will be followed by other patches that
rename the internal functions.

This patch also addresses a couple of string/name issues
and updates the Copyright year.

Also, made sure to add a MODULE_ALIAS to the old name.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-18 08:43:03 -07:00
Jacob Keller
6ad96bdca8 i40e(vf): remove i40e_ethtool_stats.h header file
Essentially reverts commit 8fd75c58a0 ("i40e: move ethtool
stats boiler plate code to i40e_ethtool_stats.h", 2018-08-30), and
additionally moves the similar code in i40evf into i40evf_ethtool.c.

The code was intially moved from i40e_ethtool.c into i40e_ethtool_stats.h
as a way of better logically organizing the code. This has two problems.
First, we can't have an inline function with variadic arguments on all
platforms. Second, it gave the appearance that we had plans to share
code between the i40e and i40evf drivers, due to having a near copy of
the contents in the i40evf/i40e_ethtool_stats.h file.

Patches which actually attempt to combine or share code between the i40e
and i40evf drivers have not materialized, and are likely a ways off.

Rather than fixing the one function which causes build issues, just move
this code back into the i40e_ethtool.c and i40evf_ethtool.c files. Note
that we also change these functions back from static inlines to just
statics, since they're no longer in a header file.

We can revisit this if/when work is done to actually attempt to share
code between drivers. Alternatively, this stats code could be made more
generic so that it can be shared across drivers as part of ethtool
kernel work.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-08 10:06:17 -07:00
David S. Miller
fd3c040b24 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-09-01

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add AF_XDP zero-copy support for i40e driver (!), from Björn and Magnus.

2) BPF verifier improvements by giving each register its own liveness
   chain which allows to simplify and getting rid of skip_callee() logic,
   from Edward.

3) Add bpf fs pretty print support for percpu arraymap, percpu hashmap
   and percpu lru hashmap. Also add generic percpu formatted print on
   bpftool so the same can be dumped there, from Yonghong.

4) Add bpf_{set,get}sockopt() helper support for TCP_SAVE_SYN and
   TCP_SAVED_SYN options to allow reflection of tos/tclass from received
   SYN packet, from Nikita.

5) Misc improvements to the BPF sockmap test cases in terms of cgroup v2
   interaction and removal of incorrect shutdown() calls, from John.

6) Few cleanups in xdp_umem_assign_dev() and xdpsock samples, from Prashant.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-31 17:41:08 -07:00
Magnus Karlsson
93ee30f3e8 xsk: i40e: get rid of useless struct xdp_umem_props
This commit gets rid of the structure xdp_umem_props. It was there to
be able to break a dependency at one point, but this is no longer
needed. The values in the struct are instead stored directly in the
xdp_umem structure. This simplifies the xsk code as well as af_xdp
zero-copy drivers and as a bonus gets rid of one internal header file.

The i40e driver is also adapted to the new interface in this commit.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-01 01:38:16 +02:00
Magnus Karlsson
cf484f9f91 i40e: fix possible compiler warning in xsk TX path
With certain gcc versions, it was possible to get the warning
"'tx_desc' may be used uninitialized in this function" for the
i40e_xmit_zc. This was not possible, however this commit simplifies
the code path so that this warning is no longer emitted.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-01 01:38:16 +02:00
Patryk Małek
5907cf6c5b i40e: Prevent deleting MAC address from VF when set by PF
To prevent VF from deleting MAC address that was assigned by the
PF we need to check for that scenario when we try to delete a MAC
address from a VF.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Lihong Yang
babbcc6004 i40evf: cancel workqueue sync for adminq when a VF is removed
If a VF is being removed, there is no need to continue with the
workqueue sync for the adminq task, thus cancel it. Without this call,
when VFs are created and removed right away, there might be a chance for
the driver to crash with events stuck in the adminq.

Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Patryk Małek
5cba17b141 i40e: hold the rtnl lock on clearing interrupt scheme
Hold the rtnl lock when we're clearing interrupt scheme
in i40e_shutdown and in i40e_remove.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Patryk Małek
3bd77e2ae1 i40evf: Don't enable vlan stripping when rx offload is turned on
With current implementation of i40evf_set_features when user sets
any offload via ethtool we set I40EVF_FLAG_AQ_ENABLE_VLAN_STRIPPING
as a required aq which triggers driver to call
i40evf_enable_vlan_stripping. This shouldn't take place.
This patches fixes it by setting the flag only when VLAN offload
is turned on.

Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Jan Sokolowski
e78d9a39fd i40e: Check and correct speed values for link on open
If our card has been put in an unstable state due to
other drivers interacting with it, speed settings
might be incorrect. If incorrect, forcefully reset them
on open to known default values.

Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Björn Töpel
cdec2141c2 i40e: report correct statistics when XDP is enabled
When XDP is enabled, the driver will report incorrect
statistics. Received frames will reported as transmitted frames.

This commits fixes the i40e implementation of ndo_get_stats64 (struct
net_device_ops), so that iproute2 will report correct statistics
(e.g. when running "ip -stats link show dev eth0") even when XDP is
enabled.

Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Fixes: 74608d17fe ("i40e: add support for XDP_TX action")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Martyna Szapar
cfe396991a i40e: static analysis report from community
Static analysis tools report a problem from original driver submission.
Removing unnecessary check in condition.

Signed-off-by: Martyna Szapar <martyna.szapar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Lihong Yang
e65aae0863 i40evf: set IFF_UNICAST_FLT flag for the VF
Set IFF_UNICAST_FLT flag for the VF to prevent it from entering
promiscuous mode when macvlan is added to the VF.

Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Mitch Williams
7eb74ff891 i40e: use correct length for strncpy
Caught by GCC 8. When we provide a length for strncpy, we should not
include the terminating null. So we must tell it one less than the size
of the destination buffer.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Paul M Stillwell Jr
3c81891091 i40evf: Validate the number of queues a PF sends
A PF can send any number of queues to the VF and the VF may not
be able to support that many. Check to see that the number of
queues is less than or equal to the max number of queues the
VF can have.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:04 -07:00
Paweł Jabłoński
ae1e29f671 i40evf: Change a VF mac without reloading the VF driver
Add possibility to change a VF mac address from host side
without reloading the VF driver on the guest side. Without
this patch it is not possible to change the VF mac because
executing i40evf_virtchnl_completion function with
VIRTCHNL_OP_GET_VF_RESOURCES opcode resets the VF mac
address to previous value.

Signed-off-by: Paweł Jabłoński <pawel.jablonski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:03 -07:00
Jacob Keller
6dba41cd02 i40evf: update ethtool stats code and use helper functions
Fix a bug in the way we handled VF queues, by always showing stats for
the maximum number of queues, even if they aren't allocated. It is not
safe to change the number of strings reported to ethtool, as grabbing
statistics occurs over multiple ethtool ops for which the rtnl_lock()
cannot be held the entire time.

Avoid this by always reporting queue stats for the maximum number of
queues in the netdevice. Share some of the helper functionality for
adding stats with the PF code in i40e_ethtool_stats.h

This should reduce the chance of potential future bugs, and make adding
new statistics easier.

Note for the queue stats, unlike the PF driver we do not keep an array
of queue pointers, but an array of queues, so care must be taken to
avoid accessing queue memory that hasn't yet been allocated.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:03 -07:00
Jacob Keller
8fd75c58a0 i40e: move ethtool stats boiler plate code to i40e_ethtool_stats.h
Move the boiler plate structures and helper functions we recently
added into their own header file, so that the complete collection is
located together.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:03 -07:00
Jacob Keller
4b59938b20 i40e: convert queue stats to i40e_stats array
Use an i40e_stats array to handle the queue stats, instead of coding
similar functionality separately. Because of how the queue stats are
accessed on some kernels, we can't easily use i40e_add_ethtool_stats.

Instead, implement a separate helper, i40e_add_queue_stats, which we'll
use instead. This helper will correctly implement the
u64_stats_fetch_begin_irq logic and allow retries until successful. We
share the most complex code by re-using i40e_add_one_ethtool_stat.

This logic additionally easily supports skipping disabled rings by using
a ternary operator before calling the u64_stats_fetch_begin_irq()
function, so that we correctly zero-out the stats values without having
to perform two separate sections of code.

This significantly reduces the boiler plate code in
i40e_get_ethtool_stats, and helps keep the complex logic contained to as
few functions as possible.

With this patch, we've finally converted all the statistics to use the
helpers and the i40e_stats function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-30 13:53:03 -07:00
Magnus Karlsson
1328dcddbd i40e: add AF_XDP zero-copy Tx support
This patch adds zero-copy Tx support for AF_XDP sockets. It implements
the ndo_xsk_async_xmit netdev ndo and performs all the Tx logic from a
NAPI context. This means pulling egress packets from the Tx ring,
placing the frames on the NIC HW descriptor ring and completing sent
frames back to the application via the completion ring.

The regular XDP Tx ring is used for AF_XDP as well. This rationale for
this is as follows: XDP_REDIRECT guarantees mutual exclusion between
different NAPI contexts based on CPU id. In other words, a netdev can
XDP_REDIRECT to another netdev with a different NAPI context, since
the operation is bound to a specific core and each core has its own
hardware ring.

As the AF_XDP Tx action is running in the same NAPI context and using
the same ring, it will also be protected from XDP_REDIRECT actions
with the exact same mechanism.

As with AF_XDP Rx, all AF_XDP Tx specific functions are added to
i40e_xsk.c.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29 12:25:53 -07:00
Magnus Karlsson
a96e747273 i40e: move common Tx functions to i40e_txrx_common.h
This patch prepares for the upcoming zero-copy Tx functionality, by
moving common functions and refactor chunks of code into re-usable
functions, used both by the regular path and zero-copy path.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29 12:25:53 -07:00
Björn Töpel
0a714186d3 i40e: add AF_XDP zero-copy Rx support
This patch adds zero-copy Rx support for AF_XDP sockets. Instead of
allocating buffers of type MEM_TYPE_PAGE_SHARED, the Rx frames are
allocated as MEM_TYPE_ZERO_COPY when AF_XDP is enabled for a certain
queue.

All AF_XDP specific functions are added to a new file, i40e_xsk.c.

Note that when AF_XDP zero-copy is enabled, the XDP action XDP_PASS
will allocate a new buffer and copy the zero-copy frame prior passing
it to the kernel stack.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29 12:25:53 -07:00
Björn Töpel
20a739dbef i40e: move common Rx functions to i40e_txrx_common.h
This patch prepares for the upcoming zero-copy Rx functionality, by
moving/changing linkage of common functions, used both by the regular
path and zero-copy path.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29 12:25:53 -07:00
Björn Töpel
6d7aad1da2 i40e: refactor Rx path for re-use
In this commit, the Rx path is refactored some, as a step torwards the
introduction AF_XDP Rx zero-copy.

The page re-use counter is moved into the i40e_reuse_rx_page, instead
of bumping the counter in many places. The Rx buffer page clearing is
moved for better readability. Lastely, functions to update statistics
and bump the XDP Tx ring are introduced.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29 12:25:53 -07:00
Björn Töpel
123cecd427 i40e: added queue pair disable/enable functions
Add functions for queue pair enable/disable. Instead of resetting the
whole device, only the affected queue pair is disabled or enabled.

This plumbing is used in a later commit, when zero-copy AF_XDP support
is introduced.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29 12:25:53 -07:00
David S. Miller
09990ad164 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-08-28

This series contains new features and implementation updates for the
ice driver.

Anirudh reworks the current flex programming logic to add support for
a second flex descriptor profile.  Updated the transmit scheduler
code to handle changes to the spec, specifically the firmware expects
a 4KB buffer at all times so fix the default scheduler topology buffer
size.  Also the maximum children per node per layer is replaced by
maximum sibling group size.  Adds a check to ensure a reset is not in
progress before exercising a control queue operation.  Refactored the
switch rule management functions and structures to simply the logic and
to add a common function to search for a rule entry and add a new rule
entry.  Refactored the VSI allocation, deletion and rebuild flow so that
on reset we can restore all the filters that were previously added.  Did
some spring cleaning of define names and macros.

Dan updates the admin queue command for requesting resource ownership
to the latest specification by adding new enum's and change the locks.

Zhenning optimizes the driver by using the existing buffer in a
structure directly versus a local array.

Chinh implements handlers for ethtool for get and set link settings.

Sudheer implements transmit hang/timeout detection and malicious driver
detection in the driver.

Md Fahad Iqbal implements the get and set bridge mode operations.

Hieu adds the ability for firmware logging during initialization.

Brett updates the driver to only enable VSI transmit and receive pruning
when VLAN 0 is active, and when VLAN 0 is removed/not active, pruning is
disabled.

Akeem adds a flag to use for stopping the service task.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-28 15:57:25 -07:00
Shannon Nelson
5ed4e9e990 ixgbe: fix the return value for unsupported VF offload
When failing the request because we can't support that offload,
reporting EOPNOTSUPP makes much more sense than ENXIO.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:33:38 -07:00
Shannon Nelson
47b6f50077 ixgbe: disallow IPsec Tx offload when in SR-IOV mode
There seems to be a problem in the x540's internal switch wherein if SR-IOV
mode is enabled and an offloaded IPsec packet is sent to a local VF,
the packet is silently dropped.  This might never be a problem as it is
somewhat a corner case, but if someone happens to be using IPsec offload
from the PF to a VF that just happens to get migrated to the local box,
communication will mysteriously fail.

Not good.

A simple way to protect from this is to simply not allow any IPsec offloads
for outgoing packets when num_vfs != 0.  This doesn't help any offloads that
were created before SR-IOV was enabled, but we'll get to that later.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:33:33 -07:00
Shannon Nelson
7f68d43067 ixgbevf: enable VF IPsec offload operations
Add the IPsec initialization into the driver startup and
add the Rx and Tx processing hooks.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:33:30 -07:00
Shannon Nelson
0062e7cc95 ixgbevf: add VF IPsec offload code
Add the IPsec offload support code.  This is based off of the similar
code in ixgbe, but instead of writing the SA registers, the VF asks
the PF to setup the offload by sending the offload information to the
PF via the standard mailbox.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:33:26 -07:00
Shannon Nelson
adef9a26d6 ixgbevf: add defines for IPsec offload request
Fix up the register definitions for using IPsec offloads and
add the new mailbox message IDs.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:33:19 -07:00
Shannon Nelson
7269824046 ixgbe: add VF IPsec offload request message handling
Add an add and a delete message for IPsec offload requests from
the VF.  These call into the IPsec functions that can translate
the message buffer into a useful IPsec offload.

These new messages bump the mbox API version to 1.4.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:33:14 -07:00
Shannon Nelson
9e4e30cc0c ixgbe: add VF IPsec offload enable flag
Add a private flag to expressly enable support for VF IPsec offload.
The VF will have to be "trusted" in order to use the hardware offload,
but because of the general concerns of managing VF access, we want to
be sure the user specifically is enabling the feature.

This is likely a candidate for becoming a netdev feature flag.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:33:10 -07:00
Shannon Nelson
eda0333ac2 ixgbe: add VF IPsec management
Add functions to translate VF IPsec offload add and delete requests
into something the existing code can work with.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:33:03 -07:00
Shannon Nelson
99a7b0c14c ixgbe: prep IPsec constants for later use
Pull out a couple of values from a function so they can be used
later elsewhere.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:32:58 -07:00
Shannon Nelson
b2875fbf6c ixgbe: reload IPsec IP table after sa tables
Restore the IPsec hardware IP table after reloading the SA tables.
This doesn't make much difference now, but will matter when we add
support for VF IPsec offloads.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:32:53 -07:00
Shannon Nelson
9e3f2f5ece ixgbe: don't clear IPsec sa counters on HW clearing
The software SA record counters should not be cleared when clearing
the hardware tables.  This causes the counters to be out of sync
after a driver reset.

Fixes: 63a67fe229 ("ixgbe: add ipsec offload add and remove SA")
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 14:32:22 -07:00
Sebastian Basierski
7fb94bd58d ixgbevf: VF2VF TCP RSS
While VF2VF with RSS communication, RSS Type were wrongly recognized
and RSS hash was not calculated as it should be. Packets was
distributed on various queues by accident.
This commit fixes that behaviour and causes proper RSS Type recognition.

Signed-off-by: Sebastian Basierski <sebastianx.basierski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 13:28:49 -07:00
Sebastian Basierski
59dd45d550 ixgbe: firmware recovery mode
Add check for FW NVM recovery mode during driver initialization and
service task. If in recovery mode, log message and unregister device

Signed-off-by: Sebastian Basierski <sebastianx.basierski@intel.com>
Tested-by: Don Buchholz <donald.buchholz@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 12:17:15 -07:00
Anirudh Venkataramanan
9ea47d81a7 ice: Fix and update driver version string
Remove the "ice" prefix for the driver version string and bump version
to 0.7.1-k.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 11:14:19 -07:00
Akeem G Abodunrin
8d81fa55ba ice: Introduce SERVICE_DIS flag and service routine functions
This patch introduces SERVICE_DIS flag to use for stopping service task.
This flag will be checked before scheduling new tasks. Also add new
functions ice_service_task_stop to stop service task.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 11:11:18 -07:00
Brett Creeley
4f74dcc1b8 ice: Enable VSI Rx/Tx pruning only when VLAN 0 is active
VLAN pruning is not valid when VLAN 0 is not active. If VLAN
pruning is enabled and VLAN 0 is not active (8021q driver not loaded)
then normal, non-VLAN, traffic will not pass.

TX/RX VLAN pruning is enabled when the VLAN 0 is added to the
active_vlan bitmap and it is disabled when VLAN 0 is removed from the
active_vlan bitmap.

So, only enable VLAN pruning when VLAN 0 is active. Setting RX VLAN
pruning causes the switch to drop received VLAN packets when there
are no matching VLAN ids in the associated VSI's switch filters. Setting
TX pruning makes it so the switch will not send out any packets with
VLAN tags that don't match the associated VSI's switch filters.

With this patch, if the VF or PF tries to send a VLAN tagged packet with
a VLAN tag that it does not have a pruning rule for it will trigger an
MDD event. For example, if PF0 has VLAN10 and VLAN11 interfaces and
scapy is used to send a packet with VLAN8 then the MDD is triggered.

Also make ice_vsi_kill_vlan return a value which the caller can check
before updating VLAN related data structures (counts, pruning bits, etc.).

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 11:07:13 -07:00
Hieu Tran
8b97ceb1dc ice: Enable firmware logging during device initialization.
To enable FW logging, the "cq_en" and "uart_en" enable bits of the
"fw_log" element in struct ice_hw need to set accordingly based on
some user-provided parameters during driver loading. To select which
FW log events to be emitted, the "cfg" elements of corresponding FW
modules in the "evnts" array member of "fw_log" need to be configured.

Signed-off-by: Hieu Tran <hieu.t.tran@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 11:04:04 -07:00
Md Fahad Iqbal Polash
b1edc14a3f ice: Implement ice_bridge_getlink and ice_bridge_setlink
ice_bridge_getlink returns the current bridge mode using
ndo_dflt_bridge_getlink and the mode parameter available in
first_switch->bridge_mode.

ice_bridge_setlink is invoked when the bridge mode needs to
changed. The value to be changed to is available as a netlink
message which is parsed in this function. If the mode has to
be changed, switch_flags is set appropriately (set ALLOW_LB
for VEB mode and clear it for VEPA mode) and ice_aq_update_vsi
is called. Also change the unicast switch filter rules.

Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 11:01:06 -07:00
Sudheer Mogilappagari
b3969fd727 ice: Add support for Tx hang, Tx timeout and malicious driver detection
When a malicious operation is detected, the firmware triggers an
interrupt, which is then picked up by the service task (specifically by
ice_handle_mdd_event). A reset is scheduled if required.

Tx hang detection works in a similar way, except the logic here monitors
the VSI's Tx queues and tries to revive them if stalled. If the hang is
not resolved, the kernel eventually calls ndo_tx_timeout, which is
handled by ice_tx_timeout.

Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 10:58:42 -07:00
Anirudh Venkataramanan
f80eaa4210 ice: Clean up register file
This patch cleans up the existing register definitions.

1) Several instances of long defines names used in the BIT() macro
   were replaced to use the actual values they represent. As a
   result some defines for shifts (ending with _S) that were used
   only to create bitmasks were removed completely.

2) Apply more consistent tab spacing.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 10:49:31 -07:00
Chinh Cao
48cb27f2fd ice: Implement handlers for ethtool PHY/link operations
This patch implements handlers for ethtool get_link_ksettings and
set_link_ksettings. Helper functions use by these handlers are also
introduced in this patch.

Signed-off-by: Chinh Cao <chinh.t.cao@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 10:48:26 -07:00
Anirudh Venkataramanan
0f9d5027a7 ice: Refactor VSI allocation, deletion and rebuild flow
This patch refactors aspects of the VSI allocation, deletion and rebuild
flow. Some of the more noteworthy changes are described below.

1) On reset, all switch filters applied in the hardware are lost. In
   the rebuild flow, only MAC and broadcast filters are being restored.
   Instead, use a new function ice_replay_all_fltr to restore all the
   filters that were previously added. To do this, remove calls to
   ice_remove_vsi_fltr to prevent cleaning out the internal bookkeeping
   structures that ice_replay_all_fltr uses to replay filters.

2) Introduce a new state bit __ICE_PREPARED_FOR_RESET to distinguish the
   PF that requested the reset (and consequently prepared for it) from
   the rest of the PFs. These other PFs will prepare for reset only
   when they receive an interrupt from the firmware.

3) Use new functions ice_add_vsi and ice_free_vsi to create and destroy
   VSIs respectively. These functions accept a handle to uniquely
   identify a VSI. This same handle is required to rebuild the VSI post
   reset. To prevent confusion, the existing ice_vsi_add was renamed to
   ice_vsi_init.

4) Enhance ice_vsi_setup for the upcoming SR-IOV changes and expose a
   new wrapper function ice_pf_vsi_setup to create PF VSIs. Rework the
   error handling path in ice_setup_pf_sw.

5) Introduce a new function ice_vsi_release_all to release all PF VSIs.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 10:34:01 -07:00
Anirudh Venkataramanan
80d144c9ac ice: Refactor switch rule management structures and functions
This patch is an adaptation of the work originally done by Grishma
Kotecha <grishma.kotecha@intel.com> that in summary refactors the
switch filtering logic in the driver. More specifically,
 - Update the recipe structure to also store list of rules
 - Update the existing code for recipes like MAC, VLAN, ethtype etc to
   use list head that is attached to switch recipe structure
 - Add a common function to search for a rule entry and add a new rule
   entry. Update the code to use this new function.
 - Refactor the rem_handle_vsi_list function to simplify the logic

CC: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 10:29:38 -07:00
Zhenning Xiao
74118f7af0 ice: Code optimization for ice_fill_sw_rule()
Use the buffer in the s_rule structure directly instead of using
a local array eth_hdr[DUMMY_ETH_HDR_LEN]

Signed-off-by: Zhenning Xiao <zhenning.xiao@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 10:21:36 -07:00
Anirudh Venkataramanan
fd2a981777 ice: Prevent control queue operations during reset
Once reset is issued, the driver loses all control queue interfaces.
Exercising control queue operations during reset is incorrect and
may result in long timeouts.

This patch introduces a new field 'reset_ongoing' in the hw structure.
This is set to 1 by the core driver when it receives a reset interrupt.
ice_sq_send_cmd checks reset_ongoing before actually issuing the control
queue operation. If a reset is in progress, it returns a soft error code
(ICE_ERR_RESET_PENDING) to the caller. The caller may or may not have to
take any action based on this return. Once the driver knows that the
reset is done, it has to set reset_ongoing back to 0. This will allow
control queue operations to be posted to the hardware again.

This "bail out" logic was specifically added to ice_sq_send_cmd (which
is pretty low level function) so that we have one solution in one place
that applies to all types of control queues.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 10:20:00 -07:00
Dan Nowlin
ff2b13213a ice: Update request resource command to latest specification
Align Request Resource Ownership AQ command (0x0008) to the latest
specification. This includes:

- Correcting the resource IDs for the Global Cfg and Change locks.
- new enum ICE_CHANGE_LOCK_RES_ID
- new enum ICE_GLOBAL_CFG_LOCK_RES_ID
- Altering the flow for Global Config Lock to allow only the first PF to
  download the package.

Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 10:17:06 -07:00
Anirudh Venkataramanan
b36c598c99 ice: Updates to Tx scheduler code
1) The maximum device nodes is a global value and shared by the whole
   device. Add element AQ command would fail if there is no space to
   add new nodes so the check for max nodes isn't required. So remove
   ice_sched_get_num_nodes_per_layer and ice_sched_val_max_nodes.

2) In ice_sched_add_elems, set default node's CIR/EIR bandwidth weight.

3) Fix default scheduler topology buffer size as the firmware expects
   a 4KB buffer at all times, and will error out if one of any other
   size is provided.

4) In the latest spec, max children per node per layer is replaced by
   max sibling group size. Now it provides the max children of the below
   layer node, not the current layer node.

5) Fix some newline/whitespace issues for consistency.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 09:58:13 -07:00
Anirudh Venkataramanan
22ef683b48 ice: Rework flex descriptor programming
The driver can support two flex descriptor profiles, ICE_RXDID_FLEX_NIC
and ICE_RXDID_FLEX_NIC_2. This patch reworks the current flex programming
logic to add support for the latter profile.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-28 09:18:47 -07:00
Jacob Keller
07f3701387 i40e: fix condition of WARN_ONCE for stat strings
Commit 9b10df596b ("i40e: use WARN_ONCE to replace the commented
BUG_ON size check") introduced a warning check to make sure
that the size of the stat strings was always the expected value. This
code accidentally inverted the check of the data pointer. Fix this so
that we accurately count the size of the stats we copied in.

This fixes an erroneous WARN kernel splat that occurs when requesting
ethtool statistics.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Tested-by: Mauro S M Rodrigues <maurosr@linux.vnet.ibm.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Martyna Szapar
fa38e30ac7 i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled
If interface is connected to switch port configured for DCB there are
TX timeouts when bringing up interface. Problem started appearing after
adding in i40e driver code mqprio hardware offload mode. In function
i40e_vsi_configure_bw_alloc was added resetting BW rate which should
be executing when mqprio qdisc is removed but was also when there was
no mqprio qdisc added and DCB was enabled. In this patch was added
additional check for DCB flag so now when DCB is enabled the correct
DCB configs from before mqprio patch are restored.

Signed-off-by: Martyna Szapar <martyna.szapar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Sebastian Basierski
939b701ad6 ixgbe: fix driver behaviour after issuing VFLR
Since VFLR doesn't clear VFMBMEM (VF Mailbox Memory)
and is not re-enabling queues correctly we should fix
this behavior.

Signed-off-by: Sebastian Basierski <sebastianx.basierski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Tony Nguyen
fabf1bce10 ixgbe: Prevent unsupported configurations with XDP
These changes address comments by Jakub Kicinski on
commit 38b7e7f8ae ("ixgbe: Do not allow LRO or MTU change with XDP").

Change the MTU check with XDP to allow any supported value and only
reject those outside of the range as opposed to rejecting any change
when XDP is active. In situations where MTU size is not supported,
return -EINVAL instead of -EPERM.

Add checks when enabling SRIOV, DCB, or adding L2FW offloaded device
as they are not supported with XDP.

CC: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Jia-Ju Bai
374f78f75b ixgbe: Replace GFP_ATOMIC with GFP_KERNEL
ixgbe_fcoe_ddp_setup(), ixgbe_setup_fcoe_ddp_resources() and
ixgbe_sw_init() are never called in atomic context.
They call kmalloc(), dma_pool_alloc() and kzalloc() with GFP_ATOMIC,
which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Acked-by: Sebastian Basierski <sebastianx.basierski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Jia-Ju Bai
69a64658de igb: Replace mdelay() with msleep() in igb_integrated_phy_loopback()
igb_integrated_phy_loopback() is never called in atomic context.
It calls mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Jia-Ju Bai
151356270b igb: Replace GFP_ATOMIC with GFP_KERNEL in igb_sw_init()
igb_sw_init() is never called in atomic context.
It calls kzalloc() and kcalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Jesus Sanchez-Palencia
a798fbac33 igb: Use an advanced ctx descriptor for launchtime
On i210, Launchtime (TxTime) requires the usage of an "Advanced
Transmit Context Descriptor" for retrieving the timestamp of a packet.

The igb driver correctly builds such descriptor on the segmentation
flow (i.e. igb_tso()) or on the checksum one (i.e. igb_tx_csum()), but the
feature is broken for AF_PACKET if the IGB_TX_FLAGS_VLAN is not set,
which happens due to an early return on igb_tx_csum().

This flag is only set by the kernel when a VLAN interface is used,
thus we can't just rely on it. Here we are fixing this issue by checking
if launchtime is enabled for the current tx_ring before performing the
early return.

Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Bo Chen
ee400a3f1b e1000: ensure to free old tx/rx rings in set_ringparam()
In 'e1000_set_ringparam()', the tx_ring and rx_ring are updated with new value
and the old tx/rx rings are freed only when the device is up. There are resource
leaks on old tx/rx rings when the device is not up. This bug is reported by COD,
a tool for testing kernel module binaries I am building.

This patch fixes the bug by always calling 'kfree()' on old tx/rx rings in
'e1000_set_ringparam()'.

Signed-off-by: Bo Chen <chenbo@pdx.edu>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Bo Chen
cf1acec008 e1000: check on netif_running() before calling e1000_up()
When the device is not up, the call to 'e1000_up()' from the error handling path
of 'e1000_set_ringparam()' causes a kernel oops with a null-pointer
dereference. The null-pointer dereference is triggered in function
'e1000_alloc_rx_buffers()' at line 'buffer_info = &rx_ring->buffer_info[i]'.

This bug was reported by COD, a tool for testing kernel module binaries I am
building. This bug was also detected by KFI from Dr. Kai Cong.

This patch fixes the bug by checking on 'netif_running()' before calling
'e1000_up()' in 'e1000_set_ringparam()'.

Signed-off-by: Bo Chen <chenbo@pdx.edu>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
YueHaibing
a9910c0886 ixgb: use dma_zalloc_coherent instead of allocator/memset
Use dma_zalloc_coherent instead of dma_alloc_coherent
followed by memset 0.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24 08:52:35 -07:00
Anirudh Venkataramanan
3968540ba6 ice: Trivial formatting fixes
1) Add missing "\n" when printing link event error message.

2) Update dev_err statement in probe.

3) Add function description for ice_clear_pf_cfg.

4) Fix coding style for ice_acquire_nvm.

5) netdev->mtu is unsigned so use %u.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 11:34:26 -07:00
Bruce Allan
43f8b22450 ice: Change struct members from bool to u8
Recent versions of checkpatch have a new warning based on a documented
preference of Linus to not use bool in structures due to wasted space and
the size of bool is implementation dependent.  For more information, see
the email thread at https://lkml.org/lkml/2017/11/21/384.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 11:32:59 -07:00
Jesse Brandeburg
dab0588fb6 ice: Fix potential return of uninitialized value
In ice_vsi_setup_[tx|rx]_rings, err is uninitialized which can result in
a garbage value return to the caller. Fix that.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 11:30:38 -07:00
Anirudh Venkataramanan
c7f2c42b80 ice: Fix a few null pointer dereference issues
1) When ice_ena_msix_range() fails to reserve vectors, a devm_kfree()
   warning was seen in the error flow path. So check pf->irq_tracker
   before use in ice_clear_interrupt_scheme().

2) In ice_vsi_cfg(), check vsi->netdev before use.

3) In ice_get_link_status, check link_up before use.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 11:28:56 -07:00
Bruce Allan
3bcd7fa37f ice: Update to interrupts enabled in OICR
Remove the following interrupt causes that are not applicable or not
handled:
- PFINT_OICR_HLP_RDY_M
- PFINT_OICR_CPM_RDY_M
- PFINT_OICR_GPIO_M
- PFINT_OICR_STORM_DETECT_M

Add the following interrupt cause that's actually handled in ice_misc_intr:
- PFINT_OICR_PE_CRITERR_M

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 11:26:47 -07:00
Brett Creeley
5d8778d803 ice: Set VLAN flags correctly
In the struct ice_aqc_vsi_props the field port_vlan_flags is an
overloaded term because it is used for both port VLANs (PVLANs) and
regular VLANs. This is an issue and is very confusing especially when
dealing with VFs because normal VLANs and port VLANs are not the same.
To fix this the field was renamed to vlan_flags and all of the #define's
labeled *_PVLAN_* were renamed to *_VLAN_* if they are not specific to
port VLANs.

Also in ice_vsi_manage_vlan_stripping, set the ICE_AQ_VSI_VLAN_MODE_ALL
bit to allow the driver to add a VLAN tag to all packets it sends.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 10:02:15 -07:00
Jacob Keller
1eb43fc754 ice: Use order_base_2 to calculate higher power of 2
Currently, we use a combination of ilog2 and is_power_of_2() to
calculate the next power of 2 for the qcount. This appears to be causing
a warning on some combinations of GCC and the Linux kernel:

MODPOST 1 modules
WARNING: "____ilog2_NaN" [ice.ko] undefined!

This appears to because because GCC realizes that qcount could be zero
in some circumstances and thus attempts to link against the
intentionally undefined ___ilog2_NaN function.

The order_base_2 function is intentionally defined to return 0 when
passed 0 as an argument, and thus will be safe to use here.

This not only fixes the warning but makes the resulting code slightly
cleaner, and is really what we should have used originally.

Also update the comment to make it more clear that we are rounding up,
not just incrementing the ilog2 of qcount unconditionally.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 09:56:28 -07:00
Anirudh Venkataramanan
3d6b640efc ice: Fix bugs in control queue processing
This patch is a consolidation of multiple bug fixes for control queue
processing.

1)  In ice_clean_adminq_subtask() remove unnecessary reads/writes to
    registers. The bits PFINT_FW_CTL, PFINT_MBX_CTL and PFINT_SB_CTL
    are not set when an interrupt arrives, which means that clearing them
    again can be omitted.

2)  Get an accurate value in "pending" by re-reading the control queue
    head register from the hardware.

3)  Fix a corner case involving lost control queue messages by checking
    for new control messages (using ice_ctrlq_pending) before exiting the
    cleanup routine.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 09:54:24 -07:00
Preethi Banala
b29bc220e2 ice: Clean control queues only when they are initialized
Clean control queues only when they are initialized. One of the ways to
validate if the basic initialization is done is by checking value of
cq->sq.head and cq->rq.head variables that specify the register address.
This patch adds a check to avoid NULL pointer dereference crash when tried
to shutdown uninitialized control queue.

Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 09:51:44 -07:00
Jacob Keller
f8ba7db850 ice: Report stats for allocated queues via ethtool stats
It is not safe to have the string table for statistics change order or
size over the lifetime of a given netdevice. This is because of the
nature of the 3-step process for obtaining stats. First, user space
performs a request for the size of the strings table. Second it performs
a separate request for the strings themselves, after allocating space
for the table. Third, it requests the stats themselves, also allocating
space for the table.

If the size decreased, there is potential to see garbage data or stats
values. In the worst case, we could potentially see stats values become
mis-aligned with their strings, so that it looks like a statistic is
being reported differently than it actually is.

Even worse, if the size increased, there is potential that the strings
table or stats table was not allocated large enough and the stats code
could access and write to memory it should not, potentially resulting in
undefined behavior and system crashes.

It isn't even safe if the size always changes under the RTNL lock. This
is because the calls take place over multiple user space commands, so it
is not possible to hold the RTNL lock for the entire duration of
obtaining strings and stats. Further, not all consumers of the ethtool
API are the user space ethtool program, and it is possible that one
assumes the strings will not change (valid under the current contract),
and thus only requests the stats values when requesting stats in a loop.

Finally, it's not possible in the general case to detect when the size
changes, because it is quite possible that one value which could impact
the stat size increased, while another decreased. This would result in
the same total number of stats, but reordering them so that stats no
longer line up with the strings they belong to. Since only size changes
aren't enough, we would need some sort of hash or token to determine
when the strings no longer match. This would require extending the
ethtool stats commands, but there is no more space in the relevant
structures.

The real solution to resolve this would be to add a completely new API
for stats, probably over netlink.

In the ice driver, the only thing impacting the stats that is not
constant is the number of queues. Instead of reporting stats for each
used queue, report stats for each allocated queue. We do not change the
number of queues allocated for a given netdevice, as we pass this into
the alloc_etherdev_mq() function to set the num_tx_queues and
num_rx_queues.

This resolves the potential bugs at the slight cost of displaying many
queue statistics which will not be activated.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 09:49:16 -07:00
Anirudh Venkataramanan
5ab522443b ice: Cleanup magic number
Use define for the unit size shift of the Rx LAN context descriptor base
address instead of the magic number 7.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 09:46:17 -07:00
Bruce Allan
6efa6239e7 ice: Remove unnecessary node owner check
There is already a check for owner == ICE_SCHED_NODE_OWNER_LAN at the
beginning of ice_sched_update_vsi_child_nodes. Remove the additional
check to address the static analysis tool smatch issue "warn: we tested
'owner' before and it was 'false'".

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 09:24:27 -07:00
Anirudh Venkataramanan
4381147df9 ice: Fix multiple static analyser warnings
This patch fixes the following smatch errors:

1) Fix "odd binop '0x0 & 0xc'" when performing the bitwise-and with a
   constant value of zero (ICE_AQC_GSET_RSS_LUT_TABLE_SIZE_128_FLAG).
   Remove a similar bitwise-and with 0 in ice_add_marker_act() and use the
   right mask ICE_LG_ACT_GENERIC_OFFSET_M in the expression.

2) Fix a similar issue "odd binop '0x0 & 0x1800' in ice_req_irq_msix_misc.

3) Fix "odd binop '0x380000 & 0x7fff8'" in ice_add_marker_act(). Also, use
   a new define ICE_LG_ACT_GENERIC_OFF_RX_DESC_PROF_IDX instead of magic
   number '7'.

4) Fix warn: odd binop '0x0 & 0x18' in ice_set_dflt_vsi_ctx() by removing
   unnecessary logic to explicitly unset bits 3 and 4 in port_vlan_bits.
   These bits are unset already by the memset on ctxt->info.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23 09:20:36 -07:00
Cong Wang
244cd96adb net_sched: remove list_head from tc_action
After commit 90b73b77d0, list_head is no longer needed.
Now we just need to convert the list iteration to array
iteration for drivers.

Fixes: 90b73b77d0 ("net: sched: change action API to use array of pointers to actions")
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-21 12:45:44 -07:00
Linus Torvalds
4e31843f68 pci-v4.19-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlt1f9AUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxbdhAArnhRvkwOk4m4/LCuKF6HpmlxbBNC
 TjnBCenNf+lFXzWskfDFGFl/Wif4UzGbRTSCNQrwMzj3Ww3f/6R2QIq9rEJvyNC4
 VdxQnaBEZSUgN87q5UGqgdjMTo3zFvlFH6fpb5XDiQ5IX/QZeXeYqoB64w+HvKPU
 M+IsoOvnA5gb7pMcpchrGUnSfS1e6AqQbbTt6tZflore6YCEA4cH5OnpGx8qiZIp
 ut+CMBvQjQB01fHeBc/wGrVte4NwXdONrXqpUb4sHF7HqRNfEh0QVyPhvebBi+k1
 kquqoBQfPFTqgcab31VOcQhg70dEx+1qGm5/YBAwmhCpHR/g2gioFXoROsr+iUOe
 BtF6LZr+Y8cySuhJnkCrJBqWvvBaKbJLg0KMbI+7p4o9MZpod2u7LS5LFrlRDyKW
 3nz3o+b1+v3tCCKVKIhKo0ljolgkweQtR1f6KIHvq93wBODHVQnAOt9NlPfHVyks
 ryGBnOhMjoU5hvfexgIWFk9Ph9MEVQSffkI+TeFPO/tyGBfGfQyGtESiXuEaMQaH
 FGdZHX2RLkY3pWHOtWeMzRHzOnr2XjpDFcAqL3HBGPdJ30K3Umv3WOgoFe2SaocG
 0gaddPjKSwwM4Sa/VP+O5cjGuzi7QnczSDdpYjxIGZzBav32hqx4/rsnLw7bHH8y
 XkEme7cYJc8MGsA=
 =2Dmn
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:

 - Decode AER errors with names similar to "lspci" (Tyler Baicar)

 - Expose AER statistics in sysfs (Rajat Jain)

 - Clear AER status bits selectively based on the type of recovery (Oza
   Pawandeep)

 - Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST (Alexandru
   Gagniuc)

 - Don't clear AER status bits if we're using the "Firmware-First"
   strategy where firmware owns the registers (Alexandru Gagniuc)

 - Use sysfs_match_string() to simplify ASPM sysfs parsing (Andy
   Shevchenko)

 - Remove unnecessary includes of <linux/pci-aspm.h> (Bjorn Helgaas)

 - Defer DPC event handling to work queue (Keith Busch)

 - Use threaded IRQ for DPC bottom half (Keith Busch)

 - Print AER status while handling DPC events (Keith Busch)

 - Work around IDT switch ACS Source Validation erratum (James
   Puthukattukaran)

 - Emit diagnostics for all cases of PCIe Link downtraining (Links
   operating slower than they're capable of) (Alexandru Gagniuc)

 - Skip VFs when configuring Max Payload Size (Myron Stowe)

 - Reduce Root Port Max Payload Size if necessary when hot-adding a
   device below it (Myron Stowe)

 - Simplify SHPC existence/permission checks (Bjorn Helgaas)

 - Remove hotplug sample skeleton driver (Lukas Wunner)

 - Convert pciehp to threaded IRQ handling (Lukas Wunner)

 - Improve pciehp tolerance of missed events and initially unstable
   links (Lukas Wunner)

 - Clear spurious pciehp events on resume (Lukas Wunner)

 - Add pciehp runtime PM support, including for Thunderbolt controllers
   (Lukas Wunner)

 - Support interrupts from pciehp bridges in D3hot (Lukas Wunner)

 - Mark fall-through switch cases before enabling -Wimplicit-fallthrough
   (Gustavo A. R. Silva)

 - Move DMA-debug PCI init from arch code to PCI core (Christoph
   Hellwig)

 - Fix pci_request_irq() usage of IRQF_ONESHOT when no handler is
   supplied (Heiner Kallweit)

 - Unify PCI and DMA direction #defines (Shunyong Yang)

 - Add PCI_DEVICE_DATA() macro (Andy Shevchenko)

 - Check for VPD completion before checking for timeout (Bert Kenward)

 - Limit Netronome NFP5000 config space size to work around erratum
   (Jakub Kicinski)

 - Set IRQCHIP_ONESHOT_SAFE for PCI MSI irqchips (Heiner Kallweit)

 - Document ACPI description of PCI host bridges (Bjorn Helgaas)

 - Add "pci=disable_acs_redir=" parameter to disable ACS redirection for
   peer-to-peer DMA support (we don't have the peer-to-peer support yet;
   this is just one piece) (Logan Gunthorpe)

 - Clean up devm_of_pci_get_host_bridge_resources() resource allocation
   (Jan Kiszka)

 - Fixup resizable BARs after suspend/resume (Christian König)

 - Make "pci=earlydump" generic (Sinan Kaya)

 - Fix ROM BAR access routines to stay in bounds and check for signature
   correctly (Rex Zhu)

 - Add DMA alias quirk for Microsemi Switchtec NTB (Doug Meyer)

 - Expand documentation for pci_add_dma_alias() (Logan Gunthorpe)

 - To avoid bus errors, enable PASID only if entire path supports
   End-End TLP prefixes (Sinan Kaya)

 - Unify slot and bus reset functions and remove hotplug knowledge from
   callers (Sinan Kaya)

 - Add Function-Level Reset quirks for Intel and Samsung NVMe devices to
   fix guest reboot issues (Alex Williamson)

 - Add function 1 DMA alias quirk for Marvell 88SS9183 PCIe SSD
   Controller (Bjorn Helgaas)

 - Remove Xilinx AXI-PCIe host bridge arch dependency (Palmer Dabbelt)

 - Remove Aardvark outbound window configuration (Evan Wang)

 - Fix Aardvark bridge window sizing issue (Zachary Zhang)

 - Convert Aardvark to use pci_host_probe() to reduce code duplication
   (Thomas Petazzoni)

 - Correct the Cadence cdns_pcie_writel() signature (Alan Douglas)

 - Add Cadence support for optional generic PHYs (Alan Douglas)

 - Add Cadence power management ops (Alan Douglas)

 - Remove redundant variable from Cadence driver (Colin Ian King)

 - Add Kirin MSI support (Xiaowei Song)

 - Drop unnecessary root_bus_nr setting from exynos, imx6, keystone,
   armada8k, artpec6, designware-plat, histb, qcom, spear13xx (Shawn
   Guo)

 - Move link notification settings from DesignWare core to individual
   drivers (Gustavo Pimentel)

 - Add endpoint library MSI-X interfaces (Gustavo Pimentel)

 - Correct signature of endpoint library IRQ interfaces (Gustavo
   Pimentel)

 - Add DesignWare endpoint library MSI-X callbacks (Gustavo Pimentel)

 - Add endpoint library MSI-X test support (Gustavo Pimentel)

 - Remove unnecessary GFP_ATOMIC from Hyper-V "new child" allocation
   (Jia-Ju Bai)

 - Add more devices to Broadcom PAXC quirk (Ray Jui)

 - Work around corrupted Broadcom PAXC config space to enable SMMU and
   GICv3 ITS (Ray Jui)

 - Disable MSI parsing to work around broken Broadcom PAXC logic in some
   devices (Ray Jui)

 - Hide unconfigured functions to work around a Broadcom PAXC defect
   (Ray Jui)

 - Lower iproc log level to reduce console output during boot (Ray Jui)

 - Fix mobiveil iomem/phys_addr_t type usage (Lorenzo Pieralisi)

 - Fix mobiveil missing include file (Lorenzo Pieralisi)

 - Add mobiveil Kconfig/Makefile support (Lorenzo Pieralisi)

 - Fix mvebu I/O space remapping issues (Thomas Petazzoni)

 - Use generic pci_host_bridge in mvebu instead of ARM-specific API
   (Thomas Petazzoni)

 - Whitelist VMD devices with fast interrupt handlers to avoid sharing
   vectors with slow handlers (Keith Busch)

* tag 'pci-v4.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (153 commits)
  PCI/AER: Don't clear AER bits if error handling is Firmware-First
  PCI: Limit config space size for Netronome NFP5000
  PCI/MSI: Set IRQCHIP_ONESHOT_SAFE for PCI-MSI irqchips
  PCI/VPD: Check for VPD access completion before checking for timeout
  PCI: Add PCI_DEVICE_DATA() macro to fully describe device ID entry
  PCI: Match Root Port's MPS to endpoint's MPSS as necessary
  PCI: Skip MPS logic for Virtual Functions (VFs)
  PCI: Add function 1 DMA alias quirk for Marvell 88SS9183
  PCI: Check for PCIe Link downtraining
  PCI: Add ACS Redirect disable quirk for Intel Sunrise Point
  PCI: Add device-specific ACS Redirect disable infrastructure
  PCI: Convert device-specific ACS quirks from NULL termination to ARRAY_SIZE
  PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support
  PCI: Allow specifying devices using a base bus and path of devfns
  PCI: Make specifying PCI devices in kernel parameters reusable
  PCI: Hide ACS quirk declarations inside PCI core
  PCI: Delay after FLR of Intel DC P3700 NVMe
  PCI: Disable Samsung SM961/PM961 NVMe before FLR
  PCI: Export pcie_has_flr()
  PCI: mvebu: Drop bogus comment above mvebu_pcie_map_registers()
  ...
2018-08-16 09:21:54 -07:00
Gustavo A. R. Silva
76df93b177 igbvf: netdev: Mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114801 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 17:54:20 -07:00
Gustavo A. R. Silva
eed05a094a igb: e1000_phy: Mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114800 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 17:54:20 -07:00
Gustavo A. R. Silva
b9e0e23f91 igb: e1000_82575: Mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114799 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 17:54:20 -07:00
Gustavo A. R. Silva
7e9660ff6f igb_main: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 200521 ("Missing break in switch")
Addresses-Coverity-ID: 114797 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 17:54:20 -07:00
Gustavo A. R. Silva
f7c3ca2da4 i40e_txrx: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114791 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 17:54:20 -07:00
Gustavo A. R. Silva
1e84374f1c i40e_main: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114790 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 17:54:19 -07:00
Jacob Keller
333e2f2cea i40e: fix i40e_add_queue_stats data pointer update
This function accidentally failed to update the data pointer, which
caused the reported stats to be incorrect. Additionally, statistics
which follow queue stats in the output would potentially read non-zeroed
garbage data from the ethtool buffer.

This occurred because the data double pointer was not dereferenced
before incrementing the size.

Additionally, make sure this issue is more visible by adding a WARN_ONCE
to the i40e_get_ethtool_stats function. This warning will trigger
whenever the data pointer is not at the expected address, similar to the
check that we make in the i40e_get_stat_strings() function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 12:20:52 -07:00
Piotr Azarewicz
f05798b4ff i40e: Add AQ command for rearrange NVM structure
During switching between old NVM structure approach (called structured
NVM) to new one (called flat NVM) or backward flash needs to be
rearranged to required NVM structure. This is a part of transition from
one NVM structure to another. The function is introduced to command
firmware to start rearrangement process.

Signed-off-by: Piotr Azarewicz <piotr.azarewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 12:20:45 -07:00
Piotr Azarewicz
b2b57b2958 i40e: Add additional return code to i40e_asq_send_command
Firmware can return a busy state, so the function return
I40E_ERR_NOT_READY.

Signed-off-by: Piotr Azarewicz <piotr.azarewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 12:20:38 -07:00
Jacob Keller
6e2feaa344 i40e: fix warning about shadowed ring parameter
In commit 147e81ec75 ("i40e: Test memory before ethtool alloc succeeds")
code was added to handle ring allocation on systems with low memory.

It shadowed the ring parameter pointer by introducing a local ring
pointer inside the for loop. Most of the code in the loop already just
accessed the ring via &rx_rings[i]. Since most of the code already does
this, just remove the local variable.

If someone considers it worth keeping a local around, they should use it
for the whole section instead of just a couple of accesses.

This fixes a warning when -Wshadow is enabled

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 12:20:15 -07:00
Jacob Keller
4d9768237c i40e: remove unnecessary i variable causing -Wshadow warning
Commit c61c8fe1d5 ("i40e: Implement an ethtool private flag to stop
LLDP in FW") added an extra for-loop which added a shadowing 'i'
variable as the index.

However, the local variable i already exists, and we already use it as
a loop index. Additionally, at this point, there is no further use of
the variable, so it's safe to simply overwrite the variable contents.

This fixes a -Wshadow warning which has started being enabled on some
distributions

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Patryk Malek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 12:20:00 -07:00
Jacob Keller
f25848d4cd i40e: convert priority flow control stats to use helpers
The priority flow control statistics are laid out in the stats structure
using arrays. This made it unwieldy to use as part of an i40e_stats
array.

Add a new structure type, i40e_pfc_stats, and a helper function
i40e_get_pfc_stats which can return the stats for a given priority
value as an i40e_pfc_stats structure.

Use this to create an i40e_stats array, which we'll use to format and
copy the strings and stats into the supplied buffers.

This reduces even more boiler plate code in i40e_get_ethtool_stats and
i40e_get_stat_strings.

An alternative would be to modify the structure definition for the pfc
stats, but this is more invasive to the rest of the code base.

Note that a macro was used to setup the copy of stats from the
pf->stats, as this reduces the chance of typos in the code names. It
will produce a checkpatch.pl warning due to re-use of a macro argument.
In this case, it should be safe, as the macro will fail to compile in
cases where the argument is not a simple structure member name, and thus
arguments with side effects should not be an issue.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 08:19:09 -07:00
Jacob Keller
1510ae0be2 i40e: convert VEB TC stats to use an i40e_stats array
The VEB TC stats are currently implemented with separate parsing,
instead of using the i40e_stats array and associated helper functions.
This is likely because the stats rely on embedding the TC number into
the stat name.

Update i40e_add_stat_strings to take variadic arguments, and use these
to vsnprintf the i40e_stats string as a string containing format
specifiers.

Create a stats array for the VEB TC related stats,
i40e_gstrings_veb_tc_stats, and use this along with the helper functions
to remove the specialized boiler plate code.

Always call i40e_add_ethtool_stats for both this array and the general
VEB stats array. This ensures that we zero out any memory in case it was
not zero-allocated for us.

This ultimately results in less boiler plate code for the
i40e_get_stat_strings and i40e_get_ethtool_stats.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 08:19:09 -07:00
Mariusz Stachura
1ac2ee231f i40e: Set fec_config when forcing link state
This patch configures FEC setting in i40e_force_link_state().
For some reason setting this field was overlooked thus causing
25G link to be configured incorrectly.

Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 08:19:08 -07:00
Jacob Keller
f303048067 i40e: add helper to copy statistic values into ethtool buffer
Similar to the helper function to copy the ethtool stats strings, add
and use a helper function for copying the ethtool stats into the
supplied buffer.

Just like before, we use a macro to avoid having to pass ARRAY_SIZE
manually, so as to reduce chance of bugs.

Some of the stats, especially queue stats, are a bit trickier, and will
be handled in future patches.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 08:19:08 -07:00
Jacob Keller
91f0654461 i40e: add helper function for copying strings from stat arrays
Many of the ethtool statistics use the same basic logic for copying
strings into the supplied buffer. A set of stats are stored in a const
array of i40e_stats structures, and we apply these all together.

Simplify the stats code by introducing a helper function which can take
a stats array and copy the strings into the buffer, updating the buffer
pointer as we go.

We use a macro to implement i40e_add_stat_strings so that ARRAY_SIZE can
be used on the array passed in. This ensures that we always use the
matching size in __i40e_add_stat_strings.

More complex stats currently do not use i40e_stats arrays, usually due
to custom formatted strings, or because the stats are not laid out in
the expected way. These stats will be updated to use the helper function
in separate future patches.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 08:19:08 -07:00
YueHaibing
1b4b6f3a2a i40e/i40evf: remove redundant functions i40evf_aq_{set/get}_phy_register
There are no in-tree callers.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 08:19:08 -07:00
Sergey Nemov
e661414c98 i40e: Remove duplicated prepare call in i40e_shutdown
Function call to i40e_prep_for_reset() is duplicated in
i40e_shutdown routine and gets called before
i40e_enable_mc_magic_wake() which blocks it from being executed
correctly on system reboot or shutdown because adminq is already
disabled by first i40e_prep_for_reset() call.

Two register write calls are also duplicated.

Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-07 08:19:08 -07:00
Bjorn Helgaas
f5ddcf71e6 igb: Remove unnecessary include of <linux/pci-aspm.h>
The igb driver doesn't need anything provided by pci-aspm.h, so remove
the unnecessary include of it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-06 14:32:21 -05:00
Alexander Duyck
1918e937ca ixgbe: Refactor queue disable logic to take completion time into account
This change is meant to allow us to take completion time into account when
disabling queues. Previously we were just working with hard coded values
for how long we should wait. This worked fine for the standard case where
completion timeout was operating in the 50us to 50ms range, however on
platforms that have higher completion timeout times this was resulting in
Rx queues disable messages being displayed as we weren't waiting long
enough for outstanding Rx DMA completions.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Don Buchholz <donald.buchholz@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:06 -07:00
Alexander Duyck
3b5f14b50e ixgbe: Reorder Tx/Rx shutdown to reduce time needed to stop device
This change is meant to help reduce the time needed to shutdown the
transmit and receive paths for the device. Specifically what we now do
after this patch is disable the transmit path first at the netdev level,
and then work on disabling the Rx. This way while we are waiting on the Rx
queues to be disabled the Tx queues have an opportunity to drain out.

In addition I have dropped the 10ms timeout that was left in the ixgbe_down
function that seems to have been carried through from back in e1000 as far
as I can tell. We shouldn't need it since we don't actually disable the Tx
until much later and we have additional logic in place for verifying the Tx
queues have been disabled.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Don Buchholz <donald.buchholz@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:06 -07:00
Venkatesh Srinivas
73017f4e05 igb: Use dma_wmb() instead of wmb() before doorbell writes
igb writes to doorbells to post transmit and receive descriptors;
after writing descriptors to memory but before writing to doorbells,
use dma_wmb() rather than wmb(). wmb() is more heavyweight than
necessary before doorbell writes.

On x86, this avoids SFENCEs before doorbell writes in both the
tx and rx refill paths.

Signed-off-by: Venkatesh Srinivas <venkateshs@google.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:05 -07:00
Christian Grönke
2a83fba6ca igb: Remove superfluous reset to PHY and page 0 selection
This patch reverts two previous applied patches to fix an issue
that appeared when using SGMII based SFP modules. In the current
state the driver will try to reset the PHY before obtaining the
phy_addr of the SGMII attached PHY. That leads to an error in
e1000_write_phy_reg_sgmii_82575. Causing the initialization to
fail:

    igb: Intel(R) Gigabit Ethernet Network Driver - version 5.4.0-k
    igb: Copyright (c) 2007-2014 Intel Corporation.
    igb: probe of ????:??:??.? failed with error -3

The patches being reverted are:

    commit 1827853354
    Author: Aaron Sierra <asierra@xes-inc.com>
    Date:   Tue Nov 29 10:03:56 2016 -0600

        igb: reset the PHY before reading the PHY ID

    commit 440aeca4b9
    Author: Matwey V Kornilov <matwey@sai.msu.ru>
    Date:   Thu Nov 24 13:32:48 2016 +0300

         igb: Explicitly select page 0 at initialization

The first reverted patch directly causes the problem mentioned above.
In case of SGMII the phy_addr is not known at this point and will
only be obtained by 'igb_get_phy_id_82575' further down in the code.
The second removed patch selects forces selection of page 0 in the
PHY. Something that the reset tries to address as well.

As pointed out by Alexander Duzck, the patch below fixes the same
issue but in the proper location:

    commit 4e684f59d7
    Author: Chris J Arges <christopherarges@gmail.com>
    Date:   Wed Nov 2 09:13:42 2016 -0500

        igb: Workaround for igb i210 firmware issue

Reverts: 440aeca4b9.
Reverts: 1827853354.

Signed-off-by: Christian Grönke <c.groenke@infodas.de>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:05 -07:00
Shannon Nelson
7f6cdbdafb ixgbe: add ipsec security registers into ethtool register dump
Add the ixgbe's security configuration registers into
the register dump.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:05 -07:00
Tony Nguyen
38b7e7f8ae ixgbe: Do not allow LRO or MTU change with XDP
XDP does not support jumbo frames or LRO.  These checks are being made
outside the driver when an XDP program is loaded, however, there is
nothing preventing these from changing after an XDP program is loaded.
Add the checks so that while an XDP program is loaded, do not allow MTU
to be changed or LRO to be enabled.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-26 09:04:05 -07:00
David S. Miller
c4c5551df1 Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux
All conflicts were trivial overlapping changes, so reasonably
easy to resolve.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-20 21:17:12 -07:00
David S. Miller
2aa4a3378a Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-07-15

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Various different arm32 JIT improvements in order to optimize code emission
   and make the JIT code itself more robust, from Russell.

2) Support simultaneous driver and offloaded XDP in order to allow for advanced
   use-cases where some work is offloaded to the NIC and some to the host. Also
   add ability for bpftool to load programs and maps beyond just the cgroup case,
   from Jakub.

3) Add BPF JIT support in nfp for multiplication as well as division. For the
   latter in particular, it uses the reciprocal algorithm to emulate it, from Jiong.

4) Add BTF pretty print functionality to bpftool in plain and JSON output
   format, from Okash.

5) Add build and installation to the BPF helper man page into bpftool, from Quentin.

6) Add a TCP BPF callback for listening sockets which is triggered right after
   the socket transitions to TCP_LISTEN state, from Andrey.

7) Add a new cgroup tree command to bpftool which iterates over the whole cgroup
   tree and prints all attached programs, from Roman.

8) Improve xdp_redirect_cpu sample to support parsing of double VLAN tagged
   packets, from Jesper.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-14 18:47:44 -07:00
Jakub Kicinski
6b86758973 xdp: don't make drivers report attachment mode
prog_attached of struct netdev_bpf should have been superseded
by simply setting prog_id long time ago, but we kept it around
to allow offloading drivers to communicate attachment mode (drv
vs hw).  Subsequently drivers were also allowed to report back
attachment flags (prog_flags), and since nowadays only programs
attached will XDP_FLAGS_HW_MODE can get offloaded, we can tell
the attachment mode from the flags driver reports.  Remove
prog_attached member.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-13 20:26:35 +02:00
Dan Carpenter
c411104115 ixgbe: Off by one in ixgbe_ipsec_tx()
The ipsec->tx_tbl[] has IXGBE_IPSEC_MAX_SA_COUNT elements so the > needs
to be changed to >= so we don't read one element beyond the end of the
array.

Fixes: 5925947047 ("ixgbe: process the Tx ipsec offload")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-12 08:03:09 -07:00
Alexander Duyck
d14c780c11 ixgbe: Be more careful when modifying MAC filters
This change makes it so that we are much more explicit about the ordering
of updates to the receive address register (RAR) table. Prior to this patch
I believe we may have been updating the table while entries were still
active, or possibly allowing for reordering of things since we weren't
explicitly flushing writes to either the lower or upper portion of the
register prior to accessing the other half.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-12 07:17:13 -07:00
Alexander Duyck
8ec56fc3c5 net: allow fallback function to pass netdev
For most of these calls we can just pass NULL through to the fallback
function as the sb_dev. The only cases where we cannot are the cases where
we might be dealing with either an upper device or a driver that would
have configured things to support an sb_dev itself.

The only driver that has any significant change in this patch set should be
ixgbe as we can drop the redundant functionality that existed in both the
ndo_select_queue function and the fallback function that was passed through
to us.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 13:57:25 -07:00
Alexander Duyck
4f49dec907 net: allow ndo_select_queue to pass netdev
This patch makes it so that instead of passing a void pointer as the
accel_priv we instead pass a net_device pointer as sb_dev. Making this
change allows us to pass the subordinate device through to the fallback
function eventually so that we can keep the actual code in the
ndo_select_queue call as focused on possible on the exception cases.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 13:41:34 -07:00
Alexander Duyck
eadec877ce net: Add support for subordinate traffic classes to netdev_pick_tx
This change makes it so that we can support the concept of subordinate
device traffic classes to the core networking code. In doing this we can
start pulling out the driver specific bits needed to support selecting a
queue based on an upper device.

The solution at is currently stands is only partially implemented. I have
the start of some XPS bits in here, but I would still need to allow for
configuration of the XPS maps on the queues reserved for the subordinate
devices. For now I am using the reference to the sb_dev XPS map as just a
way to skip the lookup of the lower device XPS map for now as that would
result in the wrong queue being picked.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 12:53:58 -07:00
Alexander Duyck
58b0b3ed4c ixgbe: Add code to populate and use macvlan TC to Tx queue map
This patch makes it so that we use the tc_to_txq mapping in the macvlan
device in order to select the Tx queue for outgoing packets.

The idea here is to try and move away from using ixgbe_select_queue and to
come up with a generic way to make this work for devices going forward. By
encoding this information in the netdev this can become something that can
be used generically as a solution for similar setups going forward.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 12:27:52 -07:00
Jesus Sanchez-Palencia
3048cf84d1 igb: Add support for ETF offload
Implement HW offload support for SO_TXTIME through igb's Launchtime
feature. This is done by extending igb_setup_tc() so it supports
TC_SETUP_QDISC_ETF and configuring i210 so time based transmit
arbitration is enabled.

The FQTSS transmission mode added before is extended so strict
priority (SP) queues wait for stream reservation (SR) ones.
igb_config_tx_modes() is extended so it can support enabling/disabling
Launchtime following the previous approach used for the credit-based
shaper (CBS).

As the previous flow, FQTSS transmission mode is enabled automatically
by the driver once Launchtime (or CBS, as before) is enabled.
Similarly, it's automatically disabled when the feature is disabled
for the last queue that had it setup on.

The driver just consumes the transmit times from the skbuffs directly,
so no special handling is done in case an 'invalid' time is provided.
We assume this has been handled by the ETF qdisc already.

Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04 22:30:28 +09:00
Jesus Sanchez-Palencia
1b9231e7e1 igb: Only call skb_tx_timestamp after descriptors are ready
Currently, skb_tx_timestamp() is being called before the Tx
descriptors are prepared in igb_xmit_frame_ring(), which happens
during either the igb_tso() or igb_tx_csum() calls.

Given that now the skb->tstamp might be used to carry the timestamp
for SO_TXTIME, we must only call skb_tx_timestamp() after the
information has been copied into the Tx descriptors.

Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04 22:30:28 +09:00
Jesus Sanchez-Palencia
8080e6ab4e igb: Refactor igb_offload_cbs()
Split code into a separate function (igb_offload_apply()) that will be
used by ETF offload implementation.

Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04 22:30:28 +09:00
Jesus Sanchez-Palencia
0364a0d0e7 igb: Only change Tx arbitration when CBS is on
Currently the data transmission arbitration algorithm - DataTranARB
field on TQAVCTRL reg - is always set to CBS when the Tx mode is
changed from legacy to 'Qav' mode.

Make that configuration a bit more granular in preparation for the
upcoming Launchtime enabling patches, since CBS and Launchtime can be
enabled separately. That is achieved by moving the DataTranARB setup
to igb_config_tx_modes() instead.

Similarly, when disabling CBS we must check if it has been disabled
for all queues, and clear the DataTranARB accordingly.

Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04 22:30:28 +09:00