Make the linkage between the DSA user port ethtool_ops :: get_ts_info
and the implementation from the Ocelot switch library.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250116104628.123555-5-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
drivers/net/ethernet/broadcom/bnxt/bnxt.h
c948c0973d ("bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops")
f2878cdeb7 ("bnxt_en: Add support to call FW to update a VNIC")
Link: https://patch.msgid.link/20240822210125.1542769-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use scoped for_each_available_child_of_node_scoped() when iterating over
device nodes to make code a bit simpler.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://patch.msgid.link/20240820074805.680674-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is a major design bug with ocelot-8021q, which is that it expects
more of the hardware than the hardware can actually do. The short
summary of the issue is that when a port is under a VLAN-aware bridge
and we use this tagging protocol, VLAN upper interfaces of this port do
not see RX traffic.
We use VCAP ES0 (egress rewriter) rules towards the tag_8021q CPU port
to encapsulate packets with an outer tag, later stripped by software,
that depends on the source user port. We do this so that packets can be
identified in ocelot_rcv(). To be precise, we create rules with
push_outer_tag = OCELOT_ES0_TAG and push_inner_tag = 0.
With this configuration, we expect the switch to keep the inner tag
configuration as found in the packet (if it was untagged on user port
ingress, keep it untagged, otherwise preserve the VLAN tag unmodified
as the inner tag towards the tag_8021q CPU port). But this is not what
happens.
Instead, table "Tagging Combinations" from the user manual suggests
that when the ES0 action is "PUSH_OUTER_TAG=1 and PUSH_INNER_TAG=0",
there will be "no inner tag". Experimentation further clarifies what
this means.
It appears that this "inner tag" which is not pushed into the packet on
its egress towards the CPU is none other than the classified VLAN.
When the ingress user port is standalone or under a VLAN-unaware bridge,
the classified VLAN is a discardable quantity: it is a fixed value - the
result of ocelot_vlan_unaware_pvid()'s configuration, and actually
independent of the VID from any 802.1Q header that may be in the frame.
It is actually preferable to discard the "inner tag" in this case.
The problem is when the ingress port is under a VLAN-aware bridge.
Then, the classified VLAN is taken from the frame's 802.1Q header, with
a fallback on the bridge port's PVID. It would be very good to not
discard the "inner tag" here, because if we do, we break communication
with any 8021q VLAN uppers that the port might have. These have a
processing path outside the bridge.
There seems to be nothing else we can do except to change the
configuration for VCAP ES0 rules, to actually push the inner VLAN into
the frame. There are 2 options for that, first is to push a fixed value
specified in the rule, and second is to push a fixed value, plus
(aka arithmetic +) the classified VLAN. We choose the second option,
and we select that fixed value as 0. Thus, what is pushed in the inner
tag is just the classified VLAN.
From there, we need to perform software untagging, in the receive path,
of stuff that was untagged on the wire.
Fixes: 7c83a7c539 ("net: dsa: add a second tagger for Ocelot switches based on tag_8021q")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As explained by Horatiu Vultur in commit 603ead9658 ("net: sparx5: Add
spinlock for frame transmission from CPU") which is for a similar
hardware design, multiple CPUs can simultaneously perform injection
or extraction. There are only 2 register groups for injection and 2
for extraction, and the driver only uses one of each. So we'd better
serialize access using spin locks, otherwise frame corruption is
possible.
Note that unlike in sparx5, FDMA in ocelot does not have this issue
because struct ocelot_fdma_tx_ring already contains an xmit_lock.
I guess this is mostly a problem for NXP LS1028A, as that is dual core.
I don't think VSC7514 is. So I'm blaming the commit where LS1028A (aka
the felix DSA driver) started using register-based packet injection and
extraction.
Fixes: 0a6f17c6ae ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In prevision to add new UAPI for hwtstamp we will be limited to the struct
ethtool_ts_info that is currently passed in fixed binary format through the
ETHTOOL_GET_TS_INFO ethtool ioctl. It would be good if new kernel code
already started operating on an extensible kernel variant of that
structure, similar in concept to struct kernel_hwtstamp_config vs struct
hwtstamp_config.
Since struct ethtool_ts_info is in include/uapi/linux/ethtool.h, here
we introduce the kernel-only structure in include/linux/ethtool.h.
The manual copy is then made in the function called by ETHTOOL_GET_TS_INFO.
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-6-b5317f50df2a@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now that the common felix_register_switch() from the umbrella driver
is the only entity that accesses these data structures, we can remove
them from the list of the exported symbols.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Sai Krishna <saikrishnag@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King suggested that felix_vsc9959, seville_vsc9953 and
ocelot_ext have a large portion of duplicated init code, which could be
made common [1].
[1]: https://lore.kernel.org/all/Zh1GvcOTXqb7CpQt@shell.armlinux.org.uk/
Here, we take the following common steps:
- "felix" and "ds" structure allocation
- "felix", "ocelot" and "ds" basic structure initialization
- dsa_register_switch() call
and we make a common function out of them.
For every driver except felix_vsc9959, this is also the entire probing
procedure. For felix_vsc9959, we also need to do some PCI-specific
stuff, which can easily be reordered to be done before, and unwound on
failure.
We also have to convert the bus-specific platform_set_drvdata() and
pci_set_drvdata() calls into dev_set_drvdata(). But this should have no
impact on the behavior.
Suggested-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current placement of devm_request_threaded_irq() is inconvenient.
It is between the allocation of the "felix" structure and
dsa_register_switch(), both of which we'd like to refactor into a
function that's common for all switches. But the IRQ is specific to
felix_vsc9959.
A closer inspection of the felix_irq_handler() code suggests that
it does things that depend on the data structures having been fully
initialized. For example, ocelot_get_txtstamp() takes
&port->tx_skbs.lock, which has only been initialized in
ocelot_init_port() which has not run yet.
It is not one of those IRQF_SHARED IRQs, so CONFIG_DEBUG_SHIRQ_FIXME
shouldn't apply here, and thus, it doesn't really matter, because in
practice, the IRQ will not be triggered so early. Nonetheless, it is a
good practice for the driver to be prepared for it to fire as soon as it
is requested.
Create a new felix->info method for running custom code for vsc9959 from
within felix_setup(), and move the request_irq() call there. The
ocelot_ext should have an IRQ as well, so this should be a step in the
right direction for that model (VSC7512) as well.
Some minor changes are made while moving the code. Casts from void *
aren't necessary, so drop them, and rename felix_irq_handler() to the
more specific vsc9959_irq_handler().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert felix to provide its own phylink MAC operations, thus
avoiding the shim layer in DSA's port.c.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/E1sByYA-00EM0y-Jn@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use more inclusive terms throughout the DSA subsystem by moving away
from "master" which is replaced by "conduit" and "slave" which is
replaced by "user". No functional changes.
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20231023181729.1191071-2-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the tagging protocol in current use is "ocelot-8021q" and we unbind
the driver, we see this splat:
$ echo '0000:00:00.2' > /sys/bus/pci/drivers/fsl_enetc/unbind
mscc_felix 0000:00:00.5 swp0: left promiscuous mode
sja1105 spi2.0: Link is Down
DSA: tree 1 torn down
mscc_felix 0000:00:00.5 swp2: left promiscuous mode
sja1105 spi2.2: Link is Down
DSA: tree 3 torn down
fsl_enetc 0000:00:00.2 eno2: left promiscuous mode
mscc_felix 0000:00:00.5: Link is Down
------------[ cut here ]------------
RTNL: assertion failed at net/dsa/tag_8021q.c (409)
WARNING: CPU: 1 PID: 329 at net/dsa/tag_8021q.c:409 dsa_tag_8021q_unregister+0x12c/0x1a0
Modules linked in:
CPU: 1 PID: 329 Comm: bash Not tainted 6.5.0-rc3+ #771
pc : dsa_tag_8021q_unregister+0x12c/0x1a0
lr : dsa_tag_8021q_unregister+0x12c/0x1a0
Call trace:
dsa_tag_8021q_unregister+0x12c/0x1a0
felix_tag_8021q_teardown+0x130/0x150
felix_teardown+0x3c/0xd8
dsa_tree_teardown_switches+0xbc/0xe0
dsa_unregister_switch+0x168/0x260
felix_pci_remove+0x30/0x60
pci_device_remove+0x4c/0x100
device_release_driver_internal+0x188/0x288
device_links_unbind_consumers+0xfc/0x138
device_release_driver_internal+0xe0/0x288
device_driver_detach+0x24/0x38
unbind_store+0xd8/0x108
drv_attr_store+0x30/0x50
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------
RTNL: assertion failed at net/8021q/vlan_core.c (376)
WARNING: CPU: 1 PID: 329 at net/8021q/vlan_core.c:376 vlan_vid_del+0x1b8/0x1f0
CPU: 1 PID: 329 Comm: bash Tainted: G W 6.5.0-rc3+ #771
pc : vlan_vid_del+0x1b8/0x1f0
lr : vlan_vid_del+0x1b8/0x1f0
dsa_tag_8021q_unregister+0x8c/0x1a0
felix_tag_8021q_teardown+0x130/0x150
felix_teardown+0x3c/0xd8
dsa_tree_teardown_switches+0xbc/0xe0
dsa_unregister_switch+0x168/0x260
felix_pci_remove+0x30/0x60
pci_device_remove+0x4c/0x100
device_release_driver_internal+0x188/0x288
device_links_unbind_consumers+0xfc/0x138
device_release_driver_internal+0xe0/0x288
device_driver_detach+0x24/0x38
unbind_store+0xd8/0x108
drv_attr_store+0x30/0x50
DSA: tree 0 torn down
This was somewhat not so easy to spot, because "ocelot-8021q" is not the
default tagging protocol, and thus, not everyone who tests the unbinding
path may have switched to it beforehand. The default
felix_tag_npi_teardown() does not require rtnl_lock() to be held.
Fixes: 7c83a7c539 ("net: dsa: add a second tagger for Ocelot switches based on tag_8021q")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230803134253.2711124-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since DSA no longer marks anything as phylink-legacy, there is now no
need for DSA drivers to set this member to false. Remove all instances
of this.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Remove unnecessary of_node_put from the continue path to prevent
child node from being released twice, which could avoid resource
leak or other unexpected issues.
Signed-off-by: Lu Hongfei <luhongfei@vivo.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Fixes: de879a016a ("net: dsa: felix: add functionality when not all ports are supported")
Link: https://lore.kernel.org/r/20230710031859.36784-1-luhongfei@vivo.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In a future change we will need to make
ocelot_port_update_active_preemptible_tcs() call
vsc9959_tas_guard_bands_update(), but that is currently not possible,
since the ocelot switch lib does not have access to functions private to
the DSA wrapper.
Move the pointer to vsc9959_tas_guard_bands_update() from felix->info
(which is private to the DSA driver) to ocelot->ops (which is also
visible to the ocelot switch lib).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Message-ID: <20230705104422.49025-3-vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In a future commit we will have to call vsc9959_tas_guard_bands_update()
from ocelot_port_update_active_preemptible_tcs(), and that will be
impossible due to the AB/BA locking dependencies between
ocelot->tas_lock and ocelot->fwd_domain_lock.
Just like we did in commit 3ff468ef98 ("net: mscc: ocelot: remove
struct ocelot_mm_state :: lock"), the only solution is to expand the
scope of ocelot->fwd_domain_lock for it to also serialize changes made
to the Time-Aware Shaper, because those will have to result in a
recalculation of cut-through TCs, which is something that depends on the
forwarding domain.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Message-ID: <20230705104422.49025-2-vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver implements a workaround for the fact that it doesn't have an
IRQ source to tell it whether PTP frames are available through the
extraction registers, for those frames to be processed and passed
towards the network stack. That workaround is to configure the switch,
through felix_hwtstamp_set() -> felix_update_trapping_destinations(),
to create two copies of PTP packets: one sent over Ethernet to the DSA
master, and one to be consumed through the aforementioned CPU extraction
queue registers.
The reason why we want PTP packets to be consumed through the CPU
extraction registers in the first place is because we want to see their
hardware RX timestamp. With tag_8021q, that is only visible that way,
and it isn't visible with the copy of the packet that's transmitted over
Ethernet.
The problem with the workaround implementation is that it drops the
packet received over Ethernet, in expectation of its copy being present
in the CPU extraction registers. However, if felix_hwtstamp_set() hasn't
run (aka PTP RX timestamping is disabled), the driver will drop the
original PTP frame and there will be no copy of it in the CPU extraction
registers. So, the network stack will simply not see any PTP frame.
Look at the port's trapping configuration to see whether the driver has
previously enabled the CPU extraction registers. If it hasn't, just
don't RX timestamp the frame and let it be passed up the stack by DSA,
which is perfectly fine.
Fixes: 0a6f17c6ae ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
That comment was written prior to knowing that what I was actually
seeing was a manifestation of the bug fixed in commit b4024c9e5c
("felix: Fix initialization of ioremap resources").
There isn't any particular reason now why the hardware initialization is
done in felix_setup(), so just delete that comment to avoid spreading
misinformation.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ports for Ocelot devices (VSC7511, VSC7512, VSC7513 and VSC7514) support
external phys. When external phys are used, additional configuration on
each port is required to enable QSGMII mode and set external phy modes.
Add a configurable hook into these routines, so the external ports can be
used.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a user of the Felix driver has a port running in SGMII / QSGMII mode, it
will need to utilize phylink_mac_config(). Add this configurability.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VSC7512 and VSC7514 have internal PLLs that can be used to control
different peripherals. Initialize these high speed I/O (HSIO) PLLs when
they exist, so that dependent peripherals like QSGMII can function.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the Felix driver would probe the ports and verify functionality, it
would fail if it hit single port mode that wasn't supported by the driver.
The initial case for the VSC7512 driver will have physical ports that
exist, but aren't supported by the driver implementation. Add the
OCELOT_PORT_MODE_NONE macro to handle this scenario, and allow the Felix
driver to continue with all the ports that are currently functional.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The architecture around the VSC7512 differs from existing felix drivers. In
order to add support for all the chip's features (pinctrl, MDIO, gpio) the
device had to be laid out as a multi-function device (MFD).
One difference between an MFD and a standard platform device is that the
regmaps are allocated to the parent device before the child devices are
probed. As such, there is no need for felix to initialize new regmaps in
these configurations, they can simply be requested from the parent device.
Add support for MFD configurations by performing this request from the
parent device.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The define FELIX_MAC_QUIRKS was used directly in the felix.c shared driver.
Other devices (VSC7512 for example) don't require the same quirks, so they
need to be configured on a per-device basis.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> # regression
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The build system currently complains:
scripts/Makefile.build:252: drivers/net/dsa/ocelot/Makefile:
felix.o is added to multiple modules: mscc_felix mscc_seville
Since felix.c holds the DSA glue layer, create a mscc_felix_dsa_lib.ko.
This is similar to how mscc_ocelot_switch_lib.ko holds a library for
configuring the hardware.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/20230125145716.271355-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Felix (VSC9959) has a DEV_GMII:MM_CONFIG block composed of 2 registers
(ENABLE_CONFIG and VERIF_CONFIG). Because the MAC Merge statistics and
pMAC statistics are already in the Ocelot switch lib even if just Felix
supports them, I'm adding support for the whole MAC Merge layer in the
common Ocelot library too.
There is an interrupt (shared with the PTP interrupt) which signals
changes to the MM verification state. This is done because the
preemptible traffic classes should be committed to hardware only once
the verification procedure has declared the link partner of being
capable of receiving preemptible frames.
We implement ethtool getters and setters for the MAC Merge layer state.
The "TX enabled" and "verify status" are taken from the IRQ handler,
using a mutex to ensure serialized access.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Felix VSC9959 switch supports frame preemption and has a MAC Merge
layer. In addition to the structured stats that exist for the eMAC,
export the counters associated with its pMAC (pause, RMON, MAC, PHY,
control) plus the high-level MAC Merge layer stats. The unstructured
ethtool counters, as well as the rtnl_link_stats64 were left to report
only the eMAC counters.
Because statistics processing is quite self-contained in ocelot_stats.c
now, I've opted for introducing an ocelot->mm_supported bool, based on
which the common switch lib does everything, rather than pushing the
TSN-specific code in felix_vsc9959.c, as happens for other TSN stuff.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ever since commit 4d1d157fb6 ("net: mscc: ocelot: share the common stat
definitions between all drivers") the stats_layout entry in ocelot and
felix drivers have become redundant. Remove the unnecessary code.
Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Drop the custom implementation of phylink_validate() in favor of the
generic one, which requires config->mac_capabilities to be set.
This was used up until now because of the possibility of being paired
with Aquantia PHYs with support for rate matching. The phylink framework
gained generic support for these, and knows to advertise all 10/100/1000
lower speed link modes when our SERDES protocol is 2500base-x
(fixed speed).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Existing felix DSA drivers (vsc9959, vsc9953) are all switches that were
integrated in NXP SoCs, which makes them a bit unusual compared to the
usual Microchip branded Ocelot switches.
To be precise, looking at
Documentation/devicetree/bindings/net/mscc,vsc7514-switch.yaml, one can
see 21 memory regions for the "switch" node, and these correspond to the
"targets" of the switch IP, which are spread throughout the guts of that
SoC's memory space.
In NXP integrations, those targets still exist, but they were condensed
within a single memory region, with no other peripheral in between them,
so it made more sense for the driver to ioremap the entire memory space
of the switch, and then find the targets within that memory space via
some offsets hardcoded in the driver.
The effect of this design decision is that now, the felix driver expects
hardware instantiations to provide their own resource definitions, which
is kind of odd when considering a typical device (those are retrieved
from 'reg' properties in the device tree, using platform_get_resource()
or similar).
Allow other hardware instantiations that share the felix driver to not
provide a hardcoded array of resources in the future. Instead, make the
common denominator based on which regmaps are created be just the
resource "names". Each instantiation comes with its own array of names
that are mandatory for it, and with an optional array of resources.
So we split the resources in 2 arrays, one is what's requested and the
other is what's provided. There is one pool of provided resources, in
felix->info->resources (of length felix->info->num_resources). There are
2 different ways of requesting a resource. One is by enum ocelot_target
(this handles the global regmaps), and one is by int port (this handles
the per-port ones).
For the existing vsc9959 and vsc9953, it would be a bit stupid to
request something that's not provided, given that the 2 arrays are both
defined in the same place.
The advantage is that we can now modify felix_request_regmap_by_name()
to make felix->info->resources[] optional, and if absent, the
implementation can call dev_get_regmap() and this is something that is
compatible with MFD.
Co-developed-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use less verbose resource definitions in vsc9959 and vsc9953. This also
sets IORESOURCE_MEM in the constant array of resources, so we don't have
to do this from felix_init_structs() - in fact, in the future, we may
even support IORESOURCE_REG resources.
Note that this macro takes start and length as argument, and we had
start and end before. So transform end into length.
While at it, sort the resources according to their offset.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It turns out that the idea of having a customizable implementation of a
regmap creation from a resource is not exactly useful. The idea was for
the new MFD-based VSC7512 driver to use something that creates a SPI
regmap from a resource. But there are problems in actually getting those
resources (it involves getting them from MFD).
To avoid all that, we'll be getting resources by name, so this custom
init_regmap() method won't be needed. Remove it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Changing the DSA master means different things depending on the tagging
protocol in use.
For NPI mode ("ocelot" and "seville"), there is a single port which can
be configured as NPI, but DSA only permits changing the CPU port
affinity of user ports one by one. So changing a user port to a
different NPI port globally changes what the NPI port is, and breaks the
user ports still using the old one.
To address this while still permitting the change of the NPI port,
require that the user ports which are still affine to the old NPI port
are down, and cannot be brought up until they are all affine to the same
NPI port.
The tag_8021q mode ("ocelot-8021q") is more flexible, in that each user
port can be freely assigned to one CPU port or to the other. This works
by filtering host addresses towards both tag_8021q CPU ports, and then
restricting the forwarding from a certain user port only to one of the
two tag_8021q CPU ports.
Additionally, the 2 tag_8021q CPU ports can be placed in a LAG. This
works by enabling forwarding via PGID_SRC from a certain user port
towards the logical port ID containing both tag_8021q CPU ports, but
then restricting forwarding per packet, via the LAG hash codes in
PGID_AGGR, to either one or the other.
When we change the DSA master to a LAG device, DSA guarantees us that
the LAG has at least one lower interface as a physical DSA master.
But DSA masters can come and go as lowers of that LAG, and
ds->ops->port_change_master() will not get called, because the DSA
master is still the same (the LAG). So we need to hook into the
ds->ops->port_lag_{join,leave} calls on the CPU ports and update the
logical port ID of the LAG that user ports are assigned to.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Drivers could refuse to offload a LAG configuration for a variety of
reasons, mainly having to do with its TX type. Additionally, since DSA
masters may now also be LAG interfaces, and this will translate into a
call to port_lag_join on the CPU ports, there may be extra restrictions
there. Propagate the netlink extack to this DSA method in order for
drivers to give a meaningful error message back to the user.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
DSA is integrated with the new standardized ethtool -S --groups option,
but the felix driver only exports unstructured statistics.
Reuse the array of 64-bit statistics collected by ocelot_check_stats_work(),
but just export select values from it.
Since ocelot_check_stats_work() runs periodically to avoid 32-bit
overflow, and the ethtool calling context is sleepable, we update the
64-bit stats one more time, to provide up-to-date values. The locking
scheme with a mutex followed by a spinlock is a bit hard to digest, so
we create and use a ocelot_port_stats_run() helper with a callback that
populates the ethool stats group the caller is interested in.
The exported stats are:
ethtool -S swp0 --groups eth-phy
ethtool -S swp0 --groups eth-mac
ethtool -S swp0 --groups eth-ctrl
ethtool -S swp0 --groups rmon
ethtool --include-statistics --show-pause swp0
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the logic from the ocelot switchdev driver's ocelot_get_stats64()
method to the common switch lib and reuse it for the DSA driver.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a partial revert of commit c295f9831f ("net: mscc: ocelot:
switch from {,un}set to {,un}assign for tag_8021q CPU ports"), because
as it turns out, this isn't how tag_8021q CPU ports under a LAG are
supposed to work.
Under that scenario, all user ports are "assigned" to the single
tag_8021q CPU port represented by the logical port corresponding to the
bonding interface. So one CPU port in a LAG would have is_dsa_8021q_cpu
set to true (the one whose physical port ID is equal to the logical port
ID), and the other one to false.
In turn, this makes 2 undesirable things happen:
(1) PGID_CPU contains only the first physical CPU port, rather than both
(2) only the first CPU port will be added to the private VLANs used by
ocelot for VLAN-unaware bridging
To make the driver behave in the same way for both bonded CPU ports, we
need to bring back the old concept of setting up a port as a tag_8021q
CPU port, and this is what deals with VLAN membership and PGID_CPU
updating. But we also need the CPU port "assignment" (the user to CPU
port affinity), and this is what updates the PGID_SRC forwarding rules.
All DSA CPU ports are statically configured for tag_8021q mode when the
tagging protocol is changed to ocelot-8021q. User ports are "assigned"
to one CPU port or the other dynamically (this will be handled by a
future change).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The way in which dsa_tree_change_tag_proto() works is that when
dsa_tree_notify() fails, it doesn't know whether the operation failed
mid way in a multi-switch tree, or it failed for a single-switch tree.
So even though drivers need to fail cleanly in
ds->ops->change_tag_protocol(), DSA will still call dsa_tree_notify()
again, to restore the old tag protocol for potential switches in the
tree where the change did succeeed (before failing for others).
This means for the felix driver that if we report an error in
felix_change_tag_protocol(), we'll get another call where proto_ops ==
old_proto_ops. If we proceed to act upon that, we may do unexpected
things. For example, we will call dsa_tag_8021q_register() twice in a
row, without any dsa_tag_8021q_unregister() in between. Then we will
actually call dsa_tag_8021q_unregister() via old_proto_ops->teardown,
which (if it manages to run at all, after walking through corrupted data
structures) will leave the ports inoperational anyway.
The bug can be readily reproduced if we force an error while in
tag_8021q mode; this crashes the kernel.
echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
echo edsa > /sys/class/net/eno2/dsa/tagging # -EPROTONOSUPPORT
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
Call trace:
vcap_entry_get+0x24/0x124
ocelot_vcap_filter_del+0x198/0x270
felix_tag_8021q_vlan_del+0xd4/0x21c
dsa_switch_tag_8021q_vlan_del+0x168/0x2cc
dsa_switch_event+0x68/0x1170
dsa_tree_notify+0x14/0x34
dsa_port_tag_8021q_vlan_del+0x84/0x110
dsa_tag_8021q_unregister+0x15c/0x1c0
felix_tag_8021q_teardown+0x16c/0x180
felix_change_tag_protocol+0x1bc/0x230
dsa_switch_event+0x14c/0x1170
dsa_tree_change_tag_proto+0x118/0x1c0
Fixes: 7a29d220f4 ("net: dsa: felix: reimplement tagging protocol change with function pointers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220808125127.3344094-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, sending a packet into a time gate too small for it (or always
closed) causes the queue system to hold the frame forever. Even worse,
this frame isn't subject to aging either, because for that to happen, it
needs to be scheduled for transmission in the first place. But the frame
will consume buffer memory and frame references while it is forever held
in the queue system.
Before commit a4ae997adc ("net: mscc: ocelot: initialize watermarks to
sane defaults"), this behavior was somewhat subtle, as the switch had a
more intricately tuned default watermark configuration out of reset,
which did not allow any single port and tc to consume the entire switch
buffer space. Nonetheless, the held frames are still there, and they
reduce the total backplane capacity of the switch.
However, after the aforementioned commit, the behavior can be very
clearly seen, since we deliberately allow each {port, tc} to consume the
entire shared buffer of the switch minus the reservations (and we
disable all reservations by default). That is to say, we allow a
permanently closed tc-taprio gate to hang the entire switch.
A careful inspection of the documentation shows that the QSYS:Q_MAX_SDU
per-port-tc registers serve 2 purposes: one is for guard band calculation
(when zero, this falls back to QSYS:PORT_MAX_SDU), and the other is to
enable oversized frame dropping (when non-zero).
Currently the QSYS:Q_MAX_SDU registers are all zero, so oversized frame
dropping is disabled. The goal of the change is to enable it seamlessly.
For that, we need to hook into the MTU change, tc-taprio change, and
port link speed change procedures, since we depend on these variables.
Frames are not dropped on egress due to a queue system oversize
condition, instead that egress port is simply excluded from the mask of
valid destination ports for the packet. If there are no destination
ports at all, the ingress counter that increments is the generic
"drop_tail" in ethtool -S.
The issue exists in various forms since the tc-taprio offload was introduced.
Fixes: de143c0e27 ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload")
Reported-by: Richie Pearn <richard.pearn@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Update the VCAP filters to support multiple tag_8021q CPU ports.
TX works using a filter for VLAN ID on the ingress of the CPU port, with
a redirect and a VLAN pop action. This can be updated trivially by
amending the ingress port mask of this rule to match on all tag_8021q
CPU ports.
RX works using a filter for ingress port on the egress of the CPU port,
with a VLAN push action. Here we need to replicate these filters for
each tag_8021q CPU port, and let them all have the same action.
This means that the OCELOT_VCAP_ES0_TAG_8021Q_RXVLAN() cookie needs to
encode a unique value for every {user port, CPU port} pair it's given.
Do this by encoding the CPU port in the upper 16 bits of the cookie, and
the user port in the lower 16 bits.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a desire for the felix driver to gain support for multiple
tag_8021q CPU ports, but the current model prevents it.
This is because ocelot_apply_bridge_fwd_mask() only takes into
consideration whether a port is a tag_8021q CPU port, but not whose CPU
port it is.
We need a model where we can have a direct affinity between an ocelot
port and a tag_8021q CPU port. This serves as the basis for multiple CPU
ports.
Declare a "dsa_8021q_cpu" backpointer in struct ocelot_port which
encodes that affinity. Repurpose the "ocelot_set_dsa_8021q_cpu" API to
"ocelot_assign_dsa_8021q_cpu" to express the change of paradigm.
Note that this change makes the first practical use of the new
ocelot_port->index field in ocelot_port_unassign_dsa_8021q_cpu(), where
we need to remove the old tag_8021q CPU port from the reserved VLAN range.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Absorb the final details of calling ocelot_port_{,un}set_dsa_8021q_cpu(),
i.e. the need to lock &ocelot->fwd_domain_lock, into the callee, to
simplify the caller and permit easier code reuse later.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add more logic to ocelot_port_{,un}set_dsa_8021q_cpu() from the ocelot
switch lib by encapsulating the ocelot_apply_bridge_fwd_mask() call that
felix used to have.
This is necessary because the CPU port change procedure will also need
to do this, and it's good to reduce code duplication by having an entry
point in the ocelot switch lib that does all that is needed.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PGID_CPU must be updated every time a port is configured or unconfigured
as a tag_8021q CPU port. The ocelot switch lib already has a hook for
that operation, so move the updating of PGID_CPU to those hooks.
These bits are pretty specific to DSA, so normally I would keep them out
of the common switch lib, but when tag_8021q is in use, this has
implications upon the forwarding mask determined by
ocelot_apply_bridge_fwd_mask() and called extensively by the switch lib.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PGID_BC is configured statically by ocelot_init() to flood towards the
CPU port module, and dynamically by ocelot_port_set_bcast_flood()
towards all user ports.
When the tagging protocol changes, the intention is to turn off flooding
towards the old pipe towards the host, and to turn it on towards the new
pipe.
Due to a recent change which removed the adjustment of PGID_BC from
felix_set_host_flood(), 3 things happen.
- when we change from NPI to tag_8021q mode: in this mode, the CPU port
module is accessed via registers, and used to read PTP packets with
timestamps. We fail to disable broadcast flooding towards the CPU port
module, and to enable broadcast flooding towards the physical port
that serves as a DSA tag_8021q CPU port.
- from tag_8021q to NPI mode: in this mode, the CPU port module is
redirected to a physical port. We fail to disable broadcast flooding
towards the physical tag_8021q CPU port, and to enable it towards the
CPU port module at ocelot->num_phys_ports.
- when the ports are put in promiscuous mode, we also fail to update
PGID_BC towards the host pipe of the current protocol.
First issue means that felix_check_xtr_pkt() has to do extra work,
because it will not see only PTP packets, but also broadcasts. It needs
to dequeue these packets just to drop them.
Third issue is inconsequential, since PGID_BC is allocated from the
nonreserved multicast PGID space, and these PGIDs are conveniently
initialized to 0x7f (i.e. flood towards all ports except the CPU port
module). Broadcasts reach the NPI port via ocelot_init(), and reach the
tag_8021q CPU port via the hardware defaults.
Second issue is also inconsequential, because we fail both at disabling
and at enabling broadcast flooding on a port, so the defaults mentioned
above are preserved, and they are fine except for the performance impact.
Fixes: 7a29d220f4 ("net: dsa: felix: reimplement tagging protocol change with function pointers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the ocelot switch lib is unaware of the index of a struct
ocelot_port, since that is kept in the encapsulating structures of outer
drivers (struct dsa_port :: index, struct ocelot_port_private :: chip_port).
With the upcoming increase in complexity associated with assigning DSA
tag_8021q CPU ports to certain user ports, it becomes necessary for the
switch lib to be able to retrieve the index of a certain ocelot_port.
Therefore, introduce a new u8 to ocelot_port (same size as the chip_port
used by the ocelot switchdev driver) and rework the existing code to
populate and use it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The error handling for the current tagging protocol change procedure is
a bit brittle (we dismantle the previous tagging protocol entirely
before setting up the new one). By identifying which parts of a tagging
protocol are unique to itself and which parts are shared with the other,
we can implement a protocol change procedure where error handling is a
bit more robust, because we start setting up the new protocol first, and
tear down the old one only after the setup of the specific and shared
parts succeeded.
The protocol change is a bit too open-coded too, in the area of
migrating host flood settings and MDBs. By identifying what differs
between tagging protocols (the forwarding masks for host flooding) we
can implement a more straightforward migration procedure which is
handled in the shared portion of the protocol change, rather than
individually by each protocol.
Therefore, a more structured approach calls for the introduction of a
structure of function pointers per tagging protocol. This covers setup,
teardown and the host forwarding mask. In the future it will also cover
how to prepare for a new DSA master.
The initial tagging protocol setup (at driver probe time) and the final
teardown (at driver removal time) are also adapted to call into the
structured methods of the specific protocol in current use. This is
especially relevant for teardown, where we previously called
felix_del_tag_protocol() only for the first CPU port. But by not
specifying which CPU port this is for, we gain more flexibility to
support multiple CPU ports in the future.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ocelot switches support a single active CPU port at a time (at least as
a trapping destination, i.e. for control traffic). This is true
regardless of whether we are using the native copy-to-CPU-port-module
functionality, or a redirect action towards the software-defined
tag_8021q CPU port.
Currently we assume that the trapping destination in tag_8021q mode is
the first CPU port, yet in the future we may want to migrate the user
ports to the second CPU port.
For that to work, we need to make sure that the tag_8021q trapping
destination is a CPU port that is active, i.e. is used by at least some
user port on which the trap was added. Otherwise, we may end up
redirecting the traffic to a CPU port which isn't even up.
Note that due to the current design where we simply choose the CPU port
of the first port from the trap's ingress port mask, it may be that a
CPU port absorbes control traffic from user ports which aren't affine to
it as per user space's request. This isn't ideal, but is the lesser of
two evils. Following the user-configured affinity for traps would mean
that we can no longer reuse a single TCAM entry for multiple traps,
which is what we actually do for e.g. PTP. Either we duplicate and
deduplicate TCAM entries on the fly when user-to-CPU-port mappings
change (which is unnecessarily complicated), or we redirect trapped
traffic to all tag_8021q CPU ports if multiple such ports are in use.
The latter would have actually been nice, if it actually worked, but it
doesn't, since a OCELOT_MASK_MODE_REDIRECT action towards multiple ports
would not take PGID_SRC into consideration, and it would just duplicate
the packet towards each (CPU) port, leading to duplicates in software.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
DSA has not supported (and probably will not support in the future
either) independent tagging protocols per CPU port.
Different switch drivers have different requirements, some may need to
replicate some settings for each CPU port, some may need to apply some
settings on a single CPU port, while some may have to configure some
global settings and then some per-CPU-port settings.
In any case, the current model where DSA calls ->change_tag_protocol for
each CPU port turns out to be impractical for drivers where there are
global things to be done. For example, felix calls dsa_tag_8021q_register(),
which makes no sense per CPU port, so it suppresses the second call.
Let drivers deal with replication towards all CPU ports, and remove the
CPU port argument from the function prototype.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>