2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

258 Commits

Author SHA1 Message Date
Vladimir Oltean
7e092af2f3 net: dsa: tag_8021q: setup tagging via a single function call
There is no point in calling dsa_port_setup_8021q_tagging for each
individual port. Additionally, it will become more difficult to do that
when we'll have a context structure to tag_8021q (next patch). So
refactor this now.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-11 17:30:43 -07:00
Nathan Chancellor
5978fac03e net: dsa: sja1105: Do not use address of compatible member in sja1105_check_device_id
Clang warns:

drivers/net/dsa/sja1105/sja1105_main.c:3418:38: warning: address of
array 'match->compatible' will always evaluate to 'true'
[-Wpointer-bool-conversion]
        for (match = sja1105_dt_ids; match->compatible; match++) {
        ~~~                          ~~~~~~~^~~~~~~~~~
1 warning generated.

We should check the value of the first character in compatible to see if
it is empty or not. This matches how the rest of the tree iterates over
IDs.

Fixes: 0b0e299720 ("net: dsa: sja1105: use detected device id instead of DT one on mismatch")
Link: https://github.com/ClangBuiltLinux/linux/issues/1139
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 16:13:25 -07:00
Vladimir Oltean
0b0e299720 net: dsa: sja1105: use detected device id instead of DT one on mismatch
Although we can detect the chip revision 100% at runtime, it is useful
to specify it in the device tree compatible string too, because
otherwise there would be no way to assess the correctness of device tree
bindings statically, without booting a board (only some switch versions
have internal RGMII delays and/or an SGMII port).

But for testing the P/Q/R/S support, what I have is a reworked board
with the SJA1105T replaced by a pin-compatible SJA1105Q, and I don't
want to keep a separate device tree blob just for this one-off board.
Since just the chip has been replaced, its RGMII delay setup is
inherently the same (meaning: delays added by the PHY on the slave
ports, and by PCB traces on the fixed-link CPU port).

For this board, I'd rather have the driver shout at me, but go ahead and
use what it found even if it doesn't match what it's been told is there.

[    2.970826] sja1105 spi0.1: Device tree specifies chip SJA1105T but found SJA1105Q, please fix it!
[    2.980010] sja1105 spi0.1: Probed switch chip: SJA1105Q
[    3.005082] sja1105 spi0.1: Enabled switch tagging

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-05 12:20:55 -07:00
Po Liu
5f035af76e net:qos: police action offloading parameter 'burst' change to the original value
Since 'tcfp_burst' with TICK factor, driver side always need to recover
it to the original value, this patch moves the generic calculation and
recover to the 'burst' original value before offloading to device driver.

Signed-off-by: Po Liu <po.liu@nxp.com>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:33:42 -07:00
Vladimir Oltean
aef31718a9 net: dsa: sja1105: avoid invalid state in sja1105_vlan_filtering
Be there 2 switches spi/spi2.0 and spi/spi2.1 in a cross-chip setup,
both under the same VLAN-filtering bridge, both in the
SJA1105_VLAN_BEST_EFFORT state.

If we try to change the VLAN state of one of the switches (to
SJA1105_VLAN_FILTERING_FULL) we get the following error:

devlink dev param set spi/spi2.1 name best_effort_vlan_filtering value
false cmode runtime
[   38.325683] sja1105 spi2.1: Not allowed to overcommit frame memory.
               L2 memory partitions and VL memory partitions share the
               same space. The sum of all 16 memory partitions is not
               allowed to be larger than 929 128-byte blocks (or 910
               with retagging). Please adjust
               l2-forwarding-parameters-table.part_spc and/or
               vl-forwarding-parameters-table.partspc.
[   38.356803] sja1105 spi2.1: Invalid config, cannot upload

This is because the spi/spi2.1 switch doesn't support tagging anymore in
the SJA1105_VLAN_FILTERING_FULL state, so it doesn't need to have any
retagging rules defined. Great, so it can use more frame memory
(retagging consumes extra memory).

But the built-in low-level static config checker from the sja1105 driver
says "not so fast, you've increased the frame memory to non-retagging
values, but you still kept the retagging rules in the static config".

So we need to rebuild the VLAN table immediately before re-uploading the
static config, operation which will take care, based on the new VLAN
state, of removing the retagging rules.

Fixes: 3f01c91aab ("net: dsa: sja1105: implement VLAN retagging for dsa_8021q sub-VLANs")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 16:49:50 -07:00
Vladimir Oltean
4d7525085a net: dsa: sja1105: offload the Credit-Based Shaper qdisc
SJA1105, being AVB/TSN switches, provide hardware assist for the
Credit-Based Shaper as described in the IEEE 8021Q-2018 document.

First generation has 10 shapers, freely assignable to any of the 4
external ports and 8 traffic classes, and second generation has 16
shapers.

The Credit-Based Shaper tables are accessed through the dynamic
reconfiguration interface, so we have to restore them manually after a
switch reset. The tables are backed up by the static config only on
P/Q/R/S, and we don't want to add custom code only for that family,
since the procedure that is in place now works for both.

Tested with the following commands:

data_rate_kbps=67000
port_transmit_rate_kbps=1000000
idleslope=$data_rate_kbps
sendslope=$(($idleslope - $port_transmit_rate_kbps))
locredit=$((-0x80000000))
hicredit=$((0x7fffffff))
tc qdisc add dev swp2 root handle 1: mqprio hw 0 num_tc 8 \
        map 0 1 2 3 4 5 6 7 \
        queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7
tc qdisc replace dev swp2 parent 1:1 cbs \
        idleslope $idleslope \
        sendslope $sendslope \
        hicredit $hicredit \
        locredit $locredit \
        offload 1

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-28 11:01:22 -07:00
Vladimir Oltean
3f01c91aab net: dsa: sja1105: implement VLAN retagging for dsa_8021q sub-VLANs
Expand the delta commit procedure for VLANs with additional logic for
treating bridge_vlans in the newly introduced operating mode,
SJA1105_VLAN_BEST_EFFORT.

For every bridge VLAN on every user port, a sub-VLAN index is calculated
and retagging rules are installed towards a dsa_8021q rx_vid that
encodes that sub-VLAN index. This way, the tagger can identify the
original VLANs.

Extra care is taken for VLANs to still work as intended in cross-chip
scenarios. Retagging may have unintended consequences for these because
a sub-VLAN encoding that works for the CPU does not make any sense for a
front-panel port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
aaa270c638 net: dsa: sja1105: implement a common frame memory partitioning function
There are 2 different features that require some reserved frame memory
space: VLAN retagging and virtual links. Create a central function that
modifies the static config and ensures frame memory is never
overcommitted.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
2cafa72e51 net: dsa: sja1105: add a new best_effort_vlan_filtering devlink parameter
This devlink parameter enables the handling of DSA tags when enslaved to
a bridge with vlan_filtering=1. There are very good reasons to want
this, but there are also very good reasons for not enabling it by
default. So a devlink param named best_effort_vlan_filtering, currently
driver-specific and exported only by sja1105, is used to configure this.

In practice, this is perhaps the way that most users are going to use
the switch in. It assumes that no more than 7 VLANs are needed per port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
84eeb5d460 net: dsa: tag_sja1105: implement sub-VLAN decoding
Create a subvlan_map as part of each port's tagger private structure.
This keeps reverse mappings of bridge-to-dsa_8021q VLAN retagging rules.

Note that as of this patch, this piece of code is never engaged, due to
the fact that the driver hasn't installed any retagging rule, so we'll
always see packets with a subvlan code of 0 (untagged).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
38b5beeae7 net: dsa: sja1105: prepare tagger for handling DSA tags and VLAN simultaneously
In VLAN-unaware mode, sja1105 uses VLAN tags with a custom TPID of
0xdadb. While in the yet-to-be introduced best_effort_vlan_filtering
mode, it needs to work with normal VLAN TPID values.

A complication arises when we must transmit a VLAN-tagged packet to the
switch when it's in VLAN-aware mode. We need to construct a packet with
2 VLAN tags, and the switch will use the outer header for routing and
pop it on egress. But sadly, here the 2 hardware generations don't
behave the same:

- E/T switches won't pop an ETH_P_8021AD tag on egress, it seems
  (packets will remain double-tagged).
- P/Q/R/S switches will drop a packet with 2 ETH_P_8021Q tags (it looks
  like it tries to prevent VLAN hopping).

But looks like the reverse is also true:

- E/T switches have no problem popping the outer tag from packets with
  2 ETH_P_8021Q tags.
- P/Q/R/S will have no problem popping a single tag even if that is
  ETH_P_8021AD.

So it is clear that if we want the hardware to work with dsa_8021q
tagging in VLAN-aware mode, we need to send different TPIDs depending on
revision. Keep that information in priv->info->qinq_tpid.

The per-port tagger structure will hold an xmit_tpid value that depends
not only upon the qinq_tpid, but also upon the VLAN awareness state
itself (in case we must transmit using 0xdadb).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
cfa36b1fff net: dsa: sja1105: exit sja1105_vlan_filtering when called multiple times
VLAN filtering is a global property for sja1105, and that means that we
rely on the DSA core to not call us more than once.

But we need to introduce some per-port state for the tagger, namely the
xmit_tpid, and the best place to do that is where the xmit_tpid changes,
namely in sja1105_vlan_filtering. So at the moment, exit early from the
function to avoid unnecessarily resetting the switch for each port call.
Then we'll change the xmit_tpid prior to the early exit in the next
patch.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
fa83e5d918 net: dsa: sja1105: allow VLAN configuration from the bridge in all states
Let the DSA core call our .port_vlan_add methods every time the bridge
layer requests so. We will deal internally with saving/restoring VLANs
depending on our VLAN awareness state.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
ec5ae61076 net: dsa: sja1105: save/restore VLANs using a delta commit method
Managing the VLAN table that is present in hardware will become very
difficult once we add a third operating state
(best_effort_vlan_filtering). That is because correct cleanup (not too
little, not too much) becomes virtually impossible, when VLANs can be
added from the bridge layer, from dsa_8021q for basic tagging, for
cross-chip bridging, as well as retagging rules for sub-VLANs and
cross-chip sub-VLANs. So we need to rethink VLAN interaction with the
switch in a more scalable way.

In preparation for that, use the priv->expect_dsa_8021q boolean to
classify any VLAN request received through .port_vlan_add or
.port_vlan_del towards either one of 2 internal lists: bridge VLANs and
dsa_8021q VLANs.

Then, implement a central sja1105_build_vlan_table method that creates a
VLAN configuration from scratch based on the 2 lists of VLANs kept by
the driver, and based on the VLAN awareness state. Currently, if we are
VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs.

Then, implement a delta commit procedure that identifies which VLANs
from this new configuration are actually different from the config
previously committed to hardware. We apply the delta through the dynamic
configuration interface (we don't reset the switch). The result is that
the hardware should see the exact sequence of operations as before this
patch.

This also helps remove the "br" argument passed to
dsa_8021q_crosschip_bridge_join, which it was only using to figure out
whether it should commit the configuration back to us or not, based on
the VLAN awareness state of the bridge. We can simplify that, by always
allowing those VLANs inside of our dsa_8021q_vlans list, and committing
those to hardware when necessary.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
60b33aeb7e net: dsa: sja1105: deny alterations of dsa_8021q VLANs from the bridge
At the moment, this can never happen. The 2 modes that we operate in do
not permit that:

 - SJA1105_VLAN_UNAWARE: we are guarded from bridge VLANs added by the
   user by the DSA core. We will later lift this restriction by setting
   ds->vlan_bridge_vtu = true, and that is where we'll need it.

 - SJA1105_VLAN_FILTERING_FULL: in this mode, dsa_8021q configuration is
   disabled. So the user is free to add these VLANs in the 1024-3071
   range.

The reason for the patch is that we'll introduce a third VLAN awareness
state, where both dsa_8021q as well as the bridge are going to call our
.port_vlan_add and .port_vlan_del methods.

For that, we need a good way to discriminate between the 2. The easiest
(and less intrusive way for upper layers) is to recognize the fact that
dsa_8021q configurations are always driven by our driver - we _know_
when a .port_vlan_add method will be called from dsa_8021q because _we_
initiated it.

So introduce an expect_dsa_8021q boolean which is only used, at the
moment, for blacklisting VLANs in range 1024-3071 in the modes when
dsa_8021q is active.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:08 -07:00
Vladimir Oltean
7f14937fac net: dsa: sja1105: keep the VLAN awareness state in a driver variable
Soon we'll add a third operating mode to the driver. Introduce a
vlan_state to make things more easy to manage, and use it where
applicable.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 13:08:07 -07:00
Vladimir Oltean
ac02a451a6 net: dsa: sja1105: implement cross-chip bridging operations
sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart
and which is compatible with cascading. A complete description of this
tagging format is in net/dsa/tag_8021q.c, but a quick summary is that
each external-facing port tags incoming frames with a unique pvid, and
this special VLAN is transmitted as tagged towards the inside of the
system, and as untagged towards the exterior. The tag encodes the switch
id and the source port index.

This means that cross-chip bridging for dsa_8021q only entails adding
the dsa_8021q pvids of one switch to the RX filter of the other
switches. Everything else falls naturally into place, as long as the
bottom-end of ports (the leaves in the tree) is comprised exclusively of
dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be
a chance that a front-panel switch transmits a packet tagged with a
dsa_8021q header, header which it wouldn't be able to remove, and which
would hence "leak" out.

The only use case I tested (due to lack of board availability) was when
the sja1105 switches are part of disjoint trees (however, this doesn't
change the fact that multiple sja1105 switches still need unique switch
identifiers in such a system). But in principle, even "true" single-tree
setups (with DSA links) should work just as fine, except for a small
change which I can't test: dsa_towards_port should be used instead of
dsa_upstream_port (I made the assumption that the routing port that any
sja1105 should use towards its neighbours is the CPU port. That might
not hold true in other setups).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 19:52:33 -07:00
Vladimir Oltean
834f8933d5 net: dsa: sja1105: implement tc-gate using time-triggered virtual links
Restrict the TTEthernet hardware support on this switch to operate as
closely as possible to IEEE 802.1Qci as possible. This means that it can
perform PTP-time-based ingress admission control on streams identified
by {DMAC, VID, PCP}, which is useful when trying to ensure the
determinism of traffic scheduled via IEEE 802.1Qbv.

The oddity comes from the fact that in hardware (and in TTEthernet at
large), virtual links always need a full-blown action, including not
only the type of policing, but also the list of destination ports. So in
practice, a single tc-gate action will result in all packets getting
dropped. Additional actions (either "trap" or "redirect") need to be
specified in the same filter rule such that the conforming packets are
actually forwarded somewhere.

Apart from the VL Lookup, Policing and Forwarding tables which need to
be programmed for each flow (virtual link), the Schedule engine also
needs to be told to open/close the admission gates for each individual
virtual link. A fairly accurate (and detailed) description of how that
works is already present in sja1105_tas.c, since it is already used to
trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
point here, we remember that the schedule engine supports 8
"subschedules" (execution threads that iterate through the global
schedule in parallel, and that no 2 hardware threads must execute a
schedule entry at the same time). For tc-taprio, each egress port used
one of these 8 subschedules, leaving a total of 4 subschedules unused.
In principle we could have allocated 1 subschedule for the tc-gate
offload of each ingress port, but actually the schedules of all virtual
links installed on each ingress port would have needed to be merged
together, before they could have been programmed to hardware. So
simplify our life and just merge the entire tc-gate configuration, for
all virtual links on all ingress ports, into a single subschedule. Be
sure to check that against the usual hardware scheduling conflicts, and
program it to hardware alongside any tc-taprio subschedule that may be
present.

The following scenarios were tested:

1. Quantitative testing:

   tc qdisc add dev swp2 clsact
   tc filter add dev swp2 ingress flower skip_sw \
           dst_mac 42:be:24:9b:76:20 \
           action gate index 1 base-time 0 \
           sched-entry OPEN 1200 -1 -1 \
           sched-entry CLOSE 1200 -1 -1 \
           action trap

   ping 192.168.1.2 -f
   PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
   .............................
   --- 192.168.1.2 ping statistics ---
   948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms

2. Qualitative testing (with a phase-aligned schedule - the clocks are
   synchronized by ptp4l, not shown here):

   Receiver (sja1105):

   tc qdisc add dev swp2 clsact
   now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
           sec=$(echo $now | awk -F. '{print $1}') && \
           base_time="$(((sec + 2) * 1000000000))" && \
           echo "base time ${base_time}"
   tc filter add dev swp2 ingress flower skip_sw \
           dst_mac 42:be:24:9b:76:20 \
           action gate base-time ${base_time} \
           sched-entry OPEN  60000 -1 -1 \
           sched-entry CLOSE 40000 -1 -1 \
           action trap

   Sender (enetc):
   now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
           sec=$(echo $now | awk -F. '{print $1}') && \
           base_time="$(((sec + 2) * 1000000000))" && \
           echo "base time ${base_time}"
   tc qdisc add dev eno0 parent root taprio \
           num_tc 8 \
           map 0 1 2 3 4 5 6 7 \
           queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
           base-time ${base_time} \
           sched-entry S 01  50000 \
           sched-entry S 00  50000 \
           flags 2

   ping -A 192.168.1.1
   PING 192.168.1.1 (192.168.1.1): 56 data bytes
   ...
   ^C
   --- 192.168.1.1 ping statistics ---
   1425 packets transmitted, 1424 packets received, 0% packet loss
   round-trip min/avg/max = 0.322/0.361/0.990 ms

   And just for comparison, with the tc-taprio schedule deleted:

   ping -A 192.168.1.1
   PING 192.168.1.1 (192.168.1.1): 56 data bytes
   ...
   ^C
   --- 192.168.1.1 ping statistics ---
   33 packets transmitted, 19 packets received, 42% packet loss
   round-trip min/avg/max = 0.336/0.464/0.597 ms

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-07 17:31:57 -07:00
Vladimir Oltean
dfacc5a23e net: dsa: sja1105: support flow-based redirection via virtual links
Implement tc-flower offloads for redirect, trap and drop using
non-critical virtual links.

Commands which were tested to work are:

  # Send frames received on swp2 with a DA of 42:be:24:9b:76:20 to the
  # CPU and to swp3. This type of key (DA only) when the port's VLAN
  # awareness state is off.
  tc qdisc add dev swp2 clsact
  tc filter add dev swp2 ingress flower skip_sw dst_mac 42:be:24:9b:76:20 \
          action mirred egress redirect dev swp3 \
          action trap

  # Drop frames received on swp2 with a DA of 42:be:24:9b:76:20, a VID
  # of 100 and a PCP of 0.
  tc filter add dev swp2 ingress protocol 802.1Q flower skip_sw \
          dst_mac 42:be:24:9b:76:20 vlan_id 100 vlan_prio 0 action drop

Under the hood, all rules match on DMAC, VID and PCP, but when VLAN
filtering is disabled, those are set internally by the driver to the
port-based defaults. Because we would be put in an awkward situation if
the user were to change the VLAN filtering state while there are active
rules (packets would no longer match on the specified keys), we simply
deny changing vlan_filtering unless the list of flows offloaded via
virtual links is empty. Then the user can re-add new rules.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-07 17:31:57 -07:00
Vladimir Oltean
a6af77637a net: dsa: sja1105: add broadcast and per-traffic class policers
This patch adds complete support for manipulating the L2 Policing Tables
from this switch. There are 45 table entries, one entry per each port
and traffic class, and one dedicated entry for broadcast traffic for
each ingress port.

Policing entries are shareable, and we use this functionality to support
shared block filters.

We are modeling broadcast policers as simple tc-flower matches on
dst_mac. As for the traffic class policers, the switch only deduces the
traffic class from the VLAN PCP field, so it makes sense to model this
as a tc-flower match on vlan_prio.

How to limit broadcast traffic coming from all front-panel ports to a
cumulated total of 10 Mbit/s:

tc qdisc add dev sw0p0 ingress_block 1 clsact
tc qdisc add dev sw0p1 ingress_block 1 clsact
tc qdisc add dev sw0p2 ingress_block 1 clsact
tc qdisc add dev sw0p3 ingress_block 1 clsact
tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
	action police rate 10mbit burst 64k

How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
100 Mbit/s on port 0 only:

tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
	vlan_prio 0 action police rate 100mbit burst 64k

The broadcast, VLAN PCP and port policers are compatible with one
another (can be installed at the same time on a port).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:01 -07:00
Vladimir Oltean
a7cc081cab net: dsa: sja1105: add configuration of port policers
This adds partial configuration support for the L2 Policing Table. Out
of the 45 policing entries, only 5 are used (one for each port), in a
shared manner. All 8 traffic classes, and the broadcast policer, are
redirected to a common instance which belongs to the ingress port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30 11:44:00 -07:00
Vladimir Oltean
c279c7261a net: dsa: sja1105: implement the port MTU callbacks
On this switch, the frame length enforcements are performed by the
ingress policers. There are 2 types of those: regular L2 (also called
best-effort) and Virtual Link policers (an ARINC664/AFDX concept for
defining L2 streams with certain QoS abilities). To avoid future
confusion, I prefer to call the reset reason "Best-effort policers",
even though the VL policers are not yet supported.

We also need to change the setup of the initial static config, such that
DSA calls to .change_mtu (which are expensive) become no-ops and don't
reset the switch 5 times.

A driver-level decision is to unconditionally allow single VLAN-tagged
traffic on all ports. The CPU port must accept an additional VLAN header
for the DSA tag, which is again a driver-level decision.

The policers actually count bytes not only from the SDU, but also from
the Ethernet header and FCS, so those need to be accounted for as well.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-27 16:07:25 -07:00
Vladimir Oltean
747e5eb31d net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT
The SJA1105 switch family has a PTP_CLK pin which emits a signal with
fixed 50% duty cycle, but variable frequency and programmable start time.

On the second generation (P/Q/R/S) switches, this pin supports even more
functionality. The use case described by the hardware documents talks
about synchronization via oneshot pulses: given 2 sja1105 switches,
arbitrarily designated as a master and a slave, the master emits a
single pulse on PTP_CLK, while the slave is configured to timestamp this
pulse received on its PTP_CLK pin (which must obviously be configured as
input). The difference between the timestamps then exactly becomes the
slave offset to the master.

The only trouble with the above is that the hardware is very much tied
into this use case only, and not very generic beyond that:
 - When emitting a oneshot pulse, instead of being told when to emit it,
   the switch just does it "now" and tells you later what time it was,
   via the PTPSYNCTS register. [ Incidentally, this is the same register
   that the slave uses to collect the ext_ts timestamp from, too. ]
 - On the sync slave, there is no interrupt mechanism on reception of a
   new extts, and no FIFO to buffer them, because in the foreseen use
   case, software is in control of both the master and the slave pins,
   so it "knows" when there's something to collect.

These 2 problems mean that:
 - We don't support (at least yet) the quirky oneshot mode exposed by
   the hardware, just normal periodic output.
 - We abuse the hardware a little bit when we expose generic extts.
   Because there's no interrupt mechanism, we need to poll at double the
   frequency we expect to receive a pulse. Currently that means a
   non-configurable "twice a second".

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23 22:15:07 -07:00
Vladimir Oltean
79d5511cc0 net: dsa: sja1105: unconditionally set DESTMETA and SRCMETA in AVB table
These fields configure the destination and source MAC address that the
switch will put in the Ethernet frames sent towards the CPU port that
contain RX timestamps for PTP.

These fields do not enable the feature itself, that is configured via
SEND_META0 and SEND_META1 in the General Params table.

The implication of this patch is that the AVB Params table will always
be present in the static config. Which doesn't really hurt.

This is needed because in a future patch, we will add another field from
this table, CAS_MASTER, for configuring the PTP_CLK pin function. That
can be configured irrespective of whether RX timestamping is enabled or
not, so always having this table present is going to simplify things a
bit.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23 22:15:07 -07:00
Vladimir Oltean
ffe10e679c net: dsa: sja1105: Add support for the SGMII port
SJA1105 switches R and S have one SerDes port with an 802.3z
quasi-compatible PCS, hardwired on port 4. The other ports are still
MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds;
the MAC on this port is hardwired at gigabit. Only full duplex is
supported.

The SGMII port can be configured as part of the static config tables, as
well as through a dedicated SPI address region for its pseudo-clause-22
registers. However it looks like the static configuration is not
able to change some out-of-reset values (like the value of MII_BMCR), so
at the end of the day, having code for it is utterly pointless. We are
just going to use the pseudo-C22 interface.

Because the PCS gets reset when the switch resets, we have to add even
more restoration logic to sja1105_static_config_reload, otherwise the
SGMII port breaks after operations such as enabling PTP timestamping
which require a switch reset.

>From PHYLINK perspective, the switch supports *only* SGMII (it doesn't
support 1000Base-X). It also doesn't expose access to the raw config
word for in-band AN in registers MII_ADV/MII_LPA.
It is able to work in the following modes:
 - Forced speed
 - SGMII in-band AN slave (speed received from PHY)
 - SGMII in-band AN master (acting as a PHY)

The latter mode is not supported by this patch. It is even unclear to me
how that would be described. There is some code for it left in the
patch, but 'an_master' is always passed as false.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 08:55:21 -07:00
Vladimir Oltean
ee9d0cb6c4 net: dsa: sja1105: Avoid error message for unknown PHY mode on disabled ports
When sja1105_init_mii_settings iterates over the port list, it prints
this message for disabled ports, because they don't have a valid
phy-mode:

[    4.778702] sja1105 spi2.0: Unsupported PHY mode unknown!

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Suggested-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-19 21:37:40 -07:00
Vladimir Oltean
ec8582d134 net: dsa: sja1105: move MAC configuration to .phylink_mac_link_up
The switches supported so far by the driver only have non-SerDes ports,
so they should be configured in the PHYLINK callback that provides the
resolved PHY link parameters.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-14 20:49:48 -07:00
David S. Miller
1d34357931 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor overlapping changes, nothing serious.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-12 22:34:48 -07:00
Oleksij Rempel
ca68e1384f net: dsa: sja1105: add 100baseT1_Full support
Validate 100baseT1_Full to make this driver work with TJA1102 PHY.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-03 14:54:05 -08:00
Vladimir Oltean
52c0d4e306 net: dsa: sja1105: Don't destroy not-yet-created xmit_worker
Fixes the following NULL pointer dereference on PHY connect error path
teardown:

[    2.291010] sja1105 spi0.1: Probed switch chip: SJA1105T
[    2.310044] sja1105 spi0.1: Enabled switch tagging
[    2.314970] fsl-gianfar soc:ethernet@2d90000 eth2: error -19 setting up slave phy
[    2.322463] 8<--- cut here ---
[    2.325497] Unable to handle kernel NULL pointer dereference at virtual address 00000018
[    2.333555] pgd = (ptrval)
[    2.336241] [00000018] *pgd=00000000
[    2.339797] Internal error: Oops: 5 [#1] SMP ARM
[    2.344384] Modules linked in:
[    2.347420] CPU: 1 PID: 64 Comm: kworker/1:1 Not tainted 5.5.0-rc5 #1
[    2.353820] Hardware name: Freescale LS1021A
[    2.358070] Workqueue: events deferred_probe_work_func
[    2.363182] PC is at kthread_destroy_worker+0x4/0x74
[    2.368117] LR is at sja1105_teardown+0x70/0xb4
[    2.372617] pc : [<c036cdd4>]    lr : [<c0b89238>]    psr: 60000013
[    2.378845] sp : eeac3d30  ip : eeab1900  fp : eef45480
[    2.384036] r10: eef4549c  r9 : 00000001  r8 : 00000000
[    2.389227] r7 : eef527c0  r6 : 00000034  r5 : ed8ddd0c  r4 : ed8ddc40
[    2.395714] r3 : 00000000  r2 : 00000000  r1 : eef4549c  r0 : 00000000
[    2.402204] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[    2.409297] Control: 10c5387d  Table: 8020406a  DAC: 00000051
[    2.415008] Process kworker/1:1 (pid: 64, stack limit = 0x(ptrval))
[    2.421237] Stack: (0xeeac3d30 to 0xeeac4000)
[    2.612635] [<c036cdd4>] (kthread_destroy_worker) from [<c0b89238>] (sja1105_teardown+0x70/0xb4)
[    2.621379] [<c0b89238>] (sja1105_teardown) from [<c10717fc>] (dsa_switch_teardown.part.1+0x48/0x74)
[    2.630467] [<c10717fc>] (dsa_switch_teardown.part.1) from [<c1072438>] (dsa_register_switch+0x8b0/0xbf4)
[    2.639984] [<c1072438>] (dsa_register_switch) from [<c0b89c30>] (sja1105_probe+0x2ac/0x464)
[    2.648378] [<c0b89c30>] (sja1105_probe) from [<c0b11a5c>] (spi_drv_probe+0x7c/0xa0)
[    2.656081] [<c0b11a5c>] (spi_drv_probe) from [<c0a26ab8>] (really_probe+0x208/0x480)
[    2.663871] [<c0a26ab8>] (really_probe) from [<c0a26f0c>] (driver_probe_device+0x78/0x1c4)
[    2.672093] [<c0a26f0c>] (driver_probe_device) from [<c0a24c48>] (bus_for_each_drv+0x80/0xc4)
[    2.680574] [<c0a24c48>] (bus_for_each_drv) from [<c0a26810>] (__device_attach+0xd0/0x168)
[    2.688794] [<c0a26810>] (__device_attach) from [<c0a259d8>] (bus_probe_device+0x84/0x8c)
[    2.696927] [<c0a259d8>] (bus_probe_device) from [<c0a25f24>] (deferred_probe_work_func+0x84/0xc4)
[    2.705842] [<c0a25f24>] (deferred_probe_work_func) from [<c03667b0>] (process_one_work+0x22c/0x560)
[    2.714926] [<c03667b0>] (process_one_work) from [<c0366d8c>] (worker_thread+0x2a8/0x5d4)
[    2.723059] [<c0366d8c>] (worker_thread) from [<c036cf94>] (kthread+0x150/0x154)
[    2.730416] [<c036cf94>] (kthread) from [<c03010e8>] (ret_from_fork+0x14/0x2c)

Checking for NULL pointer is correct because the per-port xmit kernel
threads are created in sja1105_probe immediately after calling
dsa_register_switch.

Fixes: a68578c20a ("net: dsa: Make deferred_xmit private to sja1105")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-29 21:58:46 -08:00
Russell King
5b502a7b29 net: dsa: propagate resolved link config via mac_link_up()
Propagate the resolved link configuration down via DSA's
phylink_mac_link_up() operation to allow split PCS/MAC to work.

Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 12:02:14 -08:00
David S. Miller
b3f7e3f23a Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
Vladimir Oltean
27afe0d34e net: dsa: sja1105: Don't error out on disabled ports with no phy-mode
The sja1105_parse_ports_node function was tested only on device trees
where all ports were enabled. Fix this check so that the driver
continues to probe only with the ports where status is not "disabled",
as expected.

Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-17 13:22:12 +01:00
Florian Fainelli
4d776482ec net: dsa: Get information about stacked DSA protocol
It is possible to stack multiple DSA switches in a way that they are not
part of the tree (disjoint) but the DSA master of a switch is a DSA
slave of another. When that happens switch drivers may have to know this
is the case so as to determine whether their tagging protocol has a
remove chance of working.

This is useful for specific switch drivers such as b53 where devices
have been known to be stacked in the wild without the Broadcom tag
protocol supporting that feature. This allows b53 to continue supporting
those devices by forcing the disabling of Broadcom tags on the outermost
switches if necessary.

The get_tag_protocol() function is therefore updated to gain an
additional enum dsa_tag_protocol argument which denotes the current
tagging protocol used by the DSA master we are attached to, else
DSA_TAG_PROTO_NONE for the top of the dsa_switch_tree.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 16:01:13 -08:00
Vladimir Oltean
a68578c20a net: dsa: Make deferred_xmit private to sja1105
There are 3 things that are wrong with the DSA deferred xmit mechanism:

1. Its introduction has made the DSA hotpath ever so slightly more
   inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs
   to be initialized to false for every transmitted frame, in order to
   figure out whether the driver requested deferral or not (a very rare
   occasion, rare even for the only driver that does use this mechanism:
   sja1105). That was necessary to avoid kfree_skb from freeing the skb.

2. Because L2 PTP is a link-local protocol like STP, it requires
   management routes and deferred xmit with this switch. But as opposed
   to STP, the deferred work mechanism needs to schedule the packet
   rather quickly for the TX timstamp to be collected in time and sent
   to user space. But there is no provision for controlling the
   scheduling priority of this deferred xmit workqueue. Too bad this is
   a rather specific requirement for a feature that nobody else uses
   (more below).

3. Perhaps most importantly, it makes the DSA core adhere a bit too
   much to the NXP company-wide policy "Innovate Where It Doesn't
   Matter". The sja1105 is probably the only DSA switch that requires
   some frames sent from the CPU to be routed to the slave port via an
   out-of-band configuration (register write) rather than in-band (DSA
   tag). And there are indeed very good reasons to not want to do that:
   if that out-of-band register is at the other end of a slow bus such
   as SPI, then you limit that Ethernet flow's throughput to effectively
   the throughput of the SPI bus. So hardware vendors should definitely
   not be encouraged to design this way. We do _not_ want more
   widespread use of this mechanism.

Luckily we have a solution for each of the 3 issues:

For 1, we can just remove that variable in the skb->cb and counteract
the effect of kfree_skb with skb_get, much to the same effect. The
advantage, of course, being that anybody who doesn't use deferred xmit
doesn't need to do any extra operation in the hotpath.

For 2, we can create a kernel thread for each port's deferred xmit work.
If the user switch ports are named swp0, swp1, swp2, the kernel threads
will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15
character length limit on kernel thread names). With this, the user can
change the scheduling priority with chrt $(pidof swp2_xmit).

For 3, we can actually move the entire implementation to the sja1105
driver.

So this patch deletes the generic implementation from the DSA core and
adds a new one, more adequate to the requirements of PTP TX
timestamping, in sja1105_main.c.

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 15:13:13 -08:00
Vladimir Oltean
0a51826c6e net: dsa: sja1105: Always send through management routes in slot 0
I finally found out how the 4 management route slots are supposed to
be used, but.. it's not worth it.

The description from the comment I've just deleted in this commit is
still true: when more than 1 management slot is active at the same time,
the switch will match frames incoming [from the CPU port] on the lowest
numbered management slot that matches the frame's DMAC.

My issue was that one was not supposed to statically assign each port a
slot. Yes, there are 4 slots and also 4 non-CPU ports, but that is a
mere coincidence.

Instead, the switch can be used like this: every management frame gets a
slot at the right of the most recently assigned slot:

Send mgmt frame 1 through S0:    S0 x  x  x
Send mgmt frame 2 through S1:    S0 S1 x  x
Send mgmt frame 3 through S2:    S0 S1 S2 x
Send mgmt frame 4 through S3:    S0 S1 S2 S3

The difference compared to the old usage is that the transmission of
frames 1-4 doesn't need to wait until the completion of the management
route. It is safe to use a slot to the right of the most recently used
one, because by protocol nobody will program a slot to your left and
"steal" your route towards the correct egress port.

So there is a potential throughput benefit here.

But mgmt frame 5 has no more free slot to use, so it has to wait until
_all_ of S0, S1, S2, S3 are full, in order to use S0 again.

And that's actually exactly the problem: I was looking for something
that would bring more predictable transmission latency, but this is
exactly the opposite: 3 out of 4 frames would be transmitted quicker,
but the 4th would draw the short straw and have a worse worst-case
latency than before.

Useless.

Things are made even worse by PTP TX timestamping, which is something I
won't go deeply into here. Suffice to say that the fact there is a
driver-level lock on the SPI bus offsets any potential throughput gains
that parallelism might bring.

So there's no going back to the multi-slot scheme, remove the
"mgmt_slot" variable from sja1105_port and the dummy static assignment
made at probe time.

While passing by, also remove the assignment to casc_port altogether.
Don't pretend that we support cascaded setups.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 15:13:12 -08:00
Vladimir Oltean
54fa49ee88 net: dsa: sja1105: Reconcile the meaning of TPID and TPID2 for E/T and P/Q/R/S
For first-generation switches (SJA1105E and SJA1105T):
- TPID means C-Tag (typically 0x8100)
- TPID2 means S-Tag (typically 0x88A8)

While for the second generation switches (SJA1105P, SJA1105Q, SJA1105R,
SJA1105S) it is the other way around:
- TPID means S-Tag (typically 0x88A8)
- TPID2 means C-Tag (typically 0x8100)

In other words, E/T tags untagged traffic with TPID, and P/Q/R/S with
TPID2.

So the patch mentioned below fixed VLAN filtering for P/Q/R/S, but broke
it for E/T.

We strive for a common code path for all switches in the family, so just
lie in the static config packing functions that TPID and TPID2 are at
swapped bit offsets than they actually are, for P/Q/R/S. This will make
both switches understand TPID to be ETH_P_8021Q and TPID2 to be
ETH_P_8021AD. The meaning from the original E/T was chosen over P/Q/R/S
because E/T is actually the one with public documentation available
(UM10944.pdf).

Fixes: f9a1a7646c ("net: dsa: sja1105: Reverse TPID and TPID2")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30 20:15:02 -08:00
Vladimir Oltean
9fcf024dd6 net: dsa: sja1105: Take PTP egress timestamp by port, not mgmt slot
The PTP egress timestamp N must be captured from register PTPEGR_TS[n],
where n = 2 * PORT + TSREG. There are 10 PTPEGR_TS registers, 2 per
port. We are only using TSREG=0.

As opposed to the management slots, which are 4 in number
(SJA1105_NUM_PORTS, minus the CPU port). Any management frame (which
includes PTP frames) can be sent to any non-CPU port through any
management slot. When the CPU port is not the last port (#4), there will
be a mismatch between the slot and the port number.

Luckily, the only mainline occurrence with this switch
(arch/arm/boot/dts/ls1021a-tsn.dts) does have the CPU port as #4, so the
issue did not manifest itself thus far.

Fixes: 47ed985e97 ("net: dsa: sja1105: Add logic for TX timestamping")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30 20:10:20 -08:00
David S. Miller
adf6f8cb3f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in networking bug fixes for merge window.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25 14:57:26 -08:00
Oleksij Rempel
9bca3a0a92 net: dsa: sja1105: fix sja1105_parse_rgmii_delays()
This function was using configuration of port 0 in devicetree for all ports.
In case CPU port was not 0, the delay settings was ignored. This resulted not
working communication between CPU and the switch.

Fixes: f5b8631c29 ("net: dsa: sja1105: Error out if RGMII delays are requested in DT")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25 10:56:12 -08:00
Vladimir Oltean
2eea1fa82f net: dsa: sja1105: Print the reset reason
Sometimes it can be quite opaque even for me why the driver decided to
reset the switch. So instead of adding dump_stack() calls each time for
debugging, just add a reset reason to sja1105_static_config_reload
calls which gets printed to the console.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12 19:53:07 -08:00
Vladimir Oltean
af580ae2dc net: dsa: sja1105: Disallow management xmit during switch reset
The purpose here is to avoid ptp4l fail due to this condition:

  timed out while polling for tx timestamp
  increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
  port 1: send peer delay request failed

So either reset the switch before the management frame was sent, or
after it was timestamped as well, but not in the middle.

The condition may arise either due to a true timeout (i.e. because
re-uploading the static config takes time), or due to the TX timestamp
actually getting lost due to reset. For the former we can increase
tx_timestamp_timeout in userspace, for the latter we need this patch.

Locking all traffic during switch reset does not make sense at all,
though. Forcing all CPU-originated traffic to potentially block waiting
for a sleepable context to send > 800 bytes over SPI is not a good idea.
Flows that are autonomously forwarded by the switch will get dropped
anyway during switch reset no matter what. So just let all other
CPU-originated traffic be dropped as well.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11 12:45:31 -08:00
Vladimir Oltean
6cf99c13ea net: dsa: sja1105: Restore PTP time after switch reset
The PTP time of the switch is not preserved when uploading a new static
configuration. Work around this hardware oddity by reading its PTP time
before a static config upload, and restoring it afterwards.

Static config changes are expected to occur at runtime even in scenarios
directly related to PTP, i.e. the Time-Aware Scheduler of the switch is
programmed in this way.

Perhaps the larger implication of this patch is that the PTP .gettimex64
and .settime functions need to be exposed to sja1105_main.c, where the
PTP lock needs to be held during this entire process. So their core
implementation needs to move to some common functions which get exposed
in sja1105_ptp.h.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11 12:45:30 -08:00
Vladimir Oltean
34d76e9fa8 net: dsa: sja1105: Implement the .gettimex64 system call for PTP
Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace
applications (i.e. phc2sys) to compensate for the delays incurred while
reading the PHC's time.

The task itself of taking the software timestamp is delegated to the SPI
subsystem, through the newly introduced API in struct spi_transfer. The
goal is to cross-timestamp I/O operations on the switch's PTP clock with
values in the local system clock (CLOCK_REALTIME). For that we need to
understand a bit of the hardware internals.

The 'read PTP time' message is a 12 byte structure, first 4 bytes of
which represent the SPI header, and the last 8 bytes represent the
64-bit PTP time. The switch itself starts processing the command
immediately after receiving the last bit of the address, i.e. at the
middle of byte 3 (last byte of header). The PTP time is shadowed to a
buffer register in the switch, and retrieved atomically during the
subsequent SPI frames.

A similar thing goes on for the 'write PTP time' message, although in
that case the switch waits until the 64-bit PTP time becomes fully
available before taking any action. So the byte that needs to be
software-timestamped is byte 11 (last) of the transfer.

The patch creates a common (and local) sja1105_xfer implementation for
the SPI I/O, and offers 3 front-ends:

- sja1105_xfer_u32 and sja1105_xfer_u64: these are capable of optionally
  requesting a PTP timestamp

- sja1105_xfer_buf: this is for large transfers (e.g. the static config
  buffer) and other misc data, and there is no point in giving
  timestamping capabilities to this.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11 12:45:30 -08:00
Andrew Lunn
0c65b2b90d net: of_get_phy_mode: Change API to solve int/unit warnings
Before this change of_get_phy_mode() returned an enum,
phy_interface_t. On error, -ENODEV etc, is returned. If the result of
the function is stored in a variable of type phy_interface_t, and the
compiler has decided to represent this as an unsigned int, comparision
with -ENODEV etc, is a signed vs unsigned comparision.

Fix this problem by changing the API. Make the function return an
error, or 0 on success, and pass a pointer, of type phy_interface_t,
where the phy mode should be stored.

v2:
Return with *interface set to PHY_INTERFACE_MODE_NA on error.
Add error checks to all users of of_get_phy_mode()
Fixup a few reverse christmas tree errors
Fixup a few slightly malformed reverse christmas trees

v3:
Fix 0-day reported errors.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-04 11:21:25 -08:00
Vivien Didelot
7e99e34701 net: dsa: remove dsa_switch_alloc helper
Now that ports are dynamically listed in the fabric, there is no need
to provide a special helper to allocate the dsa_switch structure. This
will give more flexibility to drivers to embed this structure as they
wish in their private structure.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22 12:37:07 -07:00
Vivien Didelot
d5a619bf60 net: dsa: sja1105: register switch before assigning port private data
Like the dsa_switch_tree structures, the dsa_port structures will be
allocated on switch registration.

The SJA1105 driver is the only one accessing the dsa_port structure
after the switch allocation and before the switch registration.
For that reason, move switch registration prior to assigning the priv
member of the dsa_port structures.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22 12:37:07 -07:00
Vivien Didelot
68bb8ea8ad net: dsa: use dsa_to_port helper everywhere
Do not let the drivers access the ds->ports static array directly
while there is a dsa_to_port helper for this purpose.

At the same time, un-const this helper since the SJA1105 driver
assigns the priv member of the returned dsa_port structure.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-10-22 12:37:06 -07:00
Vladimir Oltean
a9d6ed7a8b net: dsa: sja1105: Move PTP data to its own private structure
This is a non-functional change with 2 goals (both for the case when
CONFIG_NET_DSA_SJA1105_PTP is not enabled):

- Reduce the size of the sja1105_private structure.
- Make the PTP code more self-contained.

Leaving priv->ptp_data.lock to be initialized in sja1105_main.c is not a
leftover: it will be used in a future patch "net: dsa: sja1105: Restore
PTP time after switch reset".

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-14 16:45:40 -07:00
Vladimir Oltean
61c7712627 net: dsa: sja1105: Make all public PTP functions take dsa_switch as argument
The new rule (as already started for sja1105_tas.h) is for functions of
optional driver components (ones which may be disabled via Kconfig - PTP
and TAS) to take struct dsa_switch *ds instead of struct sja1105_private
*priv as first argument.

This is so that forward-declarations of struct sja1105_private can be
avoided.

So make sja1105_ptp.h the second user of this rule.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-14 16:45:40 -07:00
David S. Miller
6f4c930e02 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-10-05 13:37:23 -07:00
Vladimir Oltean
511e6ca047 net: dsa: sja1105: Add support for port mirroring
Amazingly, of all features, this does not require a switch reset.

Tested with:

tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress matchall skip_sw \
	action mirred egress mirror dev swp3
tc filter show dev swp2 ingress
tc filter del dev swp2 ingress pref 49152

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-04 14:43:25 -07:00
Vladimir Oltean
1bd4487038 net: dsa: sja1105: Rename sja1105_spi_send_packed_buf to sja1105_xfer_buf
The most commonly called function in the driver is long due for a
rename. The "packed" word is redundant (it doesn't make sense to
transfer an unpacked structure, since that is in CPU endianness yadda
yadda), and the "spi" word is also redundant since argument 2 of the
function is SPI_READ or SPI_WRITE.

As for the sja1105_spi_send_long_packed_buf function, it is only being
used from sja1105_spi.c, so remove its global prototype.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02 12:25:11 -04:00
Vladimir Oltean
dff79620c3 net: dsa: sja1105: Replace sja1105_spi_send_int with sja1105_xfer_{u32, u64}
Having a function that takes a variable number of unpacked bytes which
it generically calls an "int" is confusing and makes auditing patches
next to impossible.

We only use spi_send_int with the int sizes of 32 and 64 bits. So just
make the spi_send_int function less generic and replace it with the
appropriate two explicit functions, which can now type-check the int
pointer type.

Note that there is still a small weirdness in the u32 function, which
has to convert it to a u64 temporary. This is because of how the packing
API works at the moment, but the weirdness is at least hidden from
callers of sja1105_xfer_u32 now.

Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02 12:25:11 -04:00
Vladimir Oltean
09c1b41255 net: dsa: sja1105: Don't use "inline" function declarations in C files
Let the compiler decide.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02 12:25:11 -04:00
Vladimir Oltean
3e8db7e560 net: dsa: sja1105: Fix sleeping while atomic in .port_hwtstamp_set
Currently this stack trace can be seen with CONFIG_DEBUG_ATOMIC_SLEEP=y:

[   41.568348] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
[   41.576757] in_atomic(): 1, irqs_disabled(): 0, pid: 208, name: ptp4l
[   41.583212] INFO: lockdep is turned off.
[   41.587123] CPU: 1 PID: 208 Comm: ptp4l Not tainted 5.3.0-rc6-01445-ge950f2d4bc7f-dirty #1827
[   41.599873] [<c0313d7c>] (unwind_backtrace) from [<c030e13c>] (show_stack+0x10/0x14)
[   41.607584] [<c030e13c>] (show_stack) from [<c1212d50>] (dump_stack+0xd4/0x100)
[   41.614863] [<c1212d50>] (dump_stack) from [<c037dfc8>] (___might_sleep+0x1c8/0x2b4)
[   41.622574] [<c037dfc8>] (___might_sleep) from [<c122ea90>] (__mutex_lock+0x48/0xab8)
[   41.630368] [<c122ea90>] (__mutex_lock) from [<c122f51c>] (mutex_lock_nested+0x1c/0x24)
[   41.638340] [<c122f51c>] (mutex_lock_nested) from [<c0c6fe08>] (sja1105_static_config_reload+0x30/0x27c)
[   41.647779] [<c0c6fe08>] (sja1105_static_config_reload) from [<c0c7015c>] (sja1105_hwtstamp_set+0x108/0x1cc)
[   41.657562] [<c0c7015c>] (sja1105_hwtstamp_set) from [<c0feb650>] (dev_ifsioc+0x18c/0x330)
[   41.665788] [<c0feb650>] (dev_ifsioc) from [<c0febbd8>] (dev_ioctl+0x320/0x6e8)
[   41.673064] [<c0febbd8>] (dev_ioctl) from [<c0f8b1f4>] (sock_ioctl+0x334/0x5e8)
[   41.680340] [<c0f8b1f4>] (sock_ioctl) from [<c05404a8>] (do_vfs_ioctl+0xb0/0xa10)
[   41.687789] [<c05404a8>] (do_vfs_ioctl) from [<c0540e3c>] (ksys_ioctl+0x34/0x58)
[   41.695151] [<c0540e3c>] (ksys_ioctl) from [<c0301000>] (ret_fast_syscall+0x0/0x28)
[   41.702768] Exception stack(0xe8495fa8 to 0xe8495ff0)
[   41.707796] 5fa0:                   beff4a8c 00000001 00000011 000089b0 beff4a8c beff4a80
[   41.715933] 5fc0: beff4a8c 00000001 0000000c 00000036 b6fa98c8 004e19c1 00000001 00000000
[   41.724069] 5fe0: 004dcedc beff4a6c 004c0738 b6e7af4c
[   41.729860] BUG: scheduling while atomic: ptp4l/208/0x00000002
[   41.735682] INFO: lockdep is turned off.

Enabling RX timestamping will logically disturb the fastpath (processing
of meta frames). Replace bool hwts_rx_en with a bit that is checked
atomically from the fastpath and temporarily unset from the sleepable
context during a change of the RX timestamping process (a destructive
operation anyways, requires switch reset).
If found unset, the fastpath (net/dsa/tag_sja1105.c) will just drop any
received meta frame and not take the meta_lock at all.

Fixes: a602afd200 ("net: dsa: sja1105: Expose PTP timestamping ioctls to userspace")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02 12:19:53 -04:00
Vladimir Oltean
d6530e5ad4 net: dsa: sja1105: Initialize the meta_lock
Otherwise, with CONFIG_DEBUG_SPINLOCK=y, this stack trace gets printed
when enabling RX timestamping and receiving a PTP frame:

[  318.537078] INFO: trying to register non-static key.
[  318.542040] the code is fine but needs lockdep annotation.
[  318.547500] turning off the locking correctness validator.
[  318.552972] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-13257-g0825b0669811-dirty #1962
[  318.561283] Hardware name: Freescale LS1021A
[  318.565566] [<c03144bc>] (unwind_backtrace) from [<c030e164>] (show_stack+0x10/0x14)
[  318.573289] [<c030e164>] (show_stack) from [<c11b9f50>] (dump_stack+0xd4/0x100)
[  318.580579] [<c11b9f50>] (dump_stack) from [<c03b9b40>] (register_lock_class+0x728/0x734)
[  318.588731] [<c03b9b40>] (register_lock_class) from [<c03b60c4>] (__lock_acquire+0x78/0x25cc)
[  318.597227] [<c03b60c4>] (__lock_acquire) from [<c03b8ef8>] (lock_acquire+0xd8/0x234)
[  318.605033] [<c03b8ef8>] (lock_acquire) from [<c11db934>] (_raw_spin_lock+0x44/0x54)
[  318.612755] [<c11db934>] (_raw_spin_lock) from [<c1164370>] (sja1105_rcv+0x1f8/0x4e8)
[  318.620561] [<c1164370>] (sja1105_rcv) from [<c115d7cc>] (dsa_switch_rcv+0x80/0x204)
[  318.628283] [<c115d7cc>] (dsa_switch_rcv) from [<c0f58c80>] (__netif_receive_skb_one_core+0x50/0x6c)
[  318.637386] [<c0f58c80>] (__netif_receive_skb_one_core) from [<c0f58f04>] (netif_receive_skb_internal+0xac/0x264)
[  318.647611] [<c0f58f04>] (netif_receive_skb_internal) from [<c0f59e98>] (napi_gro_receive+0x1d8/0x338)
[  318.656887] [<c0f59e98>] (napi_gro_receive) from [<c0c298a4>] (gfar_clean_rx_ring+0x328/0x724)
[  318.665472] [<c0c298a4>] (gfar_clean_rx_ring) from [<c0c29e60>] (gfar_poll_rx_sq+0x34/0x94)
[  318.673795] [<c0c29e60>] (gfar_poll_rx_sq) from [<c0f5b40c>] (net_rx_action+0x128/0x4f8)
[  318.681860] [<c0f5b40c>] (net_rx_action) from [<c03022f0>] (__do_softirq+0x148/0x5ac)
[  318.689666] [<c03022f0>] (__do_softirq) from [<c0355af4>] (irq_exit+0x160/0x170)
[  318.697040] [<c0355af4>] (irq_exit) from [<c03c6818>] (__handle_domain_irq+0x60/0xb4)
[  318.704847] [<c03c6818>] (__handle_domain_irq) from [<c07e9440>] (gic_handle_irq+0x58/0x9c)
[  318.713172] [<c07e9440>] (gic_handle_irq) from [<c0301a70>] (__irq_svc+0x70/0x98)
[  318.720622] Exception stack(0xc2001f18 to 0xc2001f60)
[  318.725656] 1f00:                                                       00000001 00000006
[  318.733805] 1f20: 00000000 c20165c0 ffffe000 c2010cac c2010cf4 00000001 00000000 c2010c88
[  318.741955] 1f40: c1f7a5a8 00000000 00000000 c2001f68 c03ba140 c030a288 200e0013 ffffffff
[  318.750110] [<c0301a70>] (__irq_svc) from [<c030a288>] (arch_cpu_idle+0x24/0x3c)
[  318.757486] [<c030a288>] (arch_cpu_idle) from [<c038a480>] (do_idle+0x1b8/0x2a4)
[  318.764859] [<c038a480>] (do_idle) from [<c038a94c>] (cpu_startup_entry+0x18/0x1c)
[  318.772407] [<c038a94c>] (cpu_startup_entry) from [<c1e00f10>] (start_kernel+0x4cc/0x4fc)

Fixes: 844d7edc6a ("net: dsa: sja1105: Add a global sja1105_tagger_data structure")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02 12:19:53 -04:00
Vladimir Oltean
b6f2494d31 net: dsa: sja1105: Ensure PTP time for rxtstamp reconstruction is not in the past
Sometimes the PTP synchronization on the switch 'jumps':

  ptp4l[11241.155]: rms    8 max   16 freq -21732 +/-  11 delay   742 +/-   0
  ptp4l[11243.157]: rms    7 max   17 freq -21731 +/-  10 delay   744 +/-   0
  ptp4l[11245.160]: rms 33592410 max 134217731 freq +192422 +/- 8530253 delay   743 +/-   0
  ptp4l[11247.163]: rms 811631 max 964131 freq +10326 +/- 557785 delay   743 +/-   0
  ptp4l[11249.166]: rms 261936 max 533876 freq -304323 +/- 126371 delay   744 +/-   0
  ptp4l[11251.169]: rms 48700 max 57740 freq -20218 +/- 30532 delay   744 +/-   0
  ptp4l[11253.171]: rms 14570 max 30163 freq  -5568 +/- 7563 delay   742 +/-   0
  ptp4l[11255.174]: rms 2914 max 3440 freq -22001 +/- 1667 delay   744 +/-   1
  ptp4l[11257.177]: rms  811 max 1710 freq -22653 +/- 451 delay   744 +/-   1
  ptp4l[11259.180]: rms  177 max  218 freq -21695 +/-  89 delay   741 +/-   0
  ptp4l[11261.182]: rms   45 max   92 freq -21677 +/-  32 delay   742 +/-   0
  ptp4l[11263.186]: rms   14 max   32 freq -21733 +/-  11 delay   742 +/-   0
  ptp4l[11265.188]: rms    9 max   14 freq -21725 +/-  12 delay   742 +/-   0
  ptp4l[11267.191]: rms    9 max   16 freq -21727 +/-  13 delay   742 +/-   0
  ptp4l[11269.194]: rms    6 max   15 freq -21726 +/-   9 delay   743 +/-   0
  ptp4l[11271.197]: rms    8 max   15 freq -21728 +/-  11 delay   743 +/-   0
  ptp4l[11273.200]: rms    6 max   12 freq -21727 +/-   8 delay   743 +/-   0
  ptp4l[11275.202]: rms    9 max   17 freq -21720 +/-  11 delay   742 +/-   0
  ptp4l[11277.205]: rms    9 max   18 freq -21725 +/-  12 delay   742 +/-   0

Background: the switch only offers partial RX timestamps (24 bits) and
it is up to the driver to read the PTP clock to fill those timestamps up
to 64 bits. But the PTP clock readout needs to happen quickly enough (in
0.135 seconds, in fact), otherwise the PTP clock will wrap around 24
bits, condition which cannot be detected.

Looking at the 'max 134217731' value on output line 3, one can see that
in hex it is 0x8000003. Because the PTP clock resolution is 8 ns,
that means 0x1000000 in ticks, which is exactly 2^24. So indeed this is
a PTP clock wraparound, but the reason might be surprising.

What is going on is that sja1105_tstamp_reconstruct(priv, now, ts)
expects a "now" time that is later than the "ts" was snapshotted at.
This, of course, is obvious: we read the PTP time _after_ the partial RX
timestamp was received. However, the workqueue is processing frames from
a skb queue and reuses the same PTP time, read once at the beginning.
Normally the skb queue only contains one frame and all goes well. But
when the skb queue contains two frames, the second frame that gets
dequeued might have been partially timestamped by the RX MAC _after_ we
had read our PTP time initially.

The code was originally like that due to concerns that SPI access for
PTP time readout is a slow process, and we are time-constrained anyway
(aka: premature optimization). But some timing analysis reveals that the
time spent until the RX timestamp is completely reconstructed is 1 order
of magnitude lower than the 0.135 s deadline even under worst-case
conditions. So we can afford to read the PTP time for each frame in the
RX timestamping queue, which of course ensures that the full PTP time is
in the partial timestamp's future.

Fixes: f3097be21b ("net: dsa: sja1105: Add a state machine for RX timestamping")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-30 17:20:50 -07:00
Vladimir Oltean
317ab5b86c net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload
This qdisc offload is the closest thing to what the SJA1105 supports in
hardware for time-based egress shaping. The switch core really is built
around SAE AS6802/TTEthernet (a TTTech standard) but can be made to
operate similarly to IEEE 802.1Qbv with some constraints:

- The gate control list is a global list for all ports. There are 8
  execution threads that iterate through this global list in parallel.
  I don't know why 8, there are only 4 front-panel ports.

- Care must be taken by the user to make sure that two execution threads
  never get to execute a GCL entry simultaneously. I created a O(n^4)
  checker for this hardware limitation, prior to accepting a taprio
  offload configuration as valid.

- The spec says that if a GCL entry's interval is shorter than the frame
  length, you shouldn't send it (and end up in head-of-line blocking).
  Well, this switch does anyway.

- The switch has no concept of ADMIN and OPER configurations. Because
  it's so simple, the TAS settings are loaded through the static config
  tables interface, so there isn't even place for any discussion about
  'graceful switchover between ADMIN and OPER'. You just reset the
  switch and upload a new OPER config.

- The switch accepts multiple time sources for the gate events. Right
  now I am using the standalone clock source as opposed to PTP. So the
  base time parameter doesn't really do much. Support for the PTP clock
  source will be added in a future series.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-16 21:32:58 +02:00
Vladimir Oltean
5f06c63bd3 net: dsa: sja1105: Advertise the 8 TX queues
This is a preparation patch for the tc-taprio offload (and potentially
for other future offloads such as tc-mqprio).

Instead of looking directly at skb->priority during xmit, let's get the
netdev queue and the queue-to-traffic-class mapping, and put the
resulting traffic class into the dsa_8021q PCP field. The switch is
configured with a 1-to-1 PCP-to-ingress-queue-to-egress-queue mapping
(see vlan_pmap in sja1105_main.c), so the effect is that we can inject
into a front-panel's egress traffic class through VLAN tagging from
Linux, completely transparently.

Unfortunately the switch doesn't look at the VLAN PCP in the case of
management traffic to/from the CPU (link-local frames at
01-80-C2-xx-xx-xx or 01-1B-19-xx-xx-xx) so we can't alter the
transmission queue of this type of traffic on a frame-by-frame basis. It
is only selected through the "hostprio" setting which ATM is harcoded in
the driver to 7.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-16 21:32:57 +02:00
Vladimir Oltean
e9bf96943b net: dsa: sja1105: Clear VLAN filtering offload netdev feature
The switch barely supports traffic I/O, and it does that by repurposing
VLANs when there is no bridge that is taking control of them.

Letting DSA declare this netdev feature as supported (see
dsa_slave_create) would mean that VLAN sub-interfaces created on sja1105
switch ports will be hardware offloaded. That means that
net/8021q/vlan_core.c would install the VLAN into the filter tables of
the switch, potentially interfering with the tag_8021q VLANs.

We need to prevent that from happening and not let the 8021q core
offload VLANs to the switch hardware tables. In vlan_filtering=0 modes
of operation, the switch ports can pass through VLAN-tagged frames with
no problem.

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-27 20:46:26 -07:00
YueHaibing
e3e3af9aa2 net: dsa: sja1105: remove set but not used variables 'tx_vid' and 'rx_vid'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/dsa/sja1105/sja1105_main.c: In function sja1105_fdb_dump:
drivers/net/dsa/sja1105/sja1105_main.c:1226:14: warning:
 variable tx_vid set but not used [-Wunused-but-set-variable]
drivers/net/dsa/sja1105/sja1105_main.c:1226:6: warning:
 variable rx_vid set but not used [-Wunused-but-set-variable]

They are not used since commit 6d7c7d948a ("net: dsa:
sja1105: Fix broken learning with vlan_filtering disabled")

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08 22:29:02 -07:00
Vladimir Oltean
6cb0abbdf9 net: dsa: sja1105: Really fix panic on unregistering PTP clock
The IS_ERR_OR_NULL(priv->clock) check inside
sja1105_ptp_clock_unregister() is preventing cancel_delayed_work_sync
from actually being run.

Additionally, sja1105_ptp_clock_unregister() does not actually get run,
when placed in sja1105_remove(). The DSA switch gets torn down, but the
sja1105 module does not get unregistered. So sja1105_ptp_clock_unregister
needs to be moved to sja1105_teardown, to be symmetrical with
sja1105_ptp_clock_register which is called from the DSA sja1105_setup.

It is strange to fix a "fixes" patch, but the probe failure can only be
seen when the attached PHY does not respond to MDIO (issue which I can't
pinpoint the reason to) and it goes away after I power-cycle the board.
This time the patch was validated on a failing board, and the kernel
panic from the fixed commit's message can no longer be seen.

Fixes: 29dd908d35 ("net: dsa: sja1105: Cancel PTP delayed work on unregister")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 14:37:02 -07:00
Vladimir Oltean
4b7da3d808 net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well
It looks like the FDB dump taken from first-generation switches also
contains information on whether entries are static or not. So use that
instead of searching through the driver's tables.

Fixes: d763778224 ("net: dsa: sja1105: Implement is_static for FDB entries on E/T")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 14:37:02 -07:00
Vladimir Oltean
6d7c7d948a net: dsa: sja1105: Fix broken learning with vlan_filtering disabled
When put under a bridge with vlan_filtering 0, the SJA1105 ports will
flood all traffic as if learning was broken. This is because learning
interferes with the rx_vid's configured by dsa_8021q as unique pvid's.

So learning technically still *does* work, it's just that the learnt
entries never get matched due to their unique VLAN ID.

The setting that saves the day is Shared VLAN Learning, which on this
switch family works exactly as desired: VLAN tagging still works
(untagged traffic gets the correct pvid) and FDB entries are still
populated with the correct contents including VID. Also, a frame cannot
violate the forwarding domain restrictions enforced by its classified
VLAN. It is just that the VID is ignored when looking up the FDB for
taking a forwarding decision (selecting the egress port).

This patch activates SVL, and the result is that frames with a learnt
DMAC are no longer flooded in the scenario described above.

Now exactly *because* SVL works as desired, we have to revisit some
earlier patches:

- It is no longer necessary to manipulate the VID of the 'bridge fdb
  {add,del}' command when vlan_filtering is off. This is because now,
  SVL is enabled for that case, so the actual VID does not matter*.

- It is still desirable to hide dsa_8021q VID's in the FDB dump
  callback. But right now the dump callback should no longer hide
  duplicates (one per each front panel port's pvid, plus one for the
  VLAN that the CPU port is going to tag a TX frame with), because there
  shouldn't be any (the switch will match a single FDB entry no matter
  its VID anyway).

* Not really... It's no longer necessary to transform a 'bridge fdb add'
  into 5 fdb add operations, but the user might still add a fdb entry with
  any vid, and all of them would appear as duplicates in 'bridge fdb
  show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we
  can prune the duplicates at insertion time.

** The VID of 0 is better than 1 because it is always guaranteed to be
   in the ports' hardware filter. DSA also avoids putting the VID inside
   the netlink response message towards the bridge driver when we return
   this particular VID, which makes it suitable for FDB entries learnt
   with vlan_filtering off.

Fixes: 227d07a07e ("net: dsa: sja1105: Add support for traffic through standalone ports")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06 14:37:02 -07:00
Nishka Dasgupta
7ba771e3e2 net: dsa: sja1105: sja1105_main: Add of_node_put()
Each iteration of for_each_child_of_node puts the previous node, but in
the case of a return from the middle of the loop, there is no put, thus
causing a memory leak. Hence add an of_node_put before the return.
Issue found with Coccinelle.

Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-23 13:38:26 -07:00
Vladimir Oltean
9f971573d9 net: dsa: sja1105: Mark in-band AN modes not supported for PHYLINK
We need a better way to signal this, perhaps in phylink_validate, but
for now just print this error message as guidance for other people
looking at this driver's code while trying to rework PHYLINK.

Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28 09:31:31 -07:00
Vladimir Oltean
39710229af net: dsa: sja1105: Check for PHY mode mismatches with what PHYLINK reports
PHYLINK being designed with PHYs in mind that can change MII protocol,
for correct operation it is necessary to ensure that the PHY interface
mode stays the same (otherwise clear the supported bit mask, as
required).

Because this is just a hypothetical situation for now, we don't bother
to check whether we could actually support the new PHY interface mode.
Actually we could modify the xMII table, reset the switch and send an
updated static configuration, but adding that would just be dead code.

Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28 09:31:31 -07:00
Vladimir Oltean
a979a0ab36 net: dsa: sja1105: Don't check state->link in phylink_mac_config
It has been pointed out that PHYLINK can call mac_config only to update
the phy_interface_type and without knowing what the AN results are.

Experimentally, when this was observed to happen, state->link was also
unset, and therefore was used as a proxy to ignore this call. However it
is also suggested that state->link is undefined for this callback and
should not be relied upon.

So let the previously-dead codepath for SPEED_UNKNOWN be called, and
update the comment to make sure the MAC's behavior is sane.

Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28 09:31:31 -07:00
Vladimir Oltean
d763778224 net: dsa: sja1105: Implement is_static for FDB entries on E/T
The first generation switches don't tell us through the dynamic config
interface whether the dumped FDB entries are static or not (the LOCKEDS
bit from P/Q/R/S).

However, now that we're keeping a mirror of all 'bridge fdb' commands in
the static config, this is an opportunity to compare a dumped FDB entry
to the driver's private database.  After all, what makes an entry static
is that *we* added it.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 11:03:22 -07:00
Vladimir Oltean
b3ee526a88 net: dsa: sja1105: Use correct dsa_8021q VIDs for FDB commands
A FDB entry means that "frames that match this VID and DMAC must be
forwarded to this port".

In the case of dsa_8021q however, the VID is not a single one (and
neither two, as my previous patch assumed). The VID can be set either by
the CPU port (1 tx_vid), or by any of the other front-panel port (n-1
rx_vid's).

Fixes: 93647594d8 ("net: dsa: sja1105: Hide the dsa_8021q VLANs from the bridge fdb command")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 11:03:21 -07:00
Vladimir Oltean
17ae655540 net: dsa: sja1105: Populate is_static for FDB entries on P/Q/R/S
The reason why this wasn't tackled earlier is that I had hoped I
understood the user manual wrong.  But unfortunately hacks are required
in order to retrieve the static/dynamic nature of FDB entries on SJA1105
P/Q/R/S, since this info is stored in the writeback buffer of the
dynamic config command.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 11:03:21 -07:00
Vladimir Oltean
60f6053ff1 net: dsa: sja1105: Back up static FDB entries in kernel memory
After commit 8456721dd4 ("net: dsa: sja1105: Add support for
configuring address ageing time"), we started to reset the switch rather
often (each time the bridge core changes the ageing time on a switch
port).

The unfortunate reality is that SJA1105 doesn't have any {cold, warm,
whatever} reset mode in which it accepts a new configuration stream
without flushing the FDB.  Instead, in its world, the FDB *is* an
optional part of the static configuration.

So we play its game, and do what we also do for VLANs: for each 'bridge
fdb' command, we add the FDB entry through the dynamic interface, and we
append the in-kernel static config memory with info that we're going to
use later, when the next reset command is going to be issued.

The result is that 'bridge fdb' commands are now persistent (dynamically
learned entries are lost, but that's ok).

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 11:03:21 -07:00
Vladimir Oltean
6c56e167cc net: dsa: sja1105: Make P/Q/R/S learn MAC addresses
At the end of the commit 1da7382134 ("net: dsa: sja1105: Add FDB
operations for P/Q/R/S series") message, I said that:

    At the moment only FDB entries installed statically through 'bridge fdb'
    are visible in the dump callback - the dynamically learned ones are
    still under investigation.

It looks like the reason why they were not visible in 'bridge fdb' was
that they were never learned - always flooded.

SJA1105 P/Q/R/S manual says about the MAXADDRP[port] field:

    Specify the maximum number of MAC address dynamically learned from
    the respective port. It is used to limit the number of learned MAC
    addresses per port.

It looks like not providing a value in the static config (aka providing
zeroes) is enough for it to not store the learned addresses in the FDB.

For now we divide the 1024 entry FDB "equally" amongst the 5 ports. This
may be revisited if the situation calls for that - for now I'm happy
that learning works.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 11:03:21 -07:00
Vladimir Oltean
e3502b8297 net: dsa: sja1105: Make vid 1 the default pvid
In SJA1105 there is no concept of 'default values' per se, everything
needs to be driver-supplied through the static configuration tables.

The issue is that the hardware manual says that 'at least the default
untagging VLAN' is mandatory to be provided through the static config.
But VLAN 0 isn't a very good initial pvid - its use is reserved for
priority-tagged frames, and the layers of the stack that care about
those already make sure that this VLAN is installed, as can be seen in
the message below:

  8021q: adding VLAN 0 to HW filter on device swp2

So change the pvid provided through the static configuration to 1, which
matches the bridge core's defaults.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27 11:03:21 -07:00
YueHaibing
1dbb98699c net: dsa: sja1105: Make two functions static
Fix sparse warnings:

drivers/net/dsa/sja1105/sja1105_main.c:1848:6:
 warning: symbol 'sja1105_port_rxtstamp' was not declared. Should it be static?
drivers/net/dsa/sja1105/sja1105_main.c:1869:6:
 warning: symbol 'sja1105_port_txtstamp' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-13 13:58:32 -07:00
Vladimir Oltean
8400cff60b net: dsa: sja1105: Rethink the PHYLINK callbacks
The first fact that needs to be stated is that the per-MAC settings in
SJA1105 called EGRESS and INGRESS do *not* disable egress and ingress on
the MAC. They only prevent non-link-local traffic from being
sent/received on this port.

So instead of having .phylink_mac_config essentially mess with the STP
state and force it to DISABLED/BLOCKING (which also brings useless
complications in sja1105_static_config_reload), simply add the
.phylink_mac_link_down and .phylink_mac_link_up callbacks which inhibit
TX at the MAC level, while leaving RX essentially enabled.

Also stop from trying to put the link down in .phylink_mac_config, which
is incorrect.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:58:59 -07:00
Vladimir Oltean
1fd4a173f0 net: dsa: sja1105: Update some comments about PHYLIB
Since the driver is now using PHYLINK exclusively, it makes sense to
remove all references to it and replace them with PHYLINK.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:58:58 -07:00
Vladimir Oltean
c44d05358e net: dsa: sja1105: Use SPEED_{10, 100, 1000, UNKNOWN} macros
This is a cosmetic patch that replaces the link speed numbers used in
the driver with the corresponding ethtool macros.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:58:58 -07:00
Vladimir Oltean
a602afd200 net: dsa: sja1105: Expose PTP timestamping ioctls to userspace
This enables the PTP support towards userspace applications such as
linuxptp.

The switches can timestamp only trapped multicast MAC frames, and
therefore only the profiles of 1588 over L2 are supported.

TX timestamping can be enabled per port, but RX timestamping is enabled
globally. As long as RX timestamping is enabled, the switch will emit
metadata follow-up frames that will be processed by the tagger. It may
be a problem that linuxptp does not restore the RX timestamping settings
when exiting.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:40 -07:00
Vladimir Oltean
f3097be21b net: dsa: sja1105: Add a state machine for RX timestamping
Meta frame reception relies on the hardware keeping its promise that it
will send no other traffic towards the CPU port between a link-local
frame and a meta frame.  Otherwise there is no other way to associate
the meta frame with the link-local frame it's holding a timestamp of.
The receive function is made stateful, and buffers a timestampable frame
until its meta frame arrives, then merges the two, drops the meta and
releases the link-local frame up the stack.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:40 -07:00
Vladimir Oltean
08fde09a0d net: dsa: sja1105: Increase priority of CPU-trapped frames
Without noticing any particular issue, this patch ensures that
management traffic is treated with the maximum priority on RX by the
switch.  This is generally desirable, as the driver keeps a state
machine that waits for metadata follow-up frames as soon as a management
frame is received.  Increasing the priority helps expedite the reception
(and further reconstruction) of the RX timestamp to the driver after the
MAC has generated it.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:40 -07:00
Vladimir Oltean
844d7edc6a net: dsa: sja1105: Add a global sja1105_tagger_data structure
This will be used to keep state for RX timestamping. It is global
because the switch serializes timestampable and meta frames when
trapping them towards the CPU port (lower port indices have higher
priority) and therefore having one state machine per port would create
unnecessary complications.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:40 -07:00
Vladimir Oltean
24c01949e5 net: dsa: sja1105: Add support for the AVB Parameters Table
This table is used to program the switch to emit "meta" follow-up
Ethernet frames (which contain partial RX timestamps) after each
link-local frame that was trapped to the CPU port through MAC filtering.
This includes PTP frames.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:40 -07:00
Vladimir Oltean
47ed985e97 net: dsa: sja1105: Add logic for TX timestamping
On TX, timestamping is performed synchronously from the
port_deferred_xmit worker thread.
In management routes, the switch is requested to take egress timestamps
(again partial), which are reconstructed and appended to a clone of the
skb that was just sent.  The cloning is done by DSA and we retrieve the
pointer from the structure that DSA keeps in skb->cb.
Then these clones are enqueued to the socket's error queue for
application-level processing.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:40 -07:00
Vladimir Oltean
bb77f36ac2 net: dsa: sja1105: Add support for the PTP clock
The design of this PHC driver is influenced by the switch's behavior
w.r.t. timestamping.  It exposes two PTP counters, one free-running
(PTPTSCLK) and the other offset- and frequency-corrected in hardware
through PTPCLKVAL, PTPCLKADD and PTPCLKRATE.  The MACs can sample either
of these for frame timestamps.

However, the user manual warns that taking timestamps based on the
corrected clock is less than useful, as the switch can deliver corrupted
timestamps in a variety of circumstances.

Therefore, this PHC uses the free-running PTPTSCLK together with a
timecounter/cyclecounter structure that translates it into a software
time domain.  Thus, the settime/adjtime and adjfine callbacks are
hardware no-ops.

The timestamps (introduced in a further patch) will also be translated
to the correct time domain before being handed over to the userspace PTP
stack.

The introduction of a second set of PHC operations that operate on the
hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat
unavoidable, as the TTEthernet core uses the corrected PTP time domain.
However, the free-running counter + timecounter structure combination
will suffice for now, as the resulting timestamps yield a sub-50 ns
synchronization offset in steady state using linuxptp.

For this patch, in absence of frame timestamping, the operations of the
switch PHC were tested by syncing it to the system time as a local slave
clock with:

phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:40 -07:00
Vladimir Oltean
42824463d3 net: dsa: sja1105: Limit use of incl_srcpt to bridge+vlan mode
The incl_srcpt setting makes the switch mangle the destination MACs of
multicast frames trapped to the CPU - a primitive tagging mechanism that
works even when we cannot use the 802.1Q software features.

The downside is that the two multicast MAC addresses that the switch
traps for L2 PTP (01-80-C2-00-00-0E and 01-1B-19-00-00-00) quickly turn
into a lot more, as the switch encodes the source port and switch id
into bytes 3 and 4 of the MAC. The resulting range of MAC addresses
would need to be installed manually into the DSA master port's multicast
MAC filter, and even then, most devices might not have a large enough
MAC filtering table.

As a result, only limit use of incl_srcpt to when it's strictly
necessary: when under a VLAN filtering bridge.  This fixes PTP in
non-bridged mode (standalone ports). Otherwise, PTP frames, as well as
metadata follow-up frames holding RX timestamps won't be received
because they will be blocked by the master port's MAC filter.
Linuxptp doesn't help, because it only requests the addition of the
unmodified PTP MACs to the multicast filter.
This issue is not seen in bridged mode because the master port is put in
promiscuous mode when the slave ports are enslaved to a bridge.
Therefore, there is no downside to having the incl_srcpt mechanism
active there.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:40 -07:00
Vladimir Oltean
f9a1a7646c net: dsa: sja1105: Reverse TPID and TPID2
>From reading the P/Q/R/S user manual, it appears that TPID is used by
the switch for detecting S-tags and TPID2 for C-tags.  Their meaning is
not clear from the E/T manual.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:40 -07:00
Vladimir Oltean
070ca3bb95 net: dsa: sja1105: Move sja1105_change_tpid into sja1105_vlan_filtering
This is a cosmetic patch, pre-cursor to making another change to the
General Parameters Table (incl_srcpt) which does not logically pertain
to the sja1105_change_tpid function name, but not putting it there would
otherwise create a need of resetting the switch twice.

So simply move the existing code into the .port_vlan_filtering callback,
where the incl_srcpt change will be added as well.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:20:39 -07:00
David S. Miller
a6cdeeb16b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-07 11:00:14 -07:00
Vladimir Oltean
f4cfcfbdf0 net: dsa: sja1105: Fix link speed not working at 100 Mbps and below
The hardware values for link speed are held in the sja1105_speed_t enum.
However they do not increase in the order that sja1105_get_speed_cfg was
iterating over them (basically from SJA1105_SPEED_AUTO - 0 - to
SJA1105_SPEED_1000MBPS - 1 - skipping the other two).

Another bug is that the code in sja1105_adjust_port_config relies on the
fact that an invalid link speed is detected by sja1105_get_speed_cfg and
returned as -EINVAL.  However storing this into an enum that only has
positive members will cast it into an unsigned value, and it will miss
the negative check.

So take the simplest approach and remove the sja1105_get_speed_cfg
function and replace it with a simple switch-case statement.

Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 11:51:57 -07:00
Vladimir Oltean
93647594d8 net: dsa: sja1105: Hide the dsa_8021q VLANs from the bridge fdb command
TX VLANs and RX VLANs are an internal implementation detail of DSA for
frame tagging.  They work by installing special VLANs on switch ports in
the operating modes where no behavior change w.r.t. VLANs can be
observed by the user.

Therefore it makes sense to hide these VLANs in the 'bridge fdb'
command, as well as translate the pvid into the RX VID and TX VID on
'bridge fdb add' and 'bridge fdb del' commands.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 11:49:20 -07:00
Vladimir Oltean
7752e937f1 net: dsa: sja1105: Unset port from forwarding mask unconditionally on fdb_del
This is a cosmetic patch that simplifies the code by removing a
redundant check. A logical AND-with-zero performed on a zero is still
zero.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 11:49:20 -07:00
Vladimir Oltean
1da7382134 net: dsa: sja1105: Add FDB operations for P/Q/R/S series
This adds support for manipulating the L2 forwarding database (dump,
add, delete) for the second generation of NXP SJA1105 switches.

At the moment only FDB entries installed statically through 'bridge fdb'
are visible in the dump callback - the dynamically learned ones are
still under investigation.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 11:49:20 -07:00
Vladimir Oltean
2a7e740929 net: dsa: sja1105: Add P/Q/R/S management route support via dynamic interface
Management routes are one-shot FDB rules installed on the CPU port for
sending link-local traffic.  They are a prerequisite for STP, PTP etc to
work.

Also make a note that removing a management route was not supported on
the previous generation of switches.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 11:49:20 -07:00
Vladimir Oltean
def846042f net: dsa: sja1105: Make dynamic_config_read return -ENOENT if not found
Conceptually, if an entry is not found in the requested hardware table,
it is not an invalid request - so change the error returned
appropriately.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 11:49:19 -07:00
Vladimir Oltean
9dfa69118f net: dsa: sja1105: Make room for P/Q/R/S FDB operations
The DSA callbacks were written with the E/T (first generation) in mind,
which is quite different.

For P/Q/R/S completely new implementations need to be provided, which
are held as function pointers in the priv->info structure.  We are
taking a slightly roundabout way for this (a function from
sja1105_main.c reads a structure defined in sja1105_spi.c that
points to a function defined in sja1105_main.c), but it is what it is.

The FDB dump callback works for both families, hence no function pointer
for that.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-04 11:49:19 -07:00
Vladimir Oltean
af7cd0366e net: dsa: sja1105: Fix broken fixed-link interfaces on user ports
PHYLIB and PHYLINK handle fixed-link interfaces differently. PHYLIB
wraps them in a software PHY ("pseudo fixed link") phydev construct such
that .adjust_link driver callbacks see an unified API. Whereas PHYLINK
simply creates a phylink_link_state structure and passes it to
.mac_config.

At the time the driver was introduced, DSA was using PHYLIB for the
CPU/cascade ports (the ones with no net devices) and PHYLINK for
everything else.

As explained below:

commit aab9c4067d
Author: Florian Fainelli <f.fainelli@gmail.com>
Date:   Thu May 10 13:17:36 2018 -0700

  net: dsa: Plug in PHYLINK support

  Drivers that utilize fixed links for user-facing ports (e.g: bcm_sf2)
  will need to implement phylink_mac_ops from now on to preserve
  functionality, since PHYLINK *does not* create a phy_device instance
  for fixed links.

In the above patch, DSA guards the .phylink_mac_config callback against
a NULL phydev pointer.  Therefore, .adjust_link is not called in case of
a fixed-link user port.

This patch fixes the situation by converting the driver from using
.adjust_link to .phylink_mac_config.  This can be done now in a unified
fashion for both slave and CPU/cascade ports because DSA now uses
PHYLINK for all ports.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 21:48:54 -07:00
Vladimir Oltean
3b2c4f4d63 net: dsa: sja1105: Don't return a negative in u8 sja1105_stp_state_get
Dan Carpenter says:

The patch 640f763f98: "net: dsa: sja1105: Add support for Spanning
Tree Protocol" from May 5, 2019, leads to the following static
checker warning:

        drivers/net/dsa/sja1105/sja1105_main.c:1073 sja1105_stp_state_get()
        warn: signedness bug returning '(-22)'

The caller doesn't check for negative errors anyway.

Fixes: 640f763f98: ("net: dsa: sja1105: Add support for Spanning Tree Protocol")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-08 13:57:50 -07:00
Vladimir Oltean
640f763f98 net: dsa: sja1105: Add support for Spanning Tree Protocol
While not explicitly documented as supported in UM10944, compliance with
the STP states can be obtained by manipulating 3 settings at the
(per-port) MAC config level: dynamic learning, inhibiting reception of
regular traffic, and inhibiting transmission of regular traffic.

In all these modes, transmission and reception of special BPDU frames
from the stack is still enabled (not inhibited by the MAC-level
settings).

On ingress, BPDUs are classified by the MAC filter as link-local
(01-80-C2-00-00-00) and forwarded to the CPU port.  This mechanism works
under all conditions (even without the custom 802.1Q tagging) because
the switch hardware inserts the source port and switch ID into bytes 4
and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put
back zeroes into the MAC address after decoding the source port
information.

On egress, BPDUs are transmitted using management routes from the xmit
worker thread. Again this does not require switch tagging, as the switch
port is programmed through SPI to hold a temporary (single-fire) route
for a frame with the programmed destination MAC (01-80-C2-00-00-00).

STP is activated using the following commands and was tested by
connecting two front-panel ports together and noticing that switching
loops were prevented (one port remains in the blocking state):

$ ip link add name br0 type bridge stp_state 1 && ip link set br0 up
$ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/);
  do ip link set ${eth} master br0 && ip link set ${eth} up; done

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 21:52:42 -07:00
Vladimir Oltean
227d07a07e net: dsa: sja1105: Add support for traffic through standalone ports
In order to support this, we are creating a make-shift switch tag out of
a VLAN trunk configured on the CPU port. Termination of normal traffic
on switch ports only works when not under a vlan_filtering bridge.
Termination of management (PTP, BPDU) traffic works under all
circumstances because it uses a different tagging mechanism
(incl_srcpt). We are making use of the generic CONFIG_NET_DSA_TAG_8021Q
code and leveraging it from our own CONFIG_NET_DSA_TAG_SJA1105.

There are two types of traffic: regular and link-local.

The link-local traffic received on the CPU port is trapped from the
switch's regular forwarding decisions because it matched one of the two
DMAC filters for management traffic.

On transmission, the switch requires special massaging for these
link-local frames. Due to a weird implementation of the switching IP, by
default it drops link-local frames that originate on the CPU port.
It needs to be told where to forward them to, through an SPI command
("management route") that is valid for only a single frame.
So when we're sending link-local traffic, we are using the
dsa_defer_xmit mechanism.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 21:52:42 -07:00
Vladimir Oltean
ad9f299a87 net: dsa: sja1105: Reject unsupported link modes for AN
Ethernet flow control:

The switch MAC does not consume, nor does it emit pause frames. It
simply forwards them as any other Ethernet frame (and since the DMAC is,
per IEEE spec, 01-80-C2-00-00-01, it means they are filtered as
link-local traffic and forwarded to the CPU, which can't do anything
useful with them).

Duplex:

There is no duplex setting in the SJA1105 MAC. It is known to forward
traffic at line rate on the same port in both directions. Therefore it
must be that it only supports full duplex.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-03 10:49:17 -04:00
Vladimir Oltean
8456721dd4 net: dsa: sja1105: Add support for configuring address ageing time
If STP is active, this setting is applied on bridged ports each time an
Ethernet link is established (topology changes).

Since the setting is global to the switch and a reset is required to
change it, resets are prevented if the new callback does not change the
value that the hardware already is programmed for.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-03 10:49:17 -04:00
Vladimir Oltean
52c34e6e12 net: dsa: sja1105: Add support for ethtool port counters
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-03 10:49:17 -04:00
Vladimir Oltean
6666cebc5e net: dsa: sja1105: Add support for VLAN operations
VLAN filtering cannot be properly disabled in SJA1105. So in order to
emulate the "no VLAN awareness" behavior (not dropping traffic that is
tagged with a VID that isn't configured on the port), we need to hack
another switch feature: programmable TPID (which is 0x8100 for 802.1Q).
We are reprogramming the TPID to a bogus value which leaves the switch
thinking that all traffic is untagged, and therefore accepts it.

Under a vlan_filtering bridge, the proper TPID of ETH_P_8021Q is
installed again, and the switch starts identifying 802.1Q-tagged
traffic.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-03 10:49:17 -04:00
Vladimir Oltean
f5b8631c29 net: dsa: sja1105: Error out if RGMII delays are requested in DT
Documentation/devicetree/bindings/net/ethernet.txt is confusing because
it says what the MAC should not do, but not what it *should* do:

  * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
     should not add an RX delay in this case)

The gap in semantics is threefold:
1. Is it illegal for the MAC to apply the Rx internal delay by itself,
   and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
   passing it to of_phy_connect? The documentation would suggest yes.
1. For "rgmii-rxid", while the situation with the Rx clock skew is more
   or less clear (needs to be added by the PHY), what should the MAC
   driver do about the Tx delays? Is it an implicit wild card for the
   MAC to apply delays in the Tx direction if it can? What if those were
   already added as serpentine PCB traces, how could that be made more
   obvious through DT bindings so that the MAC doesn't attempt to add
   them twice and again potentially break the link?
3. If the interface is a fixed-link and therefore the PHY object is
   fixed (a purely software entity that obviously cannot add clock
   skew), what is the meaning of the above property?

So an interpretation of the RGMII bindings was chosen that hopefully
does not contradict their intention but also makes them more applied.
The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
if the port is in the PHY role (either explicitly, or if it is a
fixed-link). Otherwise it always passes the duty of setting up delays to
the PHY driver.

The error behavior that this patch adds is required on SJA1105E/T where
the MAC really cannot apply internal delays. If the other end of the
fixed-link cannot apply RGMII delays either (this would be specified
through its own DT bindings), then the situation requires PCB delays.

For SJA1105P/Q/R/S, this is however hardware supported and the error is
thus only temporary. I created a stub function pointer for configuring
delays per-port on RXC and TXC, and will implement it when I have access
to a board with this hardware setup.

Meanwhile do not allow the user to select an invalid configuration.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-03 10:49:17 -04:00
Vladimir Oltean
291d1e72b7 net: dsa: sja1105: Add support for FDB and MDB management
Currently only the (more difficult) first generation E/T series is
supported. Here the TCAM is only 4-way associative, and to know where
the hardware will search for a FDB entry, we need to perform the same
hash algorithm in order to install the entry in the correct bin.

On P/Q/R/S, the TCAM should be fully associative. However the SPI
command interface is different, and because I don't have access to a
new-generation device at the moment, support for it is TODO.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-03 10:49:17 -04:00
Vladimir Oltean
8aa9ebccae net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch
At this moment the following is supported:
* Link state management through phylib
* Autonomous L2 forwarding managed through iproute2 bridge commands.

IP termination must be done currently through the master netdevice,
since the switch is unmanaged at this point and using
DSA_TAG_PROTO_NONE.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-03 10:49:17 -04:00