1. 19 Dec, 2018 9 commits
    • David S. Miller's avatar
      Merge branch 'vxlan-Various-fixes' · 59fc137e
      David S. Miller authored
      Petr Machata says:
      
      ====================
      vxlan: Various fixes
      
      This patch set contains three fixes for the vxlan driver.
      
      Patch #1 fixes handling of offload mark on replaced VXLAN FDB entries. A
      way to trigger this is to replace the FDB entry with one that can not be
      offloaded. A future patch set should make it possible to veto such FDB
      changes. However the FDB might still fail to be offloaded due to another
      issue, and the offload mark should reflect that.
      
      Patch #2 fixes problems in __vxlan_dev_create() when a call to
      rtnl_configure_link() fails. These failures would be tricky to hit on a
      real system, the most likely vector is through an error in vxlan_open().
      However, with the abovementioned vetoing patchset, vetoing the created
      entry would trigger the same problems (and be easier to reproduce).
      
      Patch #3 fixes a problem in vxlan_changelink(). In situations where the
      default remote configured in the FDB table (if any) does not exactly
      match the remote address configured at the VXLAN device, changing the
      remote address breaks the default FDB entry. Patch #4 is then a self
      test for this issue.
      
      v3:
      - Patch #2:
          - Reuse the same errout block for both cleanup paths. Use a bool to
            decide whether the unregister_netdevice() call should be made.
      
      v2:
      - Drop former patch #3
      - Patch #2:
          - Delete the default entry before calling unregister_netdevice(). That
            takes care of former patch #3, hence tweak the commit message to
            mention that problem as well.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59fc137e
    • Petr Machata's avatar
      selftests: net: Add test_vxlan_fdb_changelink.sh · 55cbe079
      Petr Machata authored
      Add a test to exercise the fix from the previous patch.
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55cbe079
    • Petr Machata's avatar
      vxlan: changelink: Fix handling of default remotes · ce5e098f
      Petr Machata authored
      Default remotes are stored as FDB entries with an Ethernet address of
      00:00:00:00:00:00. When a request is made to change a remote address of
      a VXLAN device, vxlan_changelink() first deletes the existing default
      remote, and then creates a new FDB entry.
      
      This works well as long as the list of default remotes matches exactly
      the configuration of a VXLAN remote address. Thus when the VXLAN device
      has a remote of X, there should be exactly one default remote FDB entry
      X. If the VXLAN device has no remote address, there should be no such
      entry.
      
      Besides using "ip link set", it is possible to manipulate the list of
      default remotes by using the "bridge fdb". It is therefore easy to break
      the above condition. Under such circumstances, the __vxlan_fdb_delete()
      call doesn't delete the FDB entry itself, but just one remote. The
      following vxlan_fdb_create() then creates a new FDB entry, leading to a
      situation where two entries exist for the address 00:00:00:00:00:00,
      each with a different subset of default remotes.
      
      An even more obvious breakage rooted in the same cause can be observed
      when a remote address is configured for a VXLAN device that did not have
      one before. In that case vxlan_changelink() doesn't remove any remote,
      and just creates a new FDB entry for the new address:
      
      $ ip link add name vx up type vxlan id 2000 dstport 4789
      $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.20 self permanent
      $ bridge fdb ap dev vx 00:00:00:00:00:00 dst 192.0.2.30 self permanent
      $ ip link set dev vx type vxlan remote 192.0.2.30
      $ bridge fdb sh dev vx | grep 00:00:00:00:00:00
      00:00:00:00:00:00 dst 192.0.2.30 self permanent <- new entry, 1 rdst
      00:00:00:00:00:00 dst 192.0.2.20 self permanent <- orig. entry, 2 rdsts
      00:00:00:00:00:00 dst 192.0.2.30 self permanent
      
      To fix this, instead of calling vxlan_fdb_create() directly, defer to
      vxlan_fdb_update(). That has logic to handle the duplicates properly.
      Additionally, it also handles notifications, so drop that call from
      changelink as well.
      
      Fixes: 0241b836 ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce5e098f
    • Petr Machata's avatar
      vxlan: Fix error path in __vxlan_dev_create() · 6db92468
      Petr Machata authored
      When a failure occurs in rtnl_configure_link(), the current code
      calls unregister_netdevice() to roll back the earlier call to
      register_netdevice(), and jumps to errout, which calls
      vxlan_fdb_destroy().
      
      However unregister_netdevice() calls transitively ndo_uninit, which is
      vxlan_uninit(), and that already takes care of deleting the default FDB
      entry by calling vxlan_fdb_delete_default(). Since the entry added
      earlier in __vxlan_dev_create() is exactly the default entry, the
      cleanup code in the errout block always leads to double free and thus a
      panic.
      
      Besides, since vxlan_fdb_delete_default() always destroys the FDB entry
      with notification enabled, the deletion of the default entry is notified
      even before the addition was notified.
      
      Instead, move the unregister_netdevice() call after the manual destroy,
      which solves both problems.
      
      Fixes: 0241b836 ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6db92468
    • Petr Machata's avatar
      vxlan: Unmark offloaded bit on replaced FDB entries · 6ad0b5a4
      Petr Machata authored
      When rdst of an offloaded FDB entry is replaced, it certainly isn't
      offloaded anymore. Drivers are notified about such replacements, and can
      re-mark the entry as offloaded again if they so wish. However until a
      driver does so explicitly, assume a replaced FDB entry is not offloaded.
      
      Note that replaces coming via vxlan_fdb_external_learn_add() are always
      immediately followed by an explicit offload marking.
      
      Fixes: 0efe1173 ("vxlan: Support marking RDSTs as offloaded")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ad0b5a4
    • David S. Miller's avatar
      Merge branch 'macb-DMA-race-fixes' · a9d6d897
      David S. Miller authored
      Anssi Hannula says:
      
      ====================
      net: macb: DMA race condition fixes
      
      Here are a couple of race condition fixes for the macb driver. The first
      two are for issues observed at runtime on real HW.
      
      v2:
      - added received Tested-bys and Acked-bys to the first two patches
      - in patch 3/3, moved the timestamp protection barrier closer to the
        timestamp reads
      - in patch 3/3, removed unnecessary move of the addr assignment in
        gem_rx() to keep the patch minimal for maximum clarity
      - in patch 3/3, clarified commit message and comments
      
      The 3/3 is the same one I improperly sent last week as a standalone
      patch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9d6d897
    • Anssi Hannula's avatar
      net: macb: add missing barriers when reading descriptors · 6e0af298
      Anssi Hannula authored
      When reading buffer descriptors on RX or on TX completion, an
      RX_USED/TX_USED bit is checked first to ensure that the descriptors have
      been populated, i.e. the ownership has been transferred. However, there
      are no memory barriers to ensure that the data protected by the
      RX_USED/TX_USED bit is up-to-date with respect to that bit.
      
      Specifically:
      
      - TX timestamp descriptors may be loaded before ctrl is loaded for the
        TX_USED check, which is racy as the descriptors may be updated between
        the loads, causing old timestamp descriptor data to be used.
      
      - RX ctrl may be loaded before addr is loaded for the RX_USED check,
        which is racy as a new frame may be written between the loads, causing
        old ctrl descriptor data to be used.
        This issue exists for both macb_rx() and gem_rx() variants.
      
      Fix the races by adding DMA read memory barriers on those paths and
      reordering the reads in macb_rx().
      
      I have not observed any actual problems in practice caused by these
      being missing, though.
      
      Tested on a ZynqMP based system.
      
      Fixes: 89e5785f ("[PATCH] Atmel MACB ethernet driver")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e0af298
    • Anssi Hannula's avatar
      net: macb: fix dropped RX frames due to a race · 8159ecab
      Anssi Hannula authored
      Bit RX_USED set to 0 in the address field allows the controller to write
      data to the receive buffer descriptor.
      
      The driver does not ensure the ctrl field is ready (cleared) when the
      controller sees the RX_USED=0 written by the driver. The ctrl field might
      only be cleared after the controller has already updated it according to
      a newly received frame, causing the frame to be discarded in gem_rx() due
      to unexpected ctrl field contents.
      
      A message is logged when the above scenario occurs:
      
        macb ff0b0000.ethernet eth0: not whole frame pointed by descriptor
      
      Fix the issue by ensuring that when the controller sees RX_USED=0 the
      ctrl field is already cleared.
      
      This issue was observed on a ZynqMP based system.
      
      Fixes: 4df95131 ("net/macb: change RX path for GEM")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Tested-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8159ecab
    • Anssi Hannula's avatar
      net: macb: fix random memory corruption on RX with 64-bit DMA · e100a897
      Anssi Hannula authored
      64-bit DMA addresses are split in upper and lower halves that are
      written in separate fields on GEM. For RX, bit 0 of the address is used
      as the ownership bit (RX_USED). When the RX_USED bit is unset the
      controller is allowed to write data to the buffer.
      
      The driver does not guarantee that the controller already sees the upper
      half when the RX_USED bit is cleared, possibly resulting in the
      controller writing an incoming frame to an address with an incorrect
      upper half and therefore possibly corrupting unrelated system memory.
      
      Fix that by adding the necessary DMA memory barrier between the writes.
      
      This corruption was observed on a ZynqMP based system.
      
      Fixes: fff8019a ("net: macb: Add 64 bit addressing support for GEM")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Acked-by: default avatarHarini Katakam <harini.katakam@xilinx.com>
      Tested-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e100a897
  2. 18 Dec, 2018 16 commits
  3. 17 Dec, 2018 1 commit
  4. 16 Dec, 2018 12 commits
    • Marcin Wojtas's avatar
      net: mvneta: fix operation for 64K PAGE_SIZE · e735fd55
      Marcin Wojtas authored
      Recent changes in the mvneta driver reworked allocation
      and handling of the ingress buffers to use entire pages.
      Apart from that in SW BM scenario the HW must be informed
      via PRXDQS about the biggest possible incoming buffer
      that can be propagated by RX descriptors.
      
      The BufferSize field was filled according to the MTU-dependent
      pkt_size value. Later change to PAGE_SIZE broke RX operation
      when usin 64K pages, as the field is simply too small.
      
      This patch conditionally limits the value passed to the BufferSize
      of the PRXDQS register, depending on the PAGE_SIZE used.
      On the occasion remove now unused frag_size field of the mvneta_port
      structure.
      
      Fixes: 562e2f46 ("net: mvneta: Improve the buffer allocation method for SWBM")
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e735fd55
    • David S. Miller's avatar
      Merge branch 'hns-fixes' · 369a094d
      David S. Miller authored
      Peng Li says:
      
      ====================
      net: hns: Code improvements & fixes for HNS driver
      
      This patchset introduces some code improvements and fixes
      for the identified problems in the HNS driver.
      
      Every patch is independent.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      369a094d
    • Yonglong Liu's avatar
      net: hns: Fix ping failed when use net bridge and send multicast · 6adafc35
      Yonglong Liu authored
      Create a net bridge, add eth and vnet to the bridge. The vnet is used
      by a virtual machine. When ping the virtual machine from the outside
      host and the virtual machine send multicast at the same time, the ping
      package will lost.
      
      The multicast package send to the eth, eth will send it to the bridge too,
      and the bridge learn the mac of eth. When outside host ping the virtual
      mechine, it will match the promisc entry of the eth which is not expected,
      and the bridge send it to eth not to vnet, cause ping lost.
      
      So this patch change promisc tcam entry position to the END of 512 tcam
      entries, which indicate lower priority. And separate one promisc entry to
      two: mc & uc, to avoid package match the wrong tcam entry.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6adafc35
    • Yonglong Liu's avatar
      net: hns: Add mac pcs config when enable|disable mac · 726ae5c9
      Yonglong Liu authored
      In some case, when mac enable|disable and adjust link, may cause hard to
      link(or abnormal) between mac and phy. This patch adds the code for rx PCS
      to avoid this bug.
      
      Disable the rx PCS when driver disable the gmac, and enable the rx PCS
      when driver enable the mac.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      726ae5c9
    • Yonglong Liu's avatar
      net: hns: Fix ntuple-filters status error. · 7e74a19c
      Yonglong Liu authored
      The ntuple-filters features is forced on by chip.
      But it shows "ntuple-filters: off [fixed]" when use ethtool.
      This patch make it correct with "ntuple-filters: on [fixed]".
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e74a19c
    • Yonglong Liu's avatar
      net: hns: Avoid net reset caused by pause frames storm · a57275d3
      Yonglong Liu authored
      There will be a large number of MAC pause frames on the net,
      which caused tx timeout of net device. And then the net device
      was reset to try to recover it. So that is not useful, and will
      cause some other problems.
      
      So need doubled ndev->watchdog_timeo if device watchdog occurred
      until watchdog_timeo up to 40s and then try resetting to recover
      it.
      
      When collecting dfx information such as hardware registers when tx timeout.
      Some registers for count were cleared when read. So need move this task
      before update net state which also read the count registers.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a57275d3
    • Yonglong Liu's avatar
      net: hns: Free irq when exit from abnormal branch · c82bd077
      Yonglong Liu authored
      1.In "hns_nic_init_irq", if request irq fail at index i,
        the function return directly without releasing irq resources
        that already requested.
      
      2.In "hns_nic_net_up" after "hns_nic_init_irq",
        if exceptional branch occurs, irqs that already requested
        are not release.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c82bd077
    • Yonglong Liu's avatar
      net: hns: Clean rx fbd when ae stopped. · 31f6b61d
      Yonglong Liu authored
      If there are packets in hardware when changing the speed or duplex,
      it may cause hardware hang up.
      
      This patch adds the code to wait rx fbd clean up when ae stopped.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31f6b61d
    • Yonglong Liu's avatar
      net: hns: Fixed bug that netdev was opened twice · 5778b13b
      Yonglong Liu authored
      After resetting dsaf to try to repair chip error such as ecc error,
      the net device will be open if net interface is up. But at this time
      if there is the users set the net device up with the command ifconfig,
      the net device will be opened twice consecutively.
      
      Function napi_enable was called when open device. And Kernel panic will
      be occurred if it was called twice consecutively. Such as follow:
      static inline void napi_enable(struct napi_struct *n)
      {
               BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
               smp_mb__before_clear_bit();
               clear_bit(NAPI_STATE_SCHED, &n->state);
      }
      
      [37255.571996] Kernel panic - not syncing: BUG!
      [37255.595234] Call trace:
      [37255.597694] [<ffff80000008ab48>] dump_backtrace+0x0/0x1a0
      [37255.603114] [<ffff80000008ad08>] show_stack+0x20/0x28
      [37255.608187] [<ffff8000009c4944>] dump_stack+0x98/0xb8
      [37255.613258] [<ffff8000009c149c>] panic+0x10c/0x26c
      [37255.618070] [<ffff80000070f134>] hns_nic_net_up+0x30c/0x4e0
      [37255.623664] [<ffff80000070f39c>] hns_nic_net_open+0x94/0x12c
      [37255.629346] [<ffff80000084be78>] __dev_open+0xf4/0x168
      [37255.634504] [<ffff80000084c1ac>] __dev_change_flags+0x98/0x15c
      [37255.640359] [<ffff80000084c29c>] dev_change_flags+0x2c/0x68
      [37255.769580] [<ffff8000008dc400>] devinet_ioctl+0x650/0x704
      [37255.775086] [<ffff8000008ddc38>] inet_ioctl+0x98/0xb4
      [37255.780159] [<ffff800000827b7c>] sock_do_ioctl+0x44/0x84
      [37255.785490] [<ffff800000828e04>] sock_ioctl+0x248/0x30c
      [37255.790737] [<ffff80000026dc6c>] do_vfs_ioctl+0x480/0x618
      [37255.796156] [<ffff80000026de94>] SyS_ioctl+0x90/0xa4
      [37255.801139] SMP: stopping secondary CPUs
      [37255.805079] kbox: catch panic event.
      [37255.809586] collected_len = 128928, LOG_BUF_LEN_LOCAL = 131072
      [37255.816103] flush cache 0xffff80003f000000  size 0x800000
      [37255.822192] flush cache 0xffff80003f000000  size 0x800000
      [37255.828289] flush cache 0xffff80003f000000  size 0x800000
      [37255.834378] kbox: no notify die func register. no need to notify
      [37255.840413] ---[ end Kernel panic - not syncing: BUG!
      
      This patchset fix this bug according to the flag NIC_STATE_DOWN.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5778b13b
    • Yonglong Liu's avatar
      net: hns: Some registers use wrong address according to the datasheet. · 4ad26f11
      Yonglong Liu authored
      According to the hip06 datasheet:
      1.Six registers use wrong address:
        RCB_COM_SF_CFG_INTMASK_RING
        RCB_COM_SF_CFG_RING_STS
        RCB_COM_SF_CFG_RING
        RCB_COM_SF_CFG_INTMASK_BD
        RCB_COM_SF_CFG_BD_RINT_STS
        DSAF_INODE_VC1_IN_PKT_NUM_0_REG
      2.The offset of DSAF_INODE_VC1_IN_PKT_NUM_0_REG should be
        0x103C + 0x80 * all_chn_num
      3.The offset to show the value of DSAF_INODE_IN_DATA_STP_DISC_0_REG
        is wrong, so the value of DSAF_INODE_SW_VLAN_TAG_DISC_0_REG will be
        overwrite
      
      These registers are only used in "ethtool -d", so that did not cause ndev
      to misfunction.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ad26f11
    • Yonglong Liu's avatar
      net: hns: All ports can not work when insmod hns ko after rmmod. · 308c6caf
      Yonglong Liu authored
      There are two test cases:
      1. Remove the 4 modules:hns_enet_drv/hns_dsaf/hnae/hns_mdio,
         and install them again, must use "ifconfig down/ifconfig up"
         command pair to bring port to work.
      
         This patch calls phy_stop function when init phy to fix this bug.
      
      2. Remove the 2 modules:hns_enet_drv/hns_dsaf, and install them again,
         all ports can not use anymore, because of the phy devices register
         failed(phy devices already exists).
      
         Phy devices are registered when hns_dsaf installed, this patch
         removes them when hns_dsaf removed.
      
      The two cases are sometimes related, fixing the second case also requires
      fixing the first case, so fix them together.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      308c6caf
    • Yonglong Liu's avatar
      net: hns: Incorrect offset address used for some registers. · 4e1d4be6
      Yonglong Liu authored
      According to the hip06 Datasheet:
      1. The offset of INGRESS_SW_VLAN_TAG_DISC should be 0x1A00+4*all_chn_num
      2. The offset of INGRESS_IN_DATA_STP_DISC should be 0x1A50+4*all_chn_num
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e1d4be6
  5. 15 Dec, 2018 2 commits