1. 17 Apr, 2016 1 commit
  2. 16 Apr, 2016 7 commits
    • David S. Miller's avatar
      Merge branch 'dsa-mv88e6xxx-fix-cross-chip-bridging' · cf6b5fb2
      David S. Miller authored
      Vivien Didelot says:
      
      ====================
      net: dsa: mv88e6xxx: fix hardware cross-chip bridging
      
      In order to accelerate cross-chip switching of frames with the hardware,
      the DSA Tag ports, used to interconnect switch devices, must learn SA
      and DA addresses, and share the same FDB with the user ports.
      
      The two first patches restore address learning on DSA links. This fixes
      hardware cross-chip bridging in a VLAN filtering enabled system, which
      implements a bridge group as a 802.1Q VLAN and thus share an isolated
      address database between DSA and user ports.
      
      The third patch changes the distinct default databases used for each
      port, to the same address database. This fixes the hardware cross-chip
      bridging in a VLAN filtering disabled system, where a bridge group gets
      implemented only as a port-based VLAN.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf6b5fb2
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: share the same default FDB · 207afda1
      Vivien Didelot authored
      For hardware cross-chip bridging to work, user ports *and* DSA ports
      need to share a common address database, in order to switch a frame to
      the correct interconnected device.
      
      This is currently working for VLAN filtering aware systems, since Linux
      will implement a bridge group as a 802.1Q VLAN, which has its own FDB,
      including DSA and CPU links as members.
      
      However when the system doesn't support VLAN filtering, Linux only
      relies on the port-based VLAN to implement a bridge group.
      
      To fix hardware cross-chip bridging for such systems, set the same
      default address database 0 for user and DSA ports, instead of giving
      them all a different default database.
      
      Note that the bridging code prevents frames to egress between unbridged
      ports, and flushes FDB entries of a port when changing its STP state.
      
      Also note that the FID 0 is special and means "all" for ATU operations,
      but it's OK since it is used as a default forwarding address database.
      
      Fixes: 2db9ce1f ("net: dsa: mv88e6xxx: assign default FDB to ports")
      Fixes: 466dfa07 ("net: dsa: mv88e6xxx: assign dynamic FDB to bridges")
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      207afda1
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: enable SA learning on DSA ports · 996ecb82
      Vivien Didelot authored
      In multi-chip systems, DSA Tag ports must learn SA addresses in order to
      correctly switch frames between interconnected chips.
      
      This fixes cross-chip hardware bridging in a VLAN filtering aware
      system, because a bridge group gets implemented as an hardware 802.1Q
      VLAN and thus DSA and user ports share the same FDB.
      
      Fixes: 4c7ea3c0 ("net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports")
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      996ecb82
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: unlock DSA and CPU ports · 65fa4027
      Vivien Didelot authored
      Locking a port generates an hardware interrupt when a new SA address is
      received. This enables CPU directed learning, which is needed for 802.1X
      MAC authentication.
      
      To disable automatic learning on a port, the only configuration needed
      is to set its Port Association Vector to all zero.
      
      Clear PAV when SA learning should be disabled instead of locking a port.
      
      Fixes: 4c7ea3c0 ("net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports")
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65fa4027
    • santosh.shilimkar@oracle.com's avatar
      RDS: Fix the atomicity for congestion map update · e47db94e
      santosh.shilimkar@oracle.com authored
      Two different threads with different rds sockets may be in
      rds_recv_rcvbuf_delta() via receive path. If their ports
      both map to the same word in the congestion map, then
      using non-atomic ops to update it could cause the map to
      be incorrect. Lets use atomics to avoid such an issue.
      
      Full credit to Wengang <wen.gang.wang@oracle.com> for
      finding the issue, analysing it and also pointing out
      to offending code with spin lock based fix.
      Reviewed-by: default avatarLeon Romanovsky <leon@leon.nu>
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e47db94e
    • Qing Huang's avatar
      RDS: fix endianness for dp_ack_seq · a7c55654
      Qing Huang authored
      dp->dp_ack_seq is used in big endian format. We need to do the
      big endianness conversion when we assign a value in host format
      to it.
      Signed-off-by: default avatarQing Huang <qing.huang@oracle.com>
      Signed-off-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7c55654
    • Daniel Borkmann's avatar
      vlan: pull on __vlan_insert_tag error path and fix csum correction · 9241e2df
      Daniel Borkmann authored
      When __vlan_insert_tag() fails from skb_vlan_push() path due to the
      skb_cow_head(), we need to undo the __skb_push() in the error path
      as well that was done earlier to move skb->data pointer to mac header.
      
      Moreover, I noticed that when in the non-error path the __skb_pull()
      is done and the original offset to mac header was non-zero, we fixup
      from a wrong skb->data offset in the checksum complete processing.
      
      So the skb_postpush_rcsum() really needs to be done before __skb_pull()
      where skb->data still points to the mac header start and thus operates
      under the same conditions as in __vlan_insert_tag().
      
      Fixes: 93515d53 ("net: move vlan pop/push functions into common code")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9241e2df
  3. 15 Apr, 2016 6 commits
    • Andrew Goodbody's avatar
      cpsw: Prevent NUll pointer dereference with two PHYs · cfe25560
      Andrew Goodbody authored
      Adding a 2nd PHY to cpsw results in a NULL pointer dereference
      as below. Fix by maintaining a reference to each PHY node in slave
      struct instead of a single reference in the priv struct which was
      overwritten by the 2nd PHY.
      
      [   17.870933] Unable to handle kernel NULL pointer dereference at virtual address 00000180
      [   17.879557] pgd = dc8bc000
      [   17.882514] [00000180] *pgd=9c882831, *pte=00000000, *ppte=00000000
      [   17.889213] Internal error: Oops: 17 [#1] ARM
      [   17.893838] Modules linked in:
      [   17.897102] CPU: 0 PID: 1657 Comm: connmand Not tainted 4.5.0-ge463dfb-dirty #11
      [   17.904947] Hardware name: Cambrionix whippet
      [   17.909576] task: dc859240 ti: dc968000 task.ti: dc968000
      [   17.915339] PC is at phy_attached_print+0x18/0x8c
      [   17.920339] LR is at phy_attached_info+0x14/0x18
      [   17.925247] pc : [<c042baec>]    lr : [<c042bb74>]    psr: 600f0113
      [   17.925247] sp : dc969cf8  ip : dc969d28  fp : dc969d18
      [   17.937425] r10: dda7a400  r9 : 00000000  r8 : 00000000
      [   17.942971] r7 : 00000001  r6 : ddb00480  r5 : ddb8cb34  r4 : 00000000
      [   17.949898] r3 : c0954cc0  r2 : c09562b0  r1 : 00000000  r0 : 00000000
      [   17.956829] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      [   17.964401] Control: 10c5387d  Table: 9c8bc019  DAC: 00000051
      [   17.970500] Process connmand (pid: 1657, stack limit = 0xdc968210)
      [   17.977059] Stack: (0xdc969cf8 to 0xdc96a000)
      [   17.981692] 9ce0:                                                       dc969d28 dc969d08
      [   17.990386] 9d00: c038f9bc c038f6b4 ddb00480 dc969d34 dc969d28 c042bb74 c042bae4 00000000
      [   17.999080] 9d20: c09562b0 c0954cc0 dc969d5c dc969d38 c043ebfc c042bb6c 00000007 00000003
      [   18.007773] 9d40: ddb00000 ddb8cb58 ddb00480 00000001 dc969dec dc969d60 c0441614 c043ea68
      [   18.016465] 9d60: 00000000 00000003 00000000 fffffff4 dc969df4 0000000d 00000000 00000000
      [   18.025159] 9d80: dc969db4 dc969d90 c005dc08 c05839e0 dc969df4 0000000d ddb00000 00001002
      [   18.033851] 9da0: 00000000 00000000 dc969dcc dc969db8 c005ddf4 c005dbc8 00000000 00000118
      [   18.042544] 9dc0: dc969dec dc969dd0 ddb00000 c06db27c ffff9003 00001002 00000000 00000000
      [   18.051237] 9de0: dc969e0c dc969df0 c057c88c c04410dc dc969e0c ddb00000 ddb00000 00000001
      [   18.059930] 9e00: dc969e34 dc969e10 c057cb44 c057c7d8 ddb00000 ddb00138 00001002 beaeda20
      [   18.068622] 9e20: 00000000 00000000 dc969e5c dc969e38 c057cc28 c057cac0 00000000 dc969e80
      [   18.077315] 9e40: dda7a40c beaeda20 00000000 00000000 dc969ecc dc969e60 c05e36d0 c057cc14
      [   18.086007] 9e60: dc969e84 00000051 beaeda20 00000000 dda7a40c 00000014 ddb00000 00008914
      [   18.094699] 9e80: 30687465 00000000 00000000 00000000 00009003 00000000 00000000 00000000
      [   18.103391] 9ea0: 00001002 00008914 dd257ae0 beaeda20 c098a428 beaeda20 00000011 00000000
      [   18.112084] 9ec0: dc969edc dc969ed0 c05e4e54 c05e3030 dc969efc dc969ee0 c055f5ac c05e4cc4
      [   18.120777] 9ee0: beaeda20 dd257ae0 dc8ab4c0 00008914 dc969f7c dc969f00 c010b388 c055f45c
      [   18.129471] 9f00: c071ca40 dd257ac0 c00165e8 dc968000 dc969f3c dc969f20 dc969f64 dc969f28
      [   18.138164] 9f20: c0115708 c0683ec8 dd257ac0 dd257ac0 dc969f74 dc969f40 c055f350 c00fc66c
      [   18.146857] 9f40: dd82e4d0 00000011 00000000 00080000 dd257ac0 00000000 dc8ab4c0 dc8ab4c0
      [   18.155550] 9f60: 00008914 beaeda20 00000011 00000000 dc969fa4 dc969f80 c010bc34 c010b2fc
      [   18.164242] 9f80: 00000000 00000011 00000002 00000036 c00165e8 dc968000 00000000 dc969fa8
      [   18.172935] 9fa0: c00163e0 c010bbcc 00000000 00000011 00000011 00008914 beaeda20 00009003
      [   18.181628] 9fc0: 00000000 00000011 00000002 00000036 00081018 00000001 00000000 beaedc10
      [   18.190320] 9fe0: 00083188 beaeda1c 00043a5d b6d29c0c 600b0010 00000011 00000000 00000000
      [   18.198989] Backtrace:
      [   18.201621] [<c042bad8>] (phy_attached_print) from [<c042bb74>] (phy_attached_info+0x14/0x18)
      [   18.210664]  r3:c0954cc0 r2:c09562b0 r1:00000000
      [   18.215588]  r4:ddb00480
      [   18.218322] [<c042bb60>] (phy_attached_info) from [<c043ebfc>] (cpsw_slave_open+0x1a0/0x280)
      [   18.227293] [<c043ea5c>] (cpsw_slave_open) from [<c0441614>] (cpsw_ndo_open+0x544/0x674)
      [   18.235874]  r7:00000001 r6:ddb00480 r5:ddb8cb58 r4:ddb00000
      [   18.241944] [<c04410d0>] (cpsw_ndo_open) from [<c057c88c>] (__dev_open+0xc0/0x128)
      [   18.249972]  r9:00000000 r8:00000000 r7:00001002 r6:ffff9003 r5:c06db27c r4:ddb00000
      [   18.258255] [<c057c7cc>] (__dev_open) from [<c057cb44>] (__dev_change_flags+0x90/0x154)
      [   18.266745]  r5:00000001 r4:ddb00000
      [   18.270575] [<c057cab4>] (__dev_change_flags) from [<c057cc28>] (dev_change_flags+0x20/0x50)
      [   18.279523]  r9:00000000 r8:00000000 r7:beaeda20 r6:00001002 r5:ddb00138 r4:ddb00000
      [   18.287811] [<c057cc08>] (dev_change_flags) from [<c05e36d0>] (devinet_ioctl+0x6ac/0x76c)
      [   18.296483]  r9:00000000 r8:00000000 r7:beaeda20 r6:dda7a40c r5:dc969e80 r4:00000000
      [   18.304762] [<c05e3024>] (devinet_ioctl) from [<c05e4e54>] (inet_ioctl+0x19c/0x1c8)
      [   18.312882]  r10:00000000 r9:00000011 r8:beaeda20 r7:c098a428 r6:beaeda20 r5:dd257ae0
      [   18.321235]  r4:00008914
      [   18.323956] [<c05e4cb8>] (inet_ioctl) from [<c055f5ac>] (sock_ioctl+0x15c/0x2d8)
      [   18.331829] [<c055f450>] (sock_ioctl) from [<c010b388>] (do_vfs_ioctl+0x98/0x8d0)
      [   18.339765]  r7:00008914 r6:dc8ab4c0 r5:dd257ae0 r4:beaeda20
      [   18.345822] [<c010b2f0>] (do_vfs_ioctl) from [<c010bc34>] (SyS_ioctl+0x74/0x84)
      [   18.353573]  r10:00000000 r9:00000011 r8:beaeda20 r7:00008914 r6:dc8ab4c0 r5:dc8ab4c0
      [   18.361924]  r4:00000000
      [   18.364653] [<c010bbc0>] (SyS_ioctl) from [<c00163e0>] (ret_fast_syscall+0x0/0x3c)
      [   18.372682]  r9:dc968000 r8:c00165e8 r7:00000036 r6:00000002 r5:00000011 r4:00000000
      [   18.380960] Code: e92dd810 e24cb010 e24dd010 e59b4004 (e5902180)
      [   18.387580] ---[ end trace c80529466223f3f3 ]---
      Signed-off-by: default avatarAndrew Goodbody <andrew.goodbody@cambrionix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfe25560
    • Felix Fietkau's avatar
      bgmac: fix MAC soft-reset bit for corerev > 4 · c02bc350
      Felix Fietkau authored
      Only core revisions older than 4 use BGMAC_CMDCFG_SR_REV0. This mainly
      fixes support for BCM4708A0KF SoCs with Ethernet core rev 5 (it means
      only some devices as most of BCM4708A0KF-s got core rev 4).
      This was tested for regressions on BCM47094 which doesn't seem to care
      which bit gets used.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarRafał Miłecki <zajec5@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c02bc350
    • David S. Miller's avatar
      Merge branch 'soreuseport-mixed-v4-v6-fixes' · 01c445a4
      David S. Miller authored
      Craig Gallek says:
      
      ====================
      Fixes for SO_REUSEPORT and mixed v4/v6 sockets
      
      Recent changes to the datastructures associated with SO_REUSEPORT broke
      an existing behavior when equivalent SO_REUSEPORT sockets are created
      using both AF_INET and AF_INET6.  This patch series restores the previous
      behavior and includes a test to validate it.
      
      This series should be a trivial merge to stable kernels (if deemed
      necessary), but will have conflicts in net-next.  The following patches
      recently replaced the use of hlist_nulls with hlists for UDP and TCP
      socket lists:
      ca065d0c ("udp: no longer use SLAB_DESTROY_BY_RCU")
      3b24d854 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
      
      If this series is accepted, I will send an RFC for the net-next change
      to assist with the merge.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01c445a4
    • Craig Gallek's avatar
      soreuseport: test mixed v4/v6 sockets · d6a61f80
      Craig Gallek authored
      Test to validate the behavior of SO_REUSEPORT sockets that are
      created with both AF_INET and AF_INET6.  See the commit prior to this
      for a description of this behavior.
      Signed-off-by: default avatarCraig Gallek <kraig@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6a61f80
    • Craig Gallek's avatar
      soreuseport: fix ordering for mixed v4/v6 sockets · d894ba18
      Craig Gallek authored
      With the SO_REUSEPORT socket option, it is possible to create sockets
      in the AF_INET and AF_INET6 domains which are bound to the same IPv4 address.
      This is only possible with SO_REUSEPORT and when not using IPV6_V6ONLY on
      the AF_INET6 sockets.
      
      Prior to the commits referenced below, an incoming IPv4 packet would
      always be routed to a socket of type AF_INET when this mixed-mode was used.
      After those changes, the same packet would be routed to the most recently
      bound socket (if this happened to be an AF_INET6 socket, it would
      have an IPv4 mapped IPv6 address).
      
      The change in behavior occurred because the recent SO_REUSEPORT optimizations
      short-circuit the socket scoring logic as soon as they find a match.  They
      did not take into account the scoring logic that favors AF_INET sockets
      over AF_INET6 sockets in the event of a tie.
      
      To fix this problem, this patch changes the insertion order of AF_INET
      and AF_INET6 addresses in the TCP and UDP socket lists when the sockets
      have SO_REUSEPORT set.  AF_INET sockets will be inserted at the head of the
      list and AF_INET6 sockets with SO_REUSEPORT set will always be inserted at
      the tail of the list.  This will force AF_INET sockets to always be
      considered first.
      
      Fixes: e32ea7e7 ("soreuseport: fast reuseport UDP socket selection")
      Fixes: 125e80b88687 ("soreuseport: fast reuseport TCP socket selection")
      Reported-by: default avatarMaciej Żenczykowski <maze@google.com>
      Signed-off-by: default avatarCraig Gallek <kraig@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d894ba18
    • Bjørn Mork's avatar
      cdc_mbim: apply "NDP to end" quirk to all Huawei devices · c5b5343c
      Bjørn Mork authored
      We now have a positive report of another Huawei device needing
      this quirk: The ME906s-158 (12d1:15c1).  This is an m.2 form
      factor modem with no obvious relationship to the E3372 (12d1:157d)
      we already have a quirk entry for.  This is reason enough to
      believe the quirk might be necessary for any number of current
      and future Huawei devices.
      
      Applying the quirk to all Huawei devices, since it is crucial
      to any device affected by the firmware bug, while the impact
      on non-affected devices is negligible.
      
      The quirk can if necessary be disabled per-device by writing
      N to /sys/class/net/<iface>/cdc_ncm/ndp_to_end
      Reported-by: default avatarAndreas Fett <andreas.fett@secunet.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5b5343c
  4. 14 Apr, 2016 20 commits
    • Rafał Miłecki's avatar
      bgmac: reset & enable Ethernet core before using it · b4dfd8e9
      Rafał Miłecki authored
      This fixes Ethernet on D-Link DIR-885L with BCM47094 SoC. Felix reported
      similar fix was needed for his BCM4709 device (Buffalo WXR-1900DHP?).
      I tested this for regressions on BCM4706, BCM4708A0 and BCM47081A0.
      
      Cc: Felix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarRafał Miłecki <zajec5@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4dfd8e9
    • David S. Miller's avatar
      Merge branch 'ipv6-dgram-dst-cache' · 97cc931f
      David S. Miller authored
      Martin KaFai Lau says:
      
      ====================
      ipv6: datagram: Update dst cache of a connected udp sk during pmtu update
      
      v2:
      ~ Protect __sk_dst_get() operations with rcu_read_lock in
        release_cb() because another thread may do ip6_dst_store()
        for a udp sk without taking the sk lock (e.g. in sendmsg).
      ~ Do a ipv6_addr_v4mapped(&sk->sk_v6_daddr) check before
        calling ip6_datagram_dst_update() in patch 3 and 4.  It is
        similar to how __ip6_datagram_connect handles it.
      ~ One fix in ip6_datagram_dst_update() in patch 2.  It needs
        to check (np->flow_label & IPV6_FLOWLABEL_MASK) before
        doing fl6_sock_lookup.  I was confused with the naming
        of IPV6_FLOWLABEL_MASK and IPV6_FLOWINFO_MASK.
      ~ Check dst->obsolete just on the safe side, although I think it
        should at least have DST_OBSOLETE_FORCE_CHK by now.
      ~ Add Fixes tag to patch 3 and 4
      ~ Add some points from the previous discussion about holding
        sk lock to the commit message in patch 3.
      
      v1:
      There is a case in connected UDP socket such that
      getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
      sequence could be the following:
      1. Create a connected UDP socket
      2. Send some datagrams out
      3. Receive a ICMPV6_PKT_TOOBIG
      4. No new outgoing datagrams to trigger the sk_dst_check()
         logic to update the sk->sk_dst_cache.
      5. getsockopt(IPV6_MTU) returns the mtu from the invalid
         sk->sk_dst_cache instead of the newly created RTF_CACHE clone.
      
      Patch 1 and 2 are the prep work.
      Patch 3 and 4 are the fixes.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      97cc931f
    • Martin KaFai Lau's avatar
      ipv6: udp: Do a route lookup and update during release_cb · e646b657
      Martin KaFai Lau authored
      This patch adds a release_cb for UDPv6.  It does a route lookup
      and updates sk->sk_dst_cache if it is needed.  It picks up the
      left-over job from ip6_sk_update_pmtu() if the sk was owned
      by user during the pmtu update.
      
      It takes a rcu_read_lock to protect the __sk_dst_get() operations
      because another thread may do ip6_dst_store() without taking the
      sk lock (e.g. sendmsg).
      
      Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Reported-by: default avatarWei Wang <weiwan@google.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e646b657
    • Martin KaFai Lau's avatar
      ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update · 33c162a9
      Martin KaFai Lau authored
      There is a case in connected UDP socket such that
      getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
      sequence could be the following:
      1. Create a connected UDP socket
      2. Send some datagrams out
      3. Receive a ICMPV6_PKT_TOOBIG
      4. No new outgoing datagrams to trigger the sk_dst_check()
         logic to update the sk->sk_dst_cache.
      5. getsockopt(IPV6_MTU) returns the mtu from the invalid
         sk->sk_dst_cache instead of the newly created RTF_CACHE clone.
      
      This patch updates the sk->sk_dst_cache for a connected datagram sk
      during pmtu-update code path.
      
      Note that the sk->sk_v6_daddr is used to do the route lookup
      instead of skb->data (i.e. iph).  It is because a UDP socket can become
      connected after sending out some datagrams in un-connected state.  or
      It can be connected multiple times to different destinations.  Hence,
      iph may not be related to where sk is currently connected to.
      
      It is done under '!sock_owned_by_user(sk)' condition because
      the user may make another ip6_datagram_connect()  (i.e changing
      the sk->sk_v6_daddr) while dst lookup is happening in the pmtu-update
      code path.
      
      For the sock_owned_by_user(sk) == true case, the next patch will
      introduce a release_cb() which will update the sk->sk_dst_cache.
      
      Test:
      
      Server (Connected UDP Socket):
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Route Details:
      [root@arch-fb-vm1 ~]# ip -6 r show | egrep '2fac'
      2fac::/64 dev eth0  proto kernel  metric 256  pref medium
      2fac:face::/64 via 2fac::face dev eth0  metric 1024  pref medium
      
      A simple python code to create a connected UDP socket:
      
      import socket
      import errno
      
      HOST = '2fac::1'
      PORT = 8080
      
      s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
      s.bind((HOST, PORT))
      s.connect(('2fac:face::face', 53))
      print("connected")
      while True:
          try:
      	data = s.recv(1024)
          except socket.error as se:
      	if se.errno == errno.EMSGSIZE:
      		pmtu = s.getsockopt(41, 24)
      		print("PMTU:%d" % pmtu)
      		break
      s.close()
      
      Python program output after getting a ICMPV6_PKT_TOOBIG:
      [root@arch-fb-vm1 ~]# python2 ~/devshare/kernel/tasks/fib6/udp-connect-53-8080.py
      connected
      PMTU:1300
      
      Cache routes after recieving TOOBIG:
      [root@arch-fb-vm1 ~]# ip -6 r show table cache
      2fac:face::face via 2fac::face dev eth0  metric 0
          cache  expires 463sec mtu 1300 pref medium
      
      Client (Send the ICMPV6_PKT_TOOBIG):
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      scapy is used to generate the TOOBIG message.  Here is the scapy script I have
      used:
      
      >>> p=Ether(src='da:75:4d:36:ac:32', dst='52:54:00:12:34:66', type=0x86dd)/IPv6(src='2fac::face', dst='2fac::1')/ICMPv6PacketTooBig(mtu=1300)/IPv6(src='2fac::
      1',dst='2fac:face::face', nh='UDP')/UDP(sport=8080,dport=53)
      >>> sendp(p, iface='qemubr0')
      
      Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Reported-by: default avatarWei Wang <weiwan@google.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33c162a9
    • Martin KaFai Lau's avatar
      ipv6: datagram: Refactor dst lookup and update codes to a new function · 7e2040db
      Martin KaFai Lau authored
      This patch moves the route lookup and update codes for connected
      datagram sk to a newly created function ip6_datagram_dst_update()
      
      It will be reused during the pmtu update in the later patch.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e2040db
    • Martin KaFai Lau's avatar
      ipv6: datagram: Refactor flowi6 init codes to a new function · 80fbdb20
      Martin KaFai Lau authored
      Move flowi6 init codes for connected datagram sk to a newly created
      function ip6_datagram_flow_key_init().
      
      Notes:
      1. fl6_flowlabel is used instead of fl6.flowlabel in __ip6_datagram_connect
      2. ipv6_addr_is_multicast(&fl6->daddr) is used instead of
         (addr_type & IPV6_ADDR_MULTICAST) in ip6_datagram_flow_key_init()
      
      This new function will be reused during pmtu update in the later patch.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80fbdb20
    • John Crispin's avatar
      net: mediatek: update the IRQ part of the binding document · f1d0540d
      John Crispin authored
      The current binding document only describes a single interrupt. Update the
      document by adding the 2 other interrupts.
      
      The driver currently only uses a single interrupt. The HW is however able
      to using IRQ grouping to split TX and RX onto separate GIC irqs.
      Signed-off-by: default avatarJohn Crispin <blogic@openwrt.org>
      Cc: devicetree@vger.kernel.org
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f1d0540d
    • David S. Miller's avatar
      Merge tag 'mac80211-for-davem-2016-04-14' of... · 5e265029
      David S. Miller authored
      Merge tag 'mac80211-for-davem-2016-04-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      This has just the single fix from Dmitry Ivanov, adding the missing
      netlink notifier family check to avoid the socket close DoS problem.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e265029
    • Alexei Starovoitov's avatar
      bpf/verifier: reject invalid LD_ABS | BPF_DW instruction · d82bccc6
      Alexei Starovoitov authored
      verifier must check for reserved size bits in instruction opcode and
      reject BPF_LD | BPF_ABS | BPF_DW and BPF_LD | BPF_IND | BPF_DW instructions,
      otherwise interpreter will WARN_RATELIMIT on them during execution.
      
      Fixes: ddd872bc ("bpf: verifier: add checks for BPF_ABS | BPF_IND instructions")
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d82bccc6
    • Lars Persson's avatar
      net: sched: do not requeue a NULL skb · 3dcd493f
      Lars Persson authored
      A failure in validate_xmit_skb_list() triggered an unconditional call
      to dev_requeue_skb with skb=NULL. This slowly grows the queue
      discipline's qlen count until all traffic through the queue stops.
      
      We take the optimistic approach and continue running the queue after a
      failure since it is unknown if later packets also will fail in the
      validate path.
      
      Fixes: 55a93b3e ("qdisc: validate skb without holding lock")
      Signed-off-by: default avatarLars Persson <larper@axis.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3dcd493f
    • Mathias Krause's avatar
      packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface · 309cf37f
      Mathias Krause authored
      Because we miss to wipe the remainder of i->addr[] in packet_mc_add(),
      pdiag_put_mclist() leaks uninitialized heap bytes via the
      PACKET_DIAG_MCLIST netlink attribute.
      
      Fix this by explicitly memset(0)ing the remaining bytes in i->addr[].
      
      Fixes: eea68e2f ("packet: Report socket mclist info via diag module")
      Signed-off-by: default avatarMathias Krause <minipli@googlemail.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Acked-by: default avatarPavel Emelyanov <xemul@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      309cf37f
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 41015e89
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2016-04-13
      
      This series contains updates to i40e, i40evf and fm10k.
      
      Alex fixes a bug introduced earlier based on his interpretation of the
      XL710 datasheet.  The actual limit for fragments with TSO and a skbuff
      that has payload data in the header portion of the buffer is actually
      only 7 fragments and the skb-data portion counts as 2 buffers, one for
      the TSO header, and the one for a segment payload buffer.
      
      Jacob fixes a bug where in a previous refactor of the code broke
      multi-bit updates for VFs.  The problem occurs because a multi-bit
      request has a non-zero length, and the PF would simply drop any
      request with the upper 16 bits set.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41015e89
    • Chris Friesen's avatar
      route: do not cache fib route info on local routes with oif · d6d5e999
      Chris Friesen authored
      For local routes that require a particular output interface we do not want
      to cache the result.  Caching the result causes incorrect behaviour when
      there are multiple source addresses on the interface.  The end result
      being that if the intended recipient is waiting on that interface for the
      packet he won't receive it because it will be delivered on the loopback
      interface and the IP_PKTINFO ipi_ifindex will be set to the loopback
      interface as well.
      
      This can be tested by running a program such as "dhcp_release" which
      attempts to inject a packet on a particular interface so that it is
      received by another program on the same board.  The receiving process
      should see an IP_PKTINFO ipi_ifndex value of the source interface
      (e.g., eth1) instead of the loopback interface (e.g., lo).  The packet
      will still appear on the loopback interface in tcpdump but the important
      aspect is that the CMSG info is correct.
      
      Sample dhcp_release command line:
      
         dhcp_release eth1 192.168.204.222 02:11:33:22:44:66
      Signed-off-by: default avatarAllain Legacy <allain.legacy@windriver.com>
      Signed off-by: Chris Friesen <chris.friesen@windriver.com>
      Reviewed-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6d5e999
    • Jacob Keller's avatar
      fm10k: fix multi-bit VLAN update requests from VF · f808c5db
      Jacob Keller authored
      The VF uses a multi-bit update request to clear unused VLANs whenever it
      resets. However, an accident in a previous refector broke multi-bit
      updates for VFs, due to misreading a comment in fm10k_vf.c and
      attempting to reduce code duplication. The problem occurs because
      a multi-bit request has a non-zero length, and the PF would simply drop
      any request with the upper 16 bits set.
      
      We can't simply remove the check of the upper 16 bits and the call to
      fm10k_iov_select vid, because this would remove the checks for default
      VID and for ensuring no other VLANs can be enabled except pf_vid when it
      has been set. To resolve that issue, this revision uses the
      iov_select_vid when we have a single-bit update, and denies any
      multi-bit update when the VLAN was administratively set by the PF. This
      should be ok since the PF properly updates VLAN_TABLE when it assigns
      the PF vid. This ensures that requests to add or remove the PF vid work
      as expected, but a rogue VF could not use the multi-bit update as
      a loophole to attempt receiving traffic on other VLANs.
      Reported-by: default avatarNgai-Mint Kwan <ngai-mint.kwan@intel.com>
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKrishneil Singh <Krishneil.k.singh@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f808c5db
    • David Daney's avatar
      net: thunderx: Fix broken of_node_put() code. · 65c66af6
      David Daney authored
      commit b7d3e3d3 ("net: thunderx: Don't leak phy device references
      on -EPROBE_DEFER condition.") incorrectly moved the call to
      of_node_put() outside of the loop.  Under normal loop exit, the node
      has already had of_node_put() called, so the extra call results in:
      
      [    8.228020] ERROR: Bad of_node_put() on /soc@0/pci@848000000000/mrml-bridge0@1,0/bgx0/xlaui00
      [    8.239433] CPU: 16 PID: 608 Comm: systemd-udevd Not tainted 4.6.0-rc1-numa+ #157
      [    8.247380] Hardware name: www.cavium.com EBB8800/EBB8800, BIOS 0.3 Mar  2 2016
      [    8.273541] Call trace:
      [    8.273550] [<fffffc0008097364>] dump_backtrace+0x0/0x210
      [    8.273557] [<fffffc0008097598>] show_stack+0x24/0x2c
      [    8.273560] [<fffffc0008399ed0>] dump_stack+0x8c/0xb4
      [    8.273566] [<fffffc00085aa828>] of_node_release+0xa8/0xac
      [    8.273570] [<fffffc000839cad8>] kobject_cleanup+0x8c/0x194
      [    8.273573] [<fffffc000839c97c>] kobject_put+0x44/0x6c
      [    8.273576] [<fffffc00085a9ab0>] of_node_put+0x24/0x30
      [    8.273587] [<fffffc0000bd0f74>] bgx_probe+0x17c/0xcd8 [thunder_bgx]
      [    8.273591] [<fffffc00083ed220>] pci_device_probe+0xa0/0x114
      [    8.273596] [<fffffc0008473fbc>] driver_probe_device+0x178/0x418
      [    8.273599] [<fffffc000847435c>] __driver_attach+0x100/0x118
      [    8.273602] [<fffffc0008471b58>] bus_for_each_dev+0x6c/0xac
      [    8.273605] [<fffffc0008473884>] driver_attach+0x30/0x38
      [    8.273608] [<fffffc00084732f4>] bus_add_driver+0x1f8/0x29c
      [    8.273611] [<fffffc0008475028>] driver_register+0x70/0x110
      [    8.273617] [<fffffc00083ebf08>] __pci_register_driver+0x60/0x6c
      [    8.273623] [<fffffc0000bf0040>] bgx_init_module+0x40/0x48 [thunder_bgx]
      [    8.273626] [<fffffc0008090d04>] do_one_initcall+0xcc/0x1c0
      [    8.273631] [<fffffc0008198abc>] do_init_module+0x68/0x1c8
      [    8.273635] [<fffffc0008125668>] load_module+0xf44/0x11f4
      [    8.273638] [<fffffc0008125b64>] SyS_finit_module+0xb8/0xe0
      [    8.273641] [<fffffc0008093b30>] el0_svc_naked+0x24/0x28
      
      Go back to the previous (correct) code that only did the extra
      of_node_put() call on early exit from the loop.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65c66af6
    • Emrah Demir's avatar
      mISDN: Fixing missing validation in base_sock_bind() · b8216468
      Emrah Demir authored
      Add validation code into mISDN/socket.c
      Signed-off-by: default avatarEmrah Demir <ed@abdsec.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8216468
    • Alexander Duyck's avatar
      i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet · 3f3f7cb8
      Alexander Duyck authored
      This patch addresses a bug introduced based on my interpretation of the
      XL710 datasheet.  Specifically section 8.4.1 states that "A single transmit
      packet may span up to 8 buffers (up to 8 data descriptors per packet
      including both the header and payload buffers)."  It then later goes on to
      say that each segment for a TSO obeys the previous rule, however it then
      refers to TSO header and the segment payload buffers.
      
      I believe the actual limit for fragments with TSO and a skbuff that has
      payload data in the header portion of the buffer is actually only 7
      fragments as the skb->data portion counts as 2 buffers, one for the TSO
      header, and one for a segment payload buffer.
      
      Fixes: 2d37490b ("i40e/i40evf: Rewrite logic for 8 descriptor per packet check")
      Reported-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarAlexander Duyck <aduyck@mirantis.com>
      Acked-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      3f3f7cb8
    • David Ahern's avatar
      net: ipv6: Do not keep linklocal and loopback addresses · 70af921d
      David Ahern authored
      f1705ec1 added the option to retain user configured addresses on an
      admin down. A comment to one of the later revisions suggested using the
      IFA_F_PERMANENT flag rather than adding a user_managed boolean to the
      ifaddr struct. A side effect of this change is that link local and
      loopback addresses are also retained which is not part of the objective
      of f1705ec1. Add check to drop those addresses.
      
      Fixes: f1705ec1 ("net: ipv6: Make address flushing on ifdown optional")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      70af921d
    • Wolfram Sang's avatar
      net: ethernet: renesas: ravb_main: test clock rate to avoid division by 0 · a6d37131
      Wolfram Sang authored
      The clk API may return 0 on clk_get_rate, so we should check the result before
      using it as a divisor.
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Acked-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6d37131
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 60e19518
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for your net tree. More
      specifically, they are:
      
      1) Fix missing filter table per-netns registration in arptables, from
         Florian Westphal.
      
      2) Resolve out of bound access when parsing TCP options in
         nf_conntrack_tcp, patch from Jozsef Kadlecsik.
      
      3) Prefer NFPROTO_BRIDGE extensions over NFPROTO_UNSPEC in ebtables,
         this resolves conflict between xt_limit and ebt_limit, from Phil Sutter.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      60e19518
  5. 13 Apr, 2016 1 commit
  6. 12 Apr, 2016 3 commits
  7. 11 Apr, 2016 2 commits
    • David Ahern's avatar
      net: vrf: Fix dev refcnt leak due to IPv6 prefix route · 4f7f34ea
      David Ahern authored
      ifupdown2 found a kernel bug with IPv6 routes and movement from the main
      table to the VRF table. Sequence of events:
      
      Create the interface and add addresses:
          ip link add dev eth4.105 link eth4 type vlan id 105
          ip addr add dev eth4.105 8.105.105.10/24
          ip -6 addr add dev eth4.105 2008:105:105::10/64
      
      At this point IPv6 has inserted a prefix route in the main table even
      though the interface is 'down'. From there the VRF device is created:
          ip link add dev vrf105 type vrf table 105
          ip addr add dev vrf105 9.9.105.10/32
          ip -6 addr add dev vrf105 2000:9:105::10/128
          ip link set vrf105 up
      
      Then the interface is enslaved, while still in the 'down' state:
          ip link set dev eth4.105 master vrf105
      
      Since the device is down the VRF driver cycling the device does not
      send the NETDEV_UP and NETDEV_DOWN but rather the NETDEV_CHANGE event
      which does not flush the routes inserted prior.
      
      When the link is brought up
          ip link set dev eth4.105 up
      
      the prefix route is added in the VRF table, but does not remove
      the route from the main table.
      
      Fix by handling the NETDEV_CHANGEUPPER event similar what was implemented
      for IPv4 in 7f49e7a3 ("net: Flush local routes when device changes vrf
      association")
      
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f7f34ea
    • David Ahern's avatar
      net: vrf: Fix dst reference counting · 9ab179d8
      David Ahern authored
      Vivek reported a kernel exception deleting a VRF with an active
      connection through it. The root cause is that the socket has a cached
      reference to a dst that is destroyed. Converting the dst_destroy to
      dst_release and letting proper reference counting kick in does not
      work as the dst has a reference to the device which needs to be released
      as well.
      
      I talked to Hannes about this at netdev and he pointed out the ipv4 and
      ipv6 dst handling has dst_ifdown for just this scenario. Rather than
      continuing with the reinvented dst wheel in VRF just remove it and
      leverage the ipv4 and ipv6 versions.
      
      Fixes: 193125db ("net: Introduce VRF device driver")
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ab179d8