1. 21 Mar, 2019 28 commits
    • Vlad Buslov's avatar
      net: sched: flower: handle concurrent mask insertion · 195c234d
      Vlad Buslov authored
      Without rtnl lock protection masks with same key can be inserted
      concurrently. Insert temporary mask with reference count zero to masks
      hashtable. This will cause any concurrent modifications to retry.
      
      Wait for rcu grace period to complete after removing temporary mask from
      masks hashtable to accommodate concurrent readers.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Suggested-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      195c234d
    • Vlad Buslov's avatar
      net: sched: flower: add reference counter to flower mask · f48ef4d5
      Vlad Buslov authored
      Extend fl_flow_mask structure with reference counter to allow parallel
      modification without relying on rtnl lock. Use rcu read lock to safely
      lookup mask and increment reference counter in order to accommodate
      concurrent deletes.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f48ef4d5
    • Vlad Buslov's avatar
      net: sched: flower: track filter deletion with flag · b2552b8c
      Vlad Buslov authored
      In order to prevent double deletion of filter by concurrent tasks when rtnl
      lock is not used for synchronization, add 'deleted' filter field. Check
      value of this field when modifying filters and return error if concurrent
      deletion is detected.
      
      Refactor __fl_delete() to accept pointer to 'last' boolean as argument,
      and return error code as function return value instead. This is necessary
      to signal concurrent filter delete to caller.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b2552b8c
    • Vlad Buslov's avatar
      net: sched: flower: introduce reference counting for filters · 06177558
      Vlad Buslov authored
      Extend flower filters with reference counting in order to remove dependency
      on rtnl lock in flower ops and allow to modify filters concurrently.
      Reference to flower filter can be taken/released concurrently as soon as it
      is marked as 'unlocked' by last patch in this series. Use atomic reference
      counter type to make concurrent modifications safe.
      
      Always take reference to flower filter while working with it:
      - Modify fl_get() to take reference to filter.
      - Implement tp->put() callback as fl_put() function to allow cls API to
      release reference taken by fl_get().
      - Modify fl_change() to assume that caller holds reference to fold and take
      reference to fnew.
      - Take reference to filter while using it in fl_walk().
      
      Implement helper functions to get/put filter reference counter.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06177558
    • Vlad Buslov's avatar
      net: sched: flower: refactor fl_change · 620da486
      Vlad Buslov authored
      As a preparation for using classifier spinlock instead of relying on
      external rtnl lock, rearrange code in fl_change. The goal is to group the
      code which changes classifier state in single block in order to allow
      following commits in this set to protect it from parallel modification with
      tp->lock. Data structures that require tp->lock protection are mask
      hashtable and filters list, and classifier handle_idr.
      
      fl_hw_replace_filter() is a sleeping function and cannot be called while
      holding a spinlock. In order to execute all sequence of changes to shared
      classifier data structures atomically, call fl_hw_replace_filter() before
      modifying them.
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      620da486
    • Vlad Buslov's avatar
      net: sched: flower: don't check for rtnl on head dereference · e474619a
      Vlad Buslov authored
      Flower classifier only changes root pointer during init and destroy. Cls
      API implements reference counting for tcf_proto, so there is no danger of
      concurrent access to tp when it is being destroyed, even without protection
      provided by rtnl lock.
      
      Implement new function fl_head_dereference() to dereference tp->root
      without checking for rtnl lock. Use it in all flower function that obtain
      head pointer instead of rtnl_dereference().
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e474619a
    • Jakub Kicinski's avatar
      nfp: remove defines for unused control bits · 31f1a0e3
      Jakub Kicinski authored
      NFP driver ABI contains bits for L2 switching which were never
      implemented in initially envisioned form.
      
      Remove the defines, and open up the possibility of
      reclaiming the bits for other uses.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31f1a0e3
    • David S. Miller's avatar
      Merge branch 'rhashtable-cleanups' · 143eb9ac
      David S. Miller authored
      NeilBrown says:
      
      ====================
      Two clean-ups for rhashtable.
      
      These two patches make small improvements to
      rhashtable, but are otherwise unrelated.
      
      Thanks to Herbert, Miguel, and Paul for the review.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      143eb9ac
    • NeilBrown's avatar
      rhashtable: rename rht_for_each*continue as *from. · f7ad68bf
      NeilBrown authored
      The pattern set by list.h is that for_each..continue()
      iterators start at the next entry after the given one,
      while for_each..from() iterators start at the given
      entry.
      
      The rht_for_each*continue() iterators are documented as though the
      start at the 'next' entry, but actually start at the given entry,
      and they are used expecting that behaviour.
      So fix the documentation and change the names to *from for consistency
      with list.h
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7ad68bf
    • NeilBrown's avatar
      rhashtable: don't hold lock on first table throughout insertion. · 4feb7c7a
      NeilBrown authored
      rhashtable_try_insert() currently holds a lock on the bucket in
      the first table, while also locking buckets in subsequent tables.
      This is unnecessary and looks like a hold-over from some earlier
      version of the implementation.
      
      As insert and remove always lock a bucket in each table in turn, and
      as insert only inserts in the final table, there cannot be any races
      that are not covered by simply locking a bucket in each table in turn.
      
      When an insert call reaches that last table it can be sure that there
      is no matchinf entry in any other table as it has searched them all, and
      insertion never happens anywhere but in the last table.  The fact that
      code tests for the existence of future_tbl while holding a lock on
      the relevant bucket ensures that two threads inserting the same key
      will make compatible decisions about which is the "last" table.
      
      This simplifies the code and allows the ->rehash field to be
      discarded.
      
      We still need a way to ensure that a dead bucket_table is never
      re-linked by rhashtable_walk_stop().  This can be achieved by calling
      call_rcu() inside the locked region, and checking with
      rcu_head_after_call_rcu() in rhashtable_walk_stop() to see if the
      bucket table is empty and dead.
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@linux.ibm.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4feb7c7a
    • David S. Miller's avatar
      Merge branch 'net-phy-Move-Omega-PHY-entry-to-Cygnus-PHY-driver' · 83b038db
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: phy: Move Omega PHY entry to Cygnus PHY driver
      
      In order to pave the way for adding some specific Omega PHY features
      that may not be desirable on other products covered by the bcm7xxx PHY
      driver, split the Omega PHY entry into the Cygnus PHY driver such that
      the PHY drivers are reflective of product lines/business units
      maintaining them within Broadcom.
      
      No functional changes intended.
      ====================
      Acked-by: default avatarArun Parameswaran <arun.parameswaran@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83b038db
    • Florian Fainelli's avatar
      net: phy: Move Omega PHY entry to Cygnus PHY driver · 17cc9821
      Florian Fainelli authored
      Cygnus and Omega are part of the same business unit and product line, it
      makes sense to group PHY entries by products such that a platform can
      select only the drivers that it needs. Bring all the functionality that
      the BCM7XXX_28NM_GPHY() macro hides for us and remove the Omega PHY
      entry from bcm7xxx.c.
      
      As an added bonus, we now have a proper mdio_device_id entry to permit
      auto-loading.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarScott Branden <scott.branden@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17cc9821
    • Florian Fainelli's avatar
      net: phy: Prepare for moving Omega out of bcm7xxx · f878fe56
      Florian Fainelli authored
      The Omega PHY entry was added to bcm7xxx.c out of convenience and this
      breaks the one driver per product line paradigm that was applied up
      until now. Since the AFE initialization is shared between Omega and
      BCM7xxx move the relevant functions to bcm-phy-lib.[ch]. No functional
      changes introduced.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarScott Branden <scott.branden@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f878fe56
    • Julian Wiedmann's avatar
      net: dst: remove gc leftovers · 02afc7ad
      Julian Wiedmann authored
      Get rid of some obsolete gc-related documentation and macros that were
      missed in commit 5b7c9a8f ("net: remove dst gc related code").
      
      CC: Wei Wang <weiwan@google.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Acked-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02afc7ad
    • David S. Miller's avatar
      Merge branch 'net-broadcom-Remove-print-of-base-address' · 88f808f3
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: broadcom: Remove print of base address
      
      Some broadcom MDIO/switch/Ethernet MAC drivers insist on printing the
      base register virtual address which has little value.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88f808f3
    • Florian Fainelli's avatar
      net: systemport: Remove print of base address · 62be757f
      Florian Fainelli authored
      Since commit ad67b74d ("printk: hash addresses printed with %p")
      pointers are being hashed when printed. Displaying the virtual memory at
      bootup time is not helpful, especially given we use a dev_info() which
      already displays the platform device's address.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      62be757f
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Remove print of base address · fbb7bc45
      Florian Fainelli authored
      Since commit ad67b74d ("printk: hash addresses printed with %p")
      pointers are being hashed when printed. Displaying the virtual memory at
      bootup time is not helpful, we use a dev_info() print which already
      displays the platform device's address.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fbb7bc45
    • Florian Fainelli's avatar
      net: phy: mdio-bcm-unimac: Remove print of base address · 647aed23
      Florian Fainelli authored
      Since commit ad67b74d ("printk: hash addresses printed with %p")
      pointers are being hashed when printed. Displaying the virtual memory at
      bootup time is not helpful, especially given we use a dev_info() which
      already displays the platform device's address.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      647aed23
    • David Ahern's avatar
      ipv6: Remove fallback argument from ip6_hold_safe · 10585b43
      David Ahern authored
      net and null_fallback are redundant. Remove null_fallback in favor of
      !net check.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Acked-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10585b43
    • David Ahern's avatar
      ipv4: Allow amount of dirty memory from fib resizing to be controllable · 9ab948a9
      David Ahern authored
      fib_trie implementation calls synchronize_rcu when a certain amount of
      pages are dirty from freed entries. The number of pages was determined
      experimentally in 2009 (commit c3059477).
      
      At the current setting, synchronize_rcu is called often -- 51 times in a
      second in one test with an average of an 8 msec delay adding a fib entry.
      The total impact is a lot of slow down modifying the fib. This is seen
      in the output of 'time' - the difference between real time and sys+user.
      For example, using 720,022 single path routes and 'ip -batch'[1]:
      
          $ time ./ip -batch ipv4/routes-1-hops
          real    0m14.214s
          user    0m2.513s
          sys     0m6.783s
      
      So roughly 35% of the actual time to install the routes is from the ip
      command getting scheduled out, most notably due to synchronize_rcu (this
      is observed using 'perf sched timehist').
      
      This patch makes the amount of dirty memory configurable between 64k where
      the synchronize_rcu is called often (small, low end systems that are memory
      sensitive) to 64M where synchronize_rcu is called rarely during a large
      FIB change (for high end systems with lots of memory). The default is 512kB
      which corresponds to the current setting of 128 pages with a 4kB page size.
      
      As an example, at 16MB the worst interval shows 4 calls to synchronize_rcu
      in a second blocking for up to 30 msec in a single instance, and a total
      of almost 100 msec across the 4 calls in the second. The trade off is
      allowing FIB entries to consume more memory in a given time window but
      but with much better fib insertion rates (~30% increase in prefixes/sec).
      With this patch and net.ipv4.fib_sync_mem set to 16MB, the same batch
      file runs in:
      
          $ time ./ip -batch ipv4/routes-1-hops
          real    0m9.692s
          user    0m2.491s
          sys     0m6.769s
      
      So the dead time is reduced to about 1/2 second or <5% of the real time.
      
      [1] 'ip' modified to not request ACK messages which improves route
          insertion times by about 20%
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ab948a9
    • Kirill Tkhai's avatar
    • Kirill Tkhai's avatar
      tun: Add ioctl() TUNGETDEVNETNS cmd to allow obtaining real net ns of tun device · 0c3e0e3b
      Kirill Tkhai authored
      In commit f2780d6d "tun: Add ioctl() SIOCGSKNS cmd to allow
      obtaining net ns of tun device" it was missed that tun may change
      its net ns, while net ns of socket remains the same as it was
      created initially. SIOCGSKNS returns net ns of socket, so it is
      not suitable for obtaining net ns of device.
      
      We may have two tun devices with the same names in two net ns,
      and in this case it's not possible to determ, which of them
      fd refers to (TUNGETIFF will return the same name).
      
      This patch adds new ioctl() cmd for obtaining net ns of a device.
      Reported-by: default avatarHarald Albrecht <harald.albrecht@gmx.net>
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c3e0e3b
    • David S. Miller's avatar
      Merge branch 'ipv6-Change-addrconf_f6i_alloc-to-use-ip6_route_info_create' · 28b18b39
      David S. Miller authored
      David Ahern says:
      
      ====================
      ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create
      
      addrconf_f6i_alloc is the last caller of fib6_info_alloc besides
      ip6_route_info_create. There really is no good reason for it do
      its own fib6_info initialization, so convert it to call
      ip6_route_info_create.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28b18b39
    • David Ahern's avatar
      ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create · c7a1ce39
      David Ahern authored
      Change addrconf_f6i_alloc to generate a fib6_config and call
      ip6_route_info_create. addrconf_f6i_alloc is the last caller to
      fib6_info_alloc besides ip6_route_info_create, and there is no
      reason for it to do its own initialization on a fib6_info.
      
      Host routes need to be created even if the device is down, so add a
      new flag, fc_ignore_dev_down, to fib6_config and update fib6_nh_init
      to not error out if device is not up.
      
      Notes on the conversion:
      - ip_fib_metrics_init is the same as fib6_config has fc_mx set to NULL
        and fc_mx_len set to 0
      - dst_nocount is handled by the RTF_ADDRCONF flag
      - dst_host is handled by fc_dst_len = 128
      
      nh_gw does not get set after the conversion to ip6_route_info_create
      but it should not be set in addrconf_f6i_alloc since this is a host
      route not a gateway route.
      
      Everything else is a straight forward map between fib6_info and
      fib6_config.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7a1ce39
    • David Ahern's avatar
      ipv6: Move setting default metric for routes · 67f69513
      David Ahern authored
      ip6_route_info_create is a low level function for ensuring fc_metric is
      set. Move the check and default setting to the 2 locations that do not
      already set fc_metric before calling ip6_route_info_create. This is
      required for the next patch which moves addrconf allocations to
      ip6_route_info_create and want the metric for host routes to be 0.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67f69513
    • Vakul Garg's avatar
      net/tls: Replace kfree_skb() with consume_skb() · a88c26f6
      Vakul Garg authored
      To free the skb in normal course of processing, consume_skb() should be
      used. Only for failure paths, skb_free() is intended to be used.
      
      https://www.kernel.org/doc/htmldocs/networking/API-consume-skb.htmlSigned-off-by: default avatarVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a88c26f6
    • Hoang Le's avatar
      tipc: fix a null pointer deref · 08e046c8
      Hoang Le authored
      In commit c55c8eda ("tipc: smooth change between replicast and
      broadcast") we introduced new method to eliminate the risk of message
      reordering that happen in between different nodes.
      Unfortunately, we forgot checking at receiving side to ignore intra node.
      
      We fix this by checking and returning if arrived message from intra node.
      
      syzbot report:
      
      ==================================================================
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 7820 Comm: syz-executor418 Not tainted 5.0.0+ #61
      Hardware name: Google Google Compute Engine/Google Compute Engine,
      BIOS Google 01/01/2011
      RIP: 0010:tipc_mcast_filter_msg+0x21b/0x13d0 net/tipc/bcast.c:782
      Code: 45 c0 0f 84 39 06 00 00 48 89 5d 98 e8 ce ab a5 fa 49 8d bc
       24 c8 00 00 00 48 b9 00 00 00 00 00 fc ff df 48 89 f8 48 c1 e8 03
       <80> 3c 08 00 0f 85 9a 0e 00 00 49 8b 9c 24 c8 00 00 00 48 be 00 00
      RSP: 0018:ffff8880959defc8 EFLAGS: 00010202
      RAX: 0000000000000019 RBX: ffff888081258a48 RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: ffffffff86cab862 RDI: 00000000000000c8
      RBP: ffff8880959df030 R08: ffff8880813d0200 R09: ffffed1015d05bc8
      R10: ffffed1015d05bc7 R11: ffff8880ae82de3b R12: 0000000000000000
      R13: 000000000000002c R14: 0000000000000000 R15: ffff888081258a48
      FS:  000000000106a880(0000) GS:ffff8880ae800000(0000)
       knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020001cc0 CR3: 0000000094a20000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       tipc_sk_filter_rcv+0x182d/0x34f0 net/tipc/socket.c:2168
       tipc_sk_enqueue net/tipc/socket.c:2254 [inline]
       tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305
       tipc_sk_mcast_rcv+0x724/0x1020 net/tipc/socket.c:1209
       tipc_mcast_xmit+0x7fe/0x1200 net/tipc/bcast.c:410
       tipc_sendmcast+0xb36/0xfc0 net/tipc/socket.c:820
       __tipc_sendmsg+0x10df/0x18d0 net/tipc/socket.c:1358
       tipc_sendmsg+0x53/0x80 net/tipc/socket.c:1291
       sock_sendmsg_nosec net/socket.c:651 [inline]
       sock_sendmsg+0xdd/0x130 net/socket.c:661
       ___sys_sendmsg+0x806/0x930 net/socket.c:2260
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2298
       __do_sys_sendmsg net/socket.c:2307 [inline]
       __se_sys_sendmsg net/socket.c:2305 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2305
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4401c9
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8
       48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05
       <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffd887fa9d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401c9
      RDX: 0000000000000000 RSI: 0000000020002140 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401a50
      R13: 0000000000401ae0 R14: 0000000000000000 R15: 0000000000000000
      Modules linked in:
      ---[ end trace ba79875754e1708f ]---
      
      Reported-by: syzbot+be4bdf2cc3e85e952c50@syzkaller.appspotmail.com
      Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast")
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08e046c8
    • Hoang Le's avatar
      tipc: fix use-after-free in tipc_sk_filter_rcv · 77d5ad40
      Hoang Le authored
      skb free-ed in:
        1/ condition 1: tipc_sk_filter_rcv -> tipc_sk_proto_rcv
        2/ condition 2: tipc_sk_filter_rcv -> tipc_group_filter_msg
      This leads to a "use-after-free" access in the next condition.
      
      We fix this by intializing the variable at declaration, then it is safe
      to check this variable to continue processing if condition matches.
      
      syzbot report:
      
      ==================================================================
      BUG: KASAN: use-after-free in tipc_sk_filter_rcv+0x2166/0x34f0
       net/tipc/socket.c:2167
      Read of size 4 at addr ffff88808ea58534 by task kworker/u4:0/7
      
      CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.0.0+ #61
      Hardware name: Google Google Compute Engine/Google Compute Engine,
       BIOS Google 01/01/2011
      Workqueue: tipc_send tipc_conn_send_work
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
       kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:131
       tipc_sk_filter_rcv+0x2166/0x34f0 net/tipc/socket.c:2167
       tipc_sk_enqueue net/tipc/socket.c:2254 [inline]
       tipc_sk_rcv+0xc45/0x25a0 net/tipc/socket.c:2305
       tipc_topsrv_kern_evt+0x3b7/0x580 net/tipc/topsrv.c:610
       tipc_conn_send_to_sock+0x43e/0x5f0 net/tipc/topsrv.c:283
       tipc_conn_send_work+0x65/0x80 net/tipc/topsrv.c:303
       process_one_work+0x98e/0x1790 kernel/workqueue.c:2269
       worker_thread+0x98/0xe40 kernel/workqueue.c:2415
       kthread+0x357/0x430 kernel/kthread.c:253
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
      
      Reported-by: syzbot+e863893591cc7a622e40@syzkaller.appspotmail.com
      Fixes: c55c8eda ("tipc: smooth change between replicast and broadcast")
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77d5ad40
  2. 20 Mar, 2019 12 commits