1. 25 Feb, 2019 3 commits
    • Darrick J. Wong's avatar
      tmpfs: fix uninitialized return value in shmem_link · 29b00e60
      Darrick J. Wong authored
      When we made the shmem_reserve_inode call in shmem_link conditional, we
      forgot to update the declaration for ret so that it always has a known
      value.  Dan Carpenter pointed out this deficiency in the original patch.
      
      Fixes: 1062af92 ("tmpfs: fix link accounting when a tmpfile is linked in")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Matej Kupljen <matej.kupljen@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29b00e60
    • Linus Torvalds's avatar
      Revert "x86/fault: BUG() when uaccess helpers fault on kernel addresses" · 53a41cb7
      Linus Torvalds authored
      This reverts commit 9da3f2b7.
      
      It was well-intentioned, but wrong.  Overriding the exception tables for
      instructions for random reasons is just wrong, and that is what the new
      code did.
      
      It caused problems for tracing, and it caused problems for strncpy_from_user(),
      because the new checks made perfectly valid use cases break, rather than
      catch things that did bad things.
      
      Unchecked user space accesses are a problem, but that's not a reason to
      add invalid checks that then people have to work around with silly flags
      (in this case, that 'kernel_uaccess_faults_ok' flag, which is just an
      odd way to say "this commit was wrong" and was sprinked into random
      places to hide the wrongness).
      
      The real fix to unchecked user space accesses is to get rid of the
      special "let's not check __get_user() and __put_user() at all" logic.
      Make __{get|put}_user() be just aliases to the regular {get|put}_user()
      functions, and make it impossible to access user space without having
      the proper checks in places.
      
      The raison d'être of the special double-underscore versions used to be
      that the range check was expensive, and if you did multiple user
      accesses, you'd do the range check up front (like the signal frame
      handling code, for example).  But SMAP (on x86) and PAN (on ARM) have
      made that optimization pointless, because the _real_ expense is the "set
      CPU flag to allow user space access".
      
      Do let's not break the valid cases to catch invalid cases that shouldn't
      even exist.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tobin C. Harding <tobin@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Jann Horn <jannh@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      53a41cb7
    • Linus Torvalds's avatar
      Linux 5.0-rc8 · 5908e6b7
      Linus Torvalds authored
      5908e6b7
  2. 24 Feb, 2019 8 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · c3619a48
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "Bug fixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: MMU: record maximum physical address width in kvm_mmu_extended_role
        kvm: x86: Return LA57 feature based on hardware capability
        x86/kvm/mmu: fix switch between root and guest MMUs
        s390: vsie: Use effective CRYCBD.31 to check CRYCBD validity
      c3619a48
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c4eb1e18
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Hopefully the last pull request for this release. Fingers crossed:
      
         1) Only refcount ESP stats on full sockets, from Martin Willi.
      
         2) Missing barriers in AF_UNIX, from Al Viro.
      
         3) RCU protection fixes in ipv6 route code, from Paolo Abeni.
      
         4) Avoid false positives in untrusted GSO validation, from Willem de
            Bruijn.
      
         5) Forwarded mesh packets in mac80211 need more tailroom allocated,
            from Felix Fietkau.
      
         6) Use operstate consistently for linkup in team driver, from George
            Wilkie.
      
         7) ThunderX bug fixes from Vadim Lomovtsev. Mostly races between VF
            and PF code paths.
      
         8) Purge ipv6 exceptions during netdevice removal, from Paolo Abeni.
      
         9) nfp eBPF code gen fixes from Jiong Wang.
      
        10) bnxt_en firmware timeout fix from Michael Chan.
      
        11) Use after free in udp/udpv6 error handlers, from Paolo Abeni.
      
        12) Fix a race in x25_bind triggerable by syzbot, from Eric Dumazet"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
        net: phy: realtek: Dummy IRQ calls for RTL8366RB
        tcp: repaired skbs must init their tso_segs
        net/x25: fix a race in x25_bind()
        net: dsa: Remove documentation for port_fdb_prepare
        Revert "bridge: do not add port to router list when receives query with source 0.0.0.0"
        selftests: fib_tests: sleep after changing carrier. again.
        net: set static variable an initial value in atl2_probe()
        net: phy: marvell10g: Fix Multi-G advertisement to only advertise 10G
        bpf, doc: add bpf list as secondary entry to maintainers file
        udp: fix possible user after free in error handler
        udpv6: fix possible user after free in error handler
        fou6: fix proto error handler argument type
        udpv6: add the required annotation to mib type
        mdio_bus: Fix use-after-free on device_register fails
        net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255
        bnxt_en: Wait longer for the firmware message response to complete.
        bnxt_en: Fix typo in firmware message timeout logic.
        nfp: bpf: fix ALU32 high bits clearance bug
        nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K
        Documentation: networking: switchdev: Update port parent ID section
        ...
      c4eb1e18
    • Linus Walleij's avatar
      net: phy: realtek: Dummy IRQ calls for RTL8366RB · 4c8e0459
      Linus Walleij authored
      This fixes a regression introduced by
      commit 0d2e778e
      "net: phy: replace PHY_HAS_INTERRUPT with a check for
      config_intr and ack_interrupt".
      
      This assumes that a PHY cannot trigger interrupt unless
      it has .config_intr() or .ack_interrupt() implemented.
      A later patch makes the code assume both need to be
      implemented for interrupts to be present.
      
      But this PHY (which is inside a DSA) will happily
      fire interrupts without either callback.
      
      Implement dummy callbacks for .config_intr() and
      .ack_interrupt() in the phy header to fix this.
      
      Tested on the RTL8366RB on D-Link DIR-685.
      
      Fixes: 0d2e778e ("net: phy: replace PHY_HAS_INTERRUPT with a check for config_intr and ack_interrupt")
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c8e0459
    • Eric Dumazet's avatar
      tcp: repaired skbs must init their tso_segs · bf50b606
      Eric Dumazet authored
      syzbot reported a WARN_ON(!tcp_skb_pcount(skb))
      in tcp_send_loss_probe() [1]
      
      This was caused by TCP_REPAIR sent skbs that inadvertenly
      were missing a call to tcp_init_tso_segs()
      
      [1]
      WARNING: CPU: 1 PID: 0 at net/ipv4/tcp_output.c:2534 tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc7+ #77
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       panic+0x2cb/0x65c kernel/panic.c:214
       __warn.cold+0x20/0x45 kernel/panic.c:571
       report_bug+0x263/0x2b0 lib/bug.c:186
       fixup_bug arch/x86/kernel/traps.c:178 [inline]
       fixup_bug arch/x86/kernel/traps.c:173 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290
       invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
      RIP: 0010:tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534
      Code: 88 fc ff ff 4c 89 ef e8 ed 75 c8 fb e9 c8 fc ff ff e8 43 76 c8 fb e9 63 fd ff ff e8 d9 75 c8 fb e9 94 f9 ff ff e8 bf 03 91 fb <0f> 0b e9 7d fa ff ff e8 b3 03 91 fb 0f b6 1d 37 43 7a 03 31 ff 89
      RSP: 0018:ffff8880ae907c60 EFLAGS: 00010206
      RAX: ffff8880a989c340 RBX: 0000000000000000 RCX: ffffffff85dedbdb
      RDX: 0000000000000100 RSI: ffffffff85dee0b1 RDI: 0000000000000005
      RBP: ffff8880ae907c90 R08: ffff8880a989c340 R09: ffffed10147d1ae1
      R10: ffffed10147d1ae0 R11: ffff8880a3e8d703 R12: ffff888091b90040
      R13: ffff8880a3e8d540 R14: 0000000000008000 R15: ffff888091b90860
       tcp_write_timer_handler+0x5c0/0x8a0 net/ipv4/tcp_timer.c:583
       tcp_write_timer+0x10e/0x1d0 net/ipv4/tcp_timer.c:607
       call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
       expire_timers kernel/time/timer.c:1362 [inline]
       __run_timers kernel/time/timer.c:1681 [inline]
       __run_timers kernel/time/timer.c:1649 [inline]
       run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
       __do_softirq+0x266/0x95a kernel/softirq.c:292
       invoke_softirq kernel/softirq.c:373 [inline]
       irq_exit+0x180/0x1d0 kernel/softirq.c:413
       exiting_irq arch/x86/include/asm/apic.h:536 [inline]
       smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
       </IRQ>
      RIP: 0010:native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:58
      Code: ff ff ff 48 89 c7 48 89 45 d8 e8 59 0c a1 fa 48 8b 45 d8 e9 ce fe ff ff 48 89 df e8 48 0c a1 fa eb 82 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90
      RSP: 0018:ffff8880a98afd78 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
      RAX: 1ffffffff1125061 RBX: ffff8880a989c340 RCX: 0000000000000000
      RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff8880a989cbbc
      RBP: ffff8880a98afda8 R08: ffff8880a989c340 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
      R13: ffffffff889282f8 R14: 0000000000000001 R15: 0000000000000000
       arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:555
       default_idle_call+0x36/0x90 kernel/sched/idle.c:93
       cpuidle_idle_call kernel/sched/idle.c:153 [inline]
       do_idle+0x386/0x570 kernel/sched/idle.c:262
       cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:353
       start_secondary+0x404/0x5c0 arch/x86/kernel/smpboot.c:271
       secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 79861919 ("tcp: fix TCP_REPAIR xmit queue setup")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf50b606
    • Eric Dumazet's avatar
      net/x25: fix a race in x25_bind() · 797a22bd
      Eric Dumazet authored
      syzbot was able to trigger another soft lockup [1]
      
      I first thought it was the O(N^2) issue I mentioned in my
      prior fix (f657d22ee1f "net/x25: do not hold the cpu
      too long in x25_new_lci()"), but I eventually found
      that x25_bind() was not checking SOCK_ZAPPED state under
      socket lock protection.
      
      This means that multiple threads can end up calling
      x25_insert_socket() for the same socket, and corrupt x25_list
      
      [1]
      watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.2:10492]
      Modules linked in:
      irq event stamp: 27515
      hardirqs last  enabled at (27514): [<ffffffff81006673>] trace_hardirqs_on_thunk+0x1a/0x1c
      hardirqs last disabled at (27515): [<ffffffff8100668f>] trace_hardirqs_off_thunk+0x1a/0x1c
      softirqs last  enabled at (32): [<ffffffff8632ee73>] x25_get_neigh+0xa3/0xd0 net/x25/x25_link.c:336
      softirqs last disabled at (34): [<ffffffff86324bc3>] x25_find_socket+0x23/0x140 net/x25/af_x25.c:341
      CPU: 0 PID: 10492 Comm: syz-executor.2 Not tainted 5.0.0-rc7+ #88
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__sanitizer_cov_trace_pc+0x4/0x50 kernel/kcov.c:97
      Code: f4 ff ff ff e8 11 9f ea ff 48 c7 05 12 fb e5 08 00 00 00 00 e9 c8 e9 ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 <48> 8b 75 08 65 48 8b 04 25 40 ee 01 00 65 8b 15 38 0c 92 7e 81 e2
      RSP: 0018:ffff88806e94fc48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
      RAX: 1ffff1100d84dac5 RBX: 0000000000000001 RCX: ffffc90006197000
      RDX: 0000000000040000 RSI: ffffffff86324bf3 RDI: ffff88806c26d628
      RBP: ffff88806e94fc48 R08: ffff88806c1c6500 R09: fffffbfff1282561
      R10: fffffbfff1282560 R11: ffffffff89412b03 R12: ffff88806c26d628
      R13: ffff888090455200 R14: dffffc0000000000 R15: 0000000000000000
      FS:  00007f3a107e4700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f3a107e3db8 CR3: 00000000a5544000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       __x25_find_socket net/x25/af_x25.c:327 [inline]
       x25_find_socket+0x7d/0x140 net/x25/af_x25.c:342
       x25_new_lci net/x25/af_x25.c:355 [inline]
       x25_connect+0x380/0xde0 net/x25/af_x25.c:784
       __sys_connect+0x266/0x330 net/socket.c:1662
       __do_sys_connect net/socket.c:1673 [inline]
       __se_sys_connect net/socket.c:1670 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:1670
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457e29
      Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f3a107e3c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e29
      RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000005
      RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3a107e46d4
      R13: 00000000004be362 R14: 00000000004ceb98 R15: 00000000ffffffff
      Sending NMI from CPU 0 to CPUs 1:
      NMI backtrace for cpu 1
      CPU: 1 PID: 10493 Comm: syz-executor.3 Not tainted 5.0.0-rc7+ #88
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline]
      RIP: 0010:queued_write_lock_slowpath+0x143/0x290 kernel/locking/qrwlock.c:86
      Code: 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 41 0f b6 55 00 <41> 38 d7 7c eb 84 d2 74 e7 48 89 df e8 cc aa 4e 00 eb dd be 04 00
      RSP: 0018:ffff888085c47bd8 EFLAGS: 00000206
      RAX: 0000000000000300 RBX: ffffffff89412b00 RCX: 1ffffffff1282560
      RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89412b00
      RBP: ffff888085c47c70 R08: 1ffffffff1282560 R09: fffffbfff1282561
      R10: fffffbfff1282560 R11: ffffffff89412b03 R12: 00000000000000ff
      R13: fffffbfff1282560 R14: 1ffff11010b88f7d R15: 0000000000000003
      FS:  00007fdd04086700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fdd04064db8 CR3: 0000000090be0000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       queued_write_lock include/asm-generic/qrwlock.h:104 [inline]
       do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203
       __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline]
       _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312
       x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267
       x25_bind+0x273/0x340 net/x25/af_x25.c:703
       __sys_bind+0x23f/0x290 net/socket.c:1481
       __do_sys_bind net/socket.c:1492 [inline]
       __se_sys_bind net/socket.c:1490 [inline]
       __x64_sys_bind+0x73/0xb0 net/socket.c:1490
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457e29
      
      Fixes: 90c27297 ("X.25 remove bkl in bind")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: andrew hendry <andrew.hendry@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      797a22bd
    • Hauke Mehrtens's avatar
      net: dsa: Remove documentation for port_fdb_prepare · 99407d8f
      Hauke Mehrtens authored
      This callback was removed some time ago, also remove the documentation.
      
      Fixes: 1b6dd556 ("net: dsa: Remove prepare phase for FDB")
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      99407d8f
    • Hangbin Liu's avatar
      Revert "bridge: do not add port to router list when receives query with source 0.0.0.0" · 278e2148
      Hangbin Liu authored
      This reverts commit 5a2de63f ("bridge: do not add port to router list
      when receives query with source 0.0.0.0") and commit 0fe5119e ("net:
      bridge: remove ipv6 zero address check in mcast queries")
      
      The reason is RFC 4541 is not a standard but suggestive. Currently we
      will elect 0.0.0.0 as Querier if there is no ip address configured on
      bridge. If we do not add the port which recives query with source
      0.0.0.0 to router list, the IGMP reports will not be about to forward
      to Querier, IGMP data will also not be able to forward to dest.
      
      As Nikolay suggested, revert this change first and add a boolopt api
      to disable none-zero election in future if needed.
      Reported-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Reported-by: default avatarSebastian Gottschall <s.gottschall@newmedia-net.de>
      Fixes: 5a2de63f ("bridge: do not add port to router list when receives query with source 0.0.0.0")
      Fixes: 0fe5119e ("net: bridge: remove ipv6 zero address check in mcast queries")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      278e2148
    • Thadeu Lima de Souza Cascardo's avatar
      selftests: fib_tests: sleep after changing carrier. again. · af548a27
      Thadeu Lima de Souza Cascardo authored
      Just like commit e2ba732a ("selftests: fib_tests: sleep after
      changing carrier"), wait one second to allow linkwatch to propagate the
      carrier change to the stack.
      
      There are two sets of carrier tests. The first slept after the carrier
      was set to off, and when the second set ran, it was likely that the
      linkwatch would be able to run again without much delay, reducing the
      likelihood of a race. However, if you run 'fib_tests.sh -t carrier' on a
      loop, you will quickly notice the failures.
      
      Sleeping on the second set of tests make the failures go away.
      
      Cc: David Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af548a27
  3. 23 Feb, 2019 17 commits
  4. 22 Feb, 2019 12 commits
    • YueHaibing's avatar
      mdio_bus: Fix use-after-free on device_register fails · 6ff7b060
      YueHaibing authored
      KASAN has found use-after-free in fixed_mdio_bus_init,
      commit 0c692d07 ("drivers/net/phy/mdio_bus.c: call
      put_device on device_register() failure") call put_device()
      while device_register() fails,give up the last reference
      to the device and allow mdiobus_release to be executed
      ,kfreeing the bus. However in most drives, mdiobus_free
      be called to free the bus while mdiobus_register fails.
      use-after-free occurs when access bus again, this patch
      revert it to let mdiobus_free free the bus.
      
      KASAN report details as below:
      
      BUG: KASAN: use-after-free in mdiobus_free+0x85/0x90 drivers/net/phy/mdio_bus.c:482
      Read of size 4 at addr ffff8881dc824d78 by task syz-executor.0/3524
      
      CPU: 1 PID: 3524 Comm: syz-executor.0 Not tainted 5.0.0-rc7+ #45
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xfa/0x1ce lib/dump_stack.c:113
       print_address_description+0x65/0x270 mm/kasan/report.c:187
       kasan_report+0x149/0x18d mm/kasan/report.c:317
       mdiobus_free+0x85/0x90 drivers/net/phy/mdio_bus.c:482
       fixed_mdio_bus_init+0x283/0x1000 [fixed_phy]
       ? 0xffffffffc0e40000
       ? 0xffffffffc0e40000
       ? 0xffffffffc0e40000
       do_one_initcall+0xfa/0x5ca init/main.c:887
       do_init_module+0x204/0x5f6 kernel/module.c:3460
       load_module+0x66b2/0x8570 kernel/module.c:3808
       __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
       do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x462e99
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f6215c19c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
      RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003
      RBP: 00007f6215c19c70 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f6215c1a6bc
      R13: 00000000004bcefb R14: 00000000006f7030 R15: 0000000000000004
      
      Allocated by task 3524:
       set_track mm/kasan/common.c:85 [inline]
       __kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:496
       kmalloc include/linux/slab.h:545 [inline]
       kzalloc include/linux/slab.h:740 [inline]
       mdiobus_alloc_size+0x54/0x1b0 drivers/net/phy/mdio_bus.c:143
       fixed_mdio_bus_init+0x163/0x1000 [fixed_phy]
       do_one_initcall+0xfa/0x5ca init/main.c:887
       do_init_module+0x204/0x5f6 kernel/module.c:3460
       load_module+0x66b2/0x8570 kernel/module.c:3808
       __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
       do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 3524:
       set_track mm/kasan/common.c:85 [inline]
       __kasan_slab_free+0x130/0x180 mm/kasan/common.c:458
       slab_free_hook mm/slub.c:1409 [inline]
       slab_free_freelist_hook mm/slub.c:1436 [inline]
       slab_free mm/slub.c:2986 [inline]
       kfree+0xe1/0x270 mm/slub.c:3938
       device_release+0x78/0x200 drivers/base/core.c:919
       kobject_cleanup lib/kobject.c:662 [inline]
       kobject_release lib/kobject.c:691 [inline]
       kref_put include/linux/kref.h:67 [inline]
       kobject_put+0x146/0x240 lib/kobject.c:708
       put_device+0x1c/0x30 drivers/base/core.c:2060
       __mdiobus_register+0x483/0x560 drivers/net/phy/mdio_bus.c:382
       fixed_mdio_bus_init+0x26b/0x1000 [fixed_phy]
       do_one_initcall+0xfa/0x5ca init/main.c:887
       do_init_module+0x204/0x5f6 kernel/module.c:3460
       load_module+0x66b2/0x8570 kernel/module.c:3808
       __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
       do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff8881dc824c80
       which belongs to the cache kmalloc-2k of size 2048
      The buggy address is located 248 bytes inside of
       2048-byte region [ffff8881dc824c80, ffff8881dc825480)
      The buggy address belongs to the page:
      page:ffffea0007720800 count:1 mapcount:0 mapping:ffff8881f6c02800 index:0x0 compound_mapcount: 0
      flags: 0x2fffc0000010200(slab|head)
      raw: 02fffc0000010200 0000000000000000 0000000500000001 ffff8881f6c02800
      raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8881dc824c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff8881dc824c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8881dc824d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                      ^
       ffff8881dc824d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8881dc824e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      Fixes: 0c692d07 ("drivers/net/phy/mdio_bus.c: call put_device on device_register() failure")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ff7b060
    • Kalash Nainwal's avatar
      net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 · 97f0082a
      Kalash Nainwal authored
      Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 to
      keep legacy software happy. This is similar to what was done for
      ipv4 in commit 709772e6 ("net: Fix routing tables with
      id > 255 for legacy software").
      Signed-off-by: default avatarKalash Nainwal <kalash@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      97f0082a
    • David S. Miller's avatar
      Merge branch 'bnxt_en-firmware-message-delay-fixes' · a11f5756
      David S. Miller authored
      Michael Chan says:
      
      ====================
      bnxt_en: firmware message delay fixes.
      
      We were seeing some intermittent firmware message timeouts in our lab and
      these 2 small patches fix them.  Please apply to stable as well.  Thanks.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a11f5756
    • Michael Chan's avatar
      bnxt_en: Wait longer for the firmware message response to complete. · 0000b81a
      Michael Chan authored
      The code waits up to 20 usec for the firmware response to complete
      once we've seen the valid response header in the buffer.  It turns
      out that in some scenarios, this wait time is not long enough.
      Extend it to 150 usec and use usleep_range() instead of udelay().
      
      Fixes: 9751e8e7 ("bnxt_en: reduce timeout on initial HWRM calls")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0000b81a
    • Michael Chan's avatar
      bnxt_en: Fix typo in firmware message timeout logic. · 67681d02
      Michael Chan authored
      The logic that polls for the firmware message response uses a shorter
      sleep interval for the first few passes.  But there was a typo so it
      was using the wrong counter (larger counter) for these short sleep
      passes.  The result is a slightly shorter timeout period for these
      firmware messages than intended.  Fix it by using the proper counter.
      
      Fixes: 9751e8e7 ("bnxt_en: reduce timeout on initial HWRM calls")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67681d02
    • Daniel Borkmann's avatar
      Merge branch 'bpf-nfp-codegen-fixes' · 7d466e5f
      Daniel Borkmann authored
      Jiong Wang says:
      
      ====================
      Code-gen for BPF_ALU | BPF_XOR | BPF_K is wrong when imm is -1,
      also high 32-bit of 64-bit register should always be cleared.
      
      This set fixed both bugs.
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      7d466e5f
    • Jiong Wang's avatar
      nfp: bpf: fix ALU32 high bits clearance bug · f036ebd9
      Jiong Wang authored
      NFP BPF JIT compiler is doing a couple of small optimizations when jitting
      ALU imm instructions, some of these optimizations could save code-gen, for
      example:
      
        A & -1 =  A
        A |  0 =  A
        A ^  0 =  A
      
      However, for ALU32, high 32-bit of the 64-bit register should still be
      cleared according to ISA semantics.
      
      Fixes: cd7df56e ("nfp: add BPF to NFP code translator")
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarJiong Wang <jiong.wang@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f036ebd9
    • Jiong Wang's avatar
      nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K · 71c19024
      Jiong Wang authored
      The intended optimization should be A ^ 0 = A, not A ^ -1 = A.
      
      Fixes: cd7df56e ("nfp: add BPF to NFP code translator")
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarJiong Wang <jiong.wang@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      71c19024
    • David S. Miller's avatar
      Merge tag 'mac80211-for-davem-2019-02-22' of... · ab01f251
      David S. Miller authored
      Merge tag 'mac80211-for-davem-2019-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      Three more fixes:
       * mac80211 mesh code wasn't allocating SKB tailroom properly
         in some cases
       * tx_sk_pacing_shift should be 7 for better performance
       * mac80211_hwsim wasn't propagating genlmsg_reply() errors
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab01f251
    • Florian Fainelli's avatar
      Documentation: networking: switchdev: Update port parent ID section · 80d79ad2
      Florian Fainelli authored
      Update the section about switchdev drivers having to implement a
      switchdev_port_attr_get() function to return
      SWITCHDEV_ATTR_ID_PORT_PARENT_ID since that is no longer valid after
      commit bccb3025 ("net: Get rid of
      SWITCHDEV_ATTR_ID_PORT_PARENT_ID").
      
      Fixes: bccb3025 ("net: Get rid of SWITCHDEV_ATTR_ID_PORT_PARENT_ID")
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80d79ad2
    • Jann Horn's avatar
      net: socket: add check for negative optlen in compat setsockopt · 52baf987
      Jann Horn authored
      __sys_setsockopt() already checks for `optlen < 0`. Add an equivalent check
      to the compat path for robustness. This has to be `> INT_MAX` instead of
      `< 0` because the signedness of `optlen` is different here.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52baf987
    • Paolo Abeni's avatar
      ipv6: route: purge exception on removal · f5b51fe8
      Paolo Abeni authored
      When a netdevice is unregistered, we flush the relevant exception
      via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route().
      
      Finally, we end-up calling rt6_remove_exception(), where we release
      the relevant dst, while we keep the references to the related fib6_info and
      dev. Such references should be released later when the dst will be
      destroyed.
      
      There are a number of caches that can keep the exception around for an
      unlimited amount of time - namely dst_cache, possibly even socket cache.
      As a result device registration may hang, as demonstrated by this script:
      
      ip netns add cl
      ip netns add rt
      ip netns add srv
      ip netns exec rt sysctl -w net.ipv6.conf.all.forwarding=1
      
      ip link add name cl_veth type veth peer name cl_rt_veth
      ip link set dev cl_veth netns cl
      ip -n cl link set dev cl_veth up
      ip -n cl addr add dev cl_veth 2001::2/64
      ip -n cl route add default via 2001::1
      
      ip -n cl link add tunv6 type ip6tnl mode ip6ip6 local 2001::2 remote 2002::1 hoplimit 64 dev cl_veth
      ip -n cl link set tunv6 up
      ip -n cl addr add 2013::2/64 dev tunv6
      
      ip link set dev cl_rt_veth netns rt
      ip -n rt link set dev cl_rt_veth up
      ip -n rt addr add dev cl_rt_veth 2001::1/64
      
      ip link add name rt_srv_veth type veth peer name srv_veth
      ip link set dev srv_veth netns srv
      ip -n srv link set dev srv_veth up
      ip -n srv addr add dev srv_veth 2002::1/64
      ip -n srv route add default via 2002::2
      
      ip -n srv link add tunv6 type ip6tnl mode ip6ip6 local 2002::1 remote 2001::2 hoplimit 64 dev srv_veth
      ip -n srv link set tunv6 up
      ip -n srv addr add 2013::1/64 dev tunv6
      
      ip link set dev rt_srv_veth netns rt
      ip -n rt link set dev rt_srv_veth up
      ip -n rt addr add dev rt_srv_veth 2002::2/64
      
      ip netns exec srv netserver & sleep 0.1
      ip netns exec cl ping6 -c 4 2013::1
      ip netns exec cl netperf -H 2013::1 -t TCP_STREAM -l 3 & sleep 1
      ip -n rt link set dev rt_srv_veth mtu 1400
      wait %2
      
      ip -n cl link del cl_veth
      
      This commit addresses the issue purging all the references held by the
      exception at time, as we currently do for e.g. ipv6 pcpu dst entries.
      
      v1 -> v2:
       - re-order the code to avoid accessing dst and net after dst_dev_put()
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5b51fe8