1. 10 Nov, 2017 2 commits
    • Eric Dumazet's avatar
      tcp: gso: avoid refcount_t warning from tcp_gso_segment() · 7ec318fe
      Eric Dumazet authored
      When a GSO skb of truesize O is segmented into 2 new skbs of truesize N1
      and N2, we want to transfer socket ownership to the new fresh skbs.
      
      In order to avoid expensive atomic operations on a cache line subject to
      cache bouncing, we replace the sequence :
      
      refcount_add(N1, &sk->sk_wmem_alloc);
      refcount_add(N2, &sk->sk_wmem_alloc); // repeated by number of segments
      
      refcount_sub(O, &sk->sk_wmem_alloc);
      
      by a single
      
      refcount_add(sum_of(N) - O, &sk->sk_wmem_alloc);
      
      Problem is :
      
      In some pathological cases, sum(N) - O might be a negative number, and
      syzkaller bot was apparently able to trigger this trace [1]
      
      atomic_t was ok with this construct, but we need to take care of the
      negative delta with refcount_t
      
      [1]
      refcount_t: saturated; leaking memory.
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 8404 at lib/refcount.c:77 refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 0 PID: 8404 Comm: syz-executor2 Not tainted 4.14.0-rc5-mm1+ #20
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:16 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:52
       panic+0x1e4/0x41c kernel/panic.c:183
       __warn+0x1c4/0x1e0 kernel/panic.c:546
       report_bug+0x211/0x2d0 lib/bug.c:183
       fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:177
       do_trap_no_signal arch/x86/kernel/traps.c:211 [inline]
       do_trap+0x260/0x390 arch/x86/kernel/traps.c:260
       do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:297
       do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310
       invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
      RIP: 0010:refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
      RSP: 0018:ffff8801c606e3a0 EFLAGS: 00010282
      RAX: 0000000000000026 RBX: 0000000000001401 RCX: 0000000000000000
      RDX: 0000000000000026 RSI: ffffc900036fc000 RDI: ffffed0038c0dc68
      RBP: ffff8801c606e430 R08: 0000000000000001 R09: 0000000000000000
      R10: ffff8801d97f5eba R11: 0000000000000000 R12: ffff8801d5acf73c
      R13: 1ffff10038c0dc75 R14: 00000000ffffffff R15: 00000000fffff72f
       refcount_add+0x1b/0x60 lib/refcount.c:101
       tcp_gso_segment+0x10d0/0x16b0 net/ipv4/tcp_offload.c:155
       tcp4_gso_segment+0xd4/0x310 net/ipv4/tcp_offload.c:51
       inet_gso_segment+0x60c/0x11c0 net/ipv4/af_inet.c:1271
       skb_mac_gso_segment+0x33f/0x660 net/core/dev.c:2749
       __skb_gso_segment+0x35f/0x7f0 net/core/dev.c:2821
       skb_gso_segment include/linux/netdevice.h:3971 [inline]
       validate_xmit_skb+0x4ba/0xb20 net/core/dev.c:3074
       __dev_queue_xmit+0xe49/0x2070 net/core/dev.c:3497
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3538
       neigh_hh_output include/net/neighbour.h:471 [inline]
       neigh_output include/net/neighbour.h:479 [inline]
       ip_finish_output2+0xece/0x1460 net/ipv4/ip_output.c:229
       ip_finish_output+0x85e/0xd10 net/ipv4/ip_output.c:317
       NF_HOOK_COND include/linux/netfilter.h:238 [inline]
       ip_output+0x1cc/0x860 net/ipv4/ip_output.c:405
       dst_output include/net/dst.h:459 [inline]
       ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
       ip_queue_xmit+0x8c6/0x18e0 net/ipv4/ip_output.c:504
       tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1137
       tcp_write_xmit+0x663/0x4de0 net/ipv4/tcp_output.c:2341
       __tcp_push_pending_frames+0xa0/0x250 net/ipv4/tcp_output.c:2513
       tcp_push_pending_frames include/net/tcp.h:1722 [inline]
       tcp_data_snd_check net/ipv4/tcp_input.c:5050 [inline]
       tcp_rcv_established+0x8c7/0x18a0 net/ipv4/tcp_input.c:5497
       tcp_v4_do_rcv+0x2ab/0x7d0 net/ipv4/tcp_ipv4.c:1460
       sk_backlog_rcv include/net/sock.h:909 [inline]
       __release_sock+0x124/0x360 net/core/sock.c:2264
       release_sock+0xa4/0x2a0 net/core/sock.c:2776
       tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1462
       inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
       sock_sendmsg_nosec net/socket.c:632 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:642
       ___sys_sendmsg+0x31c/0x890 net/socket.c:2048
       __sys_sendmmsg+0x1e6/0x5f0 net/socket.c:2138
      
      Fixes: 14afee4b ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ec318fe
    • Håkon Bugge's avatar
      rds: ib: Fix NULL pointer dereference in debug code · 1cb483a5
      Håkon Bugge authored
      rds_ib_recv_refill() is a function that refills an IB receive
      queue. It can be called from both the CQE handler (tasklet) and a
      worker thread.
      
      Just after the call to ib_post_recv(), a debug message is printed with
      rdsdebug():
      
                  ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
                  rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
                           recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
                           (long) ib_sg_dma_address(
                                  ic->i_cm_id->device,
                                  &recv->r_frag->f_sg),
                          ret);
      
      Now consider an invocation of rds_ib_recv_refill() from the worker
      thread, which is preemptible. Further, assume that the worker thread
      is preempted between the ib_post_recv() and rdsdebug() statements.
      
      Then, if the preemption is due to a receive CQE event, the
      rds_ib_recv_cqe_handler() will be invoked. This function processes
      receive completions, including freeing up data structures, such as the
      recv->r_frag.
      
      In this scenario, rds_ib_recv_cqe_handler() will process the receive
      WR posted above. That implies, that the recv->r_frag has been freed
      before the above rdsdebug() statement has been executed. When it is
      later executed, we will have a NULL pointer dereference:
      
      [ 4088.068008] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
      [ 4088.076754] IP: rds_ib_recv_refill+0x87/0x620 [rds_rdma]
      [ 4088.082686] PGD 0 P4D 0
      [ 4088.085515] Oops: 0000 [#1] SMP
      [ 4088.089015] Modules linked in: rds_rdma(OE) rds(OE) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) mlx4_ib(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_core(E) binfmt_misc(E) sb_edac(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) crypto_simd(E) iTCO_wdt(E) glue_helper(E) iTCO_vendor_support(E) sg(E) cryptd(E) pcspkr(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) shpchp(E) ioatdma(E) i2c_i801(E) wmi(E) lpc_ich(E) mei_me(E) mei(E) mfd_core(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ip_tables(E) ext4(E) mbcache(E) jbd2(E) fscrypto(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E)
      [ 4088.168486]  fb_sys_fops(E) ahci(E) ixgbe(E) libahci(E) ttm(E) mdio(E) ptp(E) pps_core(E) drm(E) sd_mod(E) libata(E) crc32c_intel(E) mlx4_core(E) i2c_core(E) dca(E) megaraid_sas(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: rds]
      [ 4088.193442] CPU: 20 PID: 1244 Comm: kworker/20:2 Tainted: G           OE   4.14.0-rc7.master.20171105.ol7.x86_64 #1
      [ 4088.205097] Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
      [ 4088.216074] Workqueue: ib_cm cm_work_handler [ib_cm]
      [ 4088.221614] task: ffff885fa11d0000 task.stack: ffffc9000e598000
      [ 4088.228224] RIP: 0010:rds_ib_recv_refill+0x87/0x620 [rds_rdma]
      [ 4088.234736] RSP: 0018:ffffc9000e59bb68 EFLAGS: 00010286
      [ 4088.240568] RAX: 0000000000000000 RBX: ffffc9002115d050 RCX: ffffc9002115d050
      [ 4088.248535] RDX: ffffffffa0521380 RSI: ffffffffa0522158 RDI: ffffffffa0525580
      [ 4088.256498] RBP: ffffc9000e59bbf8 R08: 0000000000000005 R09: 0000000000000000
      [ 4088.264465] R10: 0000000000000339 R11: 0000000000000001 R12: 0000000000000000
      [ 4088.272433] R13: ffff885f8c9d8000 R14: ffffffff81a0a060 R15: ffff884676268000
      [ 4088.280397] FS:  0000000000000000(0000) GS:ffff885fbec80000(0000) knlGS:0000000000000000
      [ 4088.289434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 4088.295846] CR2: 0000000000000020 CR3: 0000000001e09005 CR4: 00000000001606e0
      [ 4088.303816] Call Trace:
      [ 4088.306557]  rds_ib_cm_connect_complete+0xe0/0x220 [rds_rdma]
      [ 4088.312982]  ? __dynamic_pr_debug+0x8c/0xb0
      [ 4088.317664]  ? __queue_work+0x142/0x3c0
      [ 4088.321944]  rds_rdma_cm_event_handler+0x19e/0x250 [rds_rdma]
      [ 4088.328370]  cma_ib_handler+0xcd/0x280 [rdma_cm]
      [ 4088.333522]  cm_process_work+0x25/0x120 [ib_cm]
      [ 4088.338580]  cm_work_handler+0xd6b/0x17aa [ib_cm]
      [ 4088.343832]  process_one_work+0x149/0x360
      [ 4088.348307]  worker_thread+0x4d/0x3e0
      [ 4088.352397]  kthread+0x109/0x140
      [ 4088.355996]  ? rescuer_thread+0x380/0x380
      [ 4088.360467]  ? kthread_park+0x60/0x60
      [ 4088.364563]  ret_from_fork+0x25/0x30
      [ 4088.368548] Code: 48 89 45 90 48 89 45 98 eb 4d 0f 1f 44 00 00 48 8b 43 08 48 89 d9 48 c7 c2 80 13 52 a0 48 c7 c6 58 21 52 a0 48 c7 c7 80 55 52 a0 <4c> 8b 48 20 44 89 64 24 08 48 8b 40 30 49 83 e1 fc 48 89 04 24
      [ 4088.389612] RIP: rds_ib_recv_refill+0x87/0x620 [rds_rdma] RSP: ffffc9000e59bb68
      [ 4088.397772] CR2: 0000000000000020
      [ 4088.401505] ---[ end trace fe922e6ccf004431 ]---
      
      This bug was provoked by compiling rds out-of-tree with
      EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
      between the rdsdebug() and ib_ib_port_recv() statements:
      
         	       /* XXX when can this fail? */
      	       ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
      +		if (can_wait)
      +			usleep_range(1000, 5000);
      	       rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
      			recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
      			(long) ib_sg_dma_address(
      
      The fix is simply to move the rdsdebug() statement up before the
      ib_post_recv() and remove the printing of ret, which is taken care of
      anyway by the non-debug code.
      Signed-off-by: default avatarHåkon Bugge <haakon.bugge@oracle.com>
      Reviewed-by: default avatarKnut Omang <knut.omang@oracle.com>
      Reviewed-by: default avatarWei Lin Guay <wei.lin.guay@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1cb483a5
  2. 09 Nov, 2017 21 commits
    • Linus Torvalds's avatar
      Merge tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 3fefc318
      Linus Torvalds authored
      Pull final power management fixes from Rafael Wysocki:
       "These fix a regression in the schedutil cpufreq governor introduced by
        a recent change and blacklist Dell XPS13 9360 from using the Low Power
        S0 Idle _DSM interface which triggers serious problems on one of these
        machines.
      
        Specifics:
      
         - Prevent the schedutil cpufreq governor from using the utilization
           of a wrong CPU in some cases which started to happen after one of
           the recent changes in it (Chris Redpath).
      
         - Blacklist Dell XPS13 9360 from using the Low Power S0 Idle _DSM
           interface as that causes serious issue (related to NVMe) to appear
           on one of these machines, even though the other Dells XPS13 9360 in
           somewhat different HW configurations behave correctly (Rafael
           Wysocki)"
      
      * tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / PM: Blacklist Low Power S0 Idle _DSM for Dell XPS13 9360
        cpufreq: schedutil: Examine the correct CPU when we update util
      3fefc318
    • Linus Torvalds's avatar
      Merge tag 'sound-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · d93d4ce1
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "The amount of the changes isn't as quite small as wished, nevertheless
        they are straight fixes that deserve merging to 4.14 final.
      
        Most of fixes are about ALSA core bugs spotted by fuzzer: a follow-up
        fix for the previous nested rwsem patch, a fix to avoid the resource
        hogs due to too many concurrent ALSA timer invocations, and a fix for
        a crash with SYSEX MIDI transfer over OSS sequencer emulation that is
        used by none but fuzzer.
      
        The rest are usual HD-audio and USB-audio device-specific quirks,
        which are safe to apply"
      
      * tag 'sound-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - fix headset mic problem for Dell machines with alc274
        ALSA: seq: Fix OSS sysex delivery in OSS emulation
        ALSA: seq: Avoid invalid lockdep class warning
        ALSA: timer: Limit max instances per timer
        ALSA: usb-audio: support new Amanero Combo384 firmware version
      d93d4ce1
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · d1041cdc
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix use-after-free in IPSEC input parsing, desintation address
          pointer was loaded before pskb_may_pull() which can change the SKB
          data pointers. From Florian Westphal.
      
       2) Stack out-of-bounds read in xfrm_state_find(), from Steffen
          Klassert.
      
       3) IPVS state of SKB is not properly reset when moving between
          namespaces, from Ye Yin.
      
       4) Fix crash in asix driver suspend and resume, from Andrey Konovalov.
      
       5) Don't deliver ipv6 l2tp tunnel packets to ipv4 l2tp tunnels, and
          vice versa, from Guillaume Nault.
      
       6) Fix DSACK undo on non-dup ACKs, from Priyaranjan Jha.
      
       7) Fix regression in bond_xmit_hash()'s behavior after the TCP port
          selection changes back in 4.2, from Hangbin Liu.
      
       8) Two divide by zero bugs in USB networking drivers when parsing
          descriptors, from Bjorn Mork.
      
       9) Fix bonding slaves being stuck in BOND_LINK_FAIL state, from Jay
          Vosburgh.
      
      10) Missing skb_reset_mac_header() in qmi_wwan, from Kristian Evensen.
      
      11) Fix the destruction of tc action object races properly, from Cong
          Wang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (31 commits)
        cls_u32: use tcf_exts_get_net() before call_rcu()
        cls_tcindex: use tcf_exts_get_net() before call_rcu()
        cls_rsvp: use tcf_exts_get_net() before call_rcu()
        cls_route: use tcf_exts_get_net() before call_rcu()
        cls_matchall: use tcf_exts_get_net() before call_rcu()
        cls_fw: use tcf_exts_get_net() before call_rcu()
        cls_flower: use tcf_exts_get_net() before call_rcu()
        cls_flow: use tcf_exts_get_net() before call_rcu()
        cls_cgroup: use tcf_exts_get_net() before call_rcu()
        cls_bpf: use tcf_exts_get_net() before call_rcu()
        cls_basic: use tcf_exts_get_net() before call_rcu()
        net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()
        Revert "net_sched: hold netns refcnt for each action"
        net: usb: asix: fill null-ptr-deref in asix_suspend
        Revert "net: usb: asix: fill null-ptr-deref in asix_suspend"
        qmi_wwan: Add missing skb_reset_mac_header-call
        bonding: fix slave stuck in BOND_LINK_FAIL state
        qrtr: Move to postcore_initcall
        net: qmi_wwan: fix divide by 0 on bad descriptors
        net: cdc_ether: fix divide by 0 on bad descriptors
        ...
      d1041cdc
    • Hui Wang's avatar
      ALSA: hda - fix headset mic problem for Dell machines with alc274 · 75ee94b2
      Hui Wang authored
      Confirmed with Kailang of Realtek, the pin 0x19 is for Headset Mic, and
      the pin 0x1a is for Headphone Mic, he suggested to apply
      ALC269_FIXUP_DELL1_MIC_NO_PRESENCE to fix this problem. And we
      verified applying this FIXUP can fix this problem.
      
      Cc: <stable@vger.kernel.org>
      Cc: Kailang Yang <kailang@realtek.com>
      Signed-off-by: default avatarHui Wang <hui.wang@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      75ee94b2
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 6a172802
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2017-11-09
      
      1) Fix a use after free due to a reallocated skb head.
         From Florian Westphal.
      
      2) Fix sporadic lookup failures on labeled IPSEC.
         From Florian Westphal.
      
      3) Fix a stack out of bounds when a socket policy is applied
         to an IPv6 socket that sends IPv4 packets.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a172802
    • David S. Miller's avatar
      Merge branch 'net-sched-race-fix' · 623859ae
      David S. Miller authored
      Cong Wang says:
      
      ====================
      net_sched: close the race between call_rcu() and cleanup_net()
      
      This patchset tries to fix the race between call_rcu() and
      cleanup_net() again. Without holding the netns refcnt the
      tc_action_net_exit() in netns workqueue could be called before
      filter destroy works in tc filter workqueue. This patchset
      moves the netns refcnt from tc actions to tcf_exts, without
      breaking per-netns tc actions.
      
      Patch 1 reverts the previous fix, patch 2 introduces two new
      API's to help to address the bug and the rest patches switch
      to the new API's. Please see each patch for details.
      
      I was not able to reproduce this bug, but now after adding
      some delay in filter destroy work I manage to trigger the
      crash. After this patchset, the crash is not reproducible
      any more and the debugging printk's show the order is expected
      too.
      ====================
      
      Fixes: ddf97ccd ("net_sched: add network namespace support for tc actions")
      Reported-by: default avatarLucas Bates <lucasb@mojatatu.com>
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      623859ae
    • Cong Wang's avatar
      cls_u32: use tcf_exts_get_net() before call_rcu() · 35c55fc1
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35c55fc1
    • Cong Wang's avatar
      cls_tcindex: use tcf_exts_get_net() before call_rcu() · f2b75105
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f2b75105
    • Cong Wang's avatar
      cls_rsvp: use tcf_exts_get_net() before call_rcu() · 96585063
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96585063
    • Cong Wang's avatar
      cls_route: use tcf_exts_get_net() before call_rcu() · 3fd51de5
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3fd51de5
    • Cong Wang's avatar
      cls_matchall: use tcf_exts_get_net() before call_rcu() · 57767e78
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57767e78
    • Cong Wang's avatar
      cls_fw: use tcf_exts_get_net() before call_rcu() · d5f984f5
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5f984f5
    • Cong Wang's avatar
      cls_flower: use tcf_exts_get_net() before call_rcu() · 0dadc117
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0dadc117
    • Cong Wang's avatar
      cls_flow: use tcf_exts_get_net() before call_rcu() · 22f7cec9
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22f7cec9
    • Cong Wang's avatar
      cls_cgroup: use tcf_exts_get_net() before call_rcu() · ed148168
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed148168
    • Cong Wang's avatar
      cls_bpf: use tcf_exts_get_net() before call_rcu() · aae2c35e
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aae2c35e
    • Cong Wang's avatar
      cls_basic: use tcf_exts_get_net() before call_rcu() · 0b2a5989
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b2a5989
    • Cong Wang's avatar
      net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net() · e4b95c41
      Cong Wang authored
      Instead of holding netns refcnt in tc actions, we can minimize
      the holding time by saving it in struct tcf_exts instead. This
      means we can just hold netns refcnt right before call_rcu() and
      release it after tcf_exts_destroy() is done.
      
      However, because on netns cleanup path we call tcf_proto_destroy()
      too, obviously we can not hold netns for a zero refcnt, in this
      case we have to do cleanup synchronously. It is fine for RCU too,
      the caller cleanup_net() already waits for a grace period.
      
      For other cases, refcnt is non-zero and we can safely grab it as
      normal and release it after we are done.
      
      This patch provides two new API for each filter to use:
      tcf_exts_get_net() and tcf_exts_put_net(). And all filters now can
      use the following pattern:
      
      void __destroy_filter() {
        tcf_exts_destroy();
        tcf_exts_put_net();  // <== release netns refcnt
        kfree();
      }
      void some_work() {
        rtnl_lock();
        __destroy_filter();
        rtnl_unlock();
      }
      void some_rcu_callback() {
        tcf_queue_work(some_work);
      }
      
      if (tcf_exts_get_net())  // <== hold netns refcnt
        call_rcu(some_rcu_callback);
      else
        __destroy_filter();
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4b95c41
    • Cong Wang's avatar
      Revert "net_sched: hold netns refcnt for each action" · c7e460ce
      Cong Wang authored
      This reverts commit ceffcc5e.
      If we hold that refcnt, the netns can never be destroyed until
      all actions are destroyed by user, this breaks our netns design
      which we expect all actions are destroyed when we destroy the
      whole netns.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7e460ce
    • Andrey Konovalov's avatar
      net: usb: asix: fill null-ptr-deref in asix_suspend · 8f562462
      Andrey Konovalov authored
      When asix_suspend() is called dev->driver_priv might not have been
      assigned a value, so we need to check that it's not NULL.
      
      Similar issue is present in asix_resume(), this patch fixes it as well.
      
      Found by syzkaller.
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      Modules linked in:
      CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c #400
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: usb_hub_wq hub_event
      task: ffff88006bb36300 task.stack: ffff88006bba8000
      RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
      RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
      RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
      RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
      R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
      FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
      Call Trace:
       usb_suspend_interface drivers/usb/core/driver.c:1209
       usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
       usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
       __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
       rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
       rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
       __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
       pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
       usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
       hub_port_connect drivers/usb/core/hub.c:4903
       hub_port_connect_change drivers/usb/core/hub.c:5009
       port_event drivers/usb/core/hub.c:5115
       hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
       process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
       worker_thread+0x221/0x1850 kernel/workqueue.c:2253
       kthread+0x3a1/0x470 kernel/kthread.c:231
       ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
      Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
      00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
      3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
      RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
      ---[ end trace dfc4f5649284342c ]---
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f562462
    • David S. Miller's avatar
      Revert "net: usb: asix: fill null-ptr-deref in asix_suspend" · 1a8e6b48
      David S. Miller authored
      This reverts commit baedf68a.
      
      There is an updated version of this fix which covers
      the problem more thoroughly.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a8e6b48
  3. 08 Nov, 2017 13 commits
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-cpufreq-sched' · e029b9bf
      Rafael J. Wysocki authored
      * pm-cpufreq-sched:
        cpufreq: schedutil: Examine the correct CPU when we update util
      e029b9bf
    • Jiri Kosina's avatar
      x86/mm: Unbreak modules that rely on external PAGE_KERNEL availability · 87df2617
      Jiri Kosina authored
      Commit 7744ccdb ("x86/mm: Add Secure Memory Encryption (SME)
      support") as a side-effect made PAGE_KERNEL all of a sudden unavailable
      to modules which can't make use of EXPORT_SYMBOL_GPL() symbols.
      
      This is because once SME is enabled, sme_me_mask (which is introduced as
      EXPORT_SYMBOL_GPL) makes its way to PAGE_KERNEL through _PAGE_ENC,
      causing imminent build failure for all the modules which make use of all
      the EXPORT-SYMBOL()-exported API (such as vmap(), __vmalloc(),
      remap_pfn_range(), ...).
      
      Exporting (as EXPORT_SYMBOL()) interfaces (and having done so for ages)
      that take pgprot_t argument, while making it impossible to -- all of a
      sudden -- pass PAGE_KERNEL to it, feels rather incosistent.
      
      Restore the original behavior and make it possible to pass PAGE_KERNEL
      to all its EXPORT_SYMBOL() consumers.
      
      [ This is all so not wonderful. We shouldn't need that "sme_me_mask"
        access at all in all those places that really don't care about that
        level of detail, and just want _PAGE_KERNEL or whatever.
      
        We have some similar issues with _PAGE_CACHE_WP and _PAGE_NOCACHE,
        both of which hide a "cachemode2protval()" call, and which also ends
        up using another EXPORT_SYMBOL(), but at least that only triggers for
        the much more rare cases.
      
        Maybe we could move these dynamic page table bits to be generated much
        deeper down in the VM layer, instead of hiding them in the macros that
        everybody uses.
      
        So this all would merit some cleanup. But not today.   - Linus ]
      
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Despised-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      87df2617
    • Linus Torvalds's avatar
      Merge branch 'fixes-v4.14-rc8' of... · d6a2cf07
      Linus Torvalds authored
      Merge branch 'fixes-v4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
      
      Pull key handling fix from James Morris:
       "Fix by Eric Biggers for the keys subsystem"
      
      * 'fixes-v4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2]
      d6a2cf07
    • John Johansen's avatar
      apparmor: fix off-by-one comparison on MAXMAPPED_SIG · f7dc4c9a
      John Johansen authored
      This came in yesterday, and I have verified our regression tests
      were missing this and it can cause an oops. Please apply.
      
      There is a an off-by-one comparision on sig against MAXMAPPED_SIG
      that can lead to a read outside the sig_map array if sig
      is MAXMAPPED_SIG. Fix this.
      
      Verified that the check is an out of bounds case that can cause an oops.
      
      Revised: add comparison fix to second case
      Fixes: cd1dbf76 ("apparmor: add the ability to mediate signals")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarJohn Johansen <john.johansen@canonical.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f7dc4c9a
    • Eric Biggers's avatar
      KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2] · 624f5ab8
      Eric Biggers authored
      syzkaller reported a NULL pointer dereference in asn1_ber_decoder().  It
      can be reproduced by the following command, assuming
      CONFIG_PKCS7_TEST_KEY=y:
      
              keyctl add pkcs7_test desc '' @s
      
      The bug is that if the data buffer is empty, an integer underflow occurs
      in the following check:
      
              if (unlikely(dp >= datalen - 1))
                      goto data_overrun_error;
      
      This results in the NULL data pointer being dereferenced.
      
      Fix it by checking for 'datalen - dp < 2' instead.
      
      Also fix the similar check for 'dp >= datalen - n' later in the same
      function.  That one possibly could result in a buffer overread.
      
      The NULL pointer dereference was reproducible using the "pkcs7_test" key
      type but not the "asymmetric" key type because the "asymmetric" key type
      checks for a 0-length payload before calling into the ASN.1 decoder but
      the "pkcs7_test" key type does not.
      
      The bug report was:
      
          BUG: unable to handle kernel NULL pointer dereference at           (null)
          IP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
          PGD 7b708067 P4D 7b708067 PUD 7b6ee067 PMD 0
          Oops: 0000 [#1] SMP
          Modules linked in:
          CPU: 0 PID: 522 Comm: syz-executor1 Not tainted 4.14.0-rc8 #7
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
          task: ffff9b6b3798c040 task.stack: ffff9b6b37970000
          RIP: 0010:asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
          RSP: 0018:ffff9b6b37973c78 EFLAGS: 00010216
          RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000021c
          RDX: ffffffff814a04ed RSI: ffffb1524066e000 RDI: ffffffff910759e0
          RBP: ffff9b6b37973d60 R08: 0000000000000001 R09: ffff9b6b3caa4180
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
          R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
          FS:  00007f10ed1f2700(0000) GS:ffff9b6b3ea00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000000000 CR3: 000000007b6f3000 CR4: 00000000000006f0
          Call Trace:
           pkcs7_parse_message+0xee/0x240 crypto/asymmetric_keys/pkcs7_parser.c:139
           verify_pkcs7_signature+0x33/0x180 certs/system_keyring.c:216
           pkcs7_preparse+0x41/0x70 crypto/asymmetric_keys/pkcs7_key_type.c:63
           key_create_or_update+0x180/0x530 security/keys/key.c:855
           SYSC_add_key security/keys/keyctl.c:122 [inline]
           SyS_add_key+0xbf/0x250 security/keys/keyctl.c:62
           entry_SYSCALL_64_fastpath+0x1f/0xbe
          RIP: 0033:0x4585c9
          RSP: 002b:00007f10ed1f1bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000f8
          RAX: ffffffffffffffda RBX: 00007f10ed1f2700 RCX: 00000000004585c9
          RDX: 0000000020000000 RSI: 0000000020008ffb RDI: 0000000020008000
          RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000000
          R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff1b2260ae
          R13: 00007fff1b2260af R14: 00007f10ed1f2700 R15: 0000000000000000
          Code: dd ca ff 48 8b 45 88 48 83 e8 01 4c 39 f0 0f 86 a8 07 00 00 e8 53 dd ca ff 49 8d 46 01 48 89 85 58 ff ff ff 48 8b 85 60 ff ff ff <42> 0f b6 0c 30 89 c8 88 8d 75 ff ff ff 83 e0 1f 89 8d 28 ff ff
          RIP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 RSP: ffff9b6b37973c78
          CR2: 0000000000000000
      
      Fixes: 42d5ec27 ("X.509: Add an ASN.1 decoder")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: <stable@vger.kernel.org> # v3.7+
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarJames Morris <james.l.morris@oracle.com>
      624f5ab8
    • Kristian Evensen's avatar
      qmi_wwan: Add missing skb_reset_mac_header-call · 0de0add1
      Kristian Evensen authored
      When we receive a packet on a QMI device in raw IP mode, we should call
      skb_reset_mac_header() to ensure that skb->mac_header contains a valid
      offset in the packet. While it shouldn't really matter, the packets have
      no MAC header and the interface is configured as-such, it seems certain
      parts of the network stack expects a "good" value in skb->mac_header.
      
      Without the skb_reset_mac_header() call added in this patch, for example
      shaping traffic (using tc) triggers the following oops on the first
      received packet:
      
      [  303.642957] skbuff: skb_under_panic: text:8f137918 len:177 put:67 head:8e4b0f00 data:8e4b0eff tail:0x8e4b0fb0 end:0x8e4b1520 dev:wwan0
      [  303.655045] Kernel bug detected[#1]:
      [  303.658622] CPU: 1 PID: 1002 Comm: logd Not tainted 4.9.58 #0
      [  303.664339] task: 8fdf05e0 task.stack: 8f15c000
      [  303.668844] $ 0   : 00000000 00000001 0000007a 00000000
      [  303.674062] $ 4   : 8149a2fc 8149a2fc 8149ce20 00000000
      [  303.679284] $ 8   : 00000030 3878303a 31623465 20303235
      [  303.684510] $12   : ded731e3 2626a277 00000000 03bd0000
      [  303.689747] $16   : 8ef62b40 00000043 8f137918 804db5fc
      [  303.694978] $20   : 00000001 00000004 8fc13800 00000003
      [  303.700215] $24   : 00000001 8024ab10
      [  303.705442] $28   : 8f15c000 8fc19cf0 00000043 802cc920
      [  303.710664] Hi    : 00000000
      [  303.713533] Lo    : 74e58000
      [  303.716436] epc   : 802cc920 skb_panic+0x58/0x5c
      [  303.721046] ra    : 802cc920 skb_panic+0x58/0x5c
      [  303.725639] Status: 11007c03 KERNEL EXL IE
      [  303.729823] Cause : 50800024 (ExcCode 09)
      [  303.733817] PrId  : 0001992f (MIPS 1004Kc)
      [  303.737892] Modules linked in: rt2800pci rt2800mmio rt2800lib qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 mt76x2i
      Process logd (pid: 1002, threadinfo=8f15c000, task=8fdf05e0, tls=77b3eee4)
      [  303.962509] Stack : 00000000 80408990 8f137918 000000b1 00000043 8e4b0f00 8e4b0eff 8e4b0fb0
      [  303.970871]         8e4b1520 8fec1800 00000043 802cd2a4 6e000045 00000043 00000000 8ef62000
      [  303.979219]         8eef5d00 8ef62b40 8fea7300 8f137918 00000000 00000000 0002bb01 793e5664
      [  303.987568]         8ef08884 00000001 8fea7300 00000002 8fc19e80 8eef5d00 00000006 00000003
      [  303.995934]         00000000 8030ba90 00000003 77ab3fd0 8149dc80 8004d1bc 8f15c000 8f383700
      [  304.004324]         ...
      [  304.006767] Call Trace:
      [  304.009241] [<802cc920>] skb_panic+0x58/0x5c
      [  304.013504] [<802cd2a4>] skb_push+0x78/0x90
      [  304.017783] [<8f137918>] 0x8f137918
      [  304.021269] Code: 00602825  0c02a3b4  24842888 <000c000d> 8c870060  8c8200a0  0007382b  00070336  8c88005c
      [  304.031034]
      [  304.032805] ---[ end trace b778c482b3f0bda9 ]---
      [  304.041384] Kernel panic - not syncing: Fatal exception in interrupt
      [  304.051975] Rebooting in 3 seconds..
      
      While the oops is for a 4.9-kernel, I was able to trigger the same oops with
      net-next as of yesterday.
      
      Fixes: 32f7adf6 ("net: qmi_wwan: support "raw IP" mode")
      Signed-off-by: default avatarKristian Evensen <kristian.evensen@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0de0add1
    • Jay Vosburgh's avatar
      bonding: fix slave stuck in BOND_LINK_FAIL state · 055db695
      Jay Vosburgh authored
      The bonding miimon logic has a flaw, in that a failure of the
      rtnl_trylock can cause a slave to become permanently stuck in
      BOND_LINK_FAIL state.
      
      	The sequence of events to cause this is as follows:
      
      	1) bond_miimon_inspect finds that a slave's link is down, and so
      calls bond_propose_link_state, setting slave->new_link_state to
      BOND_LINK_FAIL, then sets slave->new_link to BOND_LINK_DOWN and returns
      non-zero.
      
      	2) In bond_mii_monitor, the rtnl_trylock fails, and the timer is
      rescheduled.  No change is committed.
      
      	3) bond_miimon_inspect is called again, but this time the slave
      from step 1 has recovered.  slave->new_link is reset to NOCHANGE, and, as
      slave->link was never changed, the switch enters the BOND_LINK_UP case,
      and does nothing.  The pending BOND_LINK_FAIL state from step 1 remains
      pending, as new_link_state is not reset.
      
      	4) The state from step 3 persists until another slave changes link
      state and causes bond_miimon_inspect to return non-zero.  At this point,
      the BOND_LINK_FAIL state change on the slave from steps 1-3 is committed,
      and the slave will remain stuck in BOND_LINK_FAIL state even though it
      is actually link up.
      
      	The remedy for this is to initialize new_link_state on each entry
      to bond_miimon_inspect, as is already done with new_link.
      
      Fixes: fb9eb899 ("bonding: handle link transition from FAIL to UP correctly")
      Reported-by: default avatarAlex Sidorenko <alexandre.sidorenko@hpe.com>
      Reviewed-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Acked-by: default avatarMahesh Bandewar <maheshb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      055db695
    • Bjorn Andersson's avatar
      qrtr: Move to postcore_initcall · b7e732fa
      Bjorn Andersson authored
      Registering qrtr with module_init makes the ability of typical platform
      code to create AF_QIPCRTR socket during probe a matter of link order
      luck. Moving qrtr to postcore_initcall() avoids this.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7e732fa
    • Bjørn Mork's avatar
      net: qmi_wwan: fix divide by 0 on bad descriptors · 7fd07833
      Bjørn Mork authored
      A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will
      cause a divide error in usbnet_probe:
      
      divide error: 0000 [#1] PREEMPT SMP KASAN
      Modules linked in:
      CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f #56
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: usb_hub_wq hub_event
      task: ffff88006bef5c00 task.stack: ffff88006bf60000
      RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355
      RSP: 0018:ffff88006bf67508 EFLAGS: 00010246
      RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34
      RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34
      RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881
      R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003
      R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0
      Call Trace:
       usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783
       qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338
       usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361
       really_probe drivers/base/dd.c:413
       driver_probe_device+0x522/0x740 drivers/base/dd.c:557
      
      Fix by simply ignoring the bogus descriptor, as it is optional
      for QMI devices anyway.
      
      Fixes: 423ce8ca ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fd07833
    • Bjørn Mork's avatar
      net: cdc_ether: fix divide by 0 on bad descriptors · 2cb80187
      Bjørn Mork authored
      Setting dev->hard_mtu to 0 will cause a divide error in
      usbnet_probe. Protect against devices with bogus CDC Ethernet
      functional descriptors by ignoring a zero wMaxSegmentSize.
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2cb80187
    • Hangbin Liu's avatar
      bonding: discard lowest hash bit for 802.3ad layer3+4 · b5f86218
      Hangbin Liu authored
      After commit 07f4c900 ("tcp/dccp: try to not exhaust ip_local_port_range
      in connect()"), we will try to use even ports for connect(). Then if an
      application (seen clearly with iperf) opens multiple streams to the same
      destination IP and port, each stream will be given an even source port.
      
      So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing
      will always hash all these streams to the same interface. And the total
      throughput will limited to a single slave.
      
      Change the tcp code will impact the whole tcp behavior, only for bonding
      usage. Paolo Abeni suggested fix this by changing the bonding code only,
      which should be more reasonable, and less impact.
      
      Fix this by discarding the lowest hash bit because it contains little entropy.
      After the fix we can re-balance between slaves.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5f86218
    • Gustavo A. R. Silva's avatar
      net/mlx5e/core/en_fs: fix pointer dereference after free in mlx5e_execute_l2_action · 39a4b86f
      Gustavo A. R. Silva authored
      hn is being kfree'd in mlx5e_del_l2_from_hash and then dereferenced
      by accessing hn->ai.addr
      
      Fix this by copying the MAC address into a local variable for its safe use
      in all possible execution paths within function mlx5e_execute_l2_action.
      
      Addresses-Coverity-ID: 1417789
      Fixes: eeb66cdb ("net/mlx5: Separate between E-Switch and MPFS")
      Signed-off-by: default avatarGustavo A. R. Silva <garsilva@embeddedor.com>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39a4b86f
    • Marc Zyngier's avatar
      net: mvpp2: Prevent userspace from changing TX affinities · 13c249a9
      Marc Zyngier authored
      The mvpp2 driver can't cope at all with the TX affinities being
      changed from userspace, and spit an endless stream of
      
      [   91.779920] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.779930] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.780402] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.780406] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.780415] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.780418] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      
      rendering the box completely useless (I've measured around 600k
      interrupts/s on a 8040 box) once irqbalance kicks in and start
      doing its job.
      
      Obviously, the driver was never designed with this in mind. So let's
      work around the problem by preventing userspace from interacting
      with these interrupts altogether.
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13c249a9
  4. 07 Nov, 2017 3 commits
    • Borislav Petkov's avatar
      drivers/ide-cd: Handle missing driver data during status check gracefully · fbc3edf7
      Borislav Petkov authored
      The 0day bot reports the below failure which happens occasionally, with
      their randconfig testing (once every ~100 boots).  The Code points at
      the private pointer ->driver_data being NULL, which hints at a race of
      sorts where the private driver_data descriptor has disappeared by the
      time we get to run the workqueue.
      
      So let's check that pointer before we continue with issuing the command
      to the drive.
      
      This fix is of the brown paper bag nature but considering that IDE is
      long deprecated, let's do that so that random testing which happens to
      enable CONFIG_IDE during randconfig builds, doesn't fail because of
      this.
      
      Besides, failing the TEST_UNIT_READY command because the drive private
      data is gone is something which we could simply do anyway, to denote
      that there was a problem communicating with the device.
      
        BUG: unable to handle kernel NULL pointer dereference at 000001c0
        IP: cdrom_check_status
        *pde = 00000000
        Oops: 0000 [#1] SMP
        CPU: 1 PID: 155 Comm: kworker/1:2 Not tainted 4.14.0-rc8 #127
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
        Workqueue: events_freezable_power_ disk_events_workfn
        task: 4fe90980 task.stack: 507ac000
        EIP: cdrom_check_status+0x2c/0x90
        EFLAGS: 00210246 CPU: 1
        EAX: 00000000 EBX: 4fefec00 ECX: 00000000 EDX: 00000000
        ESI: 00000003 EDI: ffffffff EBP: 467a9340 ESP: 507aded0
         DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
        CR0: 80050033 CR2: 000001c0 CR3: 06e0f000 CR4: 00000690
        Call Trace:
         ? ide_cdrom_check_events_real
         ? cdrom_check_events
         ? disk_check_events
         ? process_one_work
         ? process_one_work
         ? worker_thread
         ? kthread
         ? process_one_work
         ? __kthread_create_on_node
         ? ret_from_fork
        Code: 53 83 ec 14 89 c3 89 d1 be 03 00 00 00 65 a1 14 00 00 00 89 44 24 10 31 c0 8b 43 18 c7 44 24 04 00 00 00 00 c7 04 24 00 00 00 00 <8a> 80 c0 01 00 00 c7 44 24 08 00 00 00 00 83 e0 03 c7 44 24 0c
        EIP: cdrom_check_status+0x2c/0x90 SS:ESP: 0068:507aded0
        CR2: 00000000000001c0
        ---[ end trace 2410e586dd8f88b2 ]---
      Reported-and-tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Bart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fbc3edf7
    • Linus Torvalds's avatar
      Revert "scsi: make 'state' device attribute pollable" · a817e73f
      Linus Torvalds authored
      This reverts commit 8a97712e.
      
      This commit added a call to sysfs_notify() from within
      scsi_device_set_state(), which in turn turns out to make libata very
      unhappy, because ata_eh_detach_dev() does
      
              spin_lock_irqsave(ap->lock, flags);
              ..
              if (ata_scsi_offline_dev(dev)) {
                      dev->flags |= ATA_DFLAG_DETACHED;
                      ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG;
              }
      
      and ata_scsi_offline_dev() then does that scsi_device_set_state() to set
      it offline.
      
      So now we called sysfs_notify() from within a spinlocked region, which
      really doesn't work.  The 0day robot reported this as:
      
         BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238
      
      because sysfs_notify() ends up calling kernfs_find_and_get_ns() which
      then does mutex_lock(&kernfs_mutex)..
      
      The pollability of the device state isn't critical, so revert this all
      for now, and maybe we'll do it differently in the future.
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Acked-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a817e73f
    • Takashi Iwai's avatar
      ALSA: seq: Fix OSS sysex delivery in OSS emulation · 132d358b
      Takashi Iwai authored
      The SYSEX event delivery in OSS sequencer emulation assumed that the
      event is encoded in the variable-length data with the straight
      buffering.  This was the normal behavior in the past, but during the
      development, the chained buffers were introduced for carrying more
      data, while the OSS code was left intact.  As a result, when a SYSEX
      event with the chained buffer data is passed to OSS sequencer port,
      it may end up with the wrong memory access, as if it were having a too
      large buffer.
      
      This patch addresses the bug, by applying the buffer data expansion by
      the generic snd_seq_dump_var_event() helper function.
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reported-by: default avatarMark Salyzyn <salyzyn@android.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      132d358b
  5. 06 Nov, 2017 1 commit