1. 09 Nov, 2017 5 commits
    • Cong Wang's avatar
      cls_basic: use tcf_exts_get_net() before call_rcu() · 0b2a5989
      Cong Wang authored
      Hold netns refcnt before call_rcu() and release it after
      the tcf_exts_destroy() is done.
      
      Note, on ->destroy() path we have to respect the return value
      of tcf_exts_get_net(), on other paths it should always return
      true, so we don't need to care.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b2a5989
    • Cong Wang's avatar
      net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net() · e4b95c41
      Cong Wang authored
      Instead of holding netns refcnt in tc actions, we can minimize
      the holding time by saving it in struct tcf_exts instead. This
      means we can just hold netns refcnt right before call_rcu() and
      release it after tcf_exts_destroy() is done.
      
      However, because on netns cleanup path we call tcf_proto_destroy()
      too, obviously we can not hold netns for a zero refcnt, in this
      case we have to do cleanup synchronously. It is fine for RCU too,
      the caller cleanup_net() already waits for a grace period.
      
      For other cases, refcnt is non-zero and we can safely grab it as
      normal and release it after we are done.
      
      This patch provides two new API for each filter to use:
      tcf_exts_get_net() and tcf_exts_put_net(). And all filters now can
      use the following pattern:
      
      void __destroy_filter() {
        tcf_exts_destroy();
        tcf_exts_put_net();  // <== release netns refcnt
        kfree();
      }
      void some_work() {
        rtnl_lock();
        __destroy_filter();
        rtnl_unlock();
      }
      void some_rcu_callback() {
        tcf_queue_work(some_work);
      }
      
      if (tcf_exts_get_net())  // <== hold netns refcnt
        call_rcu(some_rcu_callback);
      else
        __destroy_filter();
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4b95c41
    • Cong Wang's avatar
      Revert "net_sched: hold netns refcnt for each action" · c7e460ce
      Cong Wang authored
      This reverts commit ceffcc5e.
      If we hold that refcnt, the netns can never be destroyed until
      all actions are destroyed by user, this breaks our netns design
      which we expect all actions are destroyed when we destroy the
      whole netns.
      
      Cc: Lucas Bates <lucasb@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7e460ce
    • Andrey Konovalov's avatar
      net: usb: asix: fill null-ptr-deref in asix_suspend · 8f562462
      Andrey Konovalov authored
      When asix_suspend() is called dev->driver_priv might not have been
      assigned a value, so we need to check that it's not NULL.
      
      Similar issue is present in asix_resume(), this patch fixes it as well.
      
      Found by syzkaller.
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      Modules linked in:
      CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c #400
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: usb_hub_wq hub_event
      task: ffff88006bb36300 task.stack: ffff88006bba8000
      RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
      RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
      RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
      RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
      R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
      FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
      Call Trace:
       usb_suspend_interface drivers/usb/core/driver.c:1209
       usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
       usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
       __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
       rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
       rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
       __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
       pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
       usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
       hub_port_connect drivers/usb/core/hub.c:4903
       hub_port_connect_change drivers/usb/core/hub.c:5009
       port_event drivers/usb/core/hub.c:5115
       hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
       process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
       worker_thread+0x221/0x1850 kernel/workqueue.c:2253
       kthread+0x3a1/0x470 kernel/kthread.c:231
       ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
      Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
      00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
      3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
      RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
      ---[ end trace dfc4f5649284342c ]---
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f562462
    • David S. Miller's avatar
      Revert "net: usb: asix: fill null-ptr-deref in asix_suspend" · 1a8e6b48
      David S. Miller authored
      This reverts commit baedf68a.
      
      There is an updated version of this fix which covers
      the problem more thoroughly.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a8e6b48
  2. 08 Nov, 2017 8 commits
    • Kristian Evensen's avatar
      qmi_wwan: Add missing skb_reset_mac_header-call · 0de0add1
      Kristian Evensen authored
      When we receive a packet on a QMI device in raw IP mode, we should call
      skb_reset_mac_header() to ensure that skb->mac_header contains a valid
      offset in the packet. While it shouldn't really matter, the packets have
      no MAC header and the interface is configured as-such, it seems certain
      parts of the network stack expects a "good" value in skb->mac_header.
      
      Without the skb_reset_mac_header() call added in this patch, for example
      shaping traffic (using tc) triggers the following oops on the first
      received packet:
      
      [  303.642957] skbuff: skb_under_panic: text:8f137918 len:177 put:67 head:8e4b0f00 data:8e4b0eff tail:0x8e4b0fb0 end:0x8e4b1520 dev:wwan0
      [  303.655045] Kernel bug detected[#1]:
      [  303.658622] CPU: 1 PID: 1002 Comm: logd Not tainted 4.9.58 #0
      [  303.664339] task: 8fdf05e0 task.stack: 8f15c000
      [  303.668844] $ 0   : 00000000 00000001 0000007a 00000000
      [  303.674062] $ 4   : 8149a2fc 8149a2fc 8149ce20 00000000
      [  303.679284] $ 8   : 00000030 3878303a 31623465 20303235
      [  303.684510] $12   : ded731e3 2626a277 00000000 03bd0000
      [  303.689747] $16   : 8ef62b40 00000043 8f137918 804db5fc
      [  303.694978] $20   : 00000001 00000004 8fc13800 00000003
      [  303.700215] $24   : 00000001 8024ab10
      [  303.705442] $28   : 8f15c000 8fc19cf0 00000043 802cc920
      [  303.710664] Hi    : 00000000
      [  303.713533] Lo    : 74e58000
      [  303.716436] epc   : 802cc920 skb_panic+0x58/0x5c
      [  303.721046] ra    : 802cc920 skb_panic+0x58/0x5c
      [  303.725639] Status: 11007c03 KERNEL EXL IE
      [  303.729823] Cause : 50800024 (ExcCode 09)
      [  303.733817] PrId  : 0001992f (MIPS 1004Kc)
      [  303.737892] Modules linked in: rt2800pci rt2800mmio rt2800lib qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp nf_conntrack_ipv6 mt76x2i
      Process logd (pid: 1002, threadinfo=8f15c000, task=8fdf05e0, tls=77b3eee4)
      [  303.962509] Stack : 00000000 80408990 8f137918 000000b1 00000043 8e4b0f00 8e4b0eff 8e4b0fb0
      [  303.970871]         8e4b1520 8fec1800 00000043 802cd2a4 6e000045 00000043 00000000 8ef62000
      [  303.979219]         8eef5d00 8ef62b40 8fea7300 8f137918 00000000 00000000 0002bb01 793e5664
      [  303.987568]         8ef08884 00000001 8fea7300 00000002 8fc19e80 8eef5d00 00000006 00000003
      [  303.995934]         00000000 8030ba90 00000003 77ab3fd0 8149dc80 8004d1bc 8f15c000 8f383700
      [  304.004324]         ...
      [  304.006767] Call Trace:
      [  304.009241] [<802cc920>] skb_panic+0x58/0x5c
      [  304.013504] [<802cd2a4>] skb_push+0x78/0x90
      [  304.017783] [<8f137918>] 0x8f137918
      [  304.021269] Code: 00602825  0c02a3b4  24842888 <000c000d> 8c870060  8c8200a0  0007382b  00070336  8c88005c
      [  304.031034]
      [  304.032805] ---[ end trace b778c482b3f0bda9 ]---
      [  304.041384] Kernel panic - not syncing: Fatal exception in interrupt
      [  304.051975] Rebooting in 3 seconds..
      
      While the oops is for a 4.9-kernel, I was able to trigger the same oops with
      net-next as of yesterday.
      
      Fixes: 32f7adf6 ("net: qmi_wwan: support "raw IP" mode")
      Signed-off-by: default avatarKristian Evensen <kristian.evensen@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0de0add1
    • Jay Vosburgh's avatar
      bonding: fix slave stuck in BOND_LINK_FAIL state · 055db695
      Jay Vosburgh authored
      The bonding miimon logic has a flaw, in that a failure of the
      rtnl_trylock can cause a slave to become permanently stuck in
      BOND_LINK_FAIL state.
      
      	The sequence of events to cause this is as follows:
      
      	1) bond_miimon_inspect finds that a slave's link is down, and so
      calls bond_propose_link_state, setting slave->new_link_state to
      BOND_LINK_FAIL, then sets slave->new_link to BOND_LINK_DOWN and returns
      non-zero.
      
      	2) In bond_mii_monitor, the rtnl_trylock fails, and the timer is
      rescheduled.  No change is committed.
      
      	3) bond_miimon_inspect is called again, but this time the slave
      from step 1 has recovered.  slave->new_link is reset to NOCHANGE, and, as
      slave->link was never changed, the switch enters the BOND_LINK_UP case,
      and does nothing.  The pending BOND_LINK_FAIL state from step 1 remains
      pending, as new_link_state is not reset.
      
      	4) The state from step 3 persists until another slave changes link
      state and causes bond_miimon_inspect to return non-zero.  At this point,
      the BOND_LINK_FAIL state change on the slave from steps 1-3 is committed,
      and the slave will remain stuck in BOND_LINK_FAIL state even though it
      is actually link up.
      
      	The remedy for this is to initialize new_link_state on each entry
      to bond_miimon_inspect, as is already done with new_link.
      
      Fixes: fb9eb899 ("bonding: handle link transition from FAIL to UP correctly")
      Reported-by: default avatarAlex Sidorenko <alexandre.sidorenko@hpe.com>
      Reviewed-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Acked-by: default avatarMahesh Bandewar <maheshb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      055db695
    • Bjorn Andersson's avatar
      qrtr: Move to postcore_initcall · b7e732fa
      Bjorn Andersson authored
      Registering qrtr with module_init makes the ability of typical platform
      code to create AF_QIPCRTR socket during probe a matter of link order
      luck. Moving qrtr to postcore_initcall() avoids this.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7e732fa
    • Bjørn Mork's avatar
      net: qmi_wwan: fix divide by 0 on bad descriptors · 7fd07833
      Bjørn Mork authored
      A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will
      cause a divide error in usbnet_probe:
      
      divide error: 0000 [#1] PREEMPT SMP KASAN
      Modules linked in:
      CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f #56
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: usb_hub_wq hub_event
      task: ffff88006bef5c00 task.stack: ffff88006bf60000
      RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355
      RSP: 0018:ffff88006bf67508 EFLAGS: 00010246
      RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34
      RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34
      RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881
      R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003
      R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0
      Call Trace:
       usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783
       qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338
       usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361
       really_probe drivers/base/dd.c:413
       driver_probe_device+0x522/0x740 drivers/base/dd.c:557
      
      Fix by simply ignoring the bogus descriptor, as it is optional
      for QMI devices anyway.
      
      Fixes: 423ce8ca ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fd07833
    • Bjørn Mork's avatar
      net: cdc_ether: fix divide by 0 on bad descriptors · 2cb80187
      Bjørn Mork authored
      Setting dev->hard_mtu to 0 will cause a divide error in
      usbnet_probe. Protect against devices with bogus CDC Ethernet
      functional descriptors by ignoring a zero wMaxSegmentSize.
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2cb80187
    • Hangbin Liu's avatar
      bonding: discard lowest hash bit for 802.3ad layer3+4 · b5f86218
      Hangbin Liu authored
      After commit 07f4c900 ("tcp/dccp: try to not exhaust ip_local_port_range
      in connect()"), we will try to use even ports for connect(). Then if an
      application (seen clearly with iperf) opens multiple streams to the same
      destination IP and port, each stream will be given an even source port.
      
      So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing
      will always hash all these streams to the same interface. And the total
      throughput will limited to a single slave.
      
      Change the tcp code will impact the whole tcp behavior, only for bonding
      usage. Paolo Abeni suggested fix this by changing the bonding code only,
      which should be more reasonable, and less impact.
      
      Fix this by discarding the lowest hash bit because it contains little entropy.
      After the fix we can re-balance between slaves.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5f86218
    • Gustavo A. R. Silva's avatar
      net/mlx5e/core/en_fs: fix pointer dereference after free in mlx5e_execute_l2_action · 39a4b86f
      Gustavo A. R. Silva authored
      hn is being kfree'd in mlx5e_del_l2_from_hash and then dereferenced
      by accessing hn->ai.addr
      
      Fix this by copying the MAC address into a local variable for its safe use
      in all possible execution paths within function mlx5e_execute_l2_action.
      
      Addresses-Coverity-ID: 1417789
      Fixes: eeb66cdb ("net/mlx5: Separate between E-Switch and MPFS")
      Signed-off-by: default avatarGustavo A. R. Silva <garsilva@embeddedor.com>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39a4b86f
    • Marc Zyngier's avatar
      net: mvpp2: Prevent userspace from changing TX affinities · 13c249a9
      Marc Zyngier authored
      The mvpp2 driver can't cope at all with the TX affinities being
      changed from userspace, and spit an endless stream of
      
      [   91.779920] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.779930] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.780402] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.780406] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.780415] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      [   91.780418] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing
      
      rendering the box completely useless (I've measured around 600k
      interrupts/s on a 8040 box) once irqbalance kicks in and start
      doing its job.
      
      Obviously, the driver was never designed with this in mind. So let's
      work around the problem by preventing userspace from interacting
      with these interrupts altogether.
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13c249a9
  3. 05 Nov, 2017 2 commits
    • Priyaranjan Jha's avatar
      tcp: fix DSACK-based undo on non-duplicate ACK · d09b9e60
      Priyaranjan Jha authored
      Fixes DSACK-based undo when sender is in Open State and
      an ACK advances snd_una.
      
      Example scenario:
      - Sender goes into recovery and makes some spurious rtx.
      - It comes out of recovery and enters into open state.
      - It sends some more packets, let's say 4.
      - The receiver sends an ACK for the first two, but this ACK is lost.
      - The sender receives ack for first two, and DSACK for previous
        spurious rtx.
      Signed-off-by: default avatarPriyaranjan Jha <priyarjha@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarYousuk Seung <ysseung@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d09b9e60
    • Guillaume Nault's avatar
      l2tp: don't use l2tp_tunnel_find() in l2tp_ip and l2tp_ip6 · 8f7dc9ae
      Guillaume Nault authored
      Using l2tp_tunnel_find() in l2tp_ip_recv() is wrong for two reasons:
      
        * It doesn't take a reference on the returned tunnel, which makes the
          call racy wrt. concurrent tunnel deletion.
      
        * The lookup is only based on the tunnel identifier, so it can return
          a tunnel that doesn't match the packet's addresses or protocol.
      
      For example, a packet sent to an L2TPv3 over IPv6 tunnel can be
      delivered to an L2TPv2 over UDPv4 tunnel. This is worse than a simple
      cross-talk: when delivering the packet to an L2TP over UDP tunnel, the
      corresponding socket is UDP, where ->sk_backlog_rcv() is NULL. Calling
      sk_receive_skb() will then crash the kernel by trying to execute this
      callback.
      
      And l2tp_tunnel_find() isn't even needed here. __l2tp_ip_bind_lookup()
      properly checks the socket binding and connection settings. It was used
      as a fallback mechanism for finding tunnels that didn't have their data
      path registered yet. But it's not limited to this case and can be used
      to replace l2tp_tunnel_find() in the general case.
      
      Fix l2tp_ip6 in the same way.
      
      Fixes: 0d76751f ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support")
      Fixes: a32e0eec ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f7dc9ae
  4. 04 Nov, 2017 3 commits
    • Andrey Konovalov's avatar
      net: usb: asix: fill null-ptr-deref in asix_suspend · baedf68a
      Andrey Konovalov authored
      When asix_suspend() is called dev->driver_priv might not have been
      assigned a value, so we need to check that it's not NULL.
      
      Found by syzkaller.
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      Modules linked in:
      CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c #400
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Workqueue: usb_hub_wq hub_event
      task: ffff88006bb36300 task.stack: ffff88006bba8000
      RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
      RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
      RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
      RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
      R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
      FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
      Call Trace:
       usb_suspend_interface drivers/usb/core/driver.c:1209
       usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
       usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
       __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
       rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
       rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
       __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
       pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
       usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
       hub_port_connect drivers/usb/core/hub.c:4903
       hub_port_connect_change drivers/usb/core/hub.c:5009
       port_event drivers/usb/core/hub.c:5115
       hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
       process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
       worker_thread+0x221/0x1850 kernel/workqueue.c:2253
       kthread+0x3a1/0x470 kernel/kthread.c:231
       ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
      Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
      00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
      3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
      RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
      ---[ end trace dfc4f5649284342c ]---
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      baedf68a
    • Ye Yin's avatar
      netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed · 2b5ec1a5
      Ye Yin authored
      When run ipvs in two different network namespace at the same host, and one
      ipvs transport network traffic to the other network namespace ipvs.
      'ipvs_property' flag will make the second ipvs take no effect. So we should
      clear 'ipvs_property' when SKB network namespace changed.
      
      Fixes: 621e84d6 ("dev: introduce skb_scrub_packet()")
      Signed-off-by: default avatarYe Yin <hustcat@gmail.com>
      Signed-off-by: default avatarWei Zhou <chouryzhou@gmail.com>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b5ec1a5
    • Ganesh Goudar's avatar
      cxgb4: update latest firmware version supported · 24de79e5
      Ganesh Goudar authored
      Change t4fw_version.h to update latest firmware version
      number to 1.16.63.0.
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      24de79e5
  5. 03 Nov, 2017 22 commits
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · d4c2e9fc
      Linus Torvalds authored
      Pull clk fix from Stephen Boyd:
       "One fix for USB clks on Uniphier PXs3 SoCs"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: uniphier: fix clock data for PXs3
      d4c2e9fc
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile · 81ca2cae
      Linus Torvalds authored
      Pull arch/tile fixes from Chris Metcalf:
       "Two one-line bug fixes"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
        arch/tile: Implement ->set_state_oneshot_stopped()
        tile: pass machine size to sparse
      81ca2cae
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 01073ac1
      Linus Torvalds authored
      Pull SCSI fix from James Bottomley:
       "One minor fix in the error leg of the qla2xxx driver (it oopses the
        system if we get an error trying to start the internal kernel thread).
      
        The fix is minor because the problem isn't often encountered in the
        field (although it can be induced by inserting the module in a low
        memory environment)"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qla2xxx: Fix oops in qla2x00_probe_one error path
      01073ac1
    • Chris Metcalf's avatar
      arch/tile: Implement ->set_state_oneshot_stopped() · 777a45b4
      Chris Metcalf authored
      set_state_oneshot_stopped() is called by the clkevt core, when the
      next event is required at an expiry time of 'KTIME_MAX'. This normally
      happens with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes.
      
      This patch makes the clockevent device to stop on such an event, to
      avoid spurious interrupts, as explained by: commit 8fff52fd
      ("clockevents: Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED state").
      Signed-off-by: default avatarChris Metcalf <cmetcalf@mellanox.com>
      777a45b4
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 866ba84e
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Some more powerpc fixes for 4.14.
      
        This is bigger than I like to send at rc7, but that's at least partly
        because I didn't send any fixes last week. If it wasn't for the IMC
        driver, which is new and getting heavy testing, the diffstat would
        look a bit better. I've also added ftrace on big endian to my test
        suite, so we shouldn't break that again in future.
      
         - A fix to the handling of misaligned paste instructions (P9 only),
           where a change to a #define has caused the check for the
           instruction to always fail.
      
         - The preempt handling was unbalanced in the radix THP flush (P9
           only). Though we don't generally use preempt we want to keep it
           working as much as possible.
      
         - Two fixes for IMC (P9 only), one when booting with restricted
           number of CPUs and one in the error handling when initialisation
           fails due to firmware etc.
      
         - A revert to fix function_graph on big endian machines, and then a
           rework of the reverted patch to fix kprobes blacklist handling on
           big endian machines.
      
        Thanks to: Anju T Sudhakar, Guilherme G. Piccoli, Madhavan Srinivasan,
        Naveen N. Rao, Nicholas Piggin, Paul Mackerras"
      
      * tag 'powerpc-4.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/perf: Fix core-imc hotplug callback failure during imc initialization
        powerpc/kprobes: Dereference function pointers only if the address does not belong to kernel text
        Revert "powerpc64/elfv1: Only dereference function descriptor for non-text symbols"
        powerpc/64s/radix: Fix preempt imbalance in TLB flush
        powerpc: Fix check for copy/paste instructions in alignment handler
        powerpc/perf: Fix IMC allocation routine
      866ba84e
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.14-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 3f46540e
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "Fix dw_mmc request timeout issues"
      
      * tag 'mmc-v4.14-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: dw_mmc: Fix the DTO timeout calculation
        mmc: dw_mmc: Add locking to the CTO timer
        mmc: dw_mmc: Fix the CTO timeout calculation
        mmc: dw_mmc: cancel the CTO timer after a voltage switch
      3f46540e
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.14-rc8' of git://people.freedesktop.org/~airlied/linux · e65a139d
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
      
       - one nouveau regression fix
      
       - some amdgpu fixes for stable to fix hangs on some harvested Polaris
         GPUs
      
       - a set of KASAN and regression fixes for i915, their CI system seems
         to be working pretty well now.
      
      * tag 'drm-fixes-for-v4.14-rc8' of git://people.freedesktop.org/~airlied/linux:
        drm/amdgpu: allow harvesting check for Polaris VCE
        drm/amdgpu: return -ENOENT from uvd 6.0 early init for harvesting
        drm/i915: Check incoming alignment for unfenced buffers (on i915gm)
        drm/nouveau/kms/nv50: use the correct state for base channel notifier setup
        drm/i915: Hold rcu_read_lock when iterating over the radixtree (vma idr)
        drm/i915: Hold rcu_read_lock when iterating over the radixtree (objects)
        drm/i915/edp: read edp display control registers unconditionally
        drm/i915: Do not rely on wm preservation for ILK watermarks
        drm/i915: Cancel the modeset retry work during modeset cleanup
      e65a139d
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7ba3ebff
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Hopefully this is the last batch of networking fixes for 4.14
      
        Fingers crossed...
      
         1) Fix stmmac to use the proper sized OF property read, from Bhadram
            Varka.
      
         2) Fix use after free in net scheduler tc action code, from Cong
            Wang.
      
         3) Fix SKB control block mangling in tcp_make_synack().
      
         4) Use proper locking in fib_dump_info(), from Florian Westphal.
      
         5) Fix IPG encodings in systemport driver, from Florian Fainelli.
      
         6) Fix division by zero in NV TCP congestion control module, from
            Konstantin Khlebnikov.
      
         7) Fix use after free in nf_reject_ipv4, from Tejaswi Tanikella"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: systemport: Correct IPG length settings
        tcp: do not mangle skb->cb[] in tcp_make_synack()
        fib: fib_dump_info can no longer use __in_dev_get_rtnl
        stmmac: use of_property_read_u32 instead of read_u8
        net_sched: hold netns refcnt for each action
        net_sched: acquire RTNL in tc_action_net_exit()
        net: vrf: correct FRA_L3MDEV encode type
        tcp_nv: fix division by zero in tcpnv_acked()
        netfilter: nf_reject_ipv4: Fix use-after-free in send_reset
        netfilter: nft_set_hash: disable fast_ops for 2-len keys
      7ba3ebff
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · f0395d5b
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "7 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm, swap: fix race between swap count continuation operations
        mm/huge_memory.c: deposit page table when copying a PMD migration entry
        initramfs: fix initramfs rebuilds w/ compression after disabling
        fs/hugetlbfs/inode.c: fix hwpoison reserve accounting
        ocfs2: fstrim: Fix start offset of first cluster group during fstrim
        mm, /proc/pid/pagemap: fix soft dirty marking for PMD migration entry
        userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size
      f0395d5b
    • Paul Burton's avatar
      Update MIPS email addresses · fb615d61
      Paul Burton authored
      MIPS will soon not be a part of Imagination Technologies, and as such
      many @imgtec.com email addresses will no longer be valid. This patch
      updates the addresses for those who:
      
       - Have 10 or more patches in mainline authored using an @imgtec.com
         email address, or any patches dated within the past year.
      
       - Are still with Imagination but leaving as part of the MIPS business
         unit, as determined from an internal email address list.
      
       - Haven't already updated their email address (ie. JamesH) or expressed
         a desire to be excluded (ie. Maciej).
      
       - Acked v2 or earlier of this patch, which leaves Deng-Cheng, Matt &
         myself.
      
      New addresses are of the form firstname.lastname@mips.com, and all
      verified against an internal email address list.  An entry is added to
      .mailmap for each person such that get_maintainer.pl will report the new
      addresses rather than @imgtec.com addresses which will soon be dead.
      
      Instances of the affected addresses throughout the tree are then
      mechanically replaced with the new @mips.com address.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
      Cc: Deng-Cheng Zhu <dengcheng.zhu@mips.com>
      Acked-by: default avatarDengcheng Zhu <dengcheng.zhu@mips.com>
      Cc: Matt Redfearn <matt.redfearn@imgtec.com>
      Cc: Matt Redfearn <matt.redfearn@mips.com>
      Acked-by: default avatarMatt Redfearn <matt.redfearn@mips.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: trivial@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fb615d61
    • Rafael J. Wysocki's avatar
      x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo · 941f5f0f
      Rafael J. Wysocki authored
      Commit 890da9cf (Revert "x86: do not use cpufreq_quick_get() for
      /proc/cpuinfo "cpu MHz"") is not sufficient to restore the previous
      behavior of "cpu MHz" in /proc/cpuinfo on x86 due to some changes
      made after the commit it has reverted.
      
      To address this, make the code in question use arch_freq_get_on_cpu()
      which also is used by cpufreq for reporting the current frequency of
      CPUs and since that function doesn't really depend on cpufreq in any
      way, drop the CONFIG_CPU_FREQ dependency for the object file
      containing it.
      
      Also refactor arch_freq_get_on_cpu() somewhat to avoid IPIs and
      return cached values right away if it is called very often over a
      short time (to prevent user space from triggering IPI storms through
      it).
      
      Fixes: 890da9cf (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"")
      Cc: stable@kernel.org   # 4.13 - together with 890da9cfSigned-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      941f5f0f
    • Huang Ying's avatar
      mm, swap: fix race between swap count continuation operations · 2628bd6f
      Huang Ying authored
      One page may store a set of entries of the sis->swap_map
      (swap_info_struct->swap_map) in multiple swap clusters.
      
      If some of the entries has sis->swap_map[offset] > SWAP_MAP_MAX,
      multiple pages will be used to store the set of entries of the
      sis->swap_map.  And the pages are linked with page->lru.  This is called
      swap count continuation.  To access the pages which store the set of
      entries of the sis->swap_map simultaneously, previously, sis->lock is
      used.  But to improve the scalability of __swap_duplicate(), swap
      cluster lock may be used in swap_count_continued() now.  This may race
      with add_swap_count_continuation() which operates on a nearby swap
      cluster, in which the sis->swap_map entries are stored in the same page.
      
      The race can cause wrong swap count in practice, thus cause unfreeable
      swap entries or software lockup, etc.
      
      To fix the race, a new spin lock called cont_lock is added to struct
      swap_info_struct to protect the swap count continuation page list.  This
      is a lock at the swap device level, so the scalability isn't very well.
      But it is still much better than the original sis->lock, because it is
      only acquired/released when swap count continuation is used.  Which is
      considered rare in practice.  If it turns out that the scalability
      becomes an issue for some workloads, we can split the lock into some
      more fine grained locks.
      
      Link: http://lkml.kernel.org/r/20171017081320.28133-1-ying.huang@intel.com
      Fixes: 235b6217 ("mm/swap: add cluster lock")
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Tim Chen <tim.c.chen@intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>	[4.11+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2628bd6f
    • Zi Yan's avatar
      mm/huge_memory.c: deposit page table when copying a PMD migration entry · dd8a67f9
      Zi Yan authored
      We need to deposit pre-allocated PTE page table when a PMD migration
      entry is copied in copy_huge_pmd().  Otherwise, we will leak the
      pre-allocated page and cause a NULL pointer dereference later in
      zap_huge_pmd().
      
      The missing counters during PMD migration entry copy process are added
      as well.
      
      The bug report is here: https://lkml.org/lkml/2017/10/29/214
      
      Link: http://lkml.kernel.org/r/20171030144636.4836-1-zi.yan@sent.com
      Fixes: 84c3fc4e ("mm: thp: check pmd migration entry in common path")
      Signed-off-by: default avatarZi Yan <zi.yan@cs.rutgers.edu>
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dd8a67f9
    • Florian Fainelli's avatar
      initramfs: fix initramfs rebuilds w/ compression after disabling · e08b1877
      Florian Fainelli authored
      This is a follow-up to commit 57ddfdaa ("initramfs: fix disabling of
      initramfs (and its compression)").  This particular commit fixed the use
      case where we build the kernel with an initramfs with no compression,
      and then we build the kernel with no initramfs.
      
      Now this still left us with the same case as described here:
      
        http://lkml.kernel.org/r/20170521033337.6197-1-f.fainelli@gmail.com
      
      not working with initramfs compression.  This can be seen by the
      following steps/timestamps:
      
        https://www.spinics.net/lists/kernel/msg2598153.html
      
      .initramfs_data.cpio.gz.cmd is correct:
      
        cmd_usr/initramfs_data.cpio.gz := /bin/bash
        ./scripts/gen_initramfs_list.sh -o usr/initramfs_data.cpio.gz  -u 1000 -g 1000  /home/fainelli/work/uclinux-rootfs/romfs /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev
      
      and was generated the first time we did generate the gzip initramfs, so
      the command has not changed, nor its arguments, so we just don't call
      it, no initramfs cpio is re-generated as a consequence.
      
      The fix for this problem is just to properly keep track of the
      .initramfs_cpio_data.d file by suffixing it with the compression
      extension.  This takes care of properly tracking dependencies such that
      the initramfs get (re)generated any time files are added/deleted etc.
      
      Link: http://lkml.kernel.org/r/20170930033936.6722-1-f.fainelli@gmail.com
      Fixes: db2aa7fd ("initramfs: allow again choice of the embedded initramfs compression algorithm")
      Fixes: 9e3596b0 ("kbuild: initramfs cleanup, set target from Kconfig")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Cc: "Francisco Blas Izquierdo Riera (klondike)" <klondike@xiscosoft.net>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e08b1877
    • Mike Kravetz's avatar
      fs/hugetlbfs/inode.c: fix hwpoison reserve accounting · ab615a5b
      Mike Kravetz authored
      Calling madvise(MADV_HWPOISON) on a hugetlbfs page will result in bad
      (negative) reserved huge page counts.  This may not happen immediately,
      but may happen later when the underlying file is removed or filesystem
      unmounted.  For example:
      
        AnonHugePages:         0 kB
        ShmemHugePages:        0 kB
        HugePages_Total:       1
        HugePages_Free:        0
        HugePages_Rsvd:    18446744073709551615
        HugePages_Surp:        0
        Hugepagesize:       2048 kB
      
      In routine hugetlbfs_error_remove_page(), hugetlb_fix_reserve_counts is
      called after remove_huge_page.  hugetlb_fix_reserve_counts is designed
      to only be called/used only if a failure is returned from
      hugetlb_unreserve_pages.  Therefore, call hugetlb_unreserve_pages as
      required and only call hugetlb_fix_reserve_counts in the unlikely event
      that hugetlb_unreserve_pages returns an error.
      
      Link: http://lkml.kernel.org/r/20171019230007.17043-2-mike.kravetz@oracle.com
      Fixes: 78bb9203 ("mm: hwpoison: dissolve in-use hugepage in unrecoverable memory error")
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab615a5b
    • Ashish Samant's avatar
      ocfs2: fstrim: Fix start offset of first cluster group during fstrim · 105ddc93
      Ashish Samant authored
      The first cluster group descriptor is not stored at the start of the
      group but at an offset from the start.  We need to take this into
      account while doing fstrim on the first cluster group.  Otherwise we
      will wrongly start fstrim a few blocks after the desired start block and
      the range can cross over into the next cluster group and zero out the
      group descriptor there.  This can cause filesytem corruption that cannot
      be fixed by fsck.
      
      Link: http://lkml.kernel.org/r/1507835579-7308-1-git-send-email-ashish.samant@oracle.comSigned-off-by: default avatarAshish Samant <ashish.samant@oracle.com>
      Reviewed-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      105ddc93
    • Huang Ying's avatar
      mm, /proc/pid/pagemap: fix soft dirty marking for PMD migration entry · b83d7e43
      Huang Ying authored
      When the pagetable is walked in the implementation of /proc/<pid>/pagemap,
      pmd_soft_dirty() is used for both the PMD huge page map and the PMD
      migration entries.  That is wrong, pmd_swp_soft_dirty() should be used
      for the PMD migration entries instead because the different page table
      entry flag is used.
      
      As a result, /proc/pid/pagemap may report incorrect soft dirty information
      for PMD migration entries.
      
      Link: http://lkml.kernel.org/r/20171017081818.31795-1-ying.huang@intel.com
      Fixes: 84c3fc4e ("mm: thp: check pmd migration entry in common path")
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Daniel Colascione <dancol@google.com>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b83d7e43
    • Andrea Arcangeli's avatar
      userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size · 1e392147
      Andrea Arcangeli authored
      This oops:
      
        kernel BUG at fs/hugetlbfs/inode.c:484!
        RIP: remove_inode_hugepages+0x3d0/0x410
        Call Trace:
          hugetlbfs_setattr+0xd9/0x130
          notify_change+0x292/0x410
          do_truncate+0x65/0xa0
          do_sys_ftruncate.constprop.3+0x11a/0x180
          SyS_ftruncate+0xe/0x10
          tracesys+0xd9/0xde
      
      was caused by the lack of i_size check in hugetlb_mcopy_atomic_pte.
      
      mmap() can still succeed beyond the end of the i_size after vmtruncate
      zapped vmas in those ranges, but the faults must not succeed, and that
      includes UFFDIO_COPY.
      
      We could differentiate the retval to userland to represent a SIGBUS like
      a page fault would do (vs SIGSEGV), but it doesn't seem very useful and
      we'd need to pick a random retval as there's no meaningful syscall
      retval that would differentiate from SIGSEGV and SIGBUS, there's just
      -EFAULT.
      
      Link: http://lkml.kernel.org/r/20171016223914.2421-2-aarcange@redhat.comSigned-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1e392147
    • Florian Fainelli's avatar
      net: systemport: Correct IPG length settings · 93824c80
      Florian Fainelli authored
      Due to a documentation mistake, the IPG length was set to 0x12 while it
      should have been 12 (decimal). This would affect short packet (64B
      typically) performance since the IPG was bigger than necessary.
      
      Fixes: 44a4524c ("net: systemport: Add support for SYSTEMPORT Lite")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93824c80
    • Eric Dumazet's avatar
      tcp: do not mangle skb->cb[] in tcp_make_synack() · 3b117750
      Eric Dumazet authored
      Christoph Paasch sent a patch to address the following issue :
      
      tcp_make_synack() is leaving some TCP private info in skb->cb[],
      then send the packet by other means than tcp_transmit_skb()
      
      tcp_transmit_skb() makes sure to clear skb->cb[] to not confuse
      IPv4/IPV6 stacks, but we have no such cleanup for SYNACK.
      
      tcp_make_synack() should not use tcp_init_nondata_skb() :
      
      tcp_init_nondata_skb() really should be limited to skbs put in write/rtx
      queues (the ones that are only sent via tcp_transmit_skb())
      
      This patch fixes the issue and should even save few cpu cycles ;)
      
      Fixes: 971f10ec ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b117750
    • Florian Westphal's avatar
      fib: fib_dump_info can no longer use __in_dev_get_rtnl · 25dd169a
      Florian Westphal authored
      syzbot reported yet another regression added with DOIT_UNLOCKED.
      When nexthop is marked as dead, fib_dump_info uses __in_dev_get_rtnl():
      
      ./include/linux/inetdevice.h:230 suspicious rcu_dereference_protected() usage!
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by syz-executor2/23859:
       #0:  (rcu_read_lock){....}, at: [<ffffffff840283f0>]
      inet_rtm_getroute+0xaa0/0x2d70 net/ipv4/route.c:2738
      [..]
        lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4665
        __in_dev_get_rtnl include/linux/inetdevice.h:230 [inline]
        fib_dump_info+0x1136/0x13d0 net/ipv4/fib_semantics.c:1377
        inet_rtm_getroute+0xf97/0x2d70 net/ipv4/route.c:2785
      ..
      
      This isn't safe anymore, callers either hold RTNL mutex or rcu read lock,
      so these spots must use rcu_dereference_rtnl() or plain rcu_derefence()
      (plus unconditional rcu read lock).
      
      This does the latter.
      
      Fixes: 394f51ab ("ipv4: route: set ipv4 RTM_GETROUTE to not use rtnl")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25dd169a
    • Bhadram Varka's avatar
      stmmac: use of_property_read_u32 instead of read_u8 · e73b49eb
      Bhadram Varka authored
      Numbers in DT are stored in “cells” which are 32-bits
      in size. of_property_read_u8 does not work properly
      because of endianness problem.
      
      This causes it to always return 0 with little-endian
      architectures.
      
      Fix it by using of_property_read_u32() OF API.
      Signed-off-by: default avatarBhadram Varka <vbhadram@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e73b49eb