1. 17 Sep, 2017 7 commits
  2. 16 Sep, 2017 13 commits
  3. 15 Sep, 2017 20 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 9db59599
      Linus Torvalds authored
      Pull more KVM updates from Paolo Bonzini:
       - PPC bugfixes
       - RCU splat fix
       - swait races fix
       - pointless userspace-triggerable BUG() fix
       - misc fixes for KVM_RUN corner cases
       - nested virt correctness fixes + one host DoS
       - some cleanups
       - clang build fix
       - fix AMD AVIC with default QEMU command line options
       - x86 bugfixes
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
        kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly
        kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly
        kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry
        kvm,mips: Fix potential swait_active() races
        kvm,powerpc: Serialize wq active checks in ops->vcpu_kick
        kvm: Serialize wq active checks in kvm_vcpu_wake_up()
        kvm,x86: Fix apf_task_wake_one() wq serialization
        kvm,lapic: Justify use of swait_active()
        kvm,async_pf: Use swq_has_sleeper()
        sched/wait: Add swq_has_sleeper()
        KVM: VMX: Do not BUG() on out-of-bounds guest IRQ
        KVM: Don't accept obviously wrong gsi values via KVM_IRQFD
        kvm: nVMX: Don't allow L2 to access the hardware CR8
        KVM: trace events: update list of exit reasons
        KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously
        KVM: X86: Don't block vCPU if there is pending exception
        KVM: SVM: Add irqchip_split() checks before enabling AVIC
        KVM: Add struct kvm_vcpu pointer parameter to get_enable_apicv()
        KVM: SVM: Refactor AVIC vcpu initialization into avic_init_vcpu()
        KVM: x86: fix clang build
        ...
      9db59599
    • Edward Cree's avatar
      bpf/verifier: reject BPF_ALU64|BPF_END · e67b8a68
      Edward Cree authored
      Neither ___bpf_prog_run nor the JITs accept it.
      Also adds a new test case.
      
      Fixes: 17a52670
      
       ("bpf: verifier (add verifier core)")
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e67b8a68
    • Xin Long's avatar
      sctp: do not mark sk dumped when inet_sctp_diag_fill returns err · 8c7c19a5
      Xin Long authored
      sctp_diag would not actually dump out sk/asoc if inet_sctp_diag_fill
      returns err, in which case it shouldn't mark sk dumped by setting
      cb->args[3] as 1 in sctp_sock_dump().
      
      Otherwise, it could cause some asocs to have no parent's sk dumped
      in 'ss --sctp'.
      
      So this patch is to not set cb->args[3] when inet_sctp_diag_fill()
      returns err in sctp_sock_dump().
      
      Fixes: 8f840e47
      
       ("sctp: add the sctp_diag.c file")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c7c19a5
    • Xin Long's avatar
      sctp: fix an use-after-free issue in sctp_sock_dump · d25adbeb
      Xin Long authored
      Commit 86fdb344 ("sctp: ensure ep is not destroyed before doing the
      dump") tried to fix an use-after-free issue by checking !sctp_sk(sk)->ep
      with holding sock and sock lock.
      
      But Paolo noticed that endpoint could be destroyed in sctp_rcv without
      sock lock protection. It means the use-after-free issue still could be
      triggered when sctp_rcv put and destroy ep after sctp_sock_dump checks
      !ep, although it's pretty hard to reproduce.
      
      I could reproduce it by mdelay in sctp_rcv while msleep in sctp_close
      and sctp_sock_dump long time.
      
      This patch is to add another param cb_done to sctp_for_each_transport
      and dump ep->assocs with holding tsp after jumping out of transport's
      traversal in it to avoid this issue.
      
      It can also improve sctp diag dump to make it run faster, as no need
      to save sk into cb->args[5] and keep calling sctp_for_each_transport
      any more.
      
      This patch is also to use int * instead of int for the pos argument
      in sctp_for_each_transport, which could make postion increment only
      in sctp_for_each_transport and no need to keep changing cb->args[2]
      in sctp_sock_filter and sctp_sock_dump any more.
      
      Fixes: 86fdb344
      
       ("sctp: ensure ep is not destroyed before doing the dump")
      Reported-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d25adbeb
    • Stephen Hemminger's avatar
      netvsc: increase default receive buffer size · 5023a6db
      Stephen Hemminger authored
      The default receive buffer size was reduced by recent change
      to a value which was appropriate for 10G and Windows Server 2016.
      But the value is too small for full performance with 40G on Azure.
      Increase the default back to maximum supported by host.
      
      Fixes: 8b532797
      
       ("netvsc: allow controlling send/recv buffer size")
      Signed-off-by: default avatarStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5023a6db
    • Eric Dumazet's avatar
      tcp: update skb->skb_mstamp more carefully · 8c72c65b
      Eric Dumazet authored
      liujian reported a problem in TCP_USER_TIMEOUT processing with a patch
      in tcp_probe_timer() :
            https://www.spinics.net/lists/netdev/msg454496.html
      
      
      
      After investigations, the root cause of the problem is that we update
      skb->skb_mstamp of skbs in write queue, even if the attempt to send a
      clone or copy of it failed. One reason being a routing problem.
      
      This patch prevents this, solving liujian issue.
      
      It also removes a potential RTT miscalculation, since
      __tcp_retransmit_skb() is not OR-ing TCP_SKB_CB(skb)->sacked with
      TCPCB_EVER_RETRANS if a failure happens, but skb->skb_mstamp has
      been changed.
      
      A future ACK would then lead to a very small RTT sample and min_rtt
      would then be lowered to this too small value.
      
      Tested:
      
      # cat user_timeout.pkt
      --local_ip=192.168.102.64
      
          0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
         +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
         +0 bind(3, ..., ...) = 0
         +0 listen(3, 1) = 0
      
         +0 `ifconfig tun0 192.168.102.64/16; ip ro add 192.0.2.1 dev tun0`
      
         +0 < S 0:0(0) win 0 <mss 1460>
         +0 > S. 0:0(0) ack 1 <mss 1460>
      
        +.1 < . 1:1(0) ack 1 win 65530
         +0 accept(3, ..., ...) = 4
      
         +0 setsockopt(4, SOL_TCP, TCP_USER_TIMEOUT, [3000], 4) = 0
         +0 write(4, ..., 24) = 24
         +0 > P. 1:25(24) ack 1 win 29200
         +.1 < . 1:1(0) ack 25 win 65530
      
      //change the ipaddress
         +1 `ifconfig tun0 192.168.0.10/16`
      
         +1 write(4, ..., 24) = 24
         +1 write(4, ..., 24) = 24
         +1 write(4, ..., 24) = 24
         +1 write(4, ..., 24) = 24
      
         +0 `ifconfig tun0 192.168.102.64/16`
         +0 < . 1:2(1) ack 25 win 65530
         +0 `ifconfig tun0 192.168.0.10/16`
      
         +3 write(4, ..., 24) = -1
      
      # ./packetdrill user_timeout.pkt
      Signed-off-by: default avatarEric Dumazet <edumazet@googl.com>
      Reported-by: default avatarliujian <liujian56@huawei.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c72c65b
    • David Ahern's avatar
      net: ipv4: fix l3slave check for index returned in IP_PKTINFO · cbea8f02
      David Ahern authored
      rt_iif is only set to the actual egress device for the output path. The
      recent change to consider the l3slave flag when returning IP_PKTINFO
      works for local traffic (the correct device index is returned), but it
      broke the more typical use case of packets received from a remote host
      always returning the VRF index rather than the original ingress device.
      Update the fixup to consider l3slave and rt_iif actually getting set.
      
      Fixes: 1dfa7639
      
       ("net: ipv4: add check for l3slave for index returned in IP_PKTINFO")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbea8f02
    • Geert Uytterhoeven's avatar
      net: smsc911x: Quieten netif during suspend · 2aa70f86
      Geert Uytterhoeven authored
      
      If the network interface is kept running during suspend, the net core
      may call net_device_ops.ndo_start_xmit() while the Ethernet device is
      still suspended, which may lead to a system crash.
      
      E.g. on sh73a0/kzm9g and r8a73a4/ape6evm, the external Ethernet chip is
      driven by a PM controlled clock.  If the Ethernet registers are accessed
      while the clock is not running, the system will crash with an imprecise
      external abort.
      
      As this is a race condition with a small time window, it is not so easy
      to trigger at will.  Using pm_test may increase your chances:
      
          # echo 0 > /sys/module/printk/parameters/console_suspend
          # echo platform > /sys/power/pm_test
          # echo mem > /sys/power/state
      
      To fix this, make sure the network interface is quietened during
      suspend.
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2aa70f86
    • Florian Fainelli's avatar
      net: systemport: Fix 64-bit stats deadlock · 7095c973
      Florian Fainelli authored
      We can enter a deadlock situation because there is no sufficient protection
      when ndo_get_stats64() runs in process context to guard against RX or TX NAPI
      contexts running in softirq, this can lead to the following lockdep splat and
      actual deadlock was experienced as well with an iperf session in the background
      and a while loop doing ifconfig + ethtool.
      
      [    5.780350] ================================
      [    5.784679] WARNING: inconsistent lock state
      [    5.789011] 4.13.0-rc7-02179-g32fae27c725d #70 Not tainted
      [    5.794561] --------------------------------
      [    5.798890] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [    5.804971] swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
      [    5.810175]  (&syncp->seq#2){+.?...}, at: [<c0768a28>] bcm_sysport_tx_reclaim+0x30/0x54
      [    5.818327] {SOFTIRQ-ON-W} state was registered at:
      [    5.823278]   bcm_sysport_get_stats64+0x17c/0x258
      [    5.828053]   dev_get_stats+0x38/0xac
      [    5.831776]   rtnl_fill_stats+0x30/0x118
      [    5.835761]   rtnl_fill_ifinfo+0x538/0xe24
      [    5.839921]   rtmsg_ifinfo_build_skb+0x6c/0xd8
      [    5.844430]   rtmsg_ifinfo_event.part.5+0x14/0x44
      [    5.849201]   rtmsg_ifinfo+0x20/0x28
      [    5.852837]   register_netdevice+0x628/0x6b8
      [    5.857171]   register_netdev+0x14/0x24
      [    5.861051]   bcm_sysport_probe+0x30c/0x438
      [    5.865280]   platform_drv_probe+0x50/0xb0
      [    5.869418]   driver_probe_device+0x2e8/0x450
      [    5.873817]   __driver_attach+0x104/0x120
      [    5.877871]   bus_for_each_dev+0x7c/0xc0
      [    5.881834]   bus_add_driver+0x1b0/0x270
      [    5.885797]   driver_register+0x78/0xf4
      [    5.889675]   do_one_initcall+0x54/0x190
      [    5.893646]   kernel_init_freeable+0x144/0x1d0
      [    5.898135]   kernel_init+0x8/0x110
      [    5.901665]   ret_from_fork+0x14/0x2c
      [    5.905363] irq event stamp: 24263
      [    5.908804] hardirqs last  enabled at (24262): [<c08eecf0>] net_rx_action+0xc4/0x4e4
      [    5.916624] hardirqs last disabled at (24263): [<c0a7da00>] _raw_spin_lock_irqsave+0x1c/0x98
      [    5.925143] softirqs last  enabled at (24258): [<c022a7fc>] irq_enter+0x84/0x98
      [    5.932524] softirqs last disabled at (24259): [<c022a918>] irq_exit+0x108/0x16c
      [    5.939985]
      [    5.939985] other info that might help us debug this:
      [    5.946576]  Possible unsafe locking scenario:
      [    5.946576]
      [    5.952556]        CPU0
      [    5.955031]        ----
      [    5.957506]   lock(&syncp->seq#2);
      [    5.960955]   <Interrupt>
      [    5.963604]     lock(&syncp->seq#2);
      [    5.967227]
      [    5.967227]  *** DEADLOCK ***
      [    5.967227]
      [    5.973222] 1 lock held by swapper/0/0:
      [    5.977092]  #0:  (&(&ring->lock)->rlock){..-...}, at: [<c0768a18>] bcm_sysport_tx_reclaim+0x20/0x54
      
      So just remove the u64_stats_update_begin()/end() pair in ndo_get_stats64()
      since it does not appear to be useful for anything. No inconsistency was
      observed with either ifconfig or ethtool, global TX counts equal the sum of
      per-queue TX counts on a 32-bit architecture.
      
      Fixes: 10377ba7
      
       ("net: systemport: Support 64bit statistics")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7095c973
    • Arnd Bergmann's avatar
      net: vrf: avoid gcc-4.6 warning · ecf09117
      Arnd Bergmann authored
      When building an allmodconfig kernel with gcc-4.6, we get a rather
      odd warning:
      
      drivers/net/vrf.c: In function ‘vrf_ip6_input_dst’:
      drivers/net/vrf.c:964:3: error: initialized field with side-effects overwritten [-Werror]
      drivers/net/vrf.c:964:3: error: (near initialization for ‘fl6’) [-Werror]
      
      I have no idea what this warning is even trying to say, but it does
      seem like a false positive. Reordering the initialization in to match
      the structure definition gets rid of the warning, and might also avoid
      whatever gcc thinks is wrong here.
      
      Fixes: 9ff74384
      
       ("net: vrf: Handle ipv6 multicast and link-local addresses")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ecf09117
    • Himanshu Jha's avatar
      qed: remove unnecessary call to memset · 4739df62
      Himanshu Jha authored
      
      call to memset to assign 0 value immediately after allocating
      memory with kzalloc is unnecesaary as kzalloc allocates the memory
      filled with 0 value.
      
      Semantic patch used to resolve this issue:
      
      @@
      expression e,e2; constant c;
      statement S;
      @@
      
        e = kzalloc(e2, c);
        if(e == NULL) S
      - memset(e, 0, e2);
      Signed-off-by: default avatarHimanshu Jha <himanshujha199640@gmail.com>
      Signed-off-by: default avatarHimanshu Jha <himanshujha199640@gmail.com>
      Acked-by: default avatarSudarsana Kalluru <sudarsana.kalluru@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4739df62
    • Linus Torvalds's avatar
      Merge tag 'firmware_removal-4.14-rc1' of... · b38923a0
      Linus Torvalds authored
      Merge tag 'firmware_removal-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull firmware removal from Greg KH:
       "Many many years ago (at the kernel summit in Boston), we all came to
        the agreement that the firmware/ tree should be dropped from the
        kernel, and everyone use the linux-firmware package instead. For some
        minor reason, David Woodhouse didn't send the pull request at that
        point in time, and everyone forgot about this.
      
        The topic came up in the hallway track at the Plumbers conference this
        week, so here's a single patch that drops the whole firmware tree. The
        last firmware update was back in 2013, and all distros have been using
        linux-firmware instead since at least that year, if not before. The
        only commits to that directory since 2013 was some kbuild fixups for
        various build tool issues.
      
        So lets finally drop this, we don't need to lug them around in the
        kernel source tree anymore, especially as no one wants or uses them.
      
        This has passed build testing with 0-day, I don't think it made it
        into linux-next this week, but I figured it was good to get in before
        4.14-rc1 was out"
      
      * tag 'firmware_removal-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        firmware: delete in-kernel firmware
      b38923a0
    • Linus Torvalds's avatar
      Merge tag 'nios2-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 · 866a30ef
      Linus Torvalds authored
      Pull arch/nios2 update from Ley Foon Tan.
      
      * tag 'nios2-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
        nios2: time: Read timer in get_cycles only if initialized
        nios2: add earlycon support to 3c120 devboard DTS
      866a30ef
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 418702b9
      Linus Torvalds authored
      Pull powerpc fix from Michael Ellerman:
       "Just one fix, for the handling of alignment interrupts on dcbz
        instructions.
      
        Thanks to Paul Mackerras, Christian Zigotzky, Michal Sojka"
      
      * tag 'powerpc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc: Fix handling of alignment interrupt on dcbz instruction
      418702b9
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.14-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux · 30db202e
      Linus Torvalds authored
      Pull orangefs updates from Mike Marshall:
       "Some cleanups and a big bug fix for ACLs.
      
        When I was reviewing Jan Kara's ACL patch, I realized that Orangefs
        ACL code was busted, not just in the kernel module, but in the server
        as well. I've been working on the code in the server mostly, but
        here's one kernel patch, there will be more"
      
      * tag 'for-linus-4.14-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
        orangefs: Adjust three checks for null pointers
        orangefs: Use kcalloc() in orangefs_prepare_cdm_array()
        orangefs: Delete error messages for a failed memory allocation in five functions
        orangefs: constify xattr_handler structure
        orangefs: don't call filemap_write_and_wait from fsync
        orangefs: off by ones in xattr size checks
        orangefs: documentation clean up
        orangefs: react properly to posix_acl_update_mode's aftermath.
        orangefs: Don't clear SGID when inheriting ACLs
      30db202e
    • Dmitry Torokhov's avatar
      Merge branch 'next' into for-linus · bbc86087
      Dmitry Torokhov authored
      Prepare second round of input updates for 4.14 merge window.
      bbc86087
    • Kai-Heng Feng's avatar
      Input: i8042 - add Gigabyte P57 to the keyboard reset table · 697c5d8a
      Kai-Heng Feng authored
      Similar to other Gigabyte laptops, the touchpad on P57 requires a
      keyboard reset to detect Elantech touchpad correctly.
      
      BugLink: https://bugs.launchpad.net/bugs/1594214
      
      Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      697c5d8a
    • Jim Mattson's avatar
      kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly · 4f350c6d
      Jim Mattson authored
      
      When emulating a nested VM-entry from L1 to L2, several control field
      validation checks are deferred to the hardware. Should one of these
      validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When
      this happens, the L2 guest state is not loaded (even in part), and
      execution should continue in L1 with the next instruction after the
      VMLAUNCH/VMRESUME.
      
      The VMCS12 is not modified (except for the VM-instruction error
      field), the VMCS12 MSR save/load lists are not processed, and the CPU
      state is not loaded from the VMCS12 host area. Moreover, the vmcs02
      exit reason is stale, so it should not be consulted for any reason.
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4f350c6d
    • Jim Mattson's avatar
      kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly · b060ca3b
      Jim Mattson authored
      
      On an early VMLAUNCH/VMRESUME failure (i.e. one which sets the
      VM-instruction error field of the current VMCS), the launch state of
      the current VMCS is not set to "launched," and the VM-exit information
      fields of the current VMCS (including IDT-vectoring information and
      exit reason) are stale.
      
      On a late VMLAUNCH/VMRESUME failure (i.e. one which sets the high bit
      of the exit reason field), the launch state of the current VMCS is not
      set to "launched," and only two of the VM-exit information fields of
      the current VMCS are modified (exit reason and exit
      qualification). The remaining VM-exit information fields of the
      current VMCS (including IDT-vectoring information, in particular) are
      stale.
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b060ca3b
    • Jim Mattson's avatar
      kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry · 7881f96c
      Jim Mattson authored
      
      After a successful VM-entry, RFLAGS is cleared, with the exception of
      bit 1, which is always set. This is handled by load_vmcs12_host_state.
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7881f96c