1. 14 Sep, 2019 3 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 36024fcf
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Don't corrupt xfrm_interface parms before validation, from Nicolas
          Dichtel.
      
       2) Revert use of usb-wakeup in btusb, from Mario Limonciello.
      
       3) Block ipv6 packets in bridge netfilter if ipv6 is disabled, from
          Leonardo Bras.
      
       4) IPS_OFFLOAD not honored in ctnetlink, from Pablo Neira Ayuso.
      
       5) Missing ULP check in sock_map, from John Fastabend.
      
       6) Fix receive statistic handling in forcedeth, from Zhu Yanjun.
      
       7) Fix length of SKB allocated in 6pack driver, from Christophe
          JAILLET.
      
       8) ip6_route_info_create() returns an error pointer, not NULL. From
          Maciej Żenczykowski.
      
       9) Only add RDS sock to the hashes after rs_transport is set, from
          Ka-Cheong Poon.
      
      10) Don't double clean TX descriptors in ixgbe, from Ilya Maximets.
      
      11) Presence of transmit IPSEC offload in an SKB is not tested for
          correctly in ixgbe and ixgbevf. From Steffen Klassert and Jeff
          Kirsher.
      
      12) Need rcu_barrier() when register_netdevice() takes one of the
          notifier based failure paths, from Subash Abhinov Kasiviswanathan.
      
      13) Fix leak in sctp_do_bind(), from Mao Wenan.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
        cdc_ether: fix rndis support for Mediatek based smartphones
        sctp: destroy bucket if failed to bind addr
        sctp: remove redundant assignment when call sctp_get_port_local
        sctp: change return type of sctp_get_port_local
        ixgbevf: Fix secpath usage for IPsec Tx offload
        sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()'
        ixgbe: Fix secpath usage for IPsec TX offload.
        net: qrtr: fix memort leak in qrtr_tun_write_iter
        net: Fix null de-reference of device refcount
        ipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()'
        tun: fix use-after-free when register netdev failed
        tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR
        ixgbe: fix double clean of Tx descriptors with xdp
        ixgbe: Prevent u8 wrapping of ITR value to something less than 10us
        mlx4: fix spelling mistake "veify" -> "verify"
        net: hns3: fix spelling mistake "undeflow" -> "underflow"
        net: lmc: fix spelling mistake "runnin" -> "running"
        NFC: st95hf: fix spelling mistake "receieve" -> "receive"
        net/rds: An rds_sock is added too early to the hash table
        mac80211: Do not send Layer 2 Update frame before authorization
        ...
      36024fcf
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 1c4c5e25
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
      
       - tmio: Fixup runtime PM management during probe and remove
      
       - sdhci-pci-o2micro: Fix eMMC initialization for an AMD SoC
      
       - bcm2835: Prevent lockups when terminating work
      
      * tag 'mmc-v5.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: tmio: Fixup runtime PM management during remove
        mmc: tmio: Fixup runtime PM management during probe
        Revert "mmc: tmio: move runtime PM enablement to the driver implementations"
        Revert "mmc: sdhci: Remove unneeded quirk2 flag of O2 SD host controller"
        Revert "mmc: bcm2835: Terminate timeout work synchronously"
      1c4c5e25
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2019-09-13' of git://anongit.freedesktop.org/drm/drm · 592b8d87
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "From the maintainer summit, just some last minute fixes for final:
      
        lima:
         - fix gem_wait ioctl
      
        core:
         - constify modes list
      
        i915:
         - DP MST high color depth regression
         - GPU hangs on vulkan compute workloads"
      
      * tag 'drm-fixes-2019-09-13' of git://anongit.freedesktop.org/drm/drm:
        drm/lima: fix lima_gem_wait() return value
        drm/i915: Restore relaxed padding (OCL_OOB_SUPPRES_ENABLE) for skl+
        drm/i915: Limit MST to <= 8bpc once again
        drm/modes: Make the whitelist more const
      592b8d87
  2. 13 Sep, 2019 11 commits
    • Bjørn Mork's avatar
      cdc_ether: fix rndis support for Mediatek based smartphones · 4d7ffcf3
      Bjørn Mork authored
      A Mediatek based smartphone owner reports problems with USB
      tethering in Linux.  The verbose USB listing shows a rndis_host
      interface pair (e0/01/03 + 10/00/00), but the driver fails to
      bind with
      
      [  355.960428] usb 1-4: bad CDC descriptors
      
      The problem is a failsafe test intended to filter out ACM serial
      functions using the same 02/02/ff class/subclass/protocol as RNDIS.
      The serial functions are recognized by their non-zero bmCapabilities.
      
      No RNDIS function with non-zero bmCapabilities were known at the time
      this failsafe was added. But it turns out that some Wireless class
      RNDIS functions are using the bmCapabilities field. These functions
      are uniquely identified as RNDIS by their class/subclass/protocol, so
      the failing test can safely be disabled.  The same applies to the two
      types of Misc class RNDIS functions.
      
      Applying the failsafe to Communication class functions only retains
      the original functionality, and fixes the problem for the Mediatek based
      smartphone.
      
      Tow examples of CDC functional descriptors with non-zero bmCapabilities
      from Wireless class RNDIS functions are:
      
      0e8d:000a  Mediatek Crosscall Spider X5 3G Phone
      
            CDC Header:
              bcdCDC               1.10
            CDC ACM:
              bmCapabilities       0x0f
                connection notifications
                sends break
                line coding and serial state
                get/set/clear comm features
            CDC Union:
              bMasterInterface        0
              bSlaveInterface         1
            CDC Call Management:
              bmCapabilities       0x03
                call management
                use DataInterface
              bDataInterface          1
      
      and
      
      19d2:1023  ZTE K4201-z
      
            CDC Header:
              bcdCDC               1.10
            CDC ACM:
              bmCapabilities       0x02
                line coding and serial state
            CDC Call Management:
              bmCapabilities       0x03
                call management
                use DataInterface
              bDataInterface          1
            CDC Union:
              bMasterInterface        0
              bSlaveInterface         1
      
      The Mediatek example is believed to apply to most smartphones with
      Mediatek firmware.  The ZTE example is most likely also part of a larger
      family of devices/firmwares.
      Suggested-by: default avatarLars Melin <larsm17@gmail.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4d7ffcf3
    • David S. Miller's avatar
      Merge branch 'sctp_do_bind-leak' · ae3b06ed
      David S. Miller authored
      Mao Wenan says:
      
      ====================
      fix memory leak for sctp_do_bind
      
      First two patches are to do cleanup, remove redundant assignment,
      and change return type of sctp_get_port_local.
      Third patch is to fix memory leak for sctp_do_bind if failed
      to bind address.
      
      v2: add one patch to change return type of sctp_get_port_local.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae3b06ed
    • Mao Wenan's avatar
      sctp: destroy bucket if failed to bind addr · 29b99f54
      Mao Wenan authored
      There is one memory leak bug report:
      BUG: memory leak
      unreferenced object 0xffff8881dc4c5ec0 (size 40):
        comm "syz-executor.0", pid 5673, jiffies 4298198457 (age 27.578s)
        hex dump (first 32 bytes):
          02 00 00 00 81 88 ff ff 00 00 00 00 00 00 00 00  ................
          f8 63 3d c1 81 88 ff ff 00 00 00 00 00 00 00 00  .c=.............
        backtrace:
          [<0000000072006339>] sctp_get_port_local+0x2a1/0xa00 [sctp]
          [<00000000c7b379ec>] sctp_do_bind+0x176/0x2c0 [sctp]
          [<000000005be274a2>] sctp_bind+0x5a/0x80 [sctp]
          [<00000000b66b4044>] inet6_bind+0x59/0xd0 [ipv6]
          [<00000000c68c7f42>] __sys_bind+0x120/0x1f0 net/socket.c:1647
          [<000000004513635b>] __do_sys_bind net/socket.c:1658 [inline]
          [<000000004513635b>] __se_sys_bind net/socket.c:1656 [inline]
          [<000000004513635b>] __x64_sys_bind+0x3e/0x50 net/socket.c:1656
          [<0000000061f2501e>] do_syscall_64+0x72/0x2e0 arch/x86/entry/common.c:296
          [<0000000003d1e05e>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      This is because in sctp_do_bind, if sctp_get_port_local is to
      create hash bucket successfully, and sctp_add_bind_addr failed
      to bind address, e.g return -ENOMEM, so memory leak found, it
      needs to destroy allocated bucket.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      29b99f54
    • Mao Wenan's avatar
      sctp: remove redundant assignment when call sctp_get_port_local · e0e4b8de
      Mao Wenan authored
      There are more parentheses in if clause when call sctp_get_port_local
      in sctp_do_bind, and redundant assignment to 'ret'. This patch is to
      do cleanup.
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e0e4b8de
    • Mao Wenan's avatar
      sctp: change return type of sctp_get_port_local · 8e2ef6ab
      Mao Wenan authored
      Currently sctp_get_port_local() returns a long
      which is either 0,1 or a pointer casted to long.
      It's neither of the callers use the return value since
      commit 62208f12 ("net: sctp: simplify sctp_get_port").
      Now two callers are sctp_get_port and sctp_do_bind,
      they actually assumend a casted to an int was the same as
      a pointer casted to a long, and they don't save the return
      value just check whether it is zero or non-zero, so
      it would better change return type from long to int for
      sctp_get_port_local.
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e2ef6ab
    • Jeff Kirsher's avatar
      ixgbevf: Fix secpath usage for IPsec Tx offload · 8f6617ba
      Jeff Kirsher authored
      Port the same fix for ixgbe to ixgbevf.
      
      The ixgbevf driver currently does IPsec Tx offloading
      based on an existing secpath. However, the secpath
      can also come from the Rx side, in this case it is
      misinterpreted for Tx offload and the packets are
      dropped with a "bad sa_idx" error. Fix this by using
      the xfrm_offload() function to test for Tx offload.
      
      CC: Shannon Nelson <snelson@pensando.io>
      Fixes: 7f68d430 ("ixgbevf: enable VF IPsec offload operations")
      Reported-by: default avatarJonathan Tooker <jonathan@reliablehosting.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Acked-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f6617ba
    • Ulf Hansson's avatar
      mmc: tmio: Fixup runtime PM management during remove · 87b5d602
      Ulf Hansson authored
      Accessing the device when it may be runtime suspended is a bug, which is
      the case in tmio_mmc_host_remove(). Let's fix the behaviour.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Tested-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      87b5d602
    • Ulf Hansson's avatar
      mmc: tmio: Fixup runtime PM management during probe · aa86f1a3
      Ulf Hansson authored
      The tmio_mmc_host_probe() calls pm_runtime_set_active() to update the
      runtime PM status of the device, as to make it reflect the current status
      of the HW. This works fine for most cases, but unfortunate not for all.
      Especially, there is a generic problem when the device has a genpd attached
      and that genpd have the ->start|stop() callbacks assigned.
      
      More precisely, if the driver calls pm_runtime_set_active() during
      ->probe(), genpd does not get to invoke the ->start() callback for it,
      which means the HW isn't really fully powered on. Furthermore, in the next
      phase, when the device becomes runtime suspended, genpd will invoke the
      ->stop() callback for it, potentially leading to usage count imbalance
      problems, depending on what's implemented behind the callbacks of course.
      
      To fix this problem, convert to call pm_runtime_get_sync() from
      tmio_mmc_host_probe() rather than pm_runtime_set_active(). Additionally, to
      avoid bumping usage counters and unnecessary re-initializing the HW the
      first time the tmio driver's ->runtime_resume() callback is called,
      introduce a state flag to keeping track of this.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Tested-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      aa86f1a3
    • Ulf Hansson's avatar
      Revert "mmc: tmio: move runtime PM enablement to the driver implementations" · 8861474a
      Ulf Hansson authored
      This reverts commit 7ff21319.
      
      It turns out that the above commit introduces other problems. For example,
      calling pm_runtime_set_active() must not be done prior calling
      pm_runtime_enable() as that makes it fail. This leads to additional
      problems, such as clock enables being wrongly balanced.
      
      Rather than fixing the problem on top, let's start over by doing a revert.
      
      Fixes: 7ff21319 ("mmc: tmio: move runtime PM enablement to the driver implementations")
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Tested-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      8861474a
    • Linus Torvalds's avatar
      Merge branch 'for-5.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · a7f89616
      Linus Torvalds authored
      Pull cgroup fix from Tejun Heo:
       "Roman found and fixed a bug in the cgroup2 freezer which allows new
        child cgroup to escape frozen state"
      
      * 'for-5.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: freezer: fix frozen state inheritance
        kselftests: cgroup: add freezer mkdir test
      a7f89616
    • Linus Torvalds's avatar
      Merge tag 'for-5.3-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 1b304a1a
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Here are two fixes, one of them urgent fixing a bug introduced in 5.2
        and reported by many users. It took time to identify the root cause,
        catching the 5.3 release is higly desired also to push the fix to 5.2
        stable tree.
      
        The bug is a mess up of return values after adding proper error
        handling and honestly the kind of bug that can cause sleeping
        disorders until it's caught. My appologies to everybody who was
        affected.
      
        Summary of what could happen:
      
        1) either a hang when committing a transaction, if this happens
           there's no risk of corruption, still the hang is very inconvenient
           and can't be resolved without a reboot
      
        2) writeback for some btree nodes may never be started and we end up
           committing a transaction without noticing that, this is really
           serious and that will lead to the "parent transid verify failed"
           messages"
      
      * tag 'for-5.3-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: fix unwritten extent buffers and hangs on future writeback attempts
        Btrfs: fix assertion failure during fsync and use of stale transaction
      1b304a1a
  3. 12 Sep, 2019 21 commits
    • Roman Gushchin's avatar
      cgroup: freezer: fix frozen state inheritance · 97a61369
      Roman Gushchin authored
      If a new child cgroup is created in the frozen cgroup hierarchy
      (one or more of ancestor cgroups is frozen), the CGRP_FREEZE cgroup
      flag should be set. Otherwise if a process will be attached to the
      child cgroup, it won't become frozen.
      
      The problem can be reproduced with the test_cgfreezer_mkdir test.
      
      This is the output before this patch:
        ~/test_freezer
        ok 1 test_cgfreezer_simple
        ok 2 test_cgfreezer_tree
        ok 3 test_cgfreezer_forkbomb
        Cgroup /sys/fs/cgroup/cg_test_mkdir_A/cg_test_mkdir_B isn't frozen
        not ok 4 test_cgfreezer_mkdir
        ok 5 test_cgfreezer_rmdir
        ok 6 test_cgfreezer_migrate
        ok 7 test_cgfreezer_ptrace
        ok 8 test_cgfreezer_stopped
        ok 9 test_cgfreezer_ptraced
        ok 10 test_cgfreezer_vfork
      
      And with this patch:
        ~/test_freezer
        ok 1 test_cgfreezer_simple
        ok 2 test_cgfreezer_tree
        ok 3 test_cgfreezer_forkbomb
        ok 4 test_cgfreezer_mkdir
        ok 5 test_cgfreezer_rmdir
        ok 6 test_cgfreezer_migrate
        ok 7 test_cgfreezer_ptrace
        ok 8 test_cgfreezer_stopped
        ok 9 test_cgfreezer_ptraced
        ok 10 test_cgfreezer_vfork
      Reported-by: default avatarMark Crossen <mcrossen@fb.com>
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Fixes: 76f969e8 ("cgroup: cgroup v2 freezer")
      Cc: Tejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org # v5.2+
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      97a61369
    • Roman Gushchin's avatar
      kselftests: cgroup: add freezer mkdir test · 44e9d308
      Roman Gushchin authored
      Add a new cgroup freezer selftest, which checks that if a cgroup is
      frozen, their new child cgroups will properly inherit the frozen
      state.
      
      It creates a parent cgroup, freezes it, creates a child cgroup
      and populates it with a dummy process. Then it checks that both
      parent and child cgroup are frozen.
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      44e9d308
    • Chris Wilson's avatar
      Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()" · 505a8ec7
      Chris Wilson authored
      The userptr put_pages can be called from inside try_to_unmap, and so
      enters with the page lock held on one of the object's backing pages. We
      cannot take the page lock ourselves for fear of recursion.
      Reported-by: default avatarLionel Landwerlin <lionel.g.landwerlin@intel.com>
      Reported-by: default avatarMartin Wilck <Martin.Wilck@suse.com>
      Reported-by: default avatarLeo Kraav <leho@kraav.com>
      Fixes: aa56a292 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
      References: https://bugzilla.kernel.org/show_bug.cgi?id=203317Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      505a8ec7
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20190912' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux · 98dcb386
      Linus Torvalds authored
      Pull clone3 fix from Christian Brauner:
       "This is a last-minute bugfix for clone3() that should go in before we
        release 5.3 with clone3().
      
        clone3() did not verify that the exit_signal argument was set to a
        valid signal. This can be used to cause a crash by specifying a signal
        greater than NSIG. e.g. -1.
      
        The commit from Eugene adds a check to copy_clone_args_from_user() to
        verify that the exit signal is limited by CSIGNAL as with legacy
        clone() and that the signal is valid. With this we don't get the
        legacy clone behavior were an invalid signal could be handed down and
        would only be detected and then ignored in do_notify_parent(). Users
        of clone3() will now get a proper error right when they pass an
        invalid exit signal. Note, that this is not a change in user-visible
        behavior since no kernel with clone3() has been released yet"
      
      * tag 'for-linus-20190912' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
        fork: block invalid exit signals with clone3()
      98dcb386
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 95217783
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "A KVM guest fix, and a kdump kernel relocation errors fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/timer: Force PIT initialization when !X86_FEATURE_ARAT
        x86/purgatory: Change compiler flags from -mcmodel=kernel to -mcmodel=large to fix kexec relocation errors
      95217783
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2019-09-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · e6bb7116
      Dave Airlie authored
      drm-misc-fixes for v5.3 final:
      - Constify modes whitelist harder.
      - Fix lima driver gem_wait ioctl.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/99e52e7a-d4ce-6a2c-0501-bc559a710955@linux.intel.com
      e6bb7116
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2019-09-11' of... · 911ad0b6
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2019-09-11' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      Final drm/i915 fixes for v5.3:
      - Fox DP MST high color depth regression
      - Fix GPU hangs on Vulkan compute workloads
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Jani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/877e6e27qm.fsf@intel.com
      911ad0b6
    • Eugene Syromiatnikov's avatar
      fork: block invalid exit signals with clone3() · a0eb9abd
      Eugene Syromiatnikov authored
      Previously, higher 32 bits of exit_signal fields were lost when copied
      to the kernel args structure (that uses int as a type for the respective
      field). Moreover, as Oleg has noted, exit_signal is used unchecked, so
      it has to be checked for sanity before use; for the legacy syscalls,
      applying CSIGNAL mask guarantees that it is at least non-negative;
      however, there's no such thing is done in clone3() code path, and that
      can break at least thread_group_leader.
      
      This commit adds a check to copy_clone_args_from_user() to verify that
      the exit signal is limited by CSIGNAL as with legacy clone() and that
      the signal is valid. With this we don't get the legacy clone behavior
      were an invalid signal could be handed down and would only be detected
      and ignored in do_notify_parent(). Users of clone3() will now get a
      proper error when they pass an invalid exit signal. Note, that this is
      not user-visible behavior since no kernel with clone3() has been
      released yet.
      
      The following program will cause a splat on a non-fixed clone3() version
      and will fail correctly on a fixed version:
      
       #define _GNU_SOURCE
       #include <linux/sched.h>
       #include <linux/types.h>
       #include <sched.h>
       #include <stdio.h>
       #include <stdlib.h>
       #include <sys/syscall.h>
       #include <sys/wait.h>
       #include <unistd.h>
      
       int main(int argc, char *argv[])
       {
              pid_t pid = -1;
              struct clone_args args = {0};
              args.exit_signal = -1;
      
              pid = syscall(__NR_clone3, &args, sizeof(struct clone_args));
              if (pid < 0)
                      exit(EXIT_FAILURE);
      
              if (pid == 0)
                      exit(EXIT_SUCCESS);
      
              wait(NULL);
      
              exit(EXIT_SUCCESS);
       }
      
      Fixes: 7f192e3c ("fork: add clone3")
      Reported-by: default avatarOleg Nesterov <oleg@redhat.com>
      Suggested-by: default avatarOleg Nesterov <oleg@redhat.com>
      Suggested-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarEugene Syromiatnikov <esyr@redhat.com>
      Link: https://lore.kernel.org/r/4b38fa4ce420b119a4c6345f42fe3cec2de9b0b5.1568223594.git.esyr@redhat.com
      [christian.brauner@ubuntu.com: simplify check and rework commit message]
      Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      a0eb9abd
    • Christophe JAILLET's avatar
      sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()' · b456d724
      Christophe JAILLET authored
      The '.exit' functions from 'pernet_operations' structure should be marked
      as __net_exit, not __net_init.
      
      Fixes: 8e2d61e0 ("sctp: fix race on protocol/netns initialization")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b456d724
    • Steffen Klassert's avatar
      ixgbe: Fix secpath usage for IPsec TX offload. · f39b683d
      Steffen Klassert authored
      The ixgbe driver currently does IPsec TX offloading
      based on an existing secpath. However, the secpath
      can also come from the RX side, in this case it is
      misinterpreted for TX offload and the packets are
      dropped with a "bad sa_idx" error. Fix this by using
      the xfrm_offload() function to test for TX offload.
      
      Fixes: 59259470 ("ixgbe: process the Tx ipsec offload")
      Reported-by: default avatarMichael Marley <michael@michaelmarley.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f39b683d
    • Filipe Manana's avatar
      Btrfs: fix unwritten extent buffers and hangs on future writeback attempts · 18dfa711
      Filipe Manana authored
      The lock_extent_buffer_io() returns 1 to the caller to tell it everything
      went fine and the callers needs to start writeback for the extent buffer
      (submit a bio, etc), 0 to tell the caller everything went fine but it does
      not need to start writeback for the extent buffer, and a negative value if
      some error happened.
      
      When it's about to return 1 it tries to lock all pages, and if a try lock
      on a page fails, and we didn't flush any existing bio in our "epd", it
      calls flush_write_bio(epd) and overwrites the return value of 1 to 0 or
      an error. The page might have been locked elsewhere, not with the goal
      of starting writeback of the extent buffer, and even by some code other
      than btrfs, like page migration for example, so it does not mean the
      writeback of the extent buffer was already started by some other task,
      so returning a 0 tells the caller (btree_write_cache_pages()) to not
      start writeback for the extent buffer. Note that epd might currently have
      either no bio, so flush_write_bio() returns 0 (success) or it might have
      a bio for another extent buffer with a lower index (logical address).
      
      Since we return 0 with the EXTENT_BUFFER_WRITEBACK bit set on the
      extent buffer and writeback is never started for the extent buffer,
      future attempts to writeback the extent buffer will hang forever waiting
      on that bit to be cleared, since it can only be cleared after writeback
      completes. Such hang is reported with a trace like the following:
      
        [49887.347053] INFO: task btrfs-transacti:1752 blocked for more than 122 seconds.
        [49887.347059]       Not tainted 5.2.13-gentoo #2
        [49887.347060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        [49887.347062] btrfs-transacti D    0  1752      2 0x80004000
        [49887.347064] Call Trace:
        [49887.347069]  ? __schedule+0x265/0x830
        [49887.347071]  ? bit_wait+0x50/0x50
        [49887.347072]  ? bit_wait+0x50/0x50
        [49887.347074]  schedule+0x24/0x90
        [49887.347075]  io_schedule+0x3c/0x60
        [49887.347077]  bit_wait_io+0x8/0x50
        [49887.347079]  __wait_on_bit+0x6c/0x80
        [49887.347081]  ? __lock_release.isra.29+0x155/0x2d0
        [49887.347083]  out_of_line_wait_on_bit+0x7b/0x80
        [49887.347084]  ? var_wake_function+0x20/0x20
        [49887.347087]  lock_extent_buffer_for_io+0x28c/0x390
        [49887.347089]  btree_write_cache_pages+0x18e/0x340
        [49887.347091]  do_writepages+0x29/0xb0
        [49887.347093]  ? kmem_cache_free+0x132/0x160
        [49887.347095]  ? convert_extent_bit+0x544/0x680
        [49887.347097]  filemap_fdatawrite_range+0x70/0x90
        [49887.347099]  btrfs_write_marked_extents+0x53/0x120
        [49887.347100]  btrfs_write_and_wait_transaction.isra.4+0x38/0xa0
        [49887.347102]  btrfs_commit_transaction+0x6bb/0x990
        [49887.347103]  ? start_transaction+0x33e/0x500
        [49887.347105]  transaction_kthread+0x139/0x15c
      
      So fix this by not overwriting the return value (ret) with the result
      from flush_write_bio(). We also need to clear the EXTENT_BUFFER_WRITEBACK
      bit in case flush_write_bio() returns an error, otherwise it will hang
      any future attempts to writeback the extent buffer, and undo all work
      done before (set back EXTENT_BUFFER_DIRTY, etc).
      
      This is a regression introduced in the 5.2 kernel.
      
      Fixes: 2e3c2513 ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()")
      Fixes: f4340622 ("btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up")
      Reported-by: default avatarZdenek Sojka <zsojka@seznam.cz>
      Link: https://lore.kernel.org/linux-btrfs/GpO.2yos.3WGDOLpx6t%7D.1TUDYM@seznam.cz/T/#uReported-by: default avatarStefan Priebe - Profihost AG <s.priebe@profihost.ag>
      Link: https://lore.kernel.org/linux-btrfs/5c4688ac-10a7-fb07-70e8-c5d31a3fbb38@profihost.ag/T/#tReported-by: default avatarDrazen Kacar <drazen.kacar@oradian.com>
      Link: https://lore.kernel.org/linux-btrfs/DB8PR03MB562876ECE2319B3E579590F799C80@DB8PR03MB5628.eurprd03.prod.outlook.com/
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204377Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      18dfa711
    • Filipe Manana's avatar
      Btrfs: fix assertion failure during fsync and use of stale transaction · 410f954c
      Filipe Manana authored
      Sometimes when fsync'ing a file we need to log that other inodes exist and
      when we need to do that we acquire a reference on the inodes and then drop
      that reference using iput() after logging them.
      
      That generally is not a problem except if we end up doing the final iput()
      (dropping the last reference) on the inode and that inode has a link count
      of 0, which can happen in a very short time window if the logging path
      gets a reference on the inode while it's being unlinked.
      
      In that case we end up getting the eviction callback, btrfs_evict_inode(),
      invoked through the iput() call chain which needs to drop all of the
      inode's items from its subvolume btree, and in order to do that, it needs
      to join a transaction at the helper function evict_refill_and_join().
      However because the task previously started a transaction at the fsync
      handler, btrfs_sync_file(), it has current->journal_info already pointing
      to a transaction handle and therefore evict_refill_and_join() will get
      that transaction handle from btrfs_join_transaction(). From this point on,
      two different problems can happen:
      
      1) evict_refill_and_join() will often change the transaction handle's
         block reserve (->block_rsv) and set its ->bytes_reserved field to a
         value greater than 0. If evict_refill_and_join() never commits the
         transaction, the eviction handler ends up decreasing the reference
         count (->use_count) of the transaction handle through the call to
         btrfs_end_transaction(), and after that point we have a transaction
         handle with a NULL ->block_rsv (which is the value prior to the
         transaction join from evict_refill_and_join()) and a ->bytes_reserved
         value greater than 0. If after the eviction/iput completes the inode
         logging path hits an error or it decides that it must fallback to a
         transaction commit, the btrfs fsync handle, btrfs_sync_file(), gets a
         non-zero value from btrfs_log_dentry_safe(), and because of that
         non-zero value it tries to commit the transaction using a handle with
         a NULL ->block_rsv and a non-zero ->bytes_reserved value. This makes
         the transaction commit hit an assertion failure at
         btrfs_trans_release_metadata() because ->bytes_reserved is not zero but
         the ->block_rsv is NULL. The produced stack trace for that is like the
         following:
      
         [192922.917158] assertion failed: !trans->bytes_reserved, file: fs/btrfs/transaction.c, line: 816
         [192922.917553] ------------[ cut here ]------------
         [192922.917922] kernel BUG at fs/btrfs/ctree.h:3532!
         [192922.918310] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
         [192922.918666] CPU: 2 PID: 883 Comm: fsstress Tainted: G        W         5.1.4-btrfs-next-47 #1
         [192922.919035] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
         [192922.919801] RIP: 0010:assfail.constprop.25+0x18/0x1a [btrfs]
         (...)
         [192922.920925] RSP: 0018:ffffaebdc8a27da8 EFLAGS: 00010286
         [192922.921315] RAX: 0000000000000051 RBX: ffff95c9c16a41c0 RCX: 0000000000000000
         [192922.921692] RDX: 0000000000000000 RSI: ffff95cab6b16838 RDI: ffff95cab6b16838
         [192922.922066] RBP: ffff95c9c16a41c0 R08: 0000000000000000 R09: 0000000000000000
         [192922.922442] R10: ffffaebdc8a27e70 R11: 0000000000000000 R12: ffff95ca731a0980
         [192922.922820] R13: 0000000000000000 R14: ffff95ca84c73338 R15: ffff95ca731a0ea8
         [192922.923200] FS:  00007f337eda4e80(0000) GS:ffff95cab6b00000(0000) knlGS:0000000000000000
         [192922.923579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
         [192922.923948] CR2: 00007f337edad000 CR3: 00000001e00f6002 CR4: 00000000003606e0
         [192922.924329] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
         [192922.924711] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
         [192922.925105] Call Trace:
         [192922.925505]  btrfs_trans_release_metadata+0x10c/0x170 [btrfs]
         [192922.925911]  btrfs_commit_transaction+0x3e/0xaf0 [btrfs]
         [192922.926324]  btrfs_sync_file+0x44c/0x490 [btrfs]
         [192922.926731]  do_fsync+0x38/0x60
         [192922.927138]  __x64_sys_fdatasync+0x13/0x20
         [192922.927543]  do_syscall_64+0x60/0x1c0
         [192922.927939]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
         (...)
         [192922.934077] ---[ end trace f00808b12068168f ]---
      
      2) If evict_refill_and_join() decides to commit the transaction, it will
         be able to do it, since the nested transaction join only increments the
         transaction handle's ->use_count reference counter and it does not
         prevent the transaction from getting committed. This means that after
         eviction completes, the fsync logging path will be using a transaction
         handle that refers to an already committed transaction. What happens
         when using such a stale transaction can be unpredictable, we are at
         least having a use-after-free on the transaction handle itself, since
         the transaction commit will call kmem_cache_free() against the handle
         regardless of its ->use_count value, or we can end up silently losing
         all the updates to the log tree after that iput() in the logging path,
         or using a transaction handle that in the meanwhile was allocated to
         another task for a new transaction, etc, pretty much unpredictable
         what can happen.
      
      In order to fix both of them, instead of using iput() during logging, use
      btrfs_add_delayed_iput(), so that the logging path of fsync never drops
      the last reference on an inode, that step is offloaded to a safe context
      (usually the cleaner kthread).
      
      The assertion failure issue was sporadically triggered by the test case
      generic/475 from fstests, which loads the dm error target while fsstress
      is running, which lead to fsync failing while logging inodes with -EIO
      errors and then trying later to commit the transaction, triggering the
      assertion failure.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      410f954c
    • Navid Emamdoost's avatar
      net: qrtr: fix memort leak in qrtr_tun_write_iter · a21b7f0c
      Navid Emamdoost authored
      In qrtr_tun_write_iter the allocated kbuf should be release in case of
      error or success return.
      
      v2 Update: Thanks to David Miller for pointing out the release on success
      path as well.
      Signed-off-by: default avatarNavid Emamdoost <navid.emamdoost@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a21b7f0c
    • Subash Abhinov Kasiviswanathan's avatar
      net: Fix null de-reference of device refcount · 10cc514f
      Subash Abhinov Kasiviswanathan authored
      In event of failure during register_netdevice, free_netdev is
      invoked immediately. free_netdev assumes that all the netdevice
      refcounts have been dropped prior to it being called and as a
      result frees and clears out the refcount pointer.
      
      However, this is not necessarily true as some of the operations
      in the NETDEV_UNREGISTER notifier handlers queue RCU callbacks for
      invocation after a grace period. The IPv4 callback in_dev_rcu_put
      tries to access the refcount after free_netdev is called which
      leads to a null de-reference-
      
      44837.761523:   <6> Unable to handle kernel paging request at
                          virtual address 0000004a88287000
      44837.761651:   <2> pc : in_dev_finish_destroy+0x4c/0xc8
      44837.761654:   <2> lr : in_dev_finish_destroy+0x2c/0xc8
      44837.762393:   <2> Call trace:
      44837.762398:   <2>  in_dev_finish_destroy+0x4c/0xc8
      44837.762404:   <2>  in_dev_rcu_put+0x24/0x30
      44837.762412:   <2>  rcu_nocb_kthread+0x43c/0x468
      44837.762418:   <2>  kthread+0x118/0x128
      44837.762424:   <2>  ret_from_fork+0x10/0x1c
      
      Fix this by waiting for the completion of the call_rcu() in
      case of register_netdevice errors.
      
      Fixes: 93ee31f1 ("[NET]: Fix free_netdev on register_netdev failure.")
      Cc: Sean Tranchetti <stranche@codeaurora.org>
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10cc514f
    • Christophe JAILLET's avatar
      ipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()' · d23dbc47
      Christophe JAILLET authored
      The '.exit' functions from 'pernet_operations' structure should be marked
      as __net_exit, not __net_init.
      
      Fixes: d862e546 ("net: ipv6: Implement /proc/net/icmp6.")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d23dbc47
    • Yang Yingliang's avatar
      tun: fix use-after-free when register netdev failed · 77f22f92
      Yang Yingliang authored
      I got a UAF repport in tun driver when doing fuzzy test:
      
      [  466.269490] ==================================================================
      [  466.271792] BUG: KASAN: use-after-free in tun_chr_read_iter+0x2ca/0x2d0
      [  466.271806] Read of size 8 at addr ffff888372139250 by task tun-test/2699
      [  466.271810]
      [  466.271824] CPU: 1 PID: 2699 Comm: tun-test Not tainted 5.3.0-rc1-00001-g5a9433db2614-dirty #427
      [  466.271833] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [  466.271838] Call Trace:
      [  466.271858]  dump_stack+0xca/0x13e
      [  466.271871]  ? tun_chr_read_iter+0x2ca/0x2d0
      [  466.271890]  print_address_description+0x79/0x440
      [  466.271906]  ? vprintk_func+0x5e/0xf0
      [  466.271920]  ? tun_chr_read_iter+0x2ca/0x2d0
      [  466.271935]  __kasan_report+0x15c/0x1df
      [  466.271958]  ? tun_chr_read_iter+0x2ca/0x2d0
      [  466.271976]  kasan_report+0xe/0x20
      [  466.271987]  tun_chr_read_iter+0x2ca/0x2d0
      [  466.272013]  do_iter_readv_writev+0x4b7/0x740
      [  466.272032]  ? default_llseek+0x2d0/0x2d0
      [  466.272072]  do_iter_read+0x1c5/0x5e0
      [  466.272110]  vfs_readv+0x108/0x180
      [  466.299007]  ? compat_rw_copy_check_uvector+0x440/0x440
      [  466.299020]  ? fsnotify+0x888/0xd50
      [  466.299040]  ? __fsnotify_parent+0xd0/0x350
      [  466.299064]  ? fsnotify_first_mark+0x1e0/0x1e0
      [  466.304548]  ? vfs_write+0x264/0x510
      [  466.304569]  ? ksys_write+0x101/0x210
      [  466.304591]  ? do_preadv+0x116/0x1a0
      [  466.304609]  do_preadv+0x116/0x1a0
      [  466.309829]  do_syscall_64+0xc8/0x600
      [  466.309849]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  466.309861] RIP: 0033:0x4560f9
      [  466.309875] Code: 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      [  466.309889] RSP: 002b:00007ffffa5166e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000127
      [  466.322992] RAX: ffffffffffffffda RBX: 0000000000400460 RCX: 00000000004560f9
      [  466.322999] RDX: 0000000000000003 RSI: 00000000200008c0 RDI: 0000000000000003
      [  466.323007] RBP: 00007ffffa516700 R08: 0000000000000004 R09: 0000000000000000
      [  466.323014] R10: 0000000000000000 R11: 0000000000000206 R12: 000000000040cb10
      [  466.323021] R13: 0000000000000000 R14: 00000000006d7018 R15: 0000000000000000
      [  466.323057]
      [  466.323064] Allocated by task 2605:
      [  466.335165]  save_stack+0x19/0x80
      [  466.336240]  __kasan_kmalloc.constprop.8+0xa0/0xd0
      [  466.337755]  kmem_cache_alloc+0xe8/0x320
      [  466.339050]  getname_flags+0xca/0x560
      [  466.340229]  user_path_at_empty+0x2c/0x50
      [  466.341508]  vfs_statx+0xe6/0x190
      [  466.342619]  __do_sys_newstat+0x81/0x100
      [  466.343908]  do_syscall_64+0xc8/0x600
      [  466.345303]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  466.347034]
      [  466.347517] Freed by task 2605:
      [  466.348471]  save_stack+0x19/0x80
      [  466.349476]  __kasan_slab_free+0x12e/0x180
      [  466.350726]  kmem_cache_free+0xc8/0x430
      [  466.351874]  putname+0xe2/0x120
      [  466.352921]  filename_lookup+0x257/0x3e0
      [  466.354319]  vfs_statx+0xe6/0x190
      [  466.355498]  __do_sys_newstat+0x81/0x100
      [  466.356889]  do_syscall_64+0xc8/0x600
      [  466.358037]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  466.359567]
      [  466.360050] The buggy address belongs to the object at ffff888372139100
      [  466.360050]  which belongs to the cache names_cache of size 4096
      [  466.363735] The buggy address is located 336 bytes inside of
      [  466.363735]  4096-byte region [ffff888372139100, ffff88837213a100)
      [  466.367179] The buggy address belongs to the page:
      [  466.368604] page:ffffea000dc84e00 refcount:1 mapcount:0 mapping:ffff8883df1b4f00 index:0x0 compound_mapcount: 0
      [  466.371582] flags: 0x2fffff80010200(slab|head)
      [  466.372910] raw: 002fffff80010200 dead000000000100 dead000000000122 ffff8883df1b4f00
      [  466.375209] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
      [  466.377778] page dumped because: kasan: bad access detected
      [  466.379730]
      [  466.380288] Memory state around the buggy address:
      [  466.381844]  ffff888372139100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.384009]  ffff888372139180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.386131] >ffff888372139200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.388257]                                                  ^
      [  466.390234]  ffff888372139280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.392512]  ffff888372139300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  466.394667] ==================================================================
      
      tun_chr_read_iter() accessed the memory which freed by free_netdev()
      called by tun_set_iff():
      
              CPUA                                           CPUB
        tun_set_iff()
          alloc_netdev_mqs()
          tun_attach()
                                                        tun_chr_read_iter()
                                                          tun_get()
                                                          tun_do_read()
                                                            tun_ring_recv()
          register_netdevice() <-- inject error
          goto err_detach
          tun_detach_all() <-- set RCV_SHUTDOWN
          free_netdev() <-- called from
                           err_free_dev path
            netdev_freemem() <-- free the memory
                              without check refcount
            (In this path, the refcount cannot prevent
             freeing the memory of dev, and the memory
             will be used by dev_put() called by
             tun_chr_read_iter() on CPUB.)
                                                           (Break from tun_ring_recv(),
                                                           because RCV_SHUTDOWN is set)
                                                         tun_put()
                                                           dev_put() <-- use the memory
                                                                         freed by netdev_freemem()
      
      Put the publishing of tfile->tun after register_netdevice(),
      so tun_get() won't get the tun pointer that freed by
      err_detach path if register_netdevice() failed.
      
      Fixes: eb0fb363 ("tuntap: attach queue 0 before registering netdevice")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Suggested-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77f22f92
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · ad32b480
      Linus Torvalds authored
      Pull virtio fixes from Michael Tsirkin:
       "Last minute bugfixes.
      
        A couple of security things.
      
        And an error handling bugfix that is never encountered by most people,
        but that also makes it kind of safe to push at the last minute, and it
        helps push the fix to stable a bit sooner"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vhost: make sure log_num < in_num
        vhost: block speculation of translated descriptors
        virtio_ring: fix unmap of indirect descriptors
      ad32b480
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6dcf6a4e
      Linus Torvalds authored
      Pull perf fix from Ingo Molnar:
       "Fix an initialization bug in the hw-breakpoints, which triggered on
        the ARM platform"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/hw_breakpoint: Fix arch_hw_breakpoint use-before-initialization
      6dcf6a4e
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 95779fe8
      Linus Torvalds authored
      Pull irq fix from Ingo Molnar:
       "Fix a race in the IRQ resend mechanism, which can result in a NULL
        dereference crash"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Prevent NULL pointer dereference in resend_irqs()
      95779fe8
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 840ce8f8
      Linus Torvalds authored
      Pull pin control fix from Linus Walleij:
       "Hopefully last pin control fix: a single patch for some Aspeed
        problems. The BMCs are much happier now"
      
      * tag 'pinctrl-v5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: aspeed: Fix spurious mux failures on the AST2500
      840ce8f8
    • Linus Torvalds's avatar
      Merge tag 'gpio-v5.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 9c09f623
      Linus Torvalds authored
      Pull GPIO fixes from Linus Walleij:
       "I don't really like to send so many fixes at the very last minute, but
        the bug-sport activity is unpredictable.
      
        Four fixes, three are -stable material that will go everywhere, one is
        for the current cycle:
      
         - An ACPI DSDT error fixup of the type we always see and Hans
           invariably gets to fix.
      
         - A OF quirk fix for the current release (v5.3)
      
         - Some consistency checks on the userspace ABI.
      
         - A memory leak"
      
      * tag 'gpio-v5.3-6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpiolib: acpi: Add gpiolib_acpi_run_edge_events_on_boot option and blacklist
        gpiolib: of: fix fallback quirks handling
        gpio: fix line flag validation in lineevent_create
        gpio: fix line flag validation in linehandle_create
        gpio: mockup: add missing single_release()
      9c09f623
  4. 11 Sep, 2019 5 commits
    • Andrew Jeffery's avatar
      pinctrl: aspeed: Fix spurious mux failures on the AST2500 · c1432423
      Andrew Jeffery authored
      Commit 674fa8da ("pinctrl: aspeed-g5: Delay acquisition of regmaps")
      was determined to be a partial fix to the problem of acquiring the LPC
      Host Controller and GFX regmaps: The AST2500 pin controller may need to
      fetch syscon regmaps during expression evaluation as well as when
      setting mux state. For example, this case is hit by attempting to export
      pins exposing the LPC Host Controller as GPIOs.
      
      An optional eval() hook is added to the Aspeed pinmux operation struct
      and called from aspeed_sig_expr_eval() if the pointer is set by the
      SoC-specific driver. This enables the AST2500 to perform the custom
      action of acquiring its regmap dependencies as required.
      
      John Wang tested the fix on an Inspur FP5280G2 machine (AST2500-based)
      where the issue was found, and I've booted the fix on Witherspoon
      (AST2500) and Palmetto (AST2400) machines, and poked at relevant pins
      under QEMU by forcing mux configurations via devmem before exporting
      GPIOs to exercise the driver.
      
      Fixes: 7d29ed88 ("pinctrl: aspeed: Read and write bits in LPC and GFX controllers")
      Fixes: 674fa8da ("pinctrl: aspeed-g5: Delay acquisition of regmaps")
      Reported-by: default avatarJohn Wang <wangzqbj@inspur.com>
      Tested-by: default avatarJohn Wang <wangzqbj@inspur.com>
      Signed-off-by: default avatarAndrew Jeffery <andrew@aj.id.au>
      
      Link: https://lore.kernel.org/r/20190829071738.2523-1-andrew@aj.id.auSigned-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      c1432423
    • David S. Miller's avatar
      Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · 13d5231c
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2019-09-11
      
      This series contains fixes to ixgbe.
      
      Alex fixes up the adaptive ITR scheme for ixgbe which could result in a
      value that was either 0 or something less than 10 which was causing
      issues with hardware features, like RSC, that do not function well with
      ITR values that low.
      
      Ilya Maximets fixes the ixgbe driver to limit the number of transmit
      descriptors to clean by the number of transmit descriptors used in the
      transmit ring, so that the driver does not try to "double" clean the
      same descriptors.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13d5231c
    • Neal Cardwell's avatar
      tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR · af38d07e
      Neal Cardwell authored
      Fix tcp_ecn_withdraw_cwr() to clear the correct bit:
      TCP_ECN_QUEUE_CWR.
      
      Rationale: basically, TCP_ECN_DEMAND_CWR is a bit that is purely about
      the behavior of data receivers, and deciding whether to reflect
      incoming IP ECN CE marks as outgoing TCP th->ece marks. The
      TCP_ECN_QUEUE_CWR bit is purely about the behavior of data senders,
      and deciding whether to send CWR. The tcp_ecn_withdraw_cwr() function
      is only called from tcp_undo_cwnd_reduction() by data senders during
      an undo, so it should zero the sender-side state,
      TCP_ECN_QUEUE_CWR. It does not make sense to stop the reflection of
      incoming CE bits on incoming data packets just because outgoing
      packets were spuriously retransmitted.
      
      The bug has been reproduced with packetdrill to manifest in a scenario
      with RFC3168 ECN, with an incoming data packet with CE bit set and
      carrying a TCP timestamp value that causes cwnd undo. Before this fix,
      the IP CE bit was ignored and not reflected in the TCP ECE header bit,
      and sender sent a TCP CWR ('W') bit on the next outgoing data packet,
      even though the cwnd reduction had been undone.  After this fix, the
      sender properly reflects the CE bit and does not set the W bit.
      
      Note: the bug actually predates 2005 git history; this Fixes footer is
      chosen to be the oldest SHA1 I have tested (from Sep 2007) for which
      the patch applies cleanly (since before this commit the code was in a
      .h file).
      
      Fixes: bdf1ee5d ("[TCP]: Move code from tcp_ecn.h to tcp*.c and tcp.h & remove it")
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af38d07e
    • yongduan's avatar
      vhost: make sure log_num < in_num · 060423bf
      yongduan authored
      The code assumes log_num < in_num everywhere, and that is true as long as
      in_num is incremented by descriptor iov count, and log_num by 1. However
      this breaks if there's a zero sized descriptor.
      
      As a result, if a malicious guest creates a vring desc with desc.len = 0,
      it may cause the host kernel to crash by overflowing the log array. This
      bug can be triggered during the VM migration.
      
      There's no need to log when desc.len = 0, so just don't increment log_num
      in this case.
      
      Fixes: 3a4d5c94 ("vhost_net: a kernel-level virtio server")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarLidong Chen <lidongchen@tencent.com>
      Signed-off-by: default avatarruippan <ruippan@tencent.com>
      Signed-off-by: default avataryongduan <yongduan@tencent.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      060423bf
    • Michael S. Tsirkin's avatar
      vhost: block speculation of translated descriptors · a89db445
      Michael S. Tsirkin authored
      iovec addresses coming from vhost are assumed to be
      pre-validated, but in fact can be speculated to a value
      out of range.
      
      Userspace address are later validated with array_index_nospec so we can
      be sure kernel info does not leak through these addresses, but vhost
      must also not leak userspace info outside the allowed memory table to
      guests.
      
      Following the defence in depth principle, make sure
      the address is not validated out of node range.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Cc: stable@vger.kernel.org
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Tested-by: default avatarJason Wang <jasowang@redhat.com>
      a89db445