1. 05 Apr, 2019 20 commits
    • Peng Fan's avatar
      mm/cma.c: cma_declare_contiguous: correct err handling · a5329016
      Peng Fan authored
      [ Upstream commit 0d3bd18a ]
      
      In case cma_init_reserved_mem failed, need to free the memblock
      allocated by memblock_reserve or memblock_alloc_range.
      
      Quote Catalin's comments:
        https://lkml.org/lkml/2019/2/26/482
      
      Kmemleak is supposed to work with the memblock_{alloc,free} pair and it
      ignores the memblock_reserve() as a memblock_alloc() implementation
      detail. It is, however, tolerant to memblock_free() being called on
      a sub-range or just a different range from a previous memblock_alloc().
      So the original patch looks fine to me. FWIW:
      
      Link: http://lkml.kernel.org/r/20190227144631.16708-1-peng.fan@nxp.comSigned-off-by: default avatarPeng Fan <peng.fan@nxp.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a5329016
    • Jiri Olsa's avatar
      perf c2c: Fix c2c report for empty numa node · 02241b9c
      Jiri Olsa authored
      [ Upstream commit e34c9402 ]
      
      Ravi Bangoria reported that we fail with an empty NUMA node with the
      following message:
      
        $ lscpu
        NUMA node0 CPU(s):
        NUMA node1 CPU(s):   0-4
      
        $ sudo ./perf c2c report
        node/cpu topology bugFailed setup nodes
      
      Fix this by detecting the empty node and keeping its CPU set empty.
      Reported-by: default avatarNageswara R Sastry <nasastry@in.ibm.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190305152536.21035-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      02241b9c
    • Linus Torvalds's avatar
      iio: adc: fix warning in Qualcomm PM8xxx HK/XOADC driver · a5c4c090
      Linus Torvalds authored
      [ Upstream commit e0f0ae83 ]
      
      The pm8xxx_get_channel() implementation is unclear, and causes gcc to
      suddenly generate odd warnings.  The trigger for the warning (at least
      for me) was the entirely unrelated commit 79a4e91d ("device.h: Add
      __cold to dev_<level> logging functions"), which apparently changes gcc
      code generation in the caller function enough to cause this:
      
        drivers/iio/adc/qcom-pm8xxx-xoadc.c: In function ‘pm8xxx_xoadc_probe’:
        drivers/iio/adc/qcom-pm8xxx-xoadc.c:633:8: warning: ‘ch’ may be used uninitialized in this function [-Wmaybe-uninitialized]
          ret = pm8xxx_read_channel_rsv(adc, ch, AMUX_RSV4,
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                   &read_nomux_rsv4, true);
                   ~~~~~~~~~~~~~~~~~~~~~~~
        drivers/iio/adc/qcom-pm8xxx-xoadc.c:426:27: note: ‘ch’ was declared here
          struct pm8xxx_chan_info *ch;
                                   ^~
      
      because gcc for some reason then isn't able to see that the termination
      condition for the "for( )" loop in that function is also the condition
      for returning NULL.
      
      So it's not _actually_ uninitialized, but the function is admittedly
      just unnecessarily oddly written.
      
      Simplify and clarify the function, making gcc also see that it always
      returns a valid initialized value.
      
      Cc: Joe Perches <joe@perches.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Andy Gross <andy.gross@linaro.org>
      Cc: David Brown <david.brown@linaro.org>
      Cc: Jonathan Cameron <jic23@kernel.org>
      Cc: Hartmut Knaack <knaack.h@gmx.de>
      Cc: Lars-Peter Clausen <lars@metafoo.de>
      Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a5c4c090
    • John Garry's avatar
      scsi: hisi_sas: Set PHY linkrate when disconnected · e1b85487
      John Garry authored
      [ Upstream commit efdcad62 ]
      
      When the PHY comes down, we currently do not set the negotiated linkrate:
      
      root@(none)$ pwd
      /sys/class/sas_phy/phy-0:0
      root@(none)$ more enable
      1
      root@(none)$ more negotiated_linkrate
      12.0 Gbit
      root@(none)$ echo 0 > enable
      root@(none)$ more negotiated_linkrate
      12.0 Gbit
      root@(none)$
      
      This patch fixes the driver code to set it properly when the PHY comes
      down.
      
      If the PHY had been enabled, then set unknown; otherwise, flag as disabled.
      
      The logical place to set the negotiated linkrate for this scenario is PHY
      down routine, which is called from the PHY down ISR.
      
      However, it is not possible to know if the PHY comes down due to PHY
      disable or loss of link, as sas_phy.enabled member is not set until after
      the transport disable routine is complete, which races with the PHY down
      ISR.
      
      As an imperfect solution, use sas_phy_data.enable as the flag to know if
      the PHY is down due to disable. It's imperfect, as sas_phy_data is internal
      to libsas.
      
      I can't see another way without adding a new field to hisi_sas_phy and
      managing it, or changing SCSI SAS transport.
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e1b85487
    • Arnd Bergmann's avatar
      enic: fix build warning without CONFIG_CPUMASK_OFFSTACK · c59f7575
      Arnd Bergmann authored
      [ Upstream commit 43d28166 ]
      
      The enic driver relies on the CONFIG_CPUMASK_OFFSTACK feature to
      dynamically allocate a struct member, but this is normally intended for
      local variables.
      
      Building with clang, I get a warning for a few locations that check the
      address of the cpumask_var_t:
      
      drivers/net/ethernet/cisco/enic/enic_main.c:122:22: error: address of array 'enic->msix[i].affinity_mask' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
      
      As far as I can tell, the code is still correct, as the truth value of
      the pointer is what we need in this configuration. To get rid of
      the warning, use cpumask_available() instead of checking the
      pointer directly.
      
      Fixes: 322cf7e3 ("enic: assign affinity hint to interrupts")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c59f7575
    • Christian Brauner's avatar
      sysctl: handle overflow for file-max · 1306ff8b
      Christian Brauner authored
      [ Upstream commit 32a5ad9c ]
      
      Currently, when writing
      
        echo 18446744073709551616 > /proc/sys/fs/file-max
      
      /proc/sys/fs/file-max will overflow and be set to 0.  That quickly
      crashes the system.
      
      This commit sets the max and min value for file-max.  The max value is
      set to long int.  Any higher value cannot currently be used as the
      percpu counters are long ints and not unsigned integers.
      
      Note that the file-max value is ultimately parsed via
      __do_proc_doulongvec_minmax().  This function does not report error when
      min or max are exceeded.  Which means if a value largen that long int is
      written userspace will not receive an error instead the old value will be
      kept.  There is an argument to be made that this should be changed and
      __do_proc_doulongvec_minmax() should return an error when a dedicated min
      or max value are exceeded.  However this has the potential to break
      userspace so let's defer this to an RFC patch.
      
      Link: http://lkml.kernel.org/r/20190107222700.15954-3-christian@brauner.ioSigned-off-by: default avatarChristian Brauner <christian@brauner.io>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Joe Lawrence <joe.lawrence@redhat.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Waiman Long <longman@redhat.com>
      [christian@brauner.io: v4]
        Link: http://lkml.kernel.org/r/20190210203943.8227-3-christian@brauner.ioSigned-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1306ff8b
    • Luc Van Oostenryck's avatar
      include/linux/relay.h: fix percpu annotation in struct rchan · 7d1be2d6
      Luc Van Oostenryck authored
      [ Upstream commit 62461ac2 ]
      
      The percpu member of this structure is declared as:
      	struct ... ** __percpu member;
      So its type is:
      	__percpu pointer to pointer to struct ...
      
      But looking at how it's used, its type should be:
      	pointer to __percpu pointer to struct ...
      and it should thus be declared as:
      	struct ... * __percpu *member;
      
      So fix the placement of '__percpu' in the definition of this
      structures.
      
      This silents a few Sparse's warnings like:
      	warning: incorrect type in initializer (different address spaces)
      	  expected void const [noderef] <asn:3> *__vpp_verify
      	  got struct sched_domain **
      
      Link: http://lkml.kernel.org/r/20190118144902.79065-1-luc.vanoostenryck@gmail.com
      Fixes: 017c59c0 ("relay: Use per CPU constructs for the relay channel buffer pointers")
      Signed-off-by: default avatarLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Cc: Jens Axboe <axboe@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7d1be2d6
    • Russell King's avatar
      gpio: gpio-omap: fix level interrupt idling · 2e9f7de9
      Russell King authored
      [ Upstream commit d01849f7 ]
      
      Tony notes that the GPIO module does not idle when level interrupts are
      in use, as the wakeup appears to get stuck.
      
      After extensive investigation, it appears that the wakeup will only be
      cleared if the interrupt status register is cleared while the interrupt
      is enabled. However, we are currently clearing it with the interrupt
      disabled for level-based interrupts.
      
      It is acknowledged that this observed behaviour conflicts with a
      statement in the TRM:
      
      CAUTION
        After servicing the interrupt, the status bit in the interrupt status
        register (GPIOi.GPIO_IRQSTATUS_0 or GPIOi.GPIO_IRQSTATUS_1) must be
        reset and the interrupt line released (by setting the corresponding
        bit of the interrupt status register to 1) before enabling an
        interrupt for the GPIO channel in the interrupt-enable register
        (GPIOi.GPIO_IRQSTATUS_SET_0 or GPIOi.GPIO_IRQSTATUS_SET_1) to prevent
        the occurrence of unexpected interrupts when enabling an interrupt
        for the GPIO channel.
      
      However, this does not appear to be a practical problem.
      
      Further, as reported by Grygorii Strashko <grygorii.strashko@ti.com>,
      the TI Android kernel tree has an earlier similar patch as "GPIO: OMAP:
      Fix the sequence to clear the IRQ status" saying:
      
       if the status is cleared after disabling the IRQ then sWAKEUP will not
       be cleared and gates the module transition
      
      When we unmask the level interrupt after the interrupt has been handled,
      enable the interrupt and only then clear the interrupt. If the interrupt
      is still pending, the hardware will re-assert the interrupt status.
      
      Should the caution note in the TRM prove to be a problem, we could
      use a clear-enable-clear sequence instead.
      
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Cc: Keerthy <j-keerthy@ti.com>
      Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      [tony@atomide.com: updated comments based on an earlier TI patch]
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Acked-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2e9f7de9
    • Tonghao Zhang's avatar
      net/mlx5: Avoid panic when setting vport mac, getting vport config · 1ece0994
      Tonghao Zhang authored
      [ Upstream commit 6e77c413 ]
      
      If we try to set VFs mac address on a VF (not PF) net device,
      the kernel will be crash. The commands are show as below:
      
      $ echo 2 > /sys/class/net/$MLX_PF0/device/sriov_numvfs
      $ ip link set $MLX_VF0 vf 0 mac 00:11:22:33:44:00
      
      [exception RIP: mlx5_eswitch_set_vport_mac+41]
      [ffffb8b7079e3688] do_setlink at ffffffff8f67f85b
      [ffffb8b7079e37a8] __rtnl_newlink at ffffffff8f683778
      [ffffb8b7079e3b68] rtnl_newlink at ffffffff8f683a63
      [ffffb8b7079e3b90] rtnetlink_rcv_msg at ffffffff8f67d812
      [ffffb8b7079e3c10] netlink_rcv_skb at ffffffff8f6b88ab
      [ffffb8b7079e3c60] netlink_unicast at ffffffff8f6b808f
      [ffffb8b7079e3ca0] netlink_sendmsg at ffffffff8f6b8412
      [ffffb8b7079e3d18] sock_sendmsg at ffffffff8f6452f6
      [ffffb8b7079e3d30] ___sys_sendmsg at ffffffff8f645860
      [ffffb8b7079e3eb0] __sys_sendmsg at ffffffff8f647a38
      [ffffb8b7079e3f38] do_syscall_64 at ffffffff8f00401b
      [ffffb8b7079e3f50] entry_SYSCALL_64_after_hwframe at ffffffff8f80008c
      
      and
      
      [exception RIP: mlx5_eswitch_get_vport_config+12]
      [ffffa70607e57678] mlx5e_get_vf_config at ffffffffc03c7f8f [mlx5_core]
      [ffffa70607e57688] do_setlink at ffffffffbc67fa59
      [ffffa70607e577a8] __rtnl_newlink at ffffffffbc683778
      [ffffa70607e57b68] rtnl_newlink at ffffffffbc683a63
      [ffffa70607e57b90] rtnetlink_rcv_msg at ffffffffbc67d812
      [ffffa70607e57c10] netlink_rcv_skb at ffffffffbc6b88ab
      [ffffa70607e57c60] netlink_unicast at ffffffffbc6b808f
      [ffffa70607e57ca0] netlink_sendmsg at ffffffffbc6b8412
      [ffffa70607e57d18] sock_sendmsg at ffffffffbc6452f6
      [ffffa70607e57d30] ___sys_sendmsg at ffffffffbc645860
      [ffffa70607e57eb0] __sys_sendmsg at ffffffffbc647a38
      [ffffa70607e57f38] do_syscall_64 at ffffffffbc00401b
      [ffffa70607e57f50] entry_SYSCALL_64_after_hwframe at ffffffffbc80008c
      
      Fixes: a8d70a05 ("net/mlx5: E-Switch, Disallow vlan/spoofcheck setup if not being esw manager")
      Cc: Eli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1ece0994
    • Tonghao Zhang's avatar
      net/mlx5: Avoid panic when setting vport rate · 729035b5
      Tonghao Zhang authored
      [ Upstream commit 24319258 ]
      
      If we try to set VFs rate on a VF (not PF) net device, the kernel
      will be crash. The commands are show as below:
      
      $ echo 2 > /sys/class/net/$MLX_PF0/device/sriov_numvfs
      $ ip link set $MLX_VF0 vf 0 max_tx_rate 2 min_tx_rate 1
      
      If not applied the first patch ("net/mlx5: Avoid panic when setting
      vport mac, getting vport config"), the command:
      
      $ ip link set $MLX_VF0 vf 0 rate 100
      
      can also crash the kernel.
      
      [ 1650.006388] RIP: 0010:mlx5_eswitch_set_vport_rate+0x1f/0x260 [mlx5_core]
      [ 1650.007092]  do_setlink+0x982/0xd20
      [ 1650.007129]  __rtnl_newlink+0x528/0x7d0
      [ 1650.007374]  rtnl_newlink+0x43/0x60
      [ 1650.007407]  rtnetlink_rcv_msg+0x2a2/0x320
      [ 1650.007484]  netlink_rcv_skb+0xcb/0x100
      [ 1650.007519]  netlink_unicast+0x17f/0x230
      [ 1650.007554]  netlink_sendmsg+0x2d2/0x3d0
      [ 1650.007592]  sock_sendmsg+0x36/0x50
      [ 1650.007625]  ___sys_sendmsg+0x280/0x2a0
      [ 1650.007963]  __sys_sendmsg+0x58/0xa0
      [ 1650.007998]  do_syscall_64+0x5b/0x180
      [ 1650.009438]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: c9497c98 ("net/mlx5: Add support for setting VF min rate")
      Cc: Mohamad Haj Yahia <mohamad@mellanox.com>
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      729035b5
    • Douglas Anderson's avatar
      tracing: kdb: Fix ftdump to not sleep · 2d412eb3
      Douglas Anderson authored
      [ Upstream commit 31b265b3 ]
      
      As reported back in 2016-11 [1], the "ftdump" kdb command triggers a
      BUG for "sleeping function called from invalid context".
      
      kdb's "ftdump" command wants to call ring_buffer_read_prepare() in
      atomic context.  A very simple solution for this is to add allocation
      flags to ring_buffer_read_prepare() so kdb can call it without
      triggering the allocation error.  This patch does that.
      
      Note that in the original email thread about this, it was suggested
      that perhaps the solution for kdb was to either preallocate the buffer
      ahead of time or create our own iterator.  I'm hoping that this
      alternative of adding allocation flags to ring_buffer_read_prepare()
      can be considered since it means I don't need to duplicate more of the
      core trace code into "trace_kdb.c" (for either creating my own
      iterator or re-preparing a ring allocator whose memory was already
      allocated).
      
      NOTE: another option for kdb is to actually figure out how to make it
      reuse the existing ftrace_dump() function and totally eliminate the
      duplication.  This sounds very appealing and actually works (the "sr
      z" command can be seen to properly dump the ftrace buffer).  The
      downside here is that ftrace_dump() fully consumes the trace buffer.
      Unless that is changed I'd rather not use it because it means "ftdump
      | grep xyz" won't be very useful to search the ftrace buffer since it
      will throw away the whole trace on the first grep.  A future patch to
      dump only the last few lines of the buffer will also be hard to
      implement.
      
      [1] https://lkml.kernel.org/r/20161117191605.GA21459@google.com
      
      Link: http://lkml.kernel.org/r/20190308193205.213659-1-dianders@chromium.orgReported-by: default avatarBrian Norris <briannorris@chromium.org>
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2d412eb3
    • Chao Yu's avatar
      f2fs: fix to avoid deadlock in f2fs_read_inline_dir() · 3d07209b
      Chao Yu authored
      [ Upstream commit aadcef64 ]
      
      As Jiqun Li reported in bugzilla:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=202883
      
      sometimes, dead lock when make system call SYS_getdents64 with fsync() is
      called by another process.
      
      monkey running on android9.0
      
      1.  task 9785 held sbi->cp_rwsem and waiting lock_page()
      2.  task 10349 held mm_sem and waiting sbi->cp_rwsem
      3. task 9709 held lock_page() and waiting mm_sem
      
      so this is a dead lock scenario.
      
      task stack is show by crash tools as following
      
      crash_arm64> bt ffffffc03c354080
      PID: 9785   TASK: ffffffc03c354080  CPU: 1   COMMAND: "RxIoScheduler-3"
      >> #7 [ffffffc01b50fac0] __lock_page at ffffff80081b11e8
      
      crash-arm64> bt 10349
      PID: 10349  TASK: ffffffc018b83080  CPU: 1   COMMAND: "BUGLY_ASYNC_UPL"
      >> #3 [ffffffc01f8cfa40] rwsem_down_read_failed at ffffff8008a93afc
           PC: 00000033  LR: 00000000  SP: 00000000  PSTATE: ffffffffffffffff
      
      crash-arm64> bt 9709
      PID: 9709   TASK: ffffffc03e7f3080  CPU: 1   COMMAND: "IntentService[A"
      >> #3 [ffffffc001e67850] rwsem_down_read_failed at ffffff8008a93afc
      >> #8 [ffffffc001e67b80] el1_ia at ffffff8008084fc4
           PC: ffffff8008274114  [compat_filldir64+120]
           LR: ffffff80083584d4  [f2fs_fill_dentries+448]
           SP: ffffffc001e67b80  PSTATE: 80400145
          X29: ffffffc001e67b80  X28: 0000000000000000  X27: 000000000000001a
          X26: 00000000000093d7  X25: ffffffc070d52480  X24: 0000000000000008
          X23: 0000000000000028  X22: 00000000d43dfd60  X21: ffffffc001e67e90
          X20: 0000000000000011  X19: ffffff80093a4000  X18: 0000000000000000
          X17: 0000000000000000  X16: 0000000000000000  X15: 0000000000000000
          X14: ffffffffffffffff  X13: 0000000000000008  X12: 0101010101010101
          X11: 7f7f7f7f7f7f7f7f  X10: 6a6a6a6a6a6a6a6a   X9: 7f7f7f7f7f7f7f7f
           X8: 0000000080808000   X7: ffffff800827409c   X6: 0000000080808000
           X5: 0000000000000008   X4: 00000000000093d7   X3: 000000000000001a
           X2: 0000000000000011   X1: ffffffc070d52480   X0: 0000000000800238
      >> #9 [ffffffc001e67be0] f2fs_fill_dentries at ffffff80083584d0
           PC: 0000003c  LR: 00000000  SP: 00000000  PSTATE: 000000d9
          X12: f48a02ff X11: d4678960 X10: d43dfc00  X9: d4678ae4
           X8: 00000058  X7: d4678994  X6: d43de800  X5: 000000d9
           X4: d43dfc0c  X3: d43dfc10  X2: d46799c8  X1: 00000000
           X0: 00001068
      
      Below potential deadlock will happen between three threads:
      Thread A		Thread B		Thread C
      - f2fs_do_sync_file
       - f2fs_write_checkpoint
        - down_write(&sbi->node_change) -- 1)
      			- do_page_fault
      			 - down_write(&mm->mmap_sem) -- 2)
      			  - do_wp_page
      			   - f2fs_vm_page_mkwrite
      						- getdents64
      						 - f2fs_read_inline_dir
      						  - lock_page -- 3)
        - f2fs_sync_node_pages
         - lock_page -- 3)
      			    - __do_map_lock
      			     - down_read(&sbi->node_change) -- 1)
      						  - f2fs_fill_dentries
      						   - dir_emit
      						    - compat_filldir64
      						     - do_page_fault
      						      - down_read(&mm->mmap_sem) -- 2)
      
      Since f2fs_readdir is protected by inode.i_rwsem, there should not be
      any updates in inode page, we're safe to lookup dents in inode page
      without its lock held, so taking off the lock to improve concurrency
      of readdir and avoid potential deadlock.
      Reported-by: default avatarJiqun Li <jiqun.li@unisoc.com>
      Signed-off-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3d07209b
    • Masahiro Yamada's avatar
      h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux- · d61ebf3d
      Masahiro Yamada authored
      [ Upstream commit fc2b47b5 ]
      
      It believe it is a bad idea to hardcode a specific compiler prefix
      that may or may not be installed on a user's system. It is annoying
      when testing features that should not require compilers at all.
      
      For example, mrproper, headers_install, etc. should work without
      any compiler.
      
      They look like follows on my machine.
      
      $ make ARCH=h8300 mrproper
      ./scripts/gcc-version.sh: line 26: h8300-unknown-linux-gcc: command not found
      ./scripts/gcc-version.sh: line 27: h8300-unknown-linux-gcc: command not found
      make: h8300-unknown-linux-gcc: Command not found
      make: h8300-unknown-linux-gcc: Command not found
        [ a bunch of the same error messages continue ]
      
      $ make ARCH=h8300 headers_install
      ./scripts/gcc-version.sh: line 26: h8300-unknown-linux-gcc: command not found
      ./scripts/gcc-version.sh: line 27: h8300-unknown-linux-gcc: command not found
      make: h8300-unknown-linux-gcc: Command not found
        HOSTCC  scripts/basic/fixdep
      make: h8300-unknown-linux-gcc: Command not found
        WRAP    arch/h8300/include/generated/uapi/asm/kvm_para.h
        [ snip ]
      
      The solution is to delete this line, or to use cc-cross-prefix like
      some architectures do. I chose the latter as a moderate fixup.
      
      I added an alternative 'h8300-linux-' because it is available at:
      
      https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/8.1.0/Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d61ebf3d
    • Aurelien Aptel's avatar
      CIFS: fix POSIX lock leak and invalid ptr deref · 3a721553
      Aurelien Aptel authored
      [ Upstream commit bc31d0cd ]
      
      We have a customer reporting crashes in lock_get_status() with many
      "Leaked POSIX lock" messages preceeding the crash.
      
       Leaked POSIX lock on dev=0x0:0x56 ...
       Leaked POSIX lock on dev=0x0:0x56 ...
       Leaked POSIX lock on dev=0x0:0x56 ...
       Leaked POSIX lock on dev=0x0:0x53 ...
       Leaked POSIX lock on dev=0x0:0x53 ...
       Leaked POSIX lock on dev=0x0:0x53 ...
       Leaked POSIX lock on dev=0x0:0x53 ...
       POSIX: fl_owner=ffff8900e7b79380 fl_flags=0x1 fl_type=0x1 fl_pid=20709
       Leaked POSIX lock on dev=0x0:0x4b ino...
       Leaked locks on dev=0x0:0x4b ino=0xf911400000029:
       POSIX: fl_owner=ffff89f41c870e00 fl_flags=0x1 fl_type=0x1 fl_pid=19592
       stack segment: 0000 [#1] SMP
       Modules linked in: binfmt_misc msr tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcsec_gss_krb5 arc4 ecb auth_rpcgss nfsv4 md4 nfs nls_utf8 lockd grace cifs sunrpc ccm dns_resolver fscache af_packet iscsi_ibft iscsi_boot_sysfs vmw_vsock_vmci_transport vsock xfs libcrc32c sb_edac edac_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drbg ansi_cprng vmw_balloon aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd joydev pcspkr vmxnet3 i2c_piix4 vmw_vmci shpchp fjes processor button ac btrfs xor raid6_pq sr_mod cdrom ata_generic sd_mod ata_piix vmwgfx crc32c_intel drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw ahci libahci drm libata vmw_pvscsi sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
      
       Supported: Yes
       CPU: 6 PID: 28250 Comm: lsof Not tainted 4.4.156-94.64-default #1
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016
       task: ffff88a345f28740 ti: ffff88c74005c000 task.ti: ffff88c74005c000
       RIP: 0010:[<ffffffff8125dcab>]  [<ffffffff8125dcab>] lock_get_status+0x9b/0x3b0
       RSP: 0018:ffff88c74005fd90  EFLAGS: 00010202
       RAX: ffff89bde83e20ae RBX: ffff89e870003d18 RCX: 0000000049534f50
       RDX: ffffffff81a3541f RSI: ffffffff81a3544e RDI: ffff89bde83e20ae
       RBP: 0026252423222120 R08: 0000000020584953 R09: 000000000000ffff
       R10: 0000000000000000 R11: ffff88c74005fc70 R12: ffff89e5ca7b1340
       R13: 00000000000050e5 R14: ffff89e870003d30 R15: ffff89e5ca7b1340
       FS:  00007fafd64be800(0000) GS:ffff89f41fd00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000001c80018 CR3: 000000a522048000 CR4: 0000000000360670
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Stack:
        0000000000000208 ffffffff81a3d6b6 ffff89e870003d30 ffff89e870003d18
        ffff89e5ca7b1340 ffff89f41738d7c0 ffff89e870003d30 ffff89e5ca7b1340
        ffffffff8125e08f 0000000000000000 ffff89bc22b67d00 ffff88c74005ff28
       Call Trace:
        [<ffffffff8125e08f>] locks_show+0x2f/0x70
        [<ffffffff81230ad1>] seq_read+0x251/0x3a0
        [<ffffffff81275bbc>] proc_reg_read+0x3c/0x70
        [<ffffffff8120e456>] __vfs_read+0x26/0x140
        [<ffffffff8120e9da>] vfs_read+0x7a/0x120
        [<ffffffff8120faf2>] SyS_read+0x42/0xa0
        [<ffffffff8161cbc3>] entry_SYSCALL_64_fastpath+0x1e/0xb7
      
      When Linux closes a FD (close(), close-on-exec, dup2(), ...) it calls
      filp_close() which also removes all posix locks.
      
      The lock struct is initialized like so in filp_close() and passed
      down to cifs
      
      	...
              lock.fl_type = F_UNLCK;
              lock.fl_flags = FL_POSIX | FL_CLOSE;
              lock.fl_start = 0;
              lock.fl_end = OFFSET_MAX;
      	...
      
      Note the FL_CLOSE flag, which hints the VFS code that this unlocking
      is done for closing the fd.
      
      filp_close()
        locks_remove_posix(filp, id);
          vfs_lock_file(filp, F_SETLK, &lock, NULL);
            return filp->f_op->lock(filp, cmd, fl) => cifs_lock()
              rc = cifs_setlk(file, flock, type, wait_flag, posix_lck, lock, unlock, xid);
                rc = server->ops->mand_unlock_range(cfile, flock, xid);
                if (flock->fl_flags & FL_POSIX && !rc)
                        rc = locks_lock_file_wait(file, flock)
      
      Notice how we don't call locks_lock_file_wait() which does the
      generic VFS lock/unlock/wait work on the inode if rc != 0.
      
      If we are closing the handle, the SMB server is supposed to remove any
      locks associated with it. Similarly, cifs.ko frees and wakes up any
      lock and lock waiter when closing the file:
      
      cifs_close()
        cifsFileInfo_put(file->private_data)
      	/*
      	 * Delete any outstanding lock records. We'll lose them when the file
      	 * is closed anyway.
      	 */
      	down_write(&cifsi->lock_sem);
      	list_for_each_entry_safe(li, tmp, &cifs_file->llist->locks, llist) {
      		list_del(&li->llist);
      		cifs_del_lock_waiters(li);
      		kfree(li);
      	}
      	list_del(&cifs_file->llist->llist);
      	kfree(cifs_file->llist);
      	up_write(&cifsi->lock_sem);
      
      So we can safely ignore unlocking failures in cifs_lock() if they
      happen with the FL_CLOSE flag hint set as both the server and the
      client take care of it during the actual closing.
      
      This is not a proper fix for the unlocking failure but it's safe and
      it seems to prevent the lock leakages and crashes the customer
      experiences.
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarNeilBrown <neil@brown.name>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Acked-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3a721553
    • Yang Shi's avatar
      mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified · b3830d2f
      Yang Shi authored
      commit a7f40cfe upstream.
      
      When MPOL_MF_STRICT was specified and an existing page was already on a
      node that does not follow the policy, mbind() should return -EIO.  But
      commit 6f4576e3 ("mempolicy: apply page table walker on
      queue_pages_range()") broke the rule.
      
      And commit c8633798 ("mm: mempolicy: mbind and migrate_pages support
      thp migration") didn't return the correct value for THP mbind() too.
      
      If MPOL_MF_STRICT is set, ignore vma_migratable() to make sure it
      reaches queue_pages_to_pte_range() or queue_pages_pmd() to check if an
      existing page was already on a node that does not follow the policy.
      And, non-migratable vma may be used, return -EIO too if MPOL_MF_MOVE or
      MPOL_MF_MOVE_ALL was specified.
      
      Tested with https://github.com/metan-ucw/ltp/blob/master/testcases/kernel/syscalls/mbind/mbind02.c
      
      [akpm@linux-foundation.org: tweak code comment]
      Link: http://lkml.kernel.org/r/1553020556-38583-1-git-send-email-yang.shi@linux.alibaba.com
      Fixes: 6f4576e3 ("mempolicy: apply page table walker on queue_pages_range()")
      Signed-off-by: default avatarYang Shi <yang.shi@linux.alibaba.com>
      Signed-off-by: default avatarOscar Salvador <osalvador@suse.de>
      Reported-by: default avatarCyril Hrubis <chrubis@suse.cz>
      Suggested-by: default avatarKirill A. Shutemov <kirill@shutemov.name>
      Acked-by: default avatarRafael Aquini <aquini@redhat.com>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      b3830d2f
    • Razvan Stefanescu's avatar
      tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped · a93b5eef
      Razvan Stefanescu authored
      commit 69646d7a upstream.
      
      In half-duplex operation, RX should be started after TX completes.
      
      If DMA is used, there is a case when the DMA transfer completes but the
      TX FIFO is not emptied, so the RX cannot be restarted just yet.
      
      Use a boolean variable to store this state and rearm TX interrupt mask
      to be signaled again that the transfer finished. In interrupt transmit
      handler this variable is used to start RX. A warning message is generated
      if RX is activated before TX fifo is cleared.
      
      Fixes: b389f173 ("tty/serial: atmel: RS485 half duplex w/DMA: enable
      RX after TX is done")
      Signed-off-by: default avatarRazvan Stefanescu <razvan.stefanescu@microchip.com>
      Acked-by: default avatarRichard Genoud <richard.genoud@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a93b5eef
    • Razvan Stefanescu's avatar
      tty/serial: atmel: Add is_half_duplex helper · 6952a095
      Razvan Stefanescu authored
      commit f3040983 upstream.
      
      Use a helper function to check that a port needs to use half duplex
      communication, replacing several occurrences of multi-line bit checking.
      
      Fixes: b389f173 ("tty/serial: atmel: RS485 half duplex w/DMA: enable RX after TX is done")
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarRazvan Stefanescu <razvan.stefanescu@microchip.com>
      Acked-by: default avatarRichard Genoud <richard.genoud@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      6952a095
    • Peter Zijlstra's avatar
      lib/int_sqrt: optimize initial value compute · 083aa6a5
      Peter Zijlstra authored
      commit f8ae107e upstream.
      
      The initial value (@M) compute is:
      
      	m = 1UL << (BITS_PER_LONG - 2);
      	while (m > x)
      		m >>= 2;
      
      Which is a linear search for the highest even bit smaller or equal to @x
      We can implement this using a binary search using __fls() (or better when
      its hardware implemented).
      
      	m = 1UL << (__fls(x) & ~1UL);
      
      Especially for small values of @x; which are the more common arguments
      when doing a CDF on idle times; the linear search is near to worst case,
      while the binary search of __fls() is a constant 6 (or 5 on 32bit)
      branches.
      
            cycles:                 branches:              branch-misses:
      
      PRE:
      
      hot:   43.633557 +- 0.034373  45.333132 +- 0.002277  0.023529 +- 0.000681
      cold: 207.438411 +- 0.125840  45.333132 +- 0.002277  6.976486 +- 0.004219
      
      SOFTWARE FLS:
      
      hot:   29.576176 +- 0.028850  26.666730 +- 0.004511  0.019463 +- 0.000663
      cold: 165.947136 +- 0.188406  26.666746 +- 0.004511  6.133897 +- 0.004386
      
      HARDWARE FLS:
      
      hot:   24.720922 +- 0.025161  20.666784 +- 0.004509  0.020836 +- 0.000677
      cold: 132.777197 +- 0.127471  20.666776 +- 0.004509  5.080285 +- 0.003874
      
      Averages computed over all values <128k using a LFSR to generate order.
      Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
      each int_sqrt() invocation.
      
      Link: http://lkml.kernel.org/r/20171020164644.936577234@infradead.orgSigned-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Suggested-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Anshul Garg <aksgarg1989@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Michael Davidson <md@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      083aa6a5
    • zhangyi (F)'s avatar
      ext4: cleanup bh release code in ext4_ind_remove_space() · d245143c
      zhangyi (F) authored
      commit 5e86bdda upstream.
      
      Currently, we are releasing the indirect buffer where we are done with
      it in ext4_ind_remove_space(), so we can see the brelse() and
      BUFFER_TRACE() everywhere.  It seems fragile and hard to read, and we
      may probably forget to release the buffer some day.  This patch cleans
      up the code by putting of the code which releases the buffers to the
      end of the function.
      Signed-off-by: default avatarzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Jari Ruusu <jari.ruusu@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d245143c
    • Will Deacon's avatar
      arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals · 34cc1662
      Will Deacon authored
      commit b9a4b9d0 upstream.
      
      FAR_EL1 is UNKNOWN for all debug exceptions other than those caused by
      taking a hardware watchpoint. Unfortunately, if a debug handler returns
      a non-zero value, then we will propagate the UNKNOWN FAR value to
      userspace via the si_addr field of the SIGTRAP siginfo_t.
      
      Instead, let's set si_addr to take on the PC of the faulting instruction,
      which we have available in the current pt_regs.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      34cc1662
  2. 03 Apr, 2019 20 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.110 · 80bf6c64
      Greg Kroah-Hartman authored
      80bf6c64
    • Cornelia Huck's avatar
      vfio: ccw: only free cp on final interrupt · 2cca5be8
      Cornelia Huck authored
      commit 50b7f1b7 upstream.
      
      When we get an interrupt for a channel program, it is not
      necessarily the final interrupt; for example, the issuing
      guest may request an intermediate interrupt by specifying
      the program-controlled-interrupt flag on a ccw.
      
      We must not switch the state to idle if the interrupt is not
      yet final; even more importantly, we must not free the translated
      channel program if the interrupt is not yet final, or the host
      can crash during cp rewind.
      
      Fixes: e5f84dba ("vfio: ccw: return I/O results asynchronously")
      Cc: stable@vger.kernel.org # v4.12+
      Reviewed-by: default avatarEric Farman <farman@linux.ibm.com>
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2cca5be8
    • Greg Kroah-Hartman's avatar
      Revert "USB: core: only clean up what we allocated" · 1cb3e7f1
      Greg Kroah-Hartman authored
      commit cf4df407 upstream.
      
      This reverts commit 32fd87b3.
      
      Alan wrote a better fix for this...
      
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: stable <stable@vger.kernel.org>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1cb3e7f1
    • Sean Christopherson's avatar
      KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts · 0c60bc18
      Sean Christopherson authored
      commit 0cf9135b upstream.
      
      The CPUID flag ARCH_CAPABILITIES is unconditioinally exposed to host
      userspace for all x86 hosts, i.e. KVM advertises ARCH_CAPABILITIES
      regardless of hardware support under the pretense that KVM fully
      emulates MSR_IA32_ARCH_CAPABILITIES.  Unfortunately, only VMX hosts
      handle accesses to MSR_IA32_ARCH_CAPABILITIES (despite KVM_GET_MSRS
      also reporting MSR_IA32_ARCH_CAPABILITIES for all hosts).
      
      Move the MSR_IA32_ARCH_CAPABILITIES handling to common x86 code so
      that it's emulated on AMD hosts.
      
      Fixes: 1eaafe91 ("kvm: x86: IA32_ARCH_CAPABILITIES is always supported")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarXiaoyao Li <xiaoyao.li@linux.intel.com>
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c60bc18
    • Sean Christopherson's avatar
      KVM: Reject device ioctls from processes other than the VM's creator · 9badc854
      Sean Christopherson authored
      commit ddba9180 upstream.
      
      KVM's API requires thats ioctls must be issued from the same process
      that created the VM.  In other words, userspace can play games with a
      VM's file descriptors, e.g. fork(), SCM_RIGHTS, etc..., but only the
      creator can do anything useful.  Explicitly reject device ioctls that
      are issued by a process other than the VM's creator, and update KVM's
      API documentation to extend its requirements to device ioctls.
      
      Fixes: 852b6d57 ("kvm: add device control API")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9badc854
    • Thomas Gleixner's avatar
      x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y · a6c42842
      Thomas Gleixner authored
      commit bebd024e upstream.
      
      The SMT disable 'nosmt' command line argument is not working properly when
      CONFIG_HOTPLUG_CPU is disabled. The teardown of the sibling CPUs which are
      required to be brought up due to the MCE issues, cannot work. The CPUs are
      then kept in a half dead state.
      
      As the 'nosmt' functionality has become popular due to the speculative
      hardware vulnerabilities, the half torn down state is not a proper solution
      to the problem.
      
      Enforce CONFIG_HOTPLUG_CPU=y when SMP is enabled so the full operation is
      possible.
      Reported-by: default avatarTianyu Lan <Tianyu.Lan@microsoft.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Konrad Wilk <konrad.wilk@oracle.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Mukesh Ojha <mojha@codeaurora.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Micheal Kelley <michael.h.kelley@microsoft.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190326163811.598166056@linutronix.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6c42842
    • Thomas Gleixner's avatar
      cpu/hotplug: Prevent crash when CPU bringup fails on CONFIG_HOTPLUG_CPU=n · 5f6b5b8b
      Thomas Gleixner authored
      commit 206b9235 upstream.
      
      Tianyu reported a crash in a CPU hotplug teardown callback when booting a
      kernel which has CONFIG_HOTPLUG_CPU disabled with the 'nosmt' boot
      parameter.
      
      It turns out that the SMP=y CONFIG_HOTPLUG_CPU=n case has been broken
      forever in case that a bringup callback fails. Unfortunately this issue was
      not recognized when the CPU hotplug code was reworked, so the shortcoming
      just stayed in place.
      
      When a bringup callback fails, the CPU hotplug code rolls back the
      operation and takes the CPU offline.
      
      The 'nosmt' command line argument uses a bringup failure to abort the
      bringup of SMT sibling CPUs. This partial bringup is required due to the
      MCE misdesign on Intel CPUs.
      
      With CONFIG_HOTPLUG_CPU=y the rollback works perfectly fine, but
      CONFIG_HOTPLUG_CPU=n lacks essential mechanisms to exercise the low level
      teardown of a CPU including the synchronizations in various facilities like
      RCU, NOHZ and others.
      
      As a consequence the teardown callbacks which must be executed on the
      outgoing CPU within stop machine with interrupts disabled are executed on
      the control CPU in interrupt enabled and preemptible context causing the
      kernel to crash and burn. The pre state machine code has a different
      failure mode which is more subtle and resulting in a less obvious use after
      free crash because the control side frees resources which are still in use
      by the undead CPU.
      
      But this is not a x86 only problem. Any architecture which supports the
      SMP=y HOTPLUG_CPU=n combination suffers from the same issue. It's just less
      likely to be triggered because in 99.99999% of the cases all bringup
      callbacks succeed.
      
      The easy solution of making HOTPLUG_CPU mandatory for SMP is not working on
      all architectures as the following architectures have either no hotplug
      support at all or not all subarchitectures support it:
      
       alpha, arc, hexagon, openrisc, riscv, sparc (32bit), mips (partial).
      
      Crashing the kernel in such a situation is not an acceptable state
      either.
      
      Implement a minimal rollback variant by limiting the teardown to the point
      where all regular teardown callbacks have been invoked and leave the CPU in
      the 'dead' idle state. This has the following consequences:
      
       - the CPU is brought down to the point where the stop_machine takedown
         would happen.
      
       - the CPU stays there forever and is idle
      
       - The CPU is cleared in the CPU active mask, but not in the CPU online
         mask which is a legit state.
      
       - Interrupts are not forced away from the CPU
      
       - All facilities which only look at online mask would still see it, but
         that is the case during normal hotplug/unplug operations as well. It's
         just a (way) longer time frame.
      
      This will expose issues, which haven't been exposed before or only seldom,
      because now the normally transient state of being non active but online is
      a permanent state. In testing this exposed already an issue vs. work queues
      where the vmstat code schedules work on the almost dead CPU which ends up
      in an unbound workqueue and triggers 'preemtible context' warnings. This is
      not a problem of this change, it merily exposes an already existing issue.
      Still this is better than crashing fully without a chance to debug it.
      
      This is mainly thought as workaround for those architectures which do not
      support HOTPLUG_CPU. All others should enforce HOTPLUG_CPU for SMP.
      
      Fixes: 2e1a3483 ("cpu/hotplug: Split out the state walk into functions")
      Reported-by: default avatarTianyu Lan <Tianyu.Lan@microsoft.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarTianyu Lan <Tianyu.Lan@microsoft.com>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Konrad Wilk <konrad.wilk@oracle.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Mukesh Ojha <mojha@codeaurora.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Micheal Kelley <michael.h.kelley@microsoft.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190326163811.503390616@linutronix.deSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f6b5b8b
    • Adrian Hunter's avatar
      perf intel-pt: Fix TSC slip · 25e902d6
      Adrian Hunter authored
      commit f3b4e06b upstream.
      
      A TSC packet can slip past MTC packets so that the timestamp appears to
      go backwards. One estimate is that can be up to about 40 CPU cycles,
      which is certainly less than 0x1000 TSC ticks, but accept slippage an
      order of magnitude more to be on the safe side.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: 79b58424 ("perf tools: Add Intel PT support for decoding MTC packets")
      Link: http://lkml.kernel.org/r/20190325135135.18348-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      25e902d6
    • Lars Persson's avatar
      mm/migrate.c: add missing flush_dcache_page for non-mapped page migrate · 4774a369
      Lars Persson authored
      commit d2b2c6dd upstream.
      
      Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
      and SIGSEGV that could not be traced back to a userspace code bug.  They
      had all the magic signs of an I/D cache coherency issue.
      
      Now recently we noticed that the /proc/sys/vm/compact_memory interface
      was quite efficient at provoking this class of userspace crashes.
      
      Studying the code in mm/migrate.c there is a distinction made between
      migrating a page that is mapped at the instant of migration and one that
      is not mapped.  Our problem turned out to be the non-mapped pages.
      
      For the non-mapped page the code performs a copy of the page content and
      all relevant meta-data of the page without doing the required D-cache
      maintenance.  This leaves dirty data in the D-cache of the CPU and on
      the 1004K cores this data is not visible to the I-cache.  A subsequent
      page-fault that triggers a mapping of the page will happily serve the
      process with potentially stale code.
      
      What about ARM then, this bug should have seen greater exposure? Well
      ARM became immune to this flaw back in 2010, see commit c0177800
      ("ARM: 6379/1: Assume new page cache pages have dirty D-cache").
      
      My proposed fix moves the D-cache maintenance inside move_to_new_page to
      make it common for both cases.
      
      Link: http://lkml.kernel.org/r/20190315083502.11849-1-larper@axis.com
      Fixes: 97ee0524 ("flush cache before installing new page at migraton")
      Signed-off-by: default avatarLars Persson <larper@axis.com>
      Reviewed-by: default avatarPaul Burton <paul.burton@mips.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4774a369
    • Romain Izard's avatar
      usb: cdc-acm: fix race during wakeup blocking TX traffic · 35b7c12b
      Romain Izard authored
      commit 93e1c8a6 upstream.
      
      When the kernel is compiled with preemption enabled, the URB completion
      handler can run in parallel with the work responsible for waking up the
      tty layer. If the URB handler sets the EVENT_TTY_WAKEUP bit during the
      call to tty_port_tty_wakeup() to signal that there is room for additional
      input, it will be cleared at the end of this call. As a result, TX traffic
      on the upper layer will be blocked.
      
      This can be seen with a kernel configured with CONFIG_PREEMPT, and a fast
      modem connected with PPP running over a USB CDC-ACM port.
      
      Use test_and_clear_bit() instead, which ensures that each wakeup requested
      by the URB completion code will trigger a call to tty_port_tty_wakeup().
      
      Fixes: 1aba579f cdc-acm: handle read pipe errors
      Signed-off-by: default avatarRomain Izard <romain.izard.pro@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35b7c12b
    • Mathias Nyman's avatar
      xhci: Fix port resume done detection for SS ports with LPM enabled · 09fa576a
      Mathias Nyman authored
      commit 6cbcf596 upstream.
      
      A suspended SS port in U3 link state will go to U0 when resumed, but
      can almost immediately after that enter U1 or U2 link power save
      states before host controller driver reads the port status.
      
      Host controller driver only checks for U0 state, and might miss
      the finished resume, leaving flags unclear and skip notifying usb
      code of the wake.
      
      Add U1 and U2 to the possible link states when checking for finished
      port resume.
      
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09fa576a
    • Yasushi Asano's avatar
      usb: host: xhci-rcar: Add XHCI_TRUST_TX_LENGTH quirk · dc4267dc
      Yasushi Asano authored
      commit 40fc1653 upstream.
      
      When plugging BUFFALO LUA4-U3-AGT USB3.0 to Gigabit Ethernet LAN
      Adapter, warning messages filled up dmesg.
      
      [  101.098287] xhci-hcd ee000000.usb: WARN Successful completion on short TX for slot 1 ep 4: needs XHCI_TRUST_TX_LENGTH quirk?
      [  101.117463] xhci-hcd ee000000.usb: WARN Successful completion on short TX for slot 1 ep 4: needs XHCI_TRUST_TX_LENGTH quirk?
      [  101.136513] xhci-hcd ee000000.usb: WARN Successful completion on short TX for slot 1 ep 4: needs XHCI_TRUST_TX_LENGTH quirk?
      
      Adding the XHCI_TRUST_TX_LENGTH quirk resolves the issue.
      Signed-off-by: default avatarYasushi Asano <yasano@jp.adit-jv.com>
      Signed-off-by: default avatarSpyridon Papageorgiou <spapageorgiou@de.adit-jv.com>
      Acked-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc4267dc
    • Fabrizio Castro's avatar
      usb: common: Consider only available nodes for dr_mode · 60397a4e
      Fabrizio Castro authored
      commit 238e0268 upstream.
      
      There are cases where multiple device tree nodes point to the
      same phy node by means of the "phys" property, but we should
      only consider those nodes that are marked as available rather
      than just any node.
      
      Fixes: 98bfb394 ("usb: of: add an api to get dr_mode by the phy node")
      Cc: stable@vger.kernel.org # v4.4+
      Signed-off-by: default avatarFabrizio Castro <fabrizio.castro@bp.renesas.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      60397a4e
    • Radoslav Gerganov's avatar
      USB: gadget: f_hid: fix deadlock in f_hidg_write() · 3e043e5d
      Radoslav Gerganov authored
      commit 072684e8 upstream.
      
      In f_hidg_write() the write_spinlock is acquired before calling
      usb_ep_queue() which causes a deadlock when dummy_hcd is being used.
      This is because dummy_queue() callbacks into f_hidg_req_complete() which
      tries to acquire the same spinlock. This is (part of) the backtrace when
      the deadlock occurs:
      
        0xffffffffc06b1410 in f_hidg_req_complete
        0xffffffffc06a590a in usb_gadget_giveback_request
        0xffffffffc06cfff2 in dummy_queue
        0xffffffffc06a4b96 in usb_ep_queue
        0xffffffffc06b1eb6 in f_hidg_write
        0xffffffff8127730b in __vfs_write
        0xffffffff812774d1 in vfs_write
        0xffffffff81277725 in SYSC_write
      
      Fix this by releasing the write_spinlock before calling usb_ep_queue()
      Reviewed-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Tested-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: stable@vger.kernel.org # 4.11+
      Fixes: 749494b6 ("usb: gadget: f_hid: fix: Move IN request allocation to set_alt()")
      Signed-off-by: default avatarRadoslav Gerganov <rgerganov@vmware.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e043e5d
    • Arnd Bergmann's avatar
      usb: mtu3: fix EXTCON dependency · 54333dcc
      Arnd Bergmann authored
      commit 3d54d10c upstream.
      
      When EXTCON is a loadable module, mtu3 fails to link as built-in:
      
      drivers/usb/mtu3/mtu3_plat.o: In function `mtu3_probe':
      mtu3_plat.c:(.text+0x690): undefined reference to `extcon_get_edev_by_phandle'
      
      Add a Kconfig dependency to force mtu3 also to be a loadable module
      if extconn is, but still allow it to be built without extcon.
      
      Fixes: d0ed062a ("usb: mtu3: dual-role mode support")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54333dcc
    • Chen-Yu Tsai's avatar
      phy: sun4i-usb: Support set_mode to USB_HOST for non-OTG PHYs · d05885ad
      Chen-Yu Tsai authored
      commit 1396929e upstream.
      
      While only the first PHY supports mode switching, the remaining PHYs
      work in USB host mode. They should support set_mode with mode=USB_HOST
      instead of failing. This is especially needed now that the USB core does
      set_mode for all USB ports, which was added in commit b97a3134 ("usb:
      core: comply to PHY framework").
      
      Make set_mode with mode=USB_HOST a no-op instead of failing for the
      non-OTG USB PHYs.
      
      Fixes: 6ba43c29 ("phy-sun4i-usb: Add support for phy_set_mode")
      Signed-off-by: default avatarChen-Yu Tsai <wens@csie.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d05885ad
    • Axel Lin's avatar
      gpio: adnp: Fix testing wrong value in adnp_gpio_direction_input · 5d4cc25d
      Axel Lin authored
      commit c5bc6e52 upstream.
      
      Current code test wrong value so it does not verify if the written
      data is correctly read back. Fix it.
      Also make it return -EPERM if read value does not match written bit,
      just like it done for adnp_gpio_direction_output().
      
      Fixes: 5e969a40 ("gpio: Add Avionic Design N-bit GPIO expander support")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Reviewed-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d4cc25d
    • Kangjie Lu's avatar
      gpio: exar: add a check for the return value of ida_simple_get fails · e708e5db
      Kangjie Lu authored
      commit 7ecced09 upstream.
      
      ida_simple_get may fail and return a negative error number.
      The fix checks its return value; if it fails, go to err_destroy.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarKangjie Lu <kjlu@umn.edu>
      Signed-off-by: default avatarBartosz Golaszewski <bgolaszewski@baylibre.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e708e5db
    • Eric Biggers's avatar
      drm/vgem: fix use-after-free when drm_gem_handle_create() fails · 2ea1c197
      Eric Biggers authored
      commit 21d2b122 upstream.
      
      If drm_gem_handle_create() fails in vgem_gem_create(), then the
      drm_vgem_gem_object is freed twice: once when the reference is dropped
      by drm_gem_object_put_unlocked(), and again by __vgem_gem_destroy().
      
      This was hit by syzkaller using fault injection.
      
      Fix it by skipping the second free.
      
      Reported-by: syzbot+e73f2fb5ed5a5df36d33@syzkaller.appspotmail.com
      Fixes: af33a919 ("drm/vgem: Enable dmabuf import interfaces")
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarRodrigo Siqueira <rodrigosiqueiramelo@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190226214451.195123-1-ebiggers@kernel.orgSigned-off-by: default avatarMaxime Ripard <maxime.ripard@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ea1c197
    • YueHaibing's avatar
      fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links · 0d9ef3f5
      YueHaibing authored
      commit 23da9588 upstream.
      
      Syzkaller reports:
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN PTI
      CPU: 1 PID: 5373 Comm: syz-executor.0 Not tainted 5.0.0-rc8+ #3
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      RIP: 0010:put_links+0x101/0x440 fs/proc/proc_sysctl.c:1599
      Code: 00 0f 85 3a 03 00 00 48 8b 43 38 48 89 44 24 20 48 83 c0 38 48 89 c2 48 89 44 24 28 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 fe 02 00 00 48 8b 74 24 20 48 c7 c7 60 2a 9d 91
      RSP: 0018:ffff8881d828f238 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: ffff8881e01b1140 RCX: ffffffff8ee98267
      RDX: 0000000000000007 RSI: ffffc90001479000 RDI: ffff8881e01b1178
      RBP: dffffc0000000000 R08: ffffed103ee27259 R09: ffffed103ee27259
      R10: 0000000000000001 R11: ffffed103ee27258 R12: fffffffffffffff4
      R13: 0000000000000006 R14: ffff8881f59838c0 R15: dffffc0000000000
      FS:  00007f072254f700(0000) GS:ffff8881f7100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fff8b286668 CR3: 00000001f0542002 CR4: 00000000007606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       drop_sysctl_table+0x152/0x9f0 fs/proc/proc_sysctl.c:1629
       get_subdir fs/proc/proc_sysctl.c:1022 [inline]
       __register_sysctl_table+0xd65/0x1090 fs/proc/proc_sysctl.c:1335
       br_netfilter_init+0xbc/0x1000 [br_netfilter]
       do_one_initcall+0xfa/0x5ca init/main.c:887
       do_init_module+0x204/0x5f6 kernel/module.c:3460
       load_module+0x66b2/0x8570 kernel/module.c:3808
       __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
       do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x462e99
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f072254ec58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
      RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000003
      RBP: 00007f072254ec70 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f072254f6bc
      R13: 00000000004bcefa R14: 00000000006f6fb0 R15: 0000000000000004
      Modules linked in: br_netfilter(+) dvb_usb_dibusb_mc_common dib3000mc dibx000_common dvb_usb_dibusb_common dvb_usb_dw2102 dvb_usb classmate_laptop palmas_regulator cn videobuf2_v4l2 v4l2_common snd_soc_bd28623 mptbase snd_usb_usx2y snd_usbmidi_lib snd_rawmidi wmi libnvdimm lockd sunrpc grace rc_kworld_pc150u rc_core rtc_da9063 sha1_ssse3 i2c_cros_ec_tunnel adxl34x_spi adxl34x nfnetlink lib80211 i5500_temp dvb_as102 dvb_core videobuf2_common videodev media videobuf2_vmalloc videobuf2_memops udc_core lnbp22 leds_lp3952 hid_roccat_ryos s1d13xxxfb mtd vport_geneve openvswitch nf_conncount nf_nat_ipv6 nsh geneve udp_tunnel ip6_udp_tunnel snd_soc_mt6351 sis_agp phylink snd_soc_adau1761_spi snd_soc_adau1761 snd_soc_adau17x1 snd_soc_core snd_pcm_dmaengine ac97_bus snd_compress snd_soc_adau_utils snd_soc_sigmadsp_regmap snd_soc_sigmadsp raid_class hid_roccat_konepure hid_roccat_common hid_roccat c2port_duramar2150 core mdio_bcm_unimac iptable_security iptable_raw iptable_mangle
       iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel hsr veth netdevsim devlink vxcan batman_adv cfg80211 rfkill chnl_net caif nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel joydev mousedev ide_pci_generic piix aesni_intel aes_x86_64 ide_core crypto_simd atkbd cryptd glue_helper serio_raw ata_generic pata_acpi i2c_piix4 floppy sch_fq_codel ip_tables x_tables ipv6 [last unloaded: lm73]
      Dumping ftrace buffer:
         (ftrace buffer empty)
      ---[ end trace 770020de38961fd0 ]---
      
      A new dir entry can be created in get_subdir and its 'header->parent' is
      set to NULL.  Only after insert_header success, it will be set to 'dir',
      otherwise 'header->parent' is set to NULL and drop_sysctl_table is called.
      However in err handling path of get_subdir, drop_sysctl_table also be
      called on 'new->header' regardless its value of parent pointer.  Then
      put_links is called, which triggers NULL-ptr deref when access member of
      header->parent.
      
      In fact we have multiple error paths which call drop_sysctl_table() there,
      upon failure on insert_links() we also call drop_sysctl_table().And even
      in the successful case on __register_sysctl_table() we still always call
      drop_sysctl_table().This patch fix it.
      
      Link: http://lkml.kernel.org/r/20190314085527.13244-1-yuehaibing@huawei.com
      Fixes: 0e47c99d ("sysctl: Replace root_list with links between sysctl_table_sets")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Acked-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>    [3.4+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d9ef3f5