1. 30 Sep, 2016 40 commits
    • Sven Van Asbroeck's avatar
      power: supply: max17042_battery: fix model download bug. · 97f9aa7c
      Sven Van Asbroeck authored
      commit 5381cfb6 upstream.
      
      The device's model download function returns the model data as
      an array of u32s, which is later compared to the reference
      model data. However, since the latter is an array of u16s,
      the comparison does not happen correctly, and model verification
      fails. This in turn breaks the POR initialization sequence.
      
      Fixes: 39e7213e ("max17042_battery: Support regmap to access device's registers")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarSven Van Asbroeck <TheSven73@googlemail.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarSebastian Reichel <sre@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97f9aa7c
    • Wei Yongjun's avatar
      power_supply: tps65217-charger: fix missing platform_set_drvdata() · a596ebc5
      Wei Yongjun authored
      commit 33e7664a upstream.
      
      Add missing platform_set_drvdata() in tps65217_charger_probe(), otherwise
      calling platform_get_drvdata() in remove returns NULL.
      
      This is detected by Coccinelle semantic patch.
      
      Fixes: 3636859b ("power_supply: Add support for tps65217-charger")
      Signed-off-by: default avatarWei Yongjun <weiyj.lk@gmail.com>
      Signed-off-by: default avatarSebastian Reichel <sre@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a596ebc5
    • James Morse's avatar
      PM / hibernate: Fix rtree_next_node() to avoid walking off list ends · 116fcd88
      James Morse authored
      commit 924d8696 upstream.
      
      rtree_next_node() walks the linked list of leaf nodes to find the next
      block of pages in the struct memory_bitmap. If it walks off the end of
      the list of nodes, it walks the list of memory zones to find the next
      region of memory. If it walks off the end of the list of zones, it
      returns false.
      
      This leaves the struct bm_position's node and zone pointers pointing
      at their respective struct list_heads in struct mem_zone_bm_rtree.
      
      memory_bm_find_bit() uses struct bm_position's node and zone pointers
      to avoid walking lists and trees if the next bit appears in the same
      node/zone. It handles these values being stale.
      
      Swap rtree_next_node()s 'step then test' to 'test-next then step',
      this means if we reach the end of memory we return false and leave
      the node and zone pointers as they were.
      
      This fixes a panic on resume using AMD Seattle with 64K pages:
      [    6.868732] Freezing user space processes ... (elapsed 0.000 seconds) done.
      [    6.875753] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
      [    6.896453] PM: Using 3 thread(s) for decompression.
      [    6.896453] PM: Loading and decompressing image data (5339 pages)...
      [    7.318890] PM: Image loading progress:   0%
      [    7.323395] Unable to handle kernel paging request at virtual address 00800040
      [    7.330611] pgd = ffff000008df0000
      [    7.334003] [00800040] *pgd=00000083fffe0003, *pud=00000083fffe0003, *pmd=00000083fffd0003, *pte=0000000000000000
      [    7.344266] Internal error: Oops: 96000005 [#1] PREEMPT SMP
      [    7.349825] Modules linked in:
      [    7.352871] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G        W I     4.8.0-rc1 #4737
      [    7.360512] Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1002C 04/08/2016
      [    7.369109] task: ffff8003c0220000 task.stack: ffff8003c0280000
      [    7.375020] PC is at set_bit+0x18/0x30
      [    7.378758] LR is at memory_bm_set_bit+0x24/0x30
      [    7.383362] pc : [<ffff00000835bbc8>] lr : [<ffff0000080faf18>] pstate: 60000045
      [    7.390743] sp : ffff8003c0283b00
      [    7.473551]
      [    7.475031] Process swapper/0 (pid: 1, stack limit = 0xffff8003c0280020)
      [    7.481718] Stack: (0xffff8003c0283b00 to 0xffff8003c0284000)
      [    7.800075] Call trace:
      [    7.887097] [<ffff00000835bbc8>] set_bit+0x18/0x30
      [    7.891876] [<ffff0000080fb038>] duplicate_memory_bitmap.constprop.38+0x54/0x70
      [    7.899172] [<ffff0000080fcc40>] snapshot_write_next+0x22c/0x47c
      [    7.905166] [<ffff0000080fe1b4>] load_image_lzo+0x754/0xa88
      [    7.910725] [<ffff0000080ff0a8>] swsusp_read+0x144/0x230
      [    7.916025] [<ffff0000080fa338>] load_image_and_restore+0x58/0x90
      [    7.922105] [<ffff0000080fa660>] software_resume+0x2f0/0x338
      [    7.927752] [<ffff000008083350>] do_one_initcall+0x38/0x11c
      [    7.933314] [<ffff000008b40cc0>] kernel_init_freeable+0x14c/0x1ec
      [    7.939395] [<ffff0000087ce564>] kernel_init+0x10/0xfc
      [    7.944520] [<ffff000008082e90>] ret_from_fork+0x10/0x40
      [    7.949820] Code: d2800022 8b400c21 f9800031 9ac32043 (c85f7c22)
      [    7.955909] ---[ end trace 0024a5986e6ff323 ]---
      [    7.960529] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
      
      Here struct mem_zone_bm_rtree's start_pfn has been returned instead of
      struct rtree_node's addr as the node/zone pointers are corrupt after
      we walked off the end of the lists during mark_unsafe_pages().
      
      This behaviour was exposed by commit 6dbecfd3 ("PM / hibernate:
      Simplify mark_unsafe_pages()"), which caused mark_unsafe_pages() to call
      duplicate_memory_bitmap(), which uses memory_bm_find_bit() after walking
      off the end of the memory bitmap.
      
      Fixes: 3a20cb17 (PM / Hibernate: Implement position keeping in radix tree)
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      [ rjw: Subject ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      116fcd88
    • Thomas Garnier's avatar
      PM / hibernate: Restore processor state before using per-CPU variables · 119c348a
      Thomas Garnier authored
      commit 62822e2e upstream.
      
      Restore the processor state before calling any other functions to
      ensure per-CPU variables can be used with KASLR memory randomization.
      
      Tracing functions use per-CPU variables (GS based on x86) and one was
      called just before restoring the processor state fully. It resulted
      in a double fault when both the tracing & the exception handler
      functions tried to use a per-CPU variable.
      
      Fixes: bb3632c6 (PM / sleep: trace events for suspend/resume)
      Reported-and-tested-by: default avatarBorislav Petkov <bp@suse.de>
      Reported-by: default avatarJiri Kosina <jikos@kernel.org>
      Tested-by: default avatarRafael J. Wysocki <rafael@kernel.org>
      Tested-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarThomas Garnier <thgarnie@google.com>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      119c348a
    • Matt Redfearn's avatar
      MIPS: paravirt: Fix undefined reference to smp_bootstrap · da40055e
      Matt Redfearn authored
      commit 951c39cd upstream.
      
      If the paravirt machine is compiles without CONFIG_SMP, the following
      linker error occurs
      
      arch/mips/kernel/head.o: In function `kernel_entry':
      (.ref.text+0x10): undefined reference to `smp_bootstrap'
      
      due to the kernel entry macro always including SMP startup code.
      Wrap this code in CONFIG_SMP to fix the error.
      Signed-off-by: default avatarMatt Redfearn <matt.redfearn@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/14212/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      da40055e
    • Huacai Chen's avatar
      MIPS: Add a missing ".set pop" in an early commit · aebd3358
      Huacai Chen authored
      commit 3cbc6fc9 upstream.
      
      Commit 842dfc11 ("MIPS: Fix build with binutils 2.24.51+") missing
      a ".set pop" in macro fpu_restore_16even, so add it.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Acked-by: default avatarManuel Lauss <manuel.lauss@gmail.com>
      Cc: Steven J . Hill <Steven.Hill@caviumnetworks.com>
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/14210/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aebd3358
    • Marcin Nowakowski's avatar
      MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...) · 49cded2a
      Marcin Nowakowski authored
      commit b244614a upstream.
      
      cpu_has_fpu macro uses smp_processor_id() and is currently executed
      with preemption enabled, that triggers the warning at runtime.
      
      It is assumed throughout the kernel that if any CPU has an FPU, then all
      CPUs would have an FPU as well, so it is safe to perform the check with
      preemption enabled - change the code to use raw_ variant of the check to
      avoid the warning.
      Signed-off-by: default avatarMarcin Nowakowski <marcin.nowakowski@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/14125/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      49cded2a
    • Paul Burton's avatar
      MIPS: Remove compact branch policy Kconfig entries · 9b30cac4
      Paul Burton authored
      commit b03c1e3b upstream.
      
      Commit c1a0e9bc ("MIPS: Allow compact branch policy to be changed")
      added Kconfig entries allowing for the compact branch policy used by the
      compiler for MIPSr6 kernels to be specified. This can be useful for
      debugging, particularly in systems where compact branches have recently
      been introduced.
      
      Unfortunately mainline gcc 5.x supports MIPSr6 but not the
      -mcompact-branches compiler flag, leading to MIPSr6 kernels failing to
      build with gcc 5.x with errors such as:
      
        mipsel-linux-gnu-gcc: error: unrecognized command line option '-mcompact-branches=optimal'
        make[2]: *** [kernel/bounds.s] Error 1
      
      Fixing this by hiding the Kconfig entry behind another seems to be more
      hassle than it's worth, as MIPSr6 & compact branches have been around
      for a while now and if policy does need to be set for debug it can be
      done easily enough with KCFLAGS. Therefore remove the compact branch
      policy Kconfig entries & their handling in the Makefile.
      
      This reverts commit c1a0e9bc ("MIPS: Allow compact branch policy to
      be changed").
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Fixes: c1a0e9bc ("MIPS: Allow compact branch policy to be changed")
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/14241/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b30cac4
    • James Hogan's avatar
      MIPS: vDSO: Fix Malta EVA mapping to vDSO page structs · 450abddb
      James Hogan authored
      commit 554af0c3 upstream.
      
      The page structures associated with the vDSO pages in the kernel image
      are calculated using virt_to_page(), which uses __pa() under the hood to
      find the pfn associated with the virtual address. The vDSO data pointers
      however point to kernel symbols, so __pa_symbol() should really be used
      instead.
      
      Since there is no equivalent to virt_to_page() which uses __pa_symbol(),
      fix init_vdso_image() to work directly with pfns, calculated with
      __phys_to_pfn(__pa_symbol(...)).
      
      This issue broke the Malta Enhanced Virtual Addressing (EVA)
      configuration which has a non-default implementation of __pa_symbol().
      This is because it uses a physical alias so that the kernel executes
      from KSeg0 (VA 0x80000000 -> PA 0x00000000), while RAM is provided to
      the kernel in the KUSeg range (VA 0x00000000 -> PA 0x80000000) which
      uses the same underlying RAM.
      
      Since there are no page structures associated with the low physical
      address region, some arbitrary kernel memory would be interpreted as a
      page structure for the vDSO pages and badness ensues.
      
      Fixes: ebb5e78c ("MIPS: Initial implementation of a VDSO")
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/14229/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      450abddb
    • Matt Redfearn's avatar
      MIPS: SMP: Fix possibility of deadlock when bringing CPUs online · b8a84b84
      Matt Redfearn authored
      commit 8f46cca1 upstream.
      
      This patch fixes the possibility of a deadlock when bringing up
      secondary CPUs.
      The deadlock occurs because the set_cpu_online() is called before
      synchronise_count_slave(). This can cause a deadlock if the boot CPU,
      having scheduled another thread, attempts to send an IPI to the
      secondary CPU, which it sees has been marked online. The secondary is
      blocked in synchronise_count_slave() waiting for the boot CPU to enter
      synchronise_count_master(), but the boot cpu is blocked in
      smp_call_function_many() waiting for the secondary to respond to it's
      IPI request.
      
      Fix this by marking the CPU online in cpu_callin_map and synchronising
      counters before declaring the CPU online and calculating the maps for
      IPIs.
      Signed-off-by: default avatarMatt Redfearn <matt.redfearn@imgtec.com>
      Reported-by: default avatarJustin Chen <justinpopo6@gmail.com>
      Tested-by: default avatarJustin Chen <justinpopo6@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/14302/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8a84b84
    • Paul Burton's avatar
      MIPS: Fix pre-r6 emulation FPU initialisation · 18720392
      Paul Burton authored
      commit 7e956304 upstream.
      
      In the mipsr2_decoder() function, used to emulate pre-MIPSr6
      instructions that were removed in MIPSr6, the init_fpu() function is
      called if a removed pre-MIPSr6 floating point instruction is the first
      floating point instruction used by the task. However, init_fpu()
      performs varous actions that rely upon not being migrated. For example
      in the most basic case it sets the coprocessor 0 Status.CU1 bit to
      enable the FPU & then loads FP register context into the FPU registers.
      If the task were to migrate during this time, it may end up attempting
      to load FP register context on a different CPU where it hasn't set the
      CU1 bit, leading to errors such as:
      
          do_cpu invoked from kernel context![#2]:
          CPU: 2 PID: 7338 Comm: fp-prctl Tainted: G      D         4.7.0-00424-g49b0c82 #2
          task: 838e4000 ti: 88d38000 task.ti: 88d38000
          $ 0   : 00000000 00000001 ffffffff 88d3fef8
          $ 4   : 838e4000 88d38004 00000000 00000001
          $ 8   : 3400fc01 801f8020 808e9100 24000000
          $12   : dbffffff 807b69d8 807b0000 00000000
          $16   : 00000000 80786150 00400fc4 809c0398
          $20   : 809c0338 0040273c 88d3ff28 808e9d30
          $24   : 808e9d30 00400fb4
          $28   : 88d38000 88d3fe88 00000000 8011a2ac
          Hi    : 0040273c
          Lo    : 88d3ff28
          epc   : 80114178 _restore_fp+0x10/0xa0
          ra    : 8011a2ac mipsr2_decoder+0xd5c/0x1660
          Status: 1400fc03	KERNEL EXL IE
          Cause : 1080002c (ExcCode 0b)
          PrId  : 0001a920 (MIPS I6400)
          Modules linked in:
          Process fp-prctl (pid: 7338, threadinfo=88d38000, task=838e4000, tls=766527d0)
          Stack : 00000000 00000000 00000000 88d3fe98 00000000 00000000 809c0398 809c0338
          	  808e9100 00000000 88d3ff28 00400fc4 00400fc4 0040273c 7fb69e18 004a0000
          	  004a0000 004a0000 7664add0 8010de18 00000000 00000000 88d3fef8 88d3ff28
          	  808e9100 00000000 766527d0 8010e534 000c0000 85755000 8181d580 00000000
          	  00000000 00000000 004a0000 00000000 766527d0 7fb69e18 004a0000 80105c20
          	  ...
          Call Trace:
          [<80114178>] _restore_fp+0x10/0xa0
          [<8011a2ac>] mipsr2_decoder+0xd5c/0x1660
          [<8010de18>] do_ri+0x90/0x6b8
          [<80105c20>] ret_from_exception+0x0/0x10
      
      Fix this by disabling preemption around the call to init_fpu(), ensuring
      that it starts & completes on one CPU.
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Fixes: b0a668fb ("MIPS: kernel: mips-r2-to-r6-emul: Add R2 emulator for MIPS R6")
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/14305/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18720392
    • Sudeep Holla's avatar
      i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended · 05e5e963
      Sudeep Holla authored
      commit 331dcf42 upstream.
      
      If the i2c device is already runtime suspended, if qup_i2c_suspend is
      executed during suspend-to-idle or suspend-to-ram it will result in the
      following splat:
      
      WARNING: CPU: 3 PID: 1593 at drivers/clk/clk.c:476 clk_core_unprepare+0x80/0x90
      Modules linked in:
      
      CPU: 3 PID: 1593 Comm: bash Tainted: G        W       4.8.0-rc3 #14
      Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
      PC is at clk_core_unprepare+0x80/0x90
      LR is at clk_unprepare+0x28/0x40
      pc : [<ffff0000086eecf0>] lr : [<ffff0000086f0c58>] pstate: 60000145
      Call trace:
       clk_core_unprepare+0x80/0x90
       qup_i2c_disable_clocks+0x2c/0x68
       qup_i2c_suspend+0x10/0x20
       platform_pm_suspend+0x24/0x68
       ...
      
      This patch fixes the issue by executing qup_i2c_pm_suspend_runtime
      conditionally in qup_i2c_suspend.
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Reviewed-by: default avatarAndy Gross <andy.gross@linaro.org>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05e5e963
    • Yadi.hu's avatar
      i2c-eg20t: fix race between i2c init and interrupt enable · bbf1c4d2
      Yadi.hu authored
      commit 371a0153 upstream.
      
      the eg20t driver call request_irq() function before the pch_base_address,
      base address of i2c controller's register, is assigned an effective value.
      
      there is one possible scenario that an interrupt which isn't inside eg20t
      arrives immediately after request_irq() is executed when i2c controller
      shares an interrupt number with others. since the interrupt handler
      pch_i2c_handler() has already active as shared action, it will be called
      and read its own register to determine if this interrupt is from itself.
      
      At that moment, since base address of i2c registers is not remapped
      in kernel space yet,so the INT handler will access an illegal address
      and then a error occurs.
      Signed-off-by: default avatarYadi.hu <yadi.hu@windriver.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbf1c4d2
    • Jeff Mahoney's avatar
      btrfs: ensure that file descriptor used with subvol ioctls is a dir · e944e698
      Jeff Mahoney authored
      commit 325c50e3 upstream.
      
      If the subvol/snapshot create/destroy ioctls are passed a regular file
      with execute permissions set, we'll eventually Oops while trying to do
      inode->i_op->lookup via lookup_one_len.
      
      This patch ensures that the file descriptor refers to a directory.
      
      Fixes: cb8e7090 (Btrfs: Fix subvolume creation locking rules)
      Fixes: 76dda93c (Btrfs: add snapshot/subvolume destroy ioctl)
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e944e698
    • Johannes Berg's avatar
      nl80211: validate number of probe response CSA counters · f1b01a34
      Johannes Berg authored
      commit ad5987b4 upstream.
      
      Due to an apparent copy/paste bug, the number of counters for the
      beacon configuration were checked twice, instead of checking the
      number of probe response counters. Fix this to check the number of
      probe response counters before parsing those.
      
      Fixes: 9a774c78 ("cfg80211: Support multiple CSA counters")
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f1b01a34
    • Fabio Estevam's avatar
      can: flexcan: fix resume function · a68022d9
      Fabio Estevam authored
      commit 4de349e7 upstream.
      
      On a imx6ul-pico board the following error is seen during system suspend:
      
      dpm_run_callback(): platform_pm_resume+0x0/0x54 returns -110
      PM: Device 2090000.flexcan failed to resume: error -110
      
      The reason for this suspend error is because when the CAN interface is not
      active the clocks are disabled and then flexcan_chip_enable() will
      always fail due to a timeout error.
      
      In order to fix this issue, only call flexcan_chip_enable/disable()
      when the CAN interface is active.
      
      Based on a patch from Dong Aisheng in the NXP kernel.
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a68022d9
    • Hugh Dickins's avatar
      mm: delete unnecessary and unsafe init_tlb_ubc() · 7af3e4e1
      Hugh Dickins authored
      commit b385d21f upstream.
      
      init_tlb_ubc() looked unnecessary to me: tlb_ubc is statically
      initialized with zeroes in the init_task, and copied from parent to
      child while it is quiescent in arch_dup_task_struct(); so I went to
      delete it.
      
      But inserted temporary debug WARN_ONs in place of init_tlb_ubc() to
      check that it was always empty at that point, and found them firing:
      because memcg reclaim can recurse into global reclaim (when allocating
      biosets for swapout in my case), and arrive back at the init_tlb_ubc()
      in shrink_node_memcg().
      
      Resetting tlb_ubc.flush_required at that point is wrong: if the upper
      level needs a deferred TLB flush, but the lower level turns out not to,
      we miss a TLB flush.  But fortunately, that's the only part of the
      protocol that does not nest: with the initialization removed, cpumask
      collects bits from upper and lower levels, and flushes TLB when needed.
      
      Fixes: 72b252ae ("mm: send one IPI per CPU to TLB flush all entries after unmapping pages")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7af3e4e1
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Move mutex to protect against resetting of seq data · 8b275b45
      Steven Rostedt (Red Hat) authored
      commit 1245800c upstream.
      
      The iter->seq can be reset outside the protection of the mutex. So can
      reading of user data. Move the mutex up to the beginning of the function.
      
      Fixes: d7350c3f ("tracing/core: make the read callbacks reentrants")
      Reported-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b275b45
    • Al Viro's avatar
      fix memory leaks in tracing_buffers_splice_read() · 369796a8
      Al Viro authored
      commit 1ae2293d upstream.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      369796a8
    • Arvind Yadav's avatar
      power: reset: hisi-reboot: Unmap region obtained by of_iomap · 3ee1b560
      Arvind Yadav authored
      commit bae170ef upstream.
      
      Free memory mapping, if probe is not successful.
      
      Fixes: 4a9b3737 ("power: reset: move hisilicon reboot code")
      Signed-off-by: default avatarArvind Yadav <arvind.yadav.cs@gmail.com>
      Signed-off-by: default avatarSebastian Reichel <sre@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ee1b560
    • Dan Carpenter's avatar
      mtd: pmcmsp-flash: Allocating too much in init_msp_flash() · a52c63ad
      Dan Carpenter authored
      commit 79ad07d4 upstream.
      
      There is a cut and paste issue here.  The bug is that we are allocating
      more memory than necessary for msp_maps.  We should be allocating enough
      space for a map_info struct (144 bytes) but we instead allocate enough
      for an mtd_info struct (1840 bytes).  It's a small waste.
      
      The other part of this is not harmful but when we allocated msp_flash
      then we allocated enough space fro a map_info pointer instead of an
      mtd_info pointer.  But since pointers are the same size it works out
      fine.
      
      Anyway, I decided to clean up all three allocations a bit to make them
      a bit more consistent and clear.
      
      Fixes: 68aa0fa8 ('[MTD] PMC MSP71xx flash/rootfs mappings')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a52c63ad
    • Dan Carpenter's avatar
      mtd: maps: sa1100-flash: potential NULL dereference · 45987838
      Dan Carpenter authored
      commit dc01a28d upstream.
      
      We check for NULL but then dereference "info->mtd" on the next line.
      
      Fixes: 72169755 ('mtd: maps: sa1100-flash: show parent device in sysfs')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45987838
    • Al Viro's avatar
      fix fault_in_multipages_...() on architectures with no-op access_ok() · 3f5d8326
      Al Viro authored
      commit e23d4159 upstream.
      
      Switching iov_iter fault-in to multipages variants has exposed an old
      bug in underlying fault_in_multipages_...(); they break if the range
      passed to them wraps around.  Normally access_ok() done by callers will
      prevent such (and it's a guaranteed EFAULT - ERR_PTR() values fall into
      such a range and they should not point to any valid objects).
      
      However, on architectures where userland and kernel live in different
      MMU contexts (e.g. s390) access_ok() is a no-op and on those a range
      with a wraparound can reach fault_in_multipages_...().
      
      Since any wraparound means EFAULT there, the fix is trivial - turn
      those
      
          while (uaddr <= end)
      	    ...
      into
      
          if (unlikely(uaddr > end))
      	    return -EFAULT;
          do
      	    ...
          while (uaddr <= end);
      Reported-by: default avatarJan Stancek <jstancek@redhat.com>
      Tested-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f5d8326
    • Jan Kara's avatar
      fanotify: fix list corruption in fanotify_get_response() · 6e67de39
      Jan Kara authored
      commit 96d41019 upstream.
      
      fanotify_get_response() calls fsnotify_remove_event() when it finds that
      group is being released from fanotify_release() (bypass_perm is set).
      
      However the event it removes need not be only in the group's notification
      queue but it can have already moved to access_list (userspace read the
      event before closing the fanotify instance fd) which is protected by a
      different lock.  Thus when fsnotify_remove_event() races with
      fanotify_release() operating on access_list, the list can get corrupted.
      
      Fix the problem by moving all the logic removing permission events from
      the lists to one place - fanotify_release().
      
      Fixes: 5838d444 ("fanotify: fix double free of pending permission events")
      Link: http://lkml.kernel.org/r/1473797711-14111-3-git-send-email-jack@suse.czSigned-off-by: default avatarJan Kara <jack@suse.cz>
      Reported-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Tested-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e67de39
    • Jan Kara's avatar
      fsnotify: add a way to stop queueing events on group shutdown · af426ec1
      Jan Kara authored
      commit 12703dbf upstream.
      
      Implement a function that can be called when a group is being shutdown
      to stop queueing new events to the group.  Fanotify will use this.
      
      Fixes: 5838d444 ("fanotify: fix double free of pending permission events")
      Link: http://lkml.kernel.org/r/1473797711-14111-2-git-send-email-jack@suse.czSigned-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af426ec1
    • Brian Foster's avatar
      xfs: prevent dropping ioend completions during buftarg wait · fc4edddc
      Brian Foster authored
      commit 800b2694 upstream.
      
      xfs_wait_buftarg() waits for all pending I/O, drains the ioend
      completion workqueue and walks the LRU until all buffers in the cache
      have been released. This is traditionally an unmount operation` but the
      mechanism is also reused during filesystem freeze.
      
      xfs_wait_buftarg() invokes drain_workqueue() as part of the quiesce,
      which is intended more for a shutdown sequence in that it indicates to
      the queue that new operations are not expected once the drain has begun.
      New work jobs after this point result in a WARN_ON_ONCE() and are
      otherwise dropped.
      
      With filesystem freeze, however, read operations are allowed and can
      proceed during or after the workqueue drain. If such a read occurs
      during the drain sequence, the workqueue infrastructure complains about
      the queued ioend completion work item and drops it on the floor. As a
      result, the buffer remains on the LRU and the freeze never completes.
      
      Despite the fact that the overall buffer cache cleanup is not necessary
      during freeze, fix up this operation such that it is safe to invoke
      during non-unmount quiesce operations. Replace the drain_workqueue()
      call with flush_workqueue(), which runs a similar serialization on
      pending workqueue jobs without causing new jobs to be dropped. This is
      safe for unmount as unmount independently locks out new operations by
      the time xfs_wait_buftarg() is invoked.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc4edddc
    • Ian Kent's avatar
      autofs: use dentry flags to block walks during expire · 2ccb99b2
      Ian Kent authored
      commit 7cbdb4a2 upstream.
      
      Somewhere along the way the autofs expire operation has changed to hold
      a spin lock over expired dentry selection.  The autofs indirect mount
      expired dentry selection is complicated and quite lengthy so it isn't
      appropriate to hold a spin lock over the operation.
      
      Commit 47be6184 ("fs/dcache.c: avoid soft-lockup in dput()") added a
      might_sleep() to dput() causing a WARN_ONCE() about this usage to be
      issued.
      
      But the spin lock doesn't need to be held over this check, the autofs
      dentry info.  flags are enough to block walks into dentrys during the
      expire.
      
      I've left the direct mount expire as it is (for now) because it is much
      simpler and quicker than the indirect mount expire and adding spin lock
      release and re-aquires would do nothing more than add overhead.
      
      Fixes: 47be6184 ("fs/dcache.c: avoid soft-lockup in dput()")
      Link: http://lkml.kernel.org/r/20160912014017.1773.73060.stgit@pluto.themaw.netSigned-off-by: default avatarIan Kent <raven@themaw.net>
      Reported-by: default avatarTakashi Iwai <tiwai@suse.de>
      Tested-by: default avatarTakashi Iwai <tiwai@suse.de>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: NeilBrown <neilb@suse.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ccb99b2
    • Al Viro's avatar
      autofs races · 30b54a26
      Al Viro authored
      commit ea01a184 upstream.
      
      * make autofs4_expire_indirect() skip the dentries being in process of
      expiry
      * do *not* mess with list_move(); making sure that dentry with
      AUTOFS_INF_EXPIRING are not picked for expiry is enough.
      * do not remove NO_RCU when we set EXPIRING, don't bother with smp_mb()
      there.  Clear it at the same time we clear EXPIRING.  Makes a bunch of
      tests simpler.
      * rename NO_RCU to WANT_EXPIRE, which is what it really is.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Ian Kent <raven@themaw.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      30b54a26
    • Thierry Reding's avatar
      pwm: Mark all devices as "might sleep" · 9aea5e0d
      Thierry Reding authored
      commit ff01c944 upstream.
      
      Commit d1cd2142 ("pwm: Set enable state properly on failed call to
      enable") introduced a mutex that is needed to protect internal state of
      PWM devices. Since that mutex is acquired in pwm_set_polarity() and in
      pwm_enable() and might potentially block, all PWM devices effectively
      become "might sleep".
      
      It's rather pointless to keep the .can_sleep field around, but given
      that there are external users let's postpone the removal for the next
      release cycle.
      Signed-off-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Fixes: d1cd2142 ("pwm: Set enable state properly on failed call to enable")
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      
      9aea5e0d
    • Davide Caratti's avatar
      bridge: re-introduce 'fix parsing of MLDv2 reports' · fd2e3102
      Davide Caratti authored
      [ Upstream commit 9264251e ]
      
      commit bc8c20ac ("bridge: multicast: treat igmpv3 report with
      INCLUDE and no sources as a leave") seems to have accidentally reverted
      commit 47cc84ce ("bridge: fix parsing of MLDv2 reports"). This
      commit brings back a change to br_ip6_multicast_mld2_report() where
      parsing of MLDv2 reports stops when the first group is successfully
      added to the MDB cache.
      
      Fixes: bc8c20ac ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: default avatarThadeu Lima de Souza Cascardo <cascardo@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fd2e3102
    • Russell King's avatar
      net: smc91x: fix SMC accesses · 8c945f5a
      Russell King authored
      [ Upstream commit 2fb04fdf ]
      
      Commit b70661c7 ("net: smc91x: use run-time configuration on all ARM
      machines") broke some ARM platforms through several mistakes.  Firstly,
      the access size must correspond to the following rule:
      
      (a) at least one of 16-bit or 8-bit access size must be supported
      (b) 32-bit accesses are optional, and may be enabled in addition to
          the above.
      
      Secondly, it provides no emulation of 16-bit accesses, instead blindly
      making 16-bit accesses even when the platform specifies that only 8-bit
      is supported.
      
      Reorganise smc91x.h so we can make use of the existing 16-bit access
      emulation already provided - if 16-bit accesses are supported, use
      16-bit accesses directly, otherwise if 8-bit accesses are supported,
      use the provided 16-bit access emulation.  If neither, BUG().  This
      exactly reflects the driver behaviour prior to the commit being fixed.
      
      Since the conversion incorrectly cut down the available access sizes on
      several platforms, we also need to go through every platform and fix up
      the overly-restrictive access size: Arnd assumed that if a platform can
      perform 32-bit, 16-bit and 8-bit accesses, then only a 32-bit access
      size needed to be specified - not so, all available access sizes must
      be specified.
      
      This likely fixes some performance regressions in doing this: if a
      platform does not support 8-bit accesses, 8-bit accesses have been
      emulated by performing a 16-bit read-modify-write access.
      
      Tested on the Intel Assabet/Neponset platform, which supports only 8-bit
      accesses, which was broken by the original commit.
      
      Fixes: b70661c7 ("net: smc91x: use run-time configuration on all ARM machines")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Tested-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c945f5a
    • Xander Huff's avatar
      Revert "phy: IRQ cannot be shared" · 339d61ab
      Xander Huff authored
      [ Upstream commit c3e70edd ]
      
      This reverts:
        commit 33c133cc ("phy: IRQ cannot be shared")
      
      On hardware with multiple PHY devices hooked up to the same IRQ line, allow
      them to share it.
      
      Sergei Shtylyov says:
        "I'm not sure now what was the reason I concluded that the IRQ sharing
        was impossible... most probably I thought that the kernel IRQ handling
        code exited the loop over the IRQ actions once IRQ_HANDLED was returned
        -- which is obviously not so in reality..."
      Signed-off-by: default avatarXander Huff <xander.huff@ni.com>
      Signed-off-by: default avatarNathan Sullivan <nathan.sullivan@ni.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      339d61ab
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Fix race condition while unmasking interrupts · a3fb2b3b
      Florian Fainelli authored
      [ Upstream commit 4f101c47 ]
      
      We kept shadow copies of which interrupt sources we have enabled and
      disabled, but due to an order bug in how intrl2_mask_clear was defined,
      we could run into the following scenario:
      
      CPU0					CPU1
      intrl2_1_mask_clear(..)
      sets INTRL2_CPU_MASK_CLEAR
      					bcm_sf2_switch_1_isr
      					read INTRL2_CPU_STATUS and masks with stale
      					irq1_mask value
      updates irq1_mask value
      
      Which would make us loop again and again trying to process and interrupt
      we are not clearing since our copy of whether it was enabled before
      still indicates it was not. Fix this by updating the shadow copy first,
      and then unasking at the HW level.
      
      Fixes: 246d7f77 ("net: dsa: add Broadcom SF2 switch driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a3fb2b3b
    • Paul Blakey's avatar
      net/mlx5: Added missing check of msg length in verifying its signature · c03c024f
      Paul Blakey authored
      [ Upstream commit 2c0f8ce1 ]
      
      Set and verify signature calculates the signature for each of the
      mailbox nodes, even for those that are unused (from cache). Added
      a missing length check to set and verify only those which are used.
      
      While here, also moved the setting of msg's nodes token to where we
      already go over them. This saves a pass because checksum is disabled,
      and the only useful thing remaining that set signature does is setting
      the token.
      
      Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB
      adapters')
      Signed-off-by: default avatarPaul Blakey <paulb@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c03c024f
    • Vegard Nossum's avatar
      tipc: fix NULL pointer dereference in shutdown() · 4be4511a
      Vegard Nossum authored
      [ Upstream commit d2fbdf76 ]
      
      tipc_msg_create() can return a NULL skb and if so, we shouldn't try to
      call tipc_node_xmit_skb() on it.
      
          general protection fault: 0000 [#1] PREEMPT SMP KASAN
          CPU: 3 PID: 30298 Comm: trinity-c0 Not tainted 4.7.0-rc7+ #19
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
          task: ffff8800baf09980 ti: ffff8800595b8000 task.ti: ffff8800595b8000
          RIP: 0010:[<ffffffff830bb46b>]  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
          RSP: 0018:ffff8800595bfce8  EFLAGS: 00010246
          RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003023b0e0
          RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffffff83d12580
          RBP: ffff8800595bfd78 R08: ffffed000b2b7f32 R09: 0000000000000000
          R10: fffffbfff0759725 R11: 0000000000000000 R12: 1ffff1000b2b7f9f
          R13: ffff8800595bfd58 R14: ffffffff83d12580 R15: dffffc0000000000
          FS:  00007fcdde242700(0000) GS:ffff88011af80000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007fcddde1db10 CR3: 000000006874b000 CR4: 00000000000006e0
          DR0: 00007fcdde248000 DR1: 00007fcddd73d000 DR2: 00007fcdde248000
          DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
          Stack:
           0000000000000018 0000000000000018 0000000041b58ab3 ffffffff83954208
           ffffffff830bb400 ffff8800595bfd30 ffffffff8309d767 0000000000000018
           0000000000000018 ffff8800595bfd78 ffffffff8309da1a 00000000810ee611
          Call Trace:
           [<ffffffff830c84a3>] tipc_shutdown+0x553/0x880
           [<ffffffff825b4a3b>] SyS_shutdown+0x14b/0x170
           [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
           [<ffffffff83295ca5>] entry_SYSCALL64_slow_path+0x25/0x25
          Code: 90 00 b4 0b 83 c7 00 f1 f1 f1 f1 4c 8d 6d e0 c7 40 04 00 00 00 f4 c7 40 08 f3 f3 f3 f3 48 89 d8 48 c1 e8 03 c7 45 b4 00 00 00 00 <80> 3c 30 00 75 78 48 8d 7b 08 49 8d 75 c0 48 b8 00 00 00 00 00
          RIP  [<ffffffff830bb46b>] tipc_node_xmit_skb+0x6b/0x140
           RSP <ffff8800595bfce8>
          ---[ end trace 57b0484e351e71f1 ]---
      
      I feel like we should maybe return -ENOMEM or -ENOBUFS, but I'm not sure
      userspace is equipped to handle that. Anyway, this is better than a GPF
      and looks somewhat consistent with other tipc_msg_create() callers.
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4be4511a
    • Vegard Nossum's avatar
      net/irda: handle iriap_register_lsap() allocation failure · 8d0d2ce6
      Vegard Nossum authored
      [ Upstream commit 5ba092ef ]
      
      If iriap_register_lsap() fails to allocate memory, self->lsap is
      set to NULL. However, none of the callers handle the failure and
      irlmp_connect_request() will happily dereference it:
      
          iriap_register_lsap: Unable to allocated LSAP!
          ================================================================================
          UBSAN: Undefined behaviour in net/irda/irlmp.c:378:2
          member access within null pointer of type 'struct lsap_cb'
          CPU: 1 PID: 15403 Comm: trinity-c0 Not tainted 4.8.0-rc1+ #81
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org
          04/01/2014
           0000000000000000 ffff88010c7e78a8 ffffffff82344f40 0000000041b58ab3
           ffffffff84f98000 ffffffff82344e94 ffff88010c7e78d0 ffff88010c7e7880
           ffff88010630ad00 ffffffff84a5fae0 ffffffff84d3f5c0 000000000000017a
          Call Trace:
           [<ffffffff82344f40>] dump_stack+0xac/0xfc
           [<ffffffff8242f5a8>] ubsan_epilogue+0xd/0x8a
           [<ffffffff824302bf>] __ubsan_handle_type_mismatch+0x157/0x411
           [<ffffffff83b7bdbc>] irlmp_connect_request+0x7ac/0x970
           [<ffffffff83b77cc0>] iriap_connect_request+0xa0/0x160
           [<ffffffff83b77f48>] state_s_disconnect+0x88/0xd0
           [<ffffffff83b78904>] iriap_do_client_event+0x94/0x120
           [<ffffffff83b77710>] iriap_getvaluebyclass_request+0x3e0/0x6d0
           [<ffffffff83ba6ebb>] irda_find_lsap_sel+0x1eb/0x630
           [<ffffffff83ba90c8>] irda_connect+0x828/0x12d0
           [<ffffffff833c0dfb>] SYSC_connect+0x22b/0x340
           [<ffffffff833c7e09>] SyS_connect+0x9/0x10
           [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
           [<ffffffff845f946a>] entry_SYSCALL64_slow_path+0x25/0x25
          ================================================================================
      
      The bug seems to have been around since forever.
      
      There's more problems with missing error checks in iriap_init() (and
      indeed all of irda_init()), but that's a bigger problem that needs
      very careful review and testing. This patch will fix the most serious
      bug (as it's easily reached from unprivileged userspace).
      
      I have tested my patch with a reproducer.
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d0d2ce6
    • Lance Richardson's avatar
      vti: flush x-netns xfrm cache when vti interface is removed · 0bb225a0
      Lance Richardson authored
      [ Upstream commit a5d0dc81 ]
      
      When executing the script included below, the netns delete operation
      hangs with the following message (repeated at 10 second intervals):
      
        kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
      
      This occurs because a reference to the lo interface in the "secure" netns
      is still held by a dst entry in the xfrm bundle cache in the init netns.
      
      Address this problem by garbage collecting the tunnel netns flow cache
      when a cross-namespace vti interface receives a NETDEV_DOWN notification.
      
      A more detailed description of the problem scenario (referencing commands
      in the script below):
      
      (1) ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1
      
        The vti_test interface is created in the init namespace. vti_tunnel_init()
        attaches a struct ip_tunnel to the vti interface's netdev_priv(dev),
        setting the tunnel net to &init_net.
      
      (2) ip link set vti_test netns secure
      
        The vti_test interface is moved to the "secure" netns. Note that
        the associated struct ip_tunnel still has tunnel->net set to &init_net.
      
      (3) ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1
      
        The first packet sent using the vti device causes xfrm_lookup() to be
        called as follows:
      
            dst = xfrm_lookup(tunnel->net, skb_dst(skb), fl, NULL, 0);
      
        Note that tunnel->net is the init namespace, while skb_dst(skb) references
        the vti_test interface in the "secure" namespace. The returned dst
        references an interface in the init namespace.
      
        Also note that the first parameter to xfrm_lookup() determines which flow
        cache is used to store the computed xfrm bundle, so after xfrm_lookup()
        returns there will be a cached bundle in the init namespace flow cache
        with a dst referencing a device in the "secure" namespace.
      
      (4) ip netns del secure
      
        Kernel begins to delete the "secure" namespace.  At some point the
        vti_test interface is deleted, at which point dst_ifdown() changes
        the dst->dev in the cached xfrm bundle flow from vti_test to lo (still
        in the "secure" namespace however).
        Since nothing has happened to cause the init namespace's flow cache
        to be garbage collected, this dst remains attached to the flow cache,
        so the kernel loops waiting for the last reference to lo to go away.
      
      <Begin script>
      ip link add br1 type bridge
      ip link set dev br1 up
      ip addr add dev br1 1.1.1.1/8
      
      ip netns add secure
      ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1
      ip link set vti_test netns secure
      ip netns exec secure ip link set vti_test up
      ip netns exec secure ip link s lo up
      ip netns exec secure ip addr add dev lo 192.168.100.1/24
      ip netns exec secure ip route add 192.168.200.0/24 dev vti_test
      ip xfrm policy flush
      ip xfrm state flush
      ip xfrm policy add dir out tmpl src 1.1.1.1 dst 1.1.1.2 \
         proto esp mode tunnel mark 1
      ip xfrm policy add dir in tmpl src 1.1.1.2 dst 1.1.1.1 \
         proto esp mode tunnel mark 1
      ip xfrm state add src 1.1.1.1 dst 1.1.1.2 proto esp spi 1 \
         mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788
      ip xfrm state add src 1.1.1.2 dst 1.1.1.1 proto esp spi 1 \
         mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788
      
      ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1
      
      ip netns del secure
      <End script>
      Reported-by: default avatarHangbin Liu <haliu@redhat.com>
      Reported-by: default avatarJan Tluka <jtluka@redhat.com>
      Signed-off-by: default avatarLance Richardson <lrichard@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0bb225a0
    • Linus Torvalds's avatar
      af_unix: split 'u->readlock' into two: 'iolock' and 'bindlock' · 9b5390d7
      Linus Torvalds authored
      commit 6e1ce3c3 upstream.
      
      Right now we use the 'readlock' both for protecting some of the af_unix
      IO path and for making the bind be single-threaded.
      
      The two are independent, but using the same lock makes for a nasty
      deadlock due to ordering with regards to filesystem locking.  The bind
      locking would want to nest outside the VSF pathname locking, but the IO
      locking wants to nest inside some of those same locks.
      
      We tried to fix this earlier with commit c845acb3 ("af_unix: Fix
      splice-bind deadlock") which moved the readlock inside the vfs locks,
      but that caused problems with overlayfs that will then call back into
      filesystem routines that take the lock in the wrong order anyway.
      
      Splitting the locks means that we can go back to having the bind lock be
      the outermost lock, and we don't have any deadlocks with lock ordering.
      Acked-by: default avatarRainer Weikusat <rweikusat@cyberadapt.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b5390d7
    • Linus Torvalds's avatar
      Revert "af_unix: Fix splice-bind deadlock" · 941f6995
      Linus Torvalds authored
      commit 38f7bd94 upstream.
      
      This reverts commit c845acb3.
      
      It turns out that it just replaces one deadlock with another one: we can
      still get the wrong lock ordering with the readlock due to overlayfs
      calling back into the filesystem layer and still taking the vfs locks
      after the readlock.
      
      The proper solution ends up being to just split the readlock into two
      pieces: the bind lock (taken *outside* the vfs locks) and the IO lock
      (taken *inside* the filesystem locks).  The two locks are independent
      anyway.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Reviewed-by: default avatarShmulik Ladkani <shmulik.ladkani@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      941f6995
    • Mahesh Bandewar's avatar
      bonding: Fix bonding crash · f357a798
      Mahesh Bandewar authored
      [ Upstream commit 24b27fc4 ]
      
      Following few steps will crash kernel -
      
        (a) Create bonding master
            > modprobe bonding miimon=50
        (b) Create macvlan bridge on eth2
            > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
      	   type macvlan
        (c) Now try adding eth2 into the bond
            > echo +eth2 > /sys/class/net/bond0/bonding/slaves
            <crash>
      
      Bonding does lots of things before checking if the device enslaved is
      busy or not.
      
      In this case when the notifier call-chain sends notifications, the
      bond_netdev_event() assumes that the rx_handler /rx_handler_data is
      registered while the bond_enslave() hasn't progressed far enough to
      register rx_handler for the new slave.
      
      This patch adds a rx_handler check that can be performed right at the
      beginning of the enslave code to avoid getting into this situation.
      Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f357a798