1. 15 Feb, 2016 18 commits
  2. 10 Feb, 2016 22 commits
    • Nicholas Bellinger's avatar
      iscsi-target: Fix potential dead-lock during node acl delete · 2a0c1fd1
      Nicholas Bellinger authored
      [ Upstream commit 26a99c19 ]
      
      This patch is a iscsi-target specific bug-fix for a dead-lock
      that can occur during explicit struct se_node_acl->acl_group
      se_session deletion via configfs rmdir(2), when iscsi-target
      time2retain timer is still active.
      
      It changes iscsi-target to obtain se_portal_group->session_lock
      internally using spin_in_locked() to check for the specific
      se_node_acl configfs shutdown rmdir(2) case.
      
      Note this patch is intended for stable, and the subsequent
      v4.5-rc patch converts target_core_tpg.c to use proper
      se_sess->sess_kref reference counting for both se_node_acl
      deletion + se_node_acl->queue_depth se_session restart.
      Reported-by: default avatar: Sagi Grimberg <sagig@mellanox.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Andy Grover <agrover@redhat.com>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: stable@vger.kernel.org # 3.10+
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      2a0c1fd1
    • Josh Boyer's avatar
      ideapad-laptop: Add Lenovo ideapad Y700-17ISK to no_hw_rfkill dmi list · 0d7b1b83
      Josh Boyer authored
      [ Upstream commit edde316a ]
      
      One of the newest ideapad models also lacks a physical hw rfkill switch,
      and trying to read the hw rfkill switch through the ideapad module
      causes it to always reported blocking breaking wifi.
      
      Fix it by adding this model to the DMI list.
      
      BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1286293
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJosh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarDarren Hart <dvhart@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      0d7b1b83
    • Vinit Agnihotri's avatar
      IB/qib: Support creating qps with GFP_NOIO flag · 15ff220a
      Vinit Agnihotri authored
      [ Upstream commit fbbeb863 ]
      
      The current code is problematic when the QP creation and ipoib is used to
      support NFS and NFS desires to do IO for paging purposes. In that case, the
      GFP_KERNEL allocation in qib_qp.c causes a deadlock in tight memory
      situations.
      
      This fix adds support to create queue pair with GFP_NOIO flag for connected
      mode only to cleanly fail the create queue pair in those situations.
      
      Cc: <stable@vger.kernel.org> # 3.16+
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarVinit Agnihotri <vinit.abhay.agnihotri@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      15ff220a
    • Mike Marciniszyn's avatar
      IB/qib: fix mcast detach when qp not attached · 383de379
      Mike Marciniszyn authored
      [ Upstream commit 09dc9cd6 ]
      
      The code produces the following trace:
      
      [1750924.419007] general protection fault: 0000 [#3] SMP
      [1750924.420364] Modules linked in: nfnetlink autofs4 rpcsec_gss_krb5 nfsv4
      dcdbas rfcomm bnep bluetooth nfsd auth_rpcgss nfs_acl dm_multipath nfs lockd
      scsi_dh sunrpc fscache radeon ttm drm_kms_helper drm serio_raw parport_pc
      ppdev i2c_algo_bit lpc_ich ipmi_si ib_mthca ib_qib dca lp parport ib_ipoib
      mac_hid ib_cm i3000_edac ib_sa ib_uverbs edac_core ib_umad ib_mad ib_core
      ib_addr tg3 ptp dm_mirror dm_region_hash dm_log psmouse pps_core
      [1750924.420364] CPU: 1 PID: 8401 Comm: python Tainted: G D
      3.13.0-39-generic #66-Ubuntu
      [1750924.420364] Hardware name: Dell Computer Corporation PowerEdge
      860/0XM089, BIOS A04 07/24/2007
      [1750924.420364] task: ffff8800366a9800 ti: ffff88007af1c000 task.ti:
      ffff88007af1c000
      [1750924.420364] RIP: 0010:[<ffffffffa0131d51>] [<ffffffffa0131d51>]
      qib_mcast_qp_free+0x11/0x50 [ib_qib]
      [1750924.420364] RSP: 0018:ffff88007af1dd70  EFLAGS: 00010246
      [1750924.420364] RAX: 0000000000000001 RBX: ffff88007b822688 RCX:
      000000000000000f
      [1750924.420364] RDX: ffff88007b822688 RSI: ffff8800366c15a0 RDI:
      6764697200000000
      [1750924.420364] RBP: ffff88007af1dd78 R08: 0000000000000001 R09:
      0000000000000000
      [1750924.420364] R10: 0000000000000011 R11: 0000000000000246 R12:
      ffff88007baa1d98
      [1750924.420364] R13: ffff88003ecab000 R14: ffff88007b822660 R15:
      0000000000000000
      [1750924.420364] FS:  00007ffff7fd8740(0000) GS:ffff88007fc80000(0000)
      knlGS:0000000000000000
      [1750924.420364] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [1750924.420364] CR2: 00007ffff597c750 CR3: 000000006860b000 CR4:
      00000000000007e0
      [1750924.420364] Stack:
      [1750924.420364]  ffff88007b822688 ffff88007af1ddf0 ffffffffa0132429
      000000007af1de20
      [1750924.420364]  ffff88007baa1dc8 ffff88007baa0000 ffff88007af1de70
      ffffffffa00cb313
      [1750924.420364]  00007fffffffde88 0000000000000000 0000000000000008
      ffff88003ecab000
      [1750924.420364] Call Trace:
      [1750924.420364]  [<ffffffffa0132429>] qib_multicast_detach+0x1e9/0x350
      [ib_qib]
      [1750924.568035]  [<ffffffffa00cb313>] ? ib_uverbs_modify_qp+0x323/0x3d0
      [ib_uverbs]
      [1750924.568035]  [<ffffffffa0092d61>] ib_detach_mcast+0x31/0x50 [ib_core]
      [1750924.568035]  [<ffffffffa00cc213>] ib_uverbs_detach_mcast+0x93/0x170
      [ib_uverbs]
      [1750924.568035]  [<ffffffffa00c61f6>] ib_uverbs_write+0xc6/0x2c0 [ib_uverbs]
      [1750924.568035]  [<ffffffff81312e68>] ? apparmor_file_permission+0x18/0x20
      [1750924.568035]  [<ffffffff812d4cd3>] ? security_file_permission+0x23/0xa0
      [1750924.568035]  [<ffffffff811bd214>] vfs_write+0xb4/0x1f0
      [1750924.568035]  [<ffffffff811bdc49>] SyS_write+0x49/0xa0
      [1750924.568035]  [<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f
      [1750924.568035] Code: 66 2e 0f 1f 84 00 00 00 00 00 31 c0 5d c3 66 2e 0f 1f
      84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 8b 7f 10
      <f0> ff 8f 40 01 00 00 74 0e 48 89 df e8 8e f8 06 e1 5b 5d c3 0f
      [1750924.568035] RIP  [<ffffffffa0131d51>] qib_mcast_qp_free+0x11/0x50
      [ib_qib]
      [1750924.568035]  RSP <ffff88007af1dd70>
      [1750924.650439] ---[ end trace 73d5d4b3f8ad4851 ]
      
      The fix is to note the qib_mcast_qp that was found.   If none is found, then
      return EINVAL indicating the error.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Reported-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      383de379
    • Jean Delvare's avatar
      crypto: crc32c - Fix crc32c soft dependency · 9b826648
      Jean Delvare authored
      [ Upstream commit fd7f6727 ]
      
      I don't think it makes sense for a module to have a soft dependency
      on itself. This seems quite cyclic by nature and I can't see what
      purpose it could serve.
      
      OTOH libcrc32c calls crypto_alloc_shash("crc32c", 0, 0) so it pretty
      much assumes that some incarnation of the "crc32c" hash algorithm has
      been loaded. Therefore it makes sense to have the soft dependency
      there (as crc-t10dif does.)
      
      Cc: stable@vger.kernel.org
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9b826648
    • Herbert Xu's avatar
      crypto: algif_hash - Fix race condition in hash_check_key · 2b12ef1b
      Herbert Xu authored
      [ Upstream commit ad46d7e3 ]
      
      We need to lock the child socket in hash_check_key as otherwise
      two simultaneous calls can cause the parent socket to be freed.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      2b12ef1b
    • Herbert Xu's avatar
      crypto: af_alg - Forbid bind(2) when nokey child sockets are present · bcfa8411
      Herbert Xu authored
      [ Upstream commit a6a48c56 ]
      
      This patch forbids the calling of bind(2) when there are child
      sockets created by accept(2) in existence, even if they are created
      on the nokey path.
      
      This is needed as those child sockets have references to the tfm
      object which bind(2) will destroy.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      bcfa8411
    • Herbert Xu's avatar
      crypto: algif_hash - Remove custom release parent function · 96507c59
      Herbert Xu authored
      [ Upstream commit f1d84af1 ]
      
      This patch removes the custom release parent function as the
      generic af_alg_release_parent now works for nokey sockets too.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      96507c59
    • Herbert Xu's avatar
      crypto: af_alg - Allow af_af_alg_release_parent to be called on nokey path · eb1ab004
      Herbert Xu authored
      [ Upstream commit 6a935170 ]
      
      This patch allows af_alg_release_parent to be called even for
      nokey sockets.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      eb1ab004
    • Herbert Xu's avatar
      crypto: algif_hash - Require setkey before accept(2) · cec8983e
      Herbert Xu authored
      [ Upstream commit 6de62f15 ]
      
      Hash implementations that require a key may crash if you use
      them without setting a key.  This patch adds the necessary checks
      so that if you do attempt to use them without a key that we return
      -ENOKEY instead of proceeding.
      
      This patch also adds a compatibility path to support old applications
      that do acept(2) before setkey.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      cec8983e
    • Herbert Xu's avatar
      crypto: hash - Add crypto_ahash_has_setkey · 8c781796
      Herbert Xu authored
      [ Upstream commit a5596d63 ]
      
      This patch adds a way for ahash users to determine whether a key
      is required by a crypto_ahash transform.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      8c781796
    • Alexander Aring's avatar
      mac802154: fix typo IEEE802515 to IEEE802154 · cfc8c439
      Alexander Aring authored
      [ Upstream commit 57205c14 ]
      
      This patch fixs a typo in address filter defines from IEEE802515 to
      IEEE802154.
      Signed-off-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Cc: Alan Ott <alan@signal11.us>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      cfc8c439
    • Herbert Xu's avatar
      crypto: af_alg - Add nokey compatibility path · 3db68eba
      Herbert Xu authored
      [ Upstream commit 37766586 ]
      
      This patch adds a compatibility path to support old applications
      that do acept(2) before setkey.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      3db68eba
    • Herbert Xu's avatar
      crypto: af_alg - Fix socket double-free when accept fails · ac9c75f8
      Herbert Xu authored
      [ Upstream commit a383292c ]
      
      When we fail an accept(2) call we will end up freeing the socket
      twice, once due to the direct sk_free call and once again through
      newsock.
      
      This patch fixes this by removing the sk_free call.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      ac9c75f8
    • Herbert Xu's avatar
      crypto: af_alg - Disallow bind/setkey/... after accept(2) · f857638d
      Herbert Xu authored
      [ Upstream commit c840ac6a ]
      
      Each af_alg parent socket obtained by socket(2) corresponds to a
      tfm object once bind(2) has succeeded.  An accept(2) call on that
      parent socket creates a context which then uses the tfm object.
      
      Therefore as long as any child sockets created by accept(2) exist
      the parent socket must not be modified or freed.
      
      This patch guarantees this by using locks and a reference count
      on the parent socket.  Any attempt to modify the parent socket will
      fail with EBUSY.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      f857638d
    • Tejun Heo's avatar
      printk: do cond_resched() between lines while outputting to consoles · 9044efa9
      Tejun Heo authored
      [ Upstream commit 8d91f8b1 ]
      
      @console_may_schedule tracks whether console_sem was acquired through
      lock or trylock.  If the former, we're inside a sleepable context and
      console_conditional_schedule() performs cond_resched().  This allows
      console drivers which use console_lock for synchronization to yield
      while performing time-consuming operations such as scrolling.
      
      However, the actual console outputting is performed while holding
      irq-safe logbuf_lock, so console_unlock() clears @console_may_schedule
      before starting outputting lines.  Also, only a few drivers call
      console_conditional_schedule() to begin with.  This means that when a
      lot of lines need to be output by console_unlock(), for example on a
      console registration, the task doing console_unlock() may not yield for
      a long time on a non-preemptible kernel.
      
      If this happens with a slow console devices, for example a serial
      console, the outputting task may occupy the cpu for a very long time.
      Long enough to trigger softlockup and/or RCU stall warnings, which in
      turn pile more messages, sometimes enough to trigger the next cycle of
      warnings incapacitating the system.
      
      Fix it by making console_unlock() insert cond_resched() between lines if
      @console_may_schedule.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarCalvin Owens <calvinowens@fb.com>
      Acked-by: default avatarJan Kara <jack@suse.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Kyle McMartin <kyle@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9044efa9
    • Vitaly Kuznetsov's avatar
      kernel/panic.c: turn off locks debug before releasing console lock · 6db3dd4b
      Vitaly Kuznetsov authored
      [ Upstream commit 7625b3a0 ]
      
      Commit 08d78658 ("panic: release stale console lock to always get the
      logbuf printed out") introduced an unwanted bad unlock balance report when
      panic() is called directly and not from OOPS (e.g.  from out_of_memory()).
      The difference is that in case of OOPS we disable locks debug in
      oops_enter() and on direct panic call nobody does that.
      
      Fixes: 08d78658 ("panic: release stale console lock to always get the logbuf printed out")
      Reported-by: default avatarkernel test robot <ying.huang@linux.intel.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Petr Mladek <pmladek@suse.cz>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      6db3dd4b
    • Vitaly Kuznetsov's avatar
      panic: release stale console lock to always get the logbuf printed out · e3191b65
      Vitaly Kuznetsov authored
      [ Upstream commit 08d78658 ]
      
      In some cases we may end up killing the CPU holding the console lock
      while still having valuable data in logbuf. E.g. I'm observing the
      following:
      
      - A crash is happening on one CPU and console_unlock() is being called on
        some other.
      
      - console_unlock() tries to print out the buffer before releasing the lock
        and on slow console it takes time.
      
      - in the meanwhile crashing CPU does lots of printk()-s with valuable data
        (which go to the logbuf) and sends IPIs to all other CPUs.
      
      - console_unlock() finishes printing previous chunk and enables interrupts
        before trying to print out the rest, the CPU catches the IPI and never
        releases console lock.
      
      This is not the only possible case: in VT/fb subsystems we have many other
      console_lock()/console_unlock() users.  Non-masked interrupts (or
      receiving NMI in case of extreme slowness) will have the same result.
      Getting the whole console buffer printed out on crash should be top
      priority.
      
      [akpm@linux-foundation.org: tweak comment text]
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e3191b65
    • Martijn Coenen's avatar
      memcg: only free spare array when readers are done · ab0d430c
      Martijn Coenen authored
      [ Upstream commit 6611d8d7 ]
      
      A spare array holding mem cgroup threshold events is kept around to make
      sure we can always safely deregister an event and have an array to store
      the new set of events in.
      
      In the scenario where we're going from 1 to 0 registered events, the
      pointer to the primary array containing 1 event is copied to the spare
      slot, and then the spare slot is freed because no events are left.
      However, it is freed before calling synchronize_rcu(), which means
      readers may still be accessing threshold->primary after it is freed.
      
      Fixed by only freeing after synchronize_rcu().
      Signed-off-by: default avatarMartijn Coenen <maco@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      ab0d430c
    • Naoya Horiguchi's avatar
      mm: soft-offline: check return value in second __get_any_page() call · a30cd72c
      Naoya Horiguchi authored
      [ Upstream commit d96b339f ]
      
      I saw the following BUG_ON triggered in a testcase where a process calls
      madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
      calls migratepages command repeatedly (doing ping-pong among different
      NUMA nodes) for the first process:
      
         Soft offlining page 0x60000 at 0x700000600000
         __get_any_page: 0x60000 free buddy page
         page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
         flags: 0x1fffc0000000000()
         page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
         ------------[ cut here ]------------
         kernel BUG at /src/linux-dev/include/linux/mm.h:342!
         invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
         Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
         CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ #74
         Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
         task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
         RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
         RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
         Call Trace:
           put_hwpoison_page+0x4e/0x80
           soft_offline_page+0x501/0x520
           SyS_madvise+0x6bc/0x6f0
           entry_SYSCALL_64_fastpath+0x12/0x6a
         Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
         RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
          RSP <ffff88007c213e00>
      
      The root cause resides in get_any_page() which retries to get a refcount
      of the page to be soft-offlined.  This function calls
      put_hwpoison_page(), expecting that the target page is putback to LRU
      list.  But it can be also freed to buddy.  So the second check need to
      care about such case.
      
      Fixes: af8fae7c ("mm/memory-failure.c: clean up soft_offline_page()")
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[3.9+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      a30cd72c
    • Kyeongdon Kim's avatar
      zram: try vmalloc() after kmalloc() · ce516bcf
      Kyeongdon Kim authored
      [ Upstream commit d913897a ]
      
      When we're using LZ4 multi compression streams for zram swap, we found
      out page allocation failure message in system running test.  That was
      not only once, but a few(2 - 5 times per test).  Also, some failure
      cases were continually occurring to try allocation order 3.
      
      In order to make parallel compression private data, we should call
      kzalloc() with order 2/3 in runtime(lzo/lz4).  But if there is no order
      2/3 size memory to allocate in that time, page allocation fails.  This
      patch makes to use vmalloc() as fallback of kmalloc(), this prevents
      page alloc failure warning.
      
      After using this, we never found warning message in running test, also
      It could reduce process startup latency about 60-120ms in each case.
      
      For reference a call trace :
      
          Binder_1: page allocation failure: order:3, mode:0x10c0d0
          CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty #20
          Call trace:
            dump_backtrace+0x0/0x270
            show_stack+0x10/0x1c
            dump_stack+0x1c/0x28
            warn_alloc_failed+0xfc/0x11c
            __alloc_pages_nodemask+0x724/0x7f0
            __get_free_pages+0x14/0x5c
            kmalloc_order_trace+0x38/0xd8
            zcomp_lz4_create+0x2c/0x38
            zcomp_strm_alloc+0x34/0x78
            zcomp_strm_multi_find+0x124/0x1ec
            zcomp_strm_find+0xc/0x18
            zram_bvec_rw+0x2fc/0x780
            zram_make_request+0x25c/0x2d4
            generic_make_request+0x80/0xbc
            submit_bio+0xa4/0x15c
            __swap_writepage+0x218/0x230
            swap_writepage+0x3c/0x4c
            shrink_page_list+0x51c/0x8d0
            shrink_inactive_list+0x3f8/0x60c
            shrink_lruvec+0x33c/0x4cc
            shrink_zone+0x3c/0x100
            try_to_free_pages+0x2b8/0x54c
            __alloc_pages_nodemask+0x514/0x7f0
            __get_free_pages+0x14/0x5c
            proc_info_read+0x50/0xe4
            vfs_read+0xa0/0x12c
            SyS_read+0x44/0x74
          DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB
               0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB
      
      [minchan@kernel.org: change vmalloc gfp and adding comment about gfp]
      [sergey.senozhatsky@gmail.com: tweak comments and styles]
      Signed-off-by: default avatarKyeongdon Kim <kyeongdon.kim@lge.com>
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      ce516bcf
    • Sergey Senozhatsky's avatar
      zram/zcomp: use GFP_NOIO to allocate streams · 40942069
      Sergey Senozhatsky authored
      [ Upstream commit 3d5fe03a ]
      
      We can end up allocating a new compression stream with GFP_KERNEL from
      within the IO path, which may result is nested (recursive) IO
      operations.  That can introduce problems if the IO path in question is a
      reclaimer, holding some locks that will deadlock nested IOs.
      
      Allocate streams and working memory using GFP_NOIO flag, forbidding
      recursive IO and FS operations.
      
      An example:
      
        inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
        git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes:
         (jbd2_handle){+.+.?.}, at:  start_this_handle+0x4ca/0x555
        {IN-RECLAIM_FS-W} state was registered at:
           __lock_acquire+0x8da/0x117b
           lock_acquire+0x10c/0x1a7
           start_this_handle+0x52d/0x555
           jbd2__journal_start+0xb4/0x237
           __ext4_journal_start_sb+0x108/0x17e
           ext4_dirty_inode+0x32/0x61
           __mark_inode_dirty+0x16b/0x60c
           iput+0x11e/0x274
           __dentry_kill+0x148/0x1b8
           shrink_dentry_list+0x274/0x44a
           prune_dcache_sb+0x4a/0x55
           super_cache_scan+0xfc/0x176
           shrink_slab.part.14.constprop.25+0x2a2/0x4d3
           shrink_zone+0x74/0x140
           kswapd+0x6b7/0x930
           kthread+0x107/0x10f
           ret_from_fork+0x3f/0x70
        irq event stamp: 138297
        hardirqs last  enabled at (138297):  debug_check_no_locks_freed+0x113/0x12f
        hardirqs last disabled at (138296):  debug_check_no_locks_freed+0x33/0x12f
        softirqs last  enabled at (137818):  __do_softirq+0x2d3/0x3e9
        softirqs last disabled at (137813):  irq_exit+0x41/0x95
      
                     other info that might help us debug this:
         Possible unsafe locking scenario:
               CPU0
               ----
          lock(jbd2_handle);
          <Interrupt>
            lock(jbd2_handle);
      
                      *** DEADLOCK ***
        5 locks held by git/20158:
         #0:  (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b
         #1:  (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3
         #2:  (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b
         #3:  (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b
         #4:  (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555
      
                     stack backtrace:
        CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211
        Call Trace:
          dump_stack+0x4c/0x6e
          mark_lock+0x384/0x56d
          mark_held_locks+0x5f/0x76
          lockdep_trace_alloc+0xb2/0xb5
          kmem_cache_alloc_trace+0x32/0x1e2
          zcomp_strm_alloc+0x25/0x73 [zram]
          zcomp_strm_multi_find+0xe7/0x173 [zram]
          zcomp_strm_find+0xc/0xe [zram]
          zram_bvec_rw+0x2ca/0x7e0 [zram]
          zram_make_request+0x1fa/0x301 [zram]
          generic_make_request+0x9c/0xdb
          submit_bio+0xf7/0x120
          ext4_io_submit+0x2e/0x43
          ext4_bio_write_page+0x1b7/0x300
          mpage_submit_page+0x60/0x77
          mpage_map_and_submit_buffers+0x10f/0x21d
          ext4_writepages+0xc8c/0xe1b
          do_writepages+0x23/0x2c
          __filemap_fdatawrite_range+0x84/0x8b
          filemap_flush+0x1c/0x1e
          ext4_alloc_da_blocks+0xb8/0x117
          ext4_rename+0x132/0x6dc
          ? mark_held_locks+0x5f/0x76
          ext4_rename2+0x29/0x2b
          vfs_rename+0x540/0x636
          SyS_renameat2+0x359/0x44d
          SyS_rename+0x1e/0x20
          entry_SYSCALL_64_fastpath+0x12/0x6f
      
      [minchan@kernel.org: add stable mark]
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Kyeongdon Kim <kyeongdon.kim@lge.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      40942069