1. 06 Mar, 2015 40 commits
    • Andrey Ryabinin's avatar
      smack: fix possible use after frees in task_security() callers · 13f431fa
      Andrey Ryabinin authored
      commit 6d1cff2a upstream.
      
      We hit use after free on dereferncing pointer to task_smack struct in
      smk_of_task() called from smack_task_to_inode().
      
      task_security() macro uses task_cred_xxx() to get pointer to the task_smack.
      task_cred_xxx() could be used only for non-pointer members of task's
      credentials. It cannot be used for pointer members since what they point
      to may disapper after dropping RCU read lock.
      
      Mainly task_security() used this way:
      	smk_of_task(task_security(p))
      
      Intead of this introduce function smk_of_task_struct() which
      takes task_struct as argument and returns pointer to smk_known struct
      and do this under RCU read lock.
      Bogus task_security() macro is not used anymore, so remove it.
      
      KASan's report for this:
      
      	AddressSanitizer: use after free in smack_task_to_inode+0x50/0x70 at addr c4635600
      	=============================================================================
      	BUG kmalloc-64 (Tainted: PO): kasan error
      	-----------------------------------------------------------------------------
      
      	Disabling lock debugging due to kernel taint
      	INFO: Allocated in new_task_smack+0x44/0xd8 age=39 cpu=0 pid=1866
      		kmem_cache_alloc_trace+0x88/0x1bc
      		new_task_smack+0x44/0xd8
      		smack_cred_prepare+0x48/0x21c
      		security_prepare_creds+0x44/0x4c
      		prepare_creds+0xdc/0x110
      		smack_setprocattr+0x104/0x150
      		security_setprocattr+0x4c/0x54
      		proc_pid_attr_write+0x12c/0x194
      		vfs_write+0x1b0/0x370
      		SyS_write+0x5c/0x94
      		ret_fast_syscall+0x0/0x48
      	INFO: Freed in smack_cred_free+0xc4/0xd0 age=27 cpu=0 pid=1564
      		kfree+0x270/0x290
      		smack_cred_free+0xc4/0xd0
      		security_cred_free+0x34/0x3c
      		put_cred_rcu+0x58/0xcc
      		rcu_process_callbacks+0x738/0x998
      		__do_softirq+0x264/0x4cc
      		do_softirq+0x94/0xf4
      		irq_exit+0xbc/0x120
      		handle_IRQ+0x104/0x134
      		gic_handle_irq+0x70/0xac
      		__irq_svc+0x44/0x78
      		_raw_spin_unlock+0x18/0x48
      		sync_inodes_sb+0x17c/0x1d8
      		sync_filesystem+0xac/0xfc
      		vdfs_file_fsync+0x90/0xc0
      		vfs_fsync_range+0x74/0x7c
      	INFO: Slab 0xd3b23f50 objects=32 used=31 fp=0xc4635600 flags=0x4080
      	INFO: Object 0xc4635600 @offset=5632 fp=0x  (null)
      
      	Bytes b4 c46355f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
      	Object c4635600: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      	Object c4635610: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      	Object c4635620: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      	Object c4635630: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
      	Redzone c4635640: bb bb bb bb                                      ....
      	Padding c46356e8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
      	Padding c46356f8: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
      	CPU: 5 PID: 834 Comm: launchpad_prelo Tainted: PBO 3.10.30 #1
      	Backtrace:
      	[<c00233a4>] (dump_backtrace+0x0/0x158) from [<c0023dec>] (show_stack+0x20/0x24)
      	 r7:c4634010 r6:d3b23f50 r5:c4635600 r4:d1002140
      	[<c0023dcc>] (show_stack+0x0/0x24) from [<c06d6d7c>] (dump_stack+0x20/0x28)
      	[<c06d6d5c>] (dump_stack+0x0/0x28) from [<c01c1d50>] (print_trailer+0x124/0x144)
      	[<c01c1c2c>] (print_trailer+0x0/0x144) from [<c01c1e88>] (object_err+0x3c/0x44)
      	 r7:c4635600 r6:d1002140 r5:d3b23f50 r4:c4635600
      	[<c01c1e4c>] (object_err+0x0/0x44) from [<c01cac18>] (kasan_report_error+0x2b8/0x538)
      	 r6:d1002140 r5:d3b23f50 r4:c6429cf8 r3:c09e1aa7
      	[<c01ca960>] (kasan_report_error+0x0/0x538) from [<c01c9430>] (__asan_load4+0xd4/0xf8)
      	[<c01c935c>] (__asan_load4+0x0/0xf8) from [<c031e168>] (smack_task_to_inode+0x50/0x70)
      	 r5:c4635600 r4:ca9da000
      	[<c031e118>] (smack_task_to_inode+0x0/0x70) from [<c031af64>] (security_task_to_inode+0x3c/0x44)
      	 r5:cca25e80 r4:c0ba9780
      	[<c031af28>] (security_task_to_inode+0x0/0x44) from [<c023d614>] (pid_revalidate+0x124/0x178)
      	 r6:00000000 r5:cca25e80 r4:cbabe3c0 r3:00008124
      	[<c023d4f0>] (pid_revalidate+0x0/0x178) from [<c01db98c>] (lookup_fast+0x35c/0x43y4)
      	 r9:c6429efc r8:00000101 r7:c079d940 r6:c6429e90 r5:c6429ed8 r4:c83c4148
      	[<c01db630>] (lookup_fast+0x0/0x434) from [<c01deec8>] (do_last.isra.24+0x1c0/0x1108)
      	[<c01ded08>] (do_last.isra.24+0x0/0x1108) from [<c01dff04>] (path_openat.isra.25+0xf4/0x648)
      	[<c01dfe10>] (path_openat.isra.25+0x0/0x648) from [<c01e1458>] (do_filp_open+0x3c/0x88)
      	[<c01e141c>] (do_filp_open+0x0/0x88) from [<c01ccb28>] (do_sys_open+0xf0/0x198)
      	 r7:00000001 r6:c0ea2180 r5:0000000b r4:00000000
      	[<c01cca38>] (do_sys_open+0x0/0x198) from [<c01ccc00>] (SyS_open+0x30/0x34)
      	[<c01ccbd0>] (SyS_open+0x0/0x34) from [<c001db80>] (ret_fast_syscall+0x0/0x48)
      	Read of size 4 by thread T834:
      	Memory state around the buggy address:
      	 c4635380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      	 c4635400: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
      	 c4635480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      	 c4635500: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
      	 c4635580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      	>c4635600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      	           ^
      	 c4635680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      	 c4635700: 00 00 00 00 04 fc fc fc fc fc fc fc fc fc fc fc
      	 c4635780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      	 c4635800: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc
      	 c4635880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      	==================================================================
      Signed-off-by: default avatarAndrey Ryabinin <a.ryabinin@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13f431fa
    • Steven Rostedt (Red Hat)'s avatar
      ring-buffer: Do not wake up a splice waiter when page is not full · 29a44468
      Steven Rostedt (Red Hat) authored
      commit 1e0d6714 upstream.
      
      When an application connects to the ring buffer via splice, it can only
      read full pages. Splice does not work with partial pages. If there is
      not enough data to fill a page, the splice command will either block
      or return -EAGAIN (if set to nonblock).
      
      Code was added where if the page is not full, to just sleep again.
      The problem is, it will get woken up again on the next event. That
      is, when something is written into the ring buffer, if there is a waiter
      it will wake it up. The waiter would then check the buffer, see that
      it still does not have enough data to fill a page and go back to sleep.
      To make matters worse, when the waiter goes back to sleep, it could
      cause another event, which would wake it back up again to see it
      doesn't have enough data and sleep again. This produces a tremendous
      overhead and fills the ring buffer with noise.
      
      For example, recording sched_switch on an idle system for 10 seconds
      produces 25,350,475 events!!!
      
      Create another wait queue for those waiters wanting full pages.
      When an event is written, it only wakes up waiters if there's a full
      page of data. It does not wake up the waiter if the page is not yet
      full.
      
      After this change, recording sched_switch on an idle system for 10
      seconds produces only 800 events. Getting rid of 25,349,675 useless
      events (99.9969% of events!!), is something to take seriously.
      
      Cc: Rabin Vincent <rabin@rab.in>
      Fixes: e30f53aa "tracing: Do not busy wait in buffer splice"
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29a44468
    • Paul Moore's avatar
      cipso: don't use IPCB() to locate the CIPSO IP option · dd8ef93c
      Paul Moore authored
      commit 04f81f01 upstream.
      
      Using the IPCB() macro to get the IPv4 options is convenient, but
      unfortunately NetLabel often needs to examine the CIPSO option outside
      of the scope of the IP layer in the stack.  While historically IPCB()
      worked above the IP layer, due to the inclusion of the inet_skb_param
      struct at the head of the {tcp,udp}_skb_cb structs, recent commit
      971f10ec ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
      reordered the tcp_skb_cb struct and invalidated this IPCB() trick.
      
      This patch fixes the problem by creating a new function,
      cipso_v4_optptr(), which locates the CIPSO option inside the IP header
      without calling IPCB().  Unfortunately, this isn't as fast as a simple
      lookup so some additional tweaks were made to limit the use of this
      new function.
      Reported-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Tested-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd8ef93c
    • Jeff Moyer's avatar
      cfq-iosched: fix incorrect filing of rt async cfqq · 416f74c6
      Jeff Moyer authored
      commit c6ce1943 upstream.
      
      Hi,
      
      If you can manage to submit an async write as the first async I/O from
      the context of a process with realtime scheduling priority, then a
      cfq_queue is allocated, but filed into the wrong async_cfqq bucket.  It
      ends up in the best effort array, but actually has realtime I/O
      scheduling priority set in cfqq->ioprio.
      
      The reason is that cfq_get_queue assumes the default scheduling class and
      priority when there is no information present (i.e. when the async cfqq
      is created):
      
      static struct cfq_queue *
      cfq_get_queue(struct cfq_data *cfqd, bool is_sync, struct cfq_io_cq *cic,
      	      struct bio *bio, gfp_t gfp_mask)
      {
      	const int ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
      	const int ioprio = IOPRIO_PRIO_DATA(cic->ioprio);
      
      cic->ioprio starts out as 0, which is "invalid".  So, class of 0
      (IOPRIO_CLASS_NONE) is passed to cfq_async_queue_prio like so:
      
      		async_cfqq = cfq_async_queue_prio(cfqd, ioprio_class, ioprio);
      
      static struct cfq_queue **
      cfq_async_queue_prio(struct cfq_data *cfqd, int ioprio_class, int ioprio)
      {
              switch (ioprio_class) {
              case IOPRIO_CLASS_RT:
                      return &cfqd->async_cfqq[0][ioprio];
              case IOPRIO_CLASS_NONE:
                      ioprio = IOPRIO_NORM;
                      /* fall through */
              case IOPRIO_CLASS_BE:
                      return &cfqd->async_cfqq[1][ioprio];
              case IOPRIO_CLASS_IDLE:
                      return &cfqd->async_idle_cfqq;
              default:
                      BUG();
              }
      }
      
      Here, instead of returning a class mapped from the process' scheduling
      priority, we get back the bucket associated with IOPRIO_CLASS_BE.
      
      Now, there is no queue allocated there yet, so we create it:
      
      		cfqq = cfq_find_alloc_queue(cfqd, is_sync, cic, bio, gfp_mask);
      
      That function ends up doing this:
      
      			cfq_init_cfqq(cfqd, cfqq, current->pid, is_sync);
      			cfq_init_prio_data(cfqq, cic);
      
      cfq_init_cfqq marks the priority as having changed.  Then, cfq_init_prio
      data does this:
      
      	ioprio_class = IOPRIO_PRIO_CLASS(cic->ioprio);
      	switch (ioprio_class) {
      	default:
      		printk(KERN_ERR "cfq: bad prio %x\n", ioprio_class);
      	case IOPRIO_CLASS_NONE:
      		/*
      		 * no prio set, inherit CPU scheduling settings
      		 */
      		cfqq->ioprio = task_nice_ioprio(tsk);
      		cfqq->ioprio_class = task_nice_ioclass(tsk);
      		break;
      
      So we basically have two code paths that treat IOPRIO_CLASS_NONE
      differently, which results in an RT async cfqq filed into a best effort
      bucket.
      
      Attached is a patch which fixes the problem.  I'm not sure how to make
      it cleaner.  Suggestions would be welcome.
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Tested-by: default avatarHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      416f74c6
    • Konstantin Khlebnikov's avatar
      cfq-iosched: handle failure of cfq group allocation · 3608406b
      Konstantin Khlebnikov authored
      commit 69abaffe upstream.
      
      Cfq_lookup_create_cfqg() allocates struct blkcg_gq using GFP_ATOMIC.
      In cfq_find_alloc_queue() possible allocation failure is not handled.
      As a result kernel oopses on NULL pointer dereference when
      cfq_link_cfqq_cfqg() calls cfqg_get() for NULL pointer.
      
      Bug was introduced in v3.5 in commit cd1604fa ("blkcg: factor
      out blkio_group creation"). Prior to that commit cfq group lookup
      had returned pointer to root group as fallback.
      
      This patch handles this error using existing fallback oom_cfqq.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Fixes: cd1604fa ("blkcg: factor out blkio_group creation")
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3608406b
    • Nicholas Bellinger's avatar
      iscsi-target: Drop problematic active_ts_list usage · f623bcdd
      Nicholas Bellinger authored
      commit 3fd7b60f upstream.
      
      This patch drops legacy active_ts_list usage within iscsi_target_tq.c
      code.  It was originally used to track the active thread sets during
      iscsi-target shutdown, and is no longer used by modern upstream code.
      
      Two people have reported list corruption using traditional iscsi-target
      and iser-target with the following backtrace, that appears to be related
      to iscsi_thread_set->ts_list being used across both active_ts_list and
      inactive_ts_list.
      
      [   60.782534] ------------[ cut here ]------------
      [   60.782543] WARNING: CPU: 0 PID: 9430 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0()
      [   60.782545] list_del corruption, ffff88045b00d180->next is LIST_POISON1 (dead000000100100)
      [   60.782546] Modules linked in: ib_srpt tcm_qla2xxx qla2xxx tcm_loop tcm_fc libfc scsi_transport_fc scsi_tgt ib_isert rdma_cm iw_cm ib_addr iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod configfs ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter ip_tables bridge stp llc autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ib_ipoib ib_cm ib_uverbs ib_umad mlx4_en mlx4_ib ib_sa ib_mad ib_core mlx4_core dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm uinput iTCO_wdt iTCO_vendor_support microcode serio_raw pcspkr sb_edac edac_core sg i2c_i801 lpc_ich mfd_core mtip32xx igb i2c_algo_bit i2c_core ptp pps_core ioatdma dca wmi ext3(F) jbd(F) mbcache(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) ahci(F) libahci(F) isci(F) libsas(F) scsi_transport_sas(F) [last unloaded: speedstep_lib]
      [   60.782597] CPU: 0 PID: 9430 Comm: iscsi_ttx Tainted: GF 3.12.19+ #2
      [   60.782598] Hardware name: Supermicro X9DRX+-F/X9DRX+-F, BIOS 3.00 07/09/2013
      [   60.782599]  0000000000000035 ffff88044de31d08 ffffffff81553ae7 0000000000000035
      [   60.782602]  ffff88044de31d58 ffff88044de31d48 ffffffff8104d1cc 0000000000000002
      [   60.782605]  ffff88045b00d180 ffff88045b00d0c0 ffff88045b00d0c0 ffff88044de31e58
      [   60.782607] Call Trace:
      [   60.782611]  [<ffffffff81553ae7>] dump_stack+0x49/0x62
      [   60.782615]  [<ffffffff8104d1cc>] warn_slowpath_common+0x8c/0xc0
      [   60.782618]  [<ffffffff8104d2b6>] warn_slowpath_fmt+0x46/0x50
      [   60.782620]  [<ffffffff81280933>] __list_del_entry+0x63/0xd0
      [   60.782622]  [<ffffffff812809b1>] list_del+0x11/0x40
      [   60.782630]  [<ffffffffa06e7cf9>] iscsi_del_ts_from_active_list+0x29/0x50 [iscsi_target_mod]
      [   60.782635]  [<ffffffffa06e87b1>] iscsi_tx_thread_pre_handler+0xa1/0x180 [iscsi_target_mod]
      [   60.782642]  [<ffffffffa06fb9ae>] iscsi_target_tx_thread+0x4e/0x220 [iscsi_target_mod]
      [   60.782647]  [<ffffffffa06fb960>] ? iscsit_handle_snack+0x190/0x190 [iscsi_target_mod]
      [   60.782652]  [<ffffffffa06fb960>] ? iscsit_handle_snack+0x190/0x190 [iscsi_target_mod]
      [   60.782655]  [<ffffffff8106f99e>] kthread+0xce/0xe0
      [   60.782657]  [<ffffffff8106f8d0>] ? kthread_freezable_should_stop+0x70/0x70
      [   60.782660]  [<ffffffff8156026c>] ret_from_fork+0x7c/0xb0
      [   60.782662]  [<ffffffff8106f8d0>] ? kthread_freezable_should_stop+0x70/0x70
      [   60.782663] ---[ end trace 9662f4a661d33965 ]---
      
      Since this code is no longer used, go ahead and drop the problematic usage
      all-together.
      Reported-by: default avatarGavin Guo <gavin.guo@canonical.com>
      Reported-by: default avatarMoussa Ba <moussaba@micron.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f623bcdd
    • Tony Battersby's avatar
      sg: fix EWOULDBLOCK errors with scsi-mq · e1ed769d
      Tony Battersby authored
      commit 7772855a upstream.
      
      With scsi-mq enabled, userspace programs can get unexpected EWOULDBLOCK
      (a.k.a. EAGAIN) errors when submitting commands to the SCSI generic
      driver.  Fix by calling blk_get_request() with GFP_KERNEL instead of
      GFP_ATOMIC.
      
      Note: to avoid introducing a potential deadlock, this patch should be
      applied after the patch titled "sg: fix unkillable I/O wait deadlock
      with scsi-mq".
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Tested-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e1ed769d
    • Tony Battersby's avatar
      sg: fix unkillable I/O wait deadlock with scsi-mq · 09c2814b
      Tony Battersby authored
      commit 7568615c upstream.
      
      When using the write()/read() interface for submitting commands, the
      SCSI generic driver does not call blk_put_request() on a completed SCSI
      command until userspace calls read() to get the command completion.
      Since scsi-mq uses a fixed number of preallocated requests, this makes
      it possible for userspace to exhaust the entire preallocated supply of
      requests.  For places in the kernel that call blk_get_request() with
      GFP_KERNEL, this can cause the calling process to deadlock in a
      permanent unkillable I/O wait in blk_get_request() -> ... -> bt_get().
      For places in the kernel that call blk_get_request() with GFP_ATOMIC,
      this can cause blk_get_request() always to return -EWOULDBLOCK.  Note
      that these problems happen only if scsi-mq is enabled.  Prevent the
      problems by calling blk_put_request() as soon as the SCSI command
      completes instead of waiting for userspace to call read().
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Tested-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09c2814b
    • Trond Myklebust's avatar
      NFSv4.1: Fix a kfree() of uninitialised pointers in decode_cb_sequence_args · 1b4853b6
      Trond Myklebust authored
      commit d8ba1f97 upstream.
      
      If the call to decode_rc_list() fails due to a memory allocation error,
      then we need to truncate the array size to ensure that we only call
      kfree() on those pointer that were allocated.
      Reported-by: default avatarDavid Ramos <daramos@stanford.edu>
      Fixes: 4aece6a1 ("nfs41: cb_sequence xdr implementation")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b4853b6
    • Trond Myklebust's avatar
      NFSv4: Ensure we reference the inode for return-on-close in delegreturn · d77b2f38
      Trond Myklebust authored
      commit ea7c38fe upstream.
      
      If we have to do a return-on-close in the delegreturn code, then
      we must ensure that the inode and super block remain referenced.
      
      Cc: Peng Tao <tao.peng@primarydata.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Reviewed-by: default avatarPeng Tao <tao.peng@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d77b2f38
    • Trond Myklebust's avatar
      SUNRPC: NULL utsname dereference on NFS umount during namespace cleanup · 1a3f8e6f
      Trond Myklebust authored
      commit 03a9a42a upstream.
      
      Fix an Oopsable condition when nsm_mon_unmon is called as part of the
      namespace cleanup, which now apparently happens after the utsname
      has been freed.
      
      Link: http://lkml.kernel.org/r/20150125220604.090121ae@neptune.homeReported-by: default avatarBruno Prémont <bonbons@linux-vserver.org>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a3f8e6f
    • Peng Tao's avatar
      nfs41: .init_read and .init_write can be called with valid pg_lseg · d52ff195
      Peng Tao authored
      commit cb5d04bc upstream.
      
      With pgio refactoring in v3.15, .init_read and .init_write can be
      called with valid pgio->pg_lseg. file layout was fixed at that time
      by commit c6194271 (pnfs: filelayout: support non page aligned
      layouts). But the generic helper still needs to be fixed.
      Signed-off-by: default avatarPeng Tao <tao.peng@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d52ff195
    • honclo's avatar
      Added Little Endian support to vtpm module · e1f02461
      honclo authored
      commit eb71f8a5 upstream.
      
      The tpm_ibmvtpm module is affected by an unaligned access problem.
      ibmvtpm_crq_get_version failed with rc=-4 during boot when vTPM is
      enabled in Power partition, which supports both little endian and
      big endian modes.
      
      We added little endian support to fix this problem:
      1) added cpu_to_be64 calls to ensure BE data is sent from an LE OS.
      2) added be16_to_cpu and be32_to_cpu calls to make sure data received
         is in LE format on a LE OS.
      Signed-off-by: default avatarHon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com>
      Signed-off-by: default avatarJoy Latten <jmlatten@linux.vnet.ibm.com>
      [phuewe: manually applied the patch :( ]
      Reviewed-by: default avatarAshley Lai <ashley@ahsleylai.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e1f02461
    • Christophe Ricard's avatar
      tpm/tpm_i2c_stm_st33: Fix potential bug in tpm_stm_i2c_send · 0d786782
      Christophe Ricard authored
      commit 1ba3b0b6 upstream.
      
      When sending data in tpm_stm_i2c_send, each loop iteration send buf.
      Send buf + i instead as the goal of this for loop is to send a number
      of byte from buf that fit in burstcnt. Once those byte are sent, we are
      supposed to send the next ones.
      
      The driver was working because the burstcount value returns always the maximum size for a TPM
      command or response. (0x800 for a command and 0x400 for a response).
      Reviewed-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarChristophe Ricard <christophe-h.ricard@st.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d786782
    • Hon Ching (Vicky) Lo's avatar
      tpm: Fix NULL return in tpm_ibmvtpm_get_desired_dma · f6dfa6a2
      Hon Ching (Vicky) Lo authored
      commit 84eb186b upstream.
      
      There was an oops in tpm_ibmvtpm_get_desired_dma, which caused
      kernel panic during boot when vTPM is enabled in Power partition
      configured in AMS mode.
      
      vio_bus_probe calls vio_cmo_bus_probe which calls
      tpm_ibmvtpm_get_desired_dma to get the size needed for DMA allocation.
      The problem is, vio_cmo_bus_probe is called before calling probe, which
      for vtpm is tpm_ibmvtpm_probe and it's this function that initializes
      and sets up vtpm's CRQ and gets required data values.  Therefore,
      since this has not yet been done, NULL is returned in attempt to get
      the size for DMA allocation.
      
      We added a NULL check.  In addition, a default buffer size will
      be set when NULL is returned.
      Signed-off-by: default avatarHon Ching (Vicky) Lo <honclo@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6dfa6a2
    • Kiran Padwal's avatar
      char: tpm: Add missing error check for devm_kzalloc · d9b736e0
      Kiran Padwal authored
      commit bb95cd34 upstream.
      
      Currently these driver are missing a check on the return value of devm_kzalloc,
      which would cause a NULL pointer dereference in a OOM situation.
      
      This patch adds a missing check for tpm_i2c_atmel.c and tpm_i2c_nuvoton.c
      Signed-off-by: default avatarKiran Padwal <kiran.padwal@smartplayin.com>
      Reviewed-By: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9b736e0
    • David Howells's avatar
      TPM: Add new TPMs to the tail of the list to prevent inadvertent change of dev · 770ab65c
      David Howells authored
      commit 398a1e71 upstream.
      
      Add newly registered TPMs to the tail of the list, not the beginning, so that
      things that are specifying TPM_ANY_NUM don't find that the device they're
      using has inadvertently changed.  Adding a second device would break IMA, for
      instance.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      770ab65c
    • Scot Doyle's avatar
      tpm_tis: verify interrupt during init · bef49dc1
      Scot Doyle authored
      commit 448e9c55 upstream.
      
      Some machines, such as the Acer C720 and Toshiba CB35, have TPMs that do
      not send IRQs while also having an ACPI TPM entry indicating that they
      will be sent. These machines freeze on resume while the tpm_tis module
      waits for an IRQ, eventually timing out.
      
      When in interrupt mode, the tpm_tis module should receive an IRQ during
      module init. Fall back to polling mode if none is received when expected.
      Signed-off-by: default avatarScot Doyle <lkml14@scotdoyle.com>
      Tested-by: default avatarMichael Mullin <masmullin@gmail.com>
      Reviewed-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      [phuewe: minor checkpatch fixed]
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bef49dc1
    • Florian Fainelli's avatar
      ARM: dts: BCM63xx: fix L2 cache properties · ede1c87d
      Florian Fainelli authored
      commit 9df11828 upstream.
      
      The L2 cache properties were completely off with respect to what the
      hardware is configured for. Fix the cache-size, cache-line-size and
      cache-sets to reflect the L2 cache controller we have: 512KB, 16 ways
      and 32 bytes per cache-line.
      
      Fixes: 46d4bca0 ("ARM: BCM63XX: add BCM63138 minimal Device Tree")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ede1c87d
    • Robert Nelson's avatar
      ARM: dts: am335x-bone*: usb0 is hardwired for peripheral · 42b3f837
      Robert Nelson authored
      commit 67fd14b3 upstream.
      
      Fixes: http://bugs.elinux.org/issues/127
      
      the bb.org community was seeing random reboots before this change.
      Signed-off-by: default avatarRobert Nelson <robertcnelson@gmail.com>
      Reviewed-by: default avatarFelipe Balbi <balbi@ti.com>
      Acked-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42b3f837
    • Dmitry Osipenko's avatar
      ARM: dts: tegra20: fix GR3D, DSI unit and reg base addresses · 6fda79f8
      Dmitry Osipenko authored
      commit de47699d upstream.
      
      Commit 58ecb23f ("ARM: tegra: add missing unit addresses to DT") added
      unit address and changed reg base for GR3D and DSI host1x modules, but these
      addresses belongs to GR2D and TVO modules respectively. Fix it by changing
      modules unit and reg base addresses to proper ones.
      Signed-off-by: default avatarDmitry Osipenko <digetx@gmail.com>
      Fixes: 58ecb23f (ARM: tegra: add missing unit addresses to DT)
      Reviewed-by: default avatarAlexandre Courbot <acourbot@nvidia.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6fda79f8
    • Lokesh Vutla's avatar
      ARM: DRA7: hwmod: Fix boot crash with DEBUG_LL enabled on UART3 · 7c062231
      Lokesh Vutla authored
      commit 1c7e36bf upstream.
      
      With commit '7dedd346: ARM: OMAP2+: hwmod: Fix a crash in _setup_reset()
      with DEBUG_LL' we moved from parsing cmdline to identify uart used
      for earlycon to using the requsite hwmod CONFIG_DEBUG_OMAPxUARTy FLAGS.
      
      On DRA7 UART3 hwmod doesn't have this flag enabled, and atleast on
      BeagleBoard-X15, where we use UART3 for console, boot fails with
      DEBUG_LL enabled. Enable DEBUG_OMAP4UART3_FLAGS for UART3 hwmod.
      
      For using DEBUG_LL, enable CONFIG_DEBUG_OMAP4UART3 in menuconfig.
      
      Fixes: 90020c7b ("ARM: OMAP: DRA7: hwmod: Create initial DRA7XX SoC data")
      Reviewed-by: default avatarFelipe Balbi <balbi@ti.com>
      Acked-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarLokesh Vutla <lokeshvutla@ti.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c062231
    • Dmitry Eremin-Solenikov's avatar
      ARM: 8284/1: sa1100: clear RCSR_SMR on resume · 33ceb68c
      Dmitry Eremin-Solenikov authored
      commit e461894d upstream.
      
      StrongARM core uses RCSR SMR bit to tell to bootloader that it was reset
      by entering the sleep mode. After we have resumed, there is little point
      in having that bit enabled. Moreover, if this bit is set before reboot,
      the bootloader can become confused. Thus clear the SMR bit on resume
      just before clearing the scratchpad (resume address) register.
      Signed-off-by: default avatarDmitry Eremin-Solenikov <dbaryshkov@gmail.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33ceb68c
    • Tony Battersby's avatar
      blk-mq: fix double-free in error path · 238ddf7a
      Tony Battersby authored
      commit 564e559f upstream.
      
      If the allocation of bt->bs fails, then bt->map can be freed twice, once
      in blk_mq_init_bitmap_tags() -> bt_alloc(), and once in
      blk_mq_init_bitmap_tags() -> bt_free().  Fix by setting the pointer to
      NULL after the first free.
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      238ddf7a
    • Vikram Mulukutla's avatar
      tracing: Fix unmapping loop in tracing_mark_write · a0a645bf
      Vikram Mulukutla authored
      commit 7215853e upstream.
      
      Commit 6edb2a8a introduced
      an array map_pages that contains the addresses returned by
      kmap_atomic. However, when unmapping those pages, map_pages[0]
      is unmapped before map_pages[1], breaking the nesting requirement
      as specified in the documentation for kmap_atomic/kunmap_atomic.
      
      This was caught by the highmem debug code present in kunmap_atomic.
      Fix the loop to do the unmapping properly.
      
      Link: http://lkml.kernel.org/r/1418871056-6614-1-git-send-email-markivx@codeaurora.orgReviewed-by: default avatarStephen Boyd <sboyd@codeaurora.org>
      Reported-by: default avatarLime Yang <limey@codeaurora.org>
      Signed-off-by: default avatarVikram Mulukutla <markivx@codeaurora.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a0a645bf
    • Naoya Horiguchi's avatar
      mm/hugetlb: pmd_huge() returns true for non-present hugepage · 9cdaa7e2
      Naoya Horiguchi authored
      commit cbef8478 upstream.
      
      Migrating hugepages and hwpoisoned hugepages are considered as non-present
      hugepages, and they are referenced via migration entries and hwpoison
      entries in their page table slots.
      
      This behavior causes race condition because pmd_huge() doesn't tell
      non-huge pages from migrating/hwpoisoned hugepages.  follow_page_mask() is
      one example where the kernel would call follow_page_pte() for such
      hugepage while this function is supposed to handle only normal pages.
      
      To avoid this, this patch makes pmd_huge() return true when pmd_none() is
      true *and* pmd_present() is false.  We don't have to worry about mixing up
      non-present pmd entry with normal pmd (pointing to leaf level pte entry)
      because pmd_present() is true in normal pmd.
      
      The same race condition could happen in (x86-specific) gup_pmd_range(),
      where this patch simply adds pmd_present() check instead of pmd_huge().
      This is because gup_pmd_range() is fast path.  If we have non-present
      hugepage in this function, we will go into gup_huge_pmd(), then return 0
      at flag mask check, and finally fall back to the slow path.
      
      Fixes: 290408d4 ("hugetlb: hugepage migration core")
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Luiz Capitulino <lcapitulino@redhat.com>
      Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9cdaa7e2
    • James Hogan's avatar
      MIPS: Export MSA functions used by lose_fpu(1) for KVM · a818d2ae
      James Hogan authored
      commit ca5d2564 upstream.
      
      Export the _save_msa asm function used by the lose_fpu(1) macro to GPL
      modules so that KVM can make use of it when it is built as a module.
      
      This fixes the following build error when CONFIG_KVM=m and
      CONFIG_CPU_HAS_MSA=y due to commit f798217d ("KVM: MIPS: Don't leak
      FPU/DSP to guest"):
      
      ERROR: "_save_msa" [arch/mips/kvm/kvm.ko] undefined!
      
      Fixes: f798217d (KVM: MIPS: Don't leak FPU/DSP to guest)
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9261/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a818d2ae
    • James Hogan's avatar
      MIPS: Export FP functions used by lose_fpu(1) for KVM · 2bfca500
      James Hogan authored
      commit 3ce465e0 upstream.
      
      Export the _save_fp asm function used by the lose_fpu(1) macro to GPL
      modules so that KVM can make use of it when it is built as a module.
      
      This fixes the following build error when CONFIG_KVM=m due to commit
      f798217d ("KVM: MIPS: Don't leak FPU/DSP to guest"):
      
      ERROR: "_save_fp" [arch/mips/kvm/kvm.ko] undefined!
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Fixes: f798217d (KVM: MIPS: Don't leak FPU/DSP to guest)
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9260/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bfca500
    • Markos Chandras's avatar
      MIPS: asm: pgtable: Prevent HTW race when updating PTEs · dea33896
      Markos Chandras authored
      commit fde3538a upstream.
      
      Whenever we modify a page table entry, we need to ensure that the HTW
      will not fetch a stable entry. And for that to happen we need to ensure
      that HTW is stopped before we modify the said entry otherwise the HTW
      may already be in the process of reading that entry and fetching the
      old information. As a result of which, we replace the htw_reset() calls
      with htw_{stop,start} in more appropriate places. This also removes the
      remaining users of htw_reset() and as a result we drop that macro
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9116/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dea33896
    • Markos Chandras's avatar
      MIPS: asm: pgtable: Add c0 hazards on HTW start/stop sequences · de0a5f5e
      Markos Chandras authored
      commit 461d1597 upstream.
      
      When we use htw_{start,stop}() outside of htw_reset(), we need
      to ensure that c0 changes have been propagated properly before
      we attempt to continue with subsequence memory operations.
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9114/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      de0a5f5e
    • Markos Chandras's avatar
      MIPS: asm: asmmacro: Replace "add" instructions with "addu" · e800e504
      Markos Chandras authored
      commit 98a833c1 upstream.
      
      The "add" instruction is actually a macro in binutils and depending on
      the size of the immediate it can expand to an "addi" instruction.
      However, the "addi" instruction traps on overflows which is not
      something we want on address calculation.
      
      Link: http://www.linux-mips.org/archives/linux-mips/2015-01/msg00121.html
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e800e504
    • Markos Chandras's avatar
      MIPS: kernel: cps-vec: Replace "addi" with "addiu" · 905fd850
      Markos Chandras authored
      commit acac4108 upstream.
      
      The "addi" instruction will trap on overflows which is not something
      we need in this code, so we replace that with "addiu".
      
      Link: http://www.linux-mips.org/archives/linux-mips/2015-01/msg00430.html
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      905fd850
    • Manuel Lauss's avatar
      MIPS: Alchemy: Fix cpu clock calculation · c1f002d6
      Manuel Lauss authored
      commit 69e4e63e upstream.
      
      The current code uses bits 0-6 of the sys_cpupll register to calculate
      core clock speed.  However this is only valid on Au1300, on all earlier
      models the hardware only uses bits 0-5 to generate core clock.
      
      This fixes clock calculation on the MTX1 (Au1500), where bit 6 of cpupll
      is set as well, which ultimately lead the code to calculate a bogus cpu
      core clock and also uart base clock down the line.
      Signed-off-by: default avatarManuel Lauss <manuel.lauss@gmail.com>
      Reported-by: default avatarJohn Crispin <blogic@openwrt.org>
      Tested-by: default avatarBruno Randolf <br1@einfach.org>
      Cc: Linux-MIPS <linux-mips@linux-mips.org>
      Patchwork: https://patchwork.linux-mips.org/patch/9279/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1f002d6
    • James Hogan's avatar
      KVM: MIPS: Don't leak FPU/DSP to guest · 1835ecd5
      James Hogan authored
      commit f798217d upstream.
      
      The FPU and DSP are enabled via the CP0 Status CU1 and MX bits by
      kvm_mips_set_c0_status() on a guest exit, presumably in case there is
      active state that needs saving if pre-emption occurs. However neither of
      these bits are cleared again when returning to the guest.
      
      This effectively gives the guest access to the FPU/DSP hardware after
      the first guest exit even though it is not aware of its presence,
      allowing FP instructions in guest user code to intermittently actually
      execute instead of trapping into the guest OS for emulation. It will
      then read & manipulate the hardware FP registers which technically
      belong to the user process (e.g. QEMU), or are stale from another user
      process. It can also crash the guest OS by causing an FP exception, for
      which a guest exception handler won't have been registered.
      
      First lets save and disable the FPU (and MSA) state with lose_fpu(1)
      before entering the guest. This simplifies the problem, especially for
      when guest FPU/MSA support is added in the future, and prevents FR=1 FPU
      state being live when the FR bit gets cleared for the guest, which
      according to the architecture causes the contents of the FPU and vector
      registers to become UNPREDICTABLE.
      
      We can then safely remove the enabling of the FPU in
      kvm_mips_set_c0_status(), since there should never be any active FPU or
      MSA state to save at pre-emption, which should plug the FPU leak.
      
      DSP state is always live rather than being lazily restored, so for that
      it is simpler to just clear the MX bit again when re-entering the guest.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Sanjay Lal <sanjayl@kymasys.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1835ecd5
    • James Hogan's avatar
      KVM: MIPS: Disable HTW while in guest · 32a69915
      James Hogan authored
      commit c4c6f2ca upstream.
      
      Ensure any hardware page table walker (HTW) is disabled while in KVM
      guest mode, as KVM doesn't yet set up hardware page table walking for
      guest mappings so the wrong mappings would get loaded, resulting in the
      guest hanging or crashing once it reaches userland.
      
      The HTW is disabled and re-enabled around the call to
      __kvm_mips_vcpu_run() which does the initial switch into guest mode and
      the final switch out of guest context. Additionally it is enabled for
      the duration of guest exits (i.e. kvm_mips_handle_exit()), getting
      disabled again before returning back to guest or host.
      
      In all cases the HTW is only disabled in normal kernel mode while
      interrupts are disabled, so that the HTW doesn't get left disabled if
      the process is preempted.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32a69915
    • Trond Myklebust's avatar
      NFS: struct nfs_commit_info.lock must always point to inode->i_lock · 5fb43ea9
      Trond Myklebust authored
      commit f4086a3d upstream.
      
      Commit 411a99ad (nfs: clear_request_commit while holding i_lock)
      assumes that the nfs_commit_info always points to the inode->i_lock.
      For historical reasons, that is not the case for O_DIRECT writes.
      
      Cc: Weston Andros Adamson <dros@primarydata.com>
      Fixes: 411a99ad ("nfs: clear_request_commit while holding i_lock")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5fb43ea9
    • Jeff Layton's avatar
      nfs: don't call blocking operations while !TASK_RUNNING · 0c0f2544
      Jeff Layton authored
      commit 6ffa30d3 upstream.
      
      Bruce reported seeing this warning pop when mounting using v4.1:
      
           ------------[ cut here ]------------
           WARNING: CPU: 1 PID: 1121 at kernel/sched/core.c:7300 __might_sleep+0xbd/0xd0()
          do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810ff58f>] prepare_to_wait+0x2f/0x90
          Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer ppdev joydev snd virtio_console virtio_balloon pcspkr serio_raw parport_pc parport pvpanic floppy soundcore i2c_piix4 virtio_blk virtio_net qxl drm_kms_helper ttm drm virtio_pci virtio_ring ata_generic virtio pata_acpi
          CPU: 1 PID: 1121 Comm: nfsv4.1-svc Not tainted 3.19.0-rc4+ #25
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014
           0000000000000000 000000004e5e3f73 ffff8800b998fb48 ffffffff8186ac78
           0000000000000000 ffff8800b998fba0 ffff8800b998fb88 ffffffff810ac9da
           ffff8800b998fb68 ffffffff81c923e7 00000000000004d9 0000000000000000
          Call Trace:
           [<ffffffff8186ac78>] dump_stack+0x4c/0x65
           [<ffffffff810ac9da>] warn_slowpath_common+0x8a/0xc0
           [<ffffffff810aca65>] warn_slowpath_fmt+0x55/0x70
           [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90
           [<ffffffff810ff58f>] ? prepare_to_wait+0x2f/0x90
           [<ffffffff810dd2ad>] __might_sleep+0xbd/0xd0
           [<ffffffff8124c973>] kmem_cache_alloc_trace+0x243/0x430
           [<ffffffff810d941e>] ? groups_alloc+0x3e/0x130
           [<ffffffff810d941e>] groups_alloc+0x3e/0x130
           [<ffffffffa0301b1e>] svcauth_unix_accept+0x16e/0x290 [sunrpc]
           [<ffffffffa0300571>] svc_authenticate+0xe1/0xf0 [sunrpc]
           [<ffffffffa02fc564>] svc_process_common+0x244/0x6a0 [sunrpc]
           [<ffffffffa02fd044>] bc_svc_process+0x1c4/0x260 [sunrpc]
           [<ffffffffa03d5478>] nfs41_callback_svc+0x128/0x1f0 [nfsv4]
           [<ffffffff810ff970>] ? wait_woken+0xc0/0xc0
           [<ffffffffa03d5350>] ? nfs4_callback_svc+0x60/0x60 [nfsv4]
           [<ffffffff810d45bf>] kthread+0x11f/0x140
           [<ffffffff810ea815>] ? local_clock+0x15/0x30
           [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250
           [<ffffffff81874bfc>] ret_from_fork+0x7c/0xb0
           [<ffffffff810d44a0>] ? kthread_create_on_node+0x250/0x250
          ---[ end trace 675220a11e30f4f2 ]---
      
      nfs41_callback_svc does most of its work while in TASK_INTERRUPTIBLE,
      which is just wrong. Fix that by finishing the wait immediately if we've
      found that the list has something on it.
      
      Also, we don't expect this kthread to accept signals, so we should be
      using a TASK_UNINTERRUPTIBLE sleep instead. That however, opens us up
      hung task warnings from the watchdog, so have the schedule_timeout
      wake up every 60s if there's no callback activity.
      Reported-by: default avatar"J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: default avatarJeff Layton <jlayton@primarydata.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c0f2544
    • Konstantin Khlebnikov's avatar
      proc/pagemap: walk page tables under pte lock · 66f6db71
      Konstantin Khlebnikov authored
      commit 05fbf357 upstream.
      
      Lockless access to pte in pagemap_pte_range() might race with page
      migration and trigger BUG_ON(!PageLocked()) in migration_entry_to_page():
      
      CPU A (pagemap)                           CPU B (migration)
                                                lock_page()
                                                try_to_unmap(page, TTU_MIGRATION...)
                                                     make_migration_entry()
                                                     set_pte_at()
      <read *pte>
      pte_to_pagemap_entry()
                                                remove_migration_ptes()
                                                unlock_page()
          if(is_migration_entry())
              migration_entry_to_page()
                  BUG_ON(!PageLocked(page))
      
      Also lockless read might be non-atomic if pte is larger than wordsize.
      Other pte walkers (smaps, numa_maps, clear_refs) already lock ptes.
      
      Fixes: 052fb0d6 ("proc: report file/anon bit in /proc/pid/pagemap")
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Reported-by: default avatarAndrey Ryabinin <a.ryabinin@samsung.com>
      Reviewed-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66f6db71
    • Marcin Wojtas's avatar
      mmc: sdhci-pxav3: Fix Armada 38x controller's caps according to erratum ERR-7878951 · 0c98e7c5
      Marcin Wojtas authored
      commit a39128bc upstream.
      
      According to erratum 'ERR-7878951' Armada 38x SDHCI controller has
      different capabilities than the ones shown in its registers:
      
      - it doesn't support the voltage switching: it can work either with
        3.3V or 1.8V supply
      - it doesn't support the SDR104 mode
      - SDR50 mode doesn't need tuning
      
      The SDHCI_QUIRK_MISSING_CAPS quirk is used for updating the
      capabilities accordingly.
      
      [gregory.clement@free-electrons.com: port from 3.10]
      
      Fixes: 5491ce3f ("mmc: sdhci-pxav3: add support for the Armada 38x SDHCI controller")
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c98e7c5
    • Gregory CLEMENT's avatar
      mmc: sdhci-pxav3: Fix SDR50 and DDR50 capabilities for the Armada 38x flavor · 56073d46
      Gregory CLEMENT authored
      commit d4b803c5 upstream.
      
      According to erratum 'FE-2946959' both SDR50 and DDR50 modes require
      specific clock adjustments in SDIO3 Configuration register. However,
      this register was not part of the device tree binding. Even if the
      binding can (and will) be extended we still need handling the case
      where this register was not available. In this case we use the
      SDHCI_QUIRK_MISSING_CAPS quirk remove them from the capabilities.
      
      This commit is based on the work done by Marcin Wojtas<mw@semihalf.com>
      
      Fixes: 5491ce3f ("mmc: sdhci-pxav3: add support for the Armada 38x SDHCI controller")
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56073d46