1. 11 Sep, 2013 40 commits
    • Jingoo Han's avatar
      block: replace strict_strtoul() with kstrtoul() · bb8e0e84
      Jingoo Han authored
      The use of strict_strtoul() is not preferred, because strict_strtoul() is
      obsolete.  Thus, kstrtoul() should be used.
      Signed-off-by: default avatarJingoo Han <jg1.han@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bb8e0e84
    • Oleg Nesterov's avatar
      include/linux/sched.h: don't use task->pid/tgid in same_thread_group/has_group_leader_pid · e1403b8e
      Oleg Nesterov authored
      task_struct->pid/tgid should go away.
      
      1. Change same_thread_group() to use task->signal for comparison.
      
      2. Change has_group_leader_pid(task) to compare task_pid(task) with
         signal->leader_pid.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Sergey Dyasly <dserrg@gmail.com>
      Reviewed-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e1403b8e
    • Jie Liu's avatar
      ocfs2: fix the end cluster offset of FIEMAP · 28e8be31
      Jie Liu authored
      Call fiemap ioctl(2) with given start offset as well as an desired mapping
      range should show extents if possible.  However, we somehow figure out the
      end offset of mapping via 'mapping_end -= cpos' before iterating the
      extent records which would cause problems if the given fiemap length is
      too small to a cluster size, e.g,
      
      Cluster size 4096:
      debugfs.ocfs2 1.6.3
              Block Size Bits: 12   Cluster Size Bits: 12
      
      The extended fiemap test utility From David:
      https://gist.github.com/anonymous/6172331
      
      # dd if=/dev/urandom of=/ocfs2/test_file bs=1M count=1000
      # ./fiemap /ocfs2/test_file 4096 10
      start: 4096, length: 10
      File /ocfs2/test_file has 0 extents:
      #	Logical          Physical         Length           Flags
      	^^^^^ <-- No extent is shown
      
      In this case, at ocfs2_fiemap(): cpos == mapping_end == 1. Hence the
      loop of searching extent records was not executed at all.
      
      This patch remove the in question 'mapping_end -= cpos', and loops
      until the cpos is larger than the mapping_end as usual.
      
      # ./fiemap /ocfs2/test_file 4096 10
      start: 4096, length: 10
      File /ocfs2/test_file has 1 extents:
      #	Logical          Physical         Length           Flags
      0:	0000000000000000 0000000056a01000 0000000006a00000 0000
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reported-by: default avatarDavid Weber <wb@munzinger.de>
      Tested-by: default avatarDavid Weber <wb@munzinger.de>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Cc: Mark Fashen <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      28e8be31
    • Joseph Qi's avatar
      ocfs2: remove unused variable ip in dlmfs_get_root_inode() · a72e27d3
      Joseph Qi authored
      Variable ip in dlmfs_get_root_inode() is defined but not used.  So clean
      it up.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a72e27d3
    • Joyce's avatar
      ocfs2: fix a tiny race case when firing callbacks · 6f8648e8
      Joyce authored
      In o2hb_shutdown_slot() and o2hb_check_slot(), since event is defined as
      local, it is only valid during the call stack.  So the following tiny race
      case may happen in a multi-volumes mounted environment:
      
      o2hb-vol1                         o2hb-vol2
      1) o2hb_shutdown_slot
      allocate local event1
      2) queue_node_event
      add event1 to global o2hb_node_events
                                        3) o2hb_shutdown_slot
                                        allocate local event2
                                        4) queue_node_event
                                        add event2 to global o2hb_node_events
                                        5) o2hb_run_event_list
                                        delete event1 from o2hb_node_events
      6) o2hb_run_event_list
      event1 empty, return
      7) o2hb_shutdown_slot
      event1 lifecycle ends
                                        8) o2hb_fire_callbacks
                                        event1 is already *invalid*
      
      This patch lets it wait on o2hb_callback_sem when another thread is firing
      callbacks.  And for performance consideration, we only call
      o2hb_run_event_list when there is an event queued.
      Signed-off-by: default avatarJoyce <xuejiufei@huawei.com>
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6f8648e8
    • Joseph Qi's avatar
      ocfs2: avoid possible NULL pointer dereference in o2net_accept_one() · 03dbe88a
      Joseph Qi authored
      Since o2nm_get_node_by_num() may return NULL, we add this check in
      o2net_accept_one() to avoid possible NULL pointer dereference.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      03dbe88a
    • Joseph Qi's avatar
      ocfs2: adjust code style for o2net_handler_tree_lookup() · 9a239e4c
      Joseph Qi authored
      Code in o2net_handler_tree_lookup() may be corrupted by mistake.  So
      adjust it to promote readability.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a239e4c
    • Younger Liu's avatar
      ocfs2: free path in ocfs2_remove_inode_range() · 7aebff18
      Younger Liu authored
      In ocfs2_remove_inode_range(), there is a memory leak.  The variable path
      has allocated memory with ocfs2_new_path_from_et(), but it is not free.
      Signed-off-by: default avatarYounger Liu <younger.liu@huawei.com>
      Reviewed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7aebff18
    • Joseph Qi's avatar
      ocfs2: fix possible double free in ocfs2_reflink_xattr_rec · 6cae6d31
      Joseph Qi authored
      In ocfs2_reflink_xattr_rec(), meta_ac and data_ac are allocated by calling
      ocfs2_lock_reflink_xattr_rec_allocators().
      
      Once an error occurs when allocating *data_ac, it frees *meta_ac which is
      allocated before.  Here it mistakenly sets meta_ac to NULL but *meta_ac.
      Then ocfs2_reflink_xattr_rec() will try to free meta_ac again which is
      already invalid.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6cae6d31
    • Xue jiufei's avatar
      ocfs2/dlm: force clean refmap when doing local cleanup · 69b2bd16
      Xue jiufei authored
      dlm_do_local_recovery_cleanup() should force clean refmap if the owner of
      lockres is UNKNOWN.  Otherwise node may hang when umounting filesystems.
      Here's the situation:
      
      	Node1                                    Node2
      dlmlock()
        -> dlm_get_lock_resource()
      send DLM_MASTER_REQUEST_MSG to
      other nodes.
      
                                             trying to master this lockres,
                                             return MAYBE.
      
      selected as the master of lockresA,
      set mle->master to Node1,
      and do assert_master,
      send DLM_ASSERT_MASTER_MSG to Node2.
                                             Node 2 has interest on lockresA
                                             and return
                                             DLM_ASSERT_RESPONSE_MASTERY_REF
                                             then something happened and
                                             Node2 crashed.
      
      Receiving DLM_ASSERT_RESPONSE_MASTERY_REF, set Node2 into refmap, and keep
      sending DLM_ASSERT_MASTER_MSG to other nodes
      
      o2hb found node2 down, calling dlm_hb_node_down() -->
      dlm_do_local_recovery_cleanup() the master of lockresA is still UNKNOWN,
      no need to call dlm_free_dead_locks().
      
      Set the master of lockresA to Node1, but Node2 stills remains in refmap.
      
      When Node1 umount, it found that the refmap of lockresA is not empty and
      attempted to migrate it to Node2, But Node2 is already down, so umount
      hang, trying to migrate lockresA again and again.
      Signed-off-by: default avatarjoyce <xuejiufei@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Jie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      69b2bd16
    • Younger Liu's avatar
      ocfs2: free meta_ac and data_ac when ocfs2_start_trans fails in ocfs2_xattr_set() · 6ea437a3
      Younger Liu authored
      In ocfs2_xattr_set(), if ocfs2_start_trans failed, meta_ac and data_ac
      should be free.  Otherwise, It would lead to a memory leak.
      Signed-off-by: default avatarYounger Liu <younger.liu@huawei.com>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Reviewed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6ea437a3
    • Joseph Qi's avatar
      ocfs2: add the missing return value check of ocfs2_xattr_get_clusters · 17caf955
      Joseph Qi authored
      In ocfs2_xattr_value_attach_refcount(), if error occurs when calling
      ocfs2_xattr_get_clusters(), it will go with unexpected behavior since
      local variables p_cluster, num_clusters and ext_flags are declared without
      initialization.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Acked-by: default avatarJie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      17caf955
    • Jie Liu's avatar
      ocfs2: fix a memory leak in __ocfs2_move_extents() · 4704aa30
      Jie Liu authored
      The ocfs2 path is not properly freed which leads to a memory leak at
      __ocfs2_move_extents().
      
      This patch stops the leaks of the ocfs2_path structure.
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarYounger Liu <younger.liu@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4704aa30
    • Joseph Qi's avatar
      ocfs2: add missing return value check of ocfs2_get_clusters() · 2b0f6eae
      Joseph Qi authored
      In ocfs2_attach_refcount_tree() and ocfs2_duplicate_extent_list(), if
      error occurs when calling ocfs2_get_clusters(), it will go with
      unexpected behavior as local variables p_cluster, num_clusters and
      ext_flags are declared without initialization.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2b0f6eae
    • Joseph Qi's avatar
      ocfs2: clean up dead code in ocfs2_acl_from_xattr() · 3d94ea51
      Joseph Qi authored
      In ocfs2_acl_from_xattr(), if size is less than sizeof(struct
      posix_acl_entry), it returns ERR_PTR(-EINVAL) directly.  Then assign (size
      / sizeof(struct posix_acl_entry)) to count which will be at least 1, that
      means the following branch (count < 0) and (count == 0) will never be
      true.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Acked-by: default avatarJoel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3d94ea51
    • Dong Fang's avatar
      ocfs2: use list_for_each_entry() instead of list_for_each() · df53cd3b
      Dong Fang authored
      [dan.carpenter@oracle.com: fix up some NULL dereference bugs]
      Signed-off-by: default avatarDong Fang <yp.fangdong@gmail.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Jeff Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      df53cd3b
    • Sunil Mushran's avatar
      fs/ocfs2/cluster/tcp.c: fix possible null pointer dereferences · 8dd7903e
      Sunil Mushran authored
      Fix some possible null pointer dereferences that were detected by the
      static code analyser, smatch.
      Signed-off-by: default avatarSunil Mushran <sunil.mushran@oracle.com>
      Reported-by: default avatarDan Carpenter <error27@gmail.com>
      Reported-by: default avatarGuozhonghua <guozhonghua@h3c.com>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Cc: Joseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8dd7903e
    • Younger Liu's avatar
      ocfs2: ac_bits_wanted should be local_alloc_bits when returns -ENOSPC · 7e9b7937
      Younger Liu authored
      There is an issue in reserving and claiming space for localalloc, When
      localalloc space is not enough, it would claim space from global_bitmap.
      And if there is not enough free space in global_bitmap, the size of
      claiming space would set to half of orignal size and retry.
      
      The issue is as follows: osb->local_alloc_bits is set to half of orignal
      size in ocfs2_recalc_la_window(), but ac->ac_bits_wanted is set to
      osb->local_alloc_default_bits which is not changed.  localalloc always
      reserves and claims local_alloc_default_bits space and returns ENOSPC.
      
      So, ac->ac_bits_wanted should be osb->local_alloc_bits which would be
      changed.
      Signed-off-by: default avatarYounger Liu <younger.liu@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Jeff Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7e9b7937
    • Xue jiufei's avatar
      ocfs2: dlm_request_all_locks() should deal with the status sent from target node · 98ac9125
      Xue jiufei authored
      dlm_request_all_locks() should deal with the status sent from target node
      if DLM_LOCK_REQUEST_MSG is sent successfully, or recovery master will fall
      into endless loop, waiting for other nodes to send locks and
      DLM_RECO_DATA_DONE_MSG to me.
      
              NodeA                                  NodeB
                                           selected as recovery master
                                           dlm_remaster_locks()
                                           ->dlm_request_all_locks()
                                           send DLM_LOCK_REQUEST_MSG to nodeA
      
      It happened that NodeA cannot alloc memory when it processes this
      message.  dlm_request_all_locks_handler() do not queue
      dlm_request_all_locks_worker and returns -ENOMEM.  It will never send
      locks and DLM_RECO_DATA_DONE_MSG to NodeB.
      
                                          NodeB do not deal with the status
                                          sent from nodeA, and will fall in
                                          endless loop waiting for the
                                          recovery state of NodeA to be
                                          changed.
      Signed-off-by: default avatarjoyce <xuejiufei@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Jeff Liu <jeff.liu@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      98ac9125
    • Junxiao Bi's avatar
      ocfs2: use i_size_read() to access i_size · f17c20dd
      Junxiao Bi authored
      Though ocfs2 uses inode->i_mutex to protect i_size, there are both
      i_size_read/write() and direct accesses.  Clean up all direct access to
      eliminate confusion.
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Jie Liu <jeff.liu@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f17c20dd
    • Younger Liu's avatar
      ocfs2: lighten up allocate transaction · 2b1e55c3
      Younger Liu authored
      The issue scenario is as following:
      
      When fallocating a very large disk space for a small file,
      __ocfs2_extend_allocation attempts to get a very large transaction.  For
      some journal sizes, there may be not enough room for this transaction,
      and the fallocate will fail.
      
      The patch below extends & restarts the transaction as necessary while
      allocating space, and should work with even the smallest journal.  This
      patch refers ext4 resize.
      
      Test:
      # mkfs.ocfs2 -b 4K -C 32K -T datafiles /dev/sdc
      ...(jounral size is 32M)
      # mount.ocfs2 /dev/sdc /mnt/ocfs2/
      # touch /mnt/ocfs2/1.log
      # fallocate -o 0 -l 400G /mnt/ocfs2/1.log
      fallocate: /mnt/ocfs2/1.log: fallocate failed: Cannot allocate memory
      # tail -f /var/log/messages
      [ 7372.278591] JBD: fallocate wants too many credits (2051 > 2048)
      [ 7372.278597] (fallocate,6438,0):__ocfs2_extend_allocation:709 ERROR: status = -12
      [ 7372.278603] (fallocate,6438,0):ocfs2_allocate_unwritten_extents:1504 ERROR: status = -12
      [ 7372.278607] (fallocate,6438,0):__ocfs2_change_file_space:1955 ERROR: status = -12
      ^C
      With this patch, the test works well.
      Signed-off-by: default avatarYounger Liu <younger.liu@huawei.com>
      Cc: Jie Liu <jeff.liu@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2b1e55c3
    • Jingoo Han's avatar
      drivers/iommu: remove unnecessary platform_set_drvdata() · 5e42781c
      Jingoo Han authored
      The driver core clears the driver data to NULL after device_release or
      on probe failure.  Thus, it is not needed to manually clear the device
      driver data to NULL.
      Signed-off-by: default avatarJingoo Han <jg1.han@samsung.com>
      Cc: David Brown <davidb@codeaurora.org>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Suman Anna <s-anna@ti.com>
      Acked-by: default avatarLibo Chen <libo.chen@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5e42781c
    • Paul Bolle's avatar
      drivers/video/acornfb.c: remove dead code · ffd29195
      Paul Bolle authored
      acornfb checks for HAS_VIDC while support for that macro was removed in
      v2.6.23 (when the arm26 port was removed).  So we can remove a bit of
      dead code.
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
      Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ffd29195
    • Oleg Nesterov's avatar
      fork: unify and tighten up CLONE_NEWUSER/CLONE_NEWPID checks · 40a0d32d
      Oleg Nesterov authored
      do_fork() denies CLONE_THREAD | CLONE_PARENT if NEWUSER | NEWPID.
      
      Then later copy_process() denies CLONE_SIGHAND if the new process will
      be in a different pid namespace (task_active_pid_ns() doesn't match
      current->nsproxy->pid_ns).
      
      This looks confusing and inconsistent.  CLONE_NEWPID is very similar to
      the case when ->pid_ns was already unshared, we want the same
      restrictions so copy_process() should also nack CLONE_PARENT.
      
      And it would be better to deny CLONE_NEWUSER && CLONE_SIGHAND as well
      just for consistency.
      
      Kill the "CLONE_NEWUSER | CLONE_NEWPID" check in do_fork() and change
      copy_process() to do the same check along with ->pid_ns check we already
      have.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Colin Walters <walters@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      40a0d32d
    • Oleg Nesterov's avatar
      pidns: kill the unnecessary CLONE_NEWPID in copy_process() · 5167246a
      Oleg Nesterov authored
      Commit 8382fcac ("pidns: Outlaw thread creation after
      unshare(CLONE_NEWPID)") nacks CLONE_NEWPID if the forking process
      unshared pid_ns.  This is correct but unnecessary, copy_pid_ns() does
      the same check.
      
      Remove the CLONE_NEWPID check to cleanup the code and prepare for the
      next change.
      
      Test-case:
      
      	static int child(void *arg)
      	{
      		return 0;
      	}
      
      	static char stack[16 * 1024];
      
      	int main(void)
      	{
      		pid_t pid;
      
      		assert(unshare(CLONE_NEWUSER | CLONE_NEWPID) == 0);
      
      		pid = clone(child, stack + sizeof(stack) / 2,
      				CLONE_NEWPID | SIGCHLD, NULL);
      		assert(pid < 0 && errno == EINVAL);
      
      		return 0;
      	}
      
      clone(CLONE_NEWPID) correctly fails with or without this change.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Colin Walters <walters@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5167246a
    • Oleg Nesterov's avatar
      pidns: fix vfork() after unshare(CLONE_NEWPID) · e79f525e
      Oleg Nesterov authored
      Commit 8382fcac ("pidns: Outlaw thread creation after
      unshare(CLONE_NEWPID)") nacks CLONE_VM if the forking process unshared
      pid_ns, this obviously breaks vfork:
      
      	int main(void)
      	{
      		assert(unshare(CLONE_NEWUSER | CLONE_NEWPID) == 0);
      		assert(vfork() >= 0);
      		_exit(0);
      		return 0;
      	}
      
      fails without this patch.
      
      Change this check to use CLONE_SIGHAND instead.  This also forbids
      CLONE_THREAD automatically, and this is what the comment implies.
      
      We could probably even drop CLONE_SIGHAND and use CLONE_THREAD, but it
      would be safer to not do this.  The current check denies CLONE_SIGHAND
      implicitely and there is no reason to change this.
      
      Eric said "CLONE_SIGHAND is fine.  CLONE_THREAD would be even better.
      Having shared signal handling between two different pid namespaces is
      the case that we are fundamentally guarding against."
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reported-by: default avatarColin Walters <walters@redhat.com>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Reviewed-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e79f525e
    • Andrew Morton's avatar
      include/linux/smp.h:on_each_cpu(): switch back to a C function · 3b8967d7
      Andrew Morton authored
      Revert commit c846ef7d ("include/linux/smp.h:on_each_cpu(): switch
      back to a macro").  It turns out that the problematic linux/irqflags.h
      include was fixed within ia64 and mn10300.
      
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: David Daney <david.daney@cavium.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3b8967d7
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · e831cbfc
      Linus Torvalds authored
      Pull more s390 updates from Heiko Carstens:
       "This includes one bpf/jit bug fix where the jit compiler could
        sometimes write generated code out of bounds of the allocated memory
        area.
      
        The rest of the patches are only cleanups and minor improvements"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/irq: reduce size of external interrupt handler hash array
        s390/compat,uid16: use current_cred()
        s390/ap_bus: use and-mask instead of a cast
        s390/ftrace: avoid pointer arithmetics with function pointers
        s390: make various functions static, add declarations to header files
        s390/compat signal: add couple of __force annotations
        s390/mm: add __releases()/__acquires() annotations to gmap_alloc_table()
        s390: keep Kconfig sorted
        s390/irq: rework irq subclass handling
        s390/irq: use hlists for external interrupt handler array
        s390/dumpstack: convert print_symbol to %pSR
        s390/perf: Remove print_hex_dump_bytes() debug output
        s390: update defconfig
        s390/bpf,jit: fix address randomization
      e831cbfc
    • Linus Torvalds's avatar
      Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · 5b419784
      Linus Torvalds authored
      Pull kconfig updates from Michal Marek:
       "This is the kconfig part of kbuild for v3.12-rc1:
         - post-3.11 search code fixes and micro-optimizations
         - CONFIG_MODULES is no longer a special case; this is needed to
           eventually fix the bug that using KCONFIG_ALLCONFIG breaks
           allmodconfig
         - long long is used to store hex and int values
         - make silentoldconfig no longer warns when a symbol changes from
           tristate to bool (it's a job for make oldconfig)
         - scripts/diffconfig updated to work with newer Pythons
         - scripts/config does not rely on GNU sed extensions"
      
      * 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        kconfig: do not allow more than one symbol to have 'option modules'
        kconfig: regenerate bison parser
        kconfig: do not special-case 'MODULES' symbol
        diffconfig: Update script to support python versions 2.5 through 3.3
        diffconfig: Gracefully exit if the default config files are not present
        modules: do not depend on kconfig to set 'modules' option to symbol MODULES
        kconfig: silence warning when parsing auto.conf when a symbol has changed type
        scripts/config: use sed's POSIX interface
        kconfig: switch to "long long" for sanity
        kconfig: simplify symbol-search code
        kconfig: don't allocate n+1 elements in temporary array
        kconfig: minor style fixes in symbol-search code
        kconfig/[mn]conf: shorten title in search-box
        kconfig: avoid multiple calls to strlen
        Documentation/kconfig: more concise and straightforward search explanation
      5b419784
    • Linus Torvalds's avatar
      Merge tag 'for-v3.12' of git://git.infradead.org/battery-2.6 · a22a0fdb
      Linus Torvalds authored
      Pull battery/power supply driver updates from Anton Vorontsov:
       "New drivers:
      
         - APM X-Gene system reboot driver by Feng Kan and Loc Ho (APM).
      
         - Qualcomm MSM reboot/poweroff driver by Abhimanyu Kapur (Codeaurora).
      
         - Texas Instruments BQ24190 charger driver by Mark A.  Greer (Animal
           Creek Technologies).
      
         - Texas Instruments TWL4030 MADC battery driver by Lukas Märdian and
           Marek Belisko (Golden Delicious Computers).  The driver is used on
           Freerunner GTA04 phones.
      
        Highlighted fixes and improvements:
      
         - Suspend/wakeup logic improvements: power supply objects will block
           system suspend until all power supply events are processed.  Thanks
           to Zoran Markovic (Linaro), Arve Hjonnevag and Todd Poynor (Google)"
      
      * tag 'for-v3.12' of git://git.infradead.org/battery-2.6:
        rx51_battery: Fix channel number when reading adc value
        power: Add twl4030_madc battery driver.
        bq24190_charger: Workaround SS definition problem on i386 builds
        power_supply: Prevent suspend until power supply events are processed
        vexpress-poweroff: Should depend on the required infrastructure
        twl4030-charger: Fix compiler warning with regulator_enable()
        rx51_battery: Replace hardcoded channels values.
        bq24190_charger: Add support for TI BQ24190 Battery Charger
        ab8500-charger: We print an unintended error message
        max8925_power: Fix missing of_node_put
        power_supply: Replace strict_strtol() with kstrtol()
        power: Add APM X-Gene system reboot driver
        power_supply: tosa_battery: Get rid of irq_to_gpio usage
        power supply: collie_battery: Convert to use dev_pm_ops
        power_supply: Make goldfish_battery depend on GOLDFISH || COMPILE_TEST
        power: reset: Add msm restart support
        MAINTAINERS: drivers/power: add entry for SmartReflex AVS drivers
      a22a0fdb
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · bf83e614
      Linus Torvalds authored
      Pull powerpc fixes from Ben Herrenschmidt:
       "Here are a handful of small powerpc fixes.
      
        A couple of section mismatches (always worth fixing), a missing export
        of a new symbol causing build failures of modules, a page fault
        deadlock fix (interestingly that bug has been around for a LONG time,
        though it seems to be more easily triggered by KVM) and fixing pseries
        default idle loop in the absence of the cpuidle drivers (such as
        during boot)"
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc: Default arch idle could cede processor on pseries
        fbdev/ps3fb: Fix section mismatch warning for ps3fb_probe
        powerpc: Fix section mismatch warning for prom_rtas_call
        powerpc: Fix possible deadlock on page fault
        powerpc: Export cpu_to_chip_id() to fix build error
      bf83e614
    • Linus Torvalds's avatar
      Merge tag 'stable/for-linus-3.12-rc0-tag-two' of... · a60d4b98
      Linus Torvalds authored
      Merge tag 'stable/for-linus-3.12-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
      
      Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
       "This pull I usually do after rc1 is out but because we have a nice
        amount of fixes, some bootup related fixes for ARM, and it is early in
        the cycle we figured to do it now to help with tracking of potential
        regressions.
      
        The simple ones are the ARM ones - one of the patches fell through the
        cracks, other fixes a bootup issue (unconditionally using Xen
        functions).  Then a fix for a regression causing preempt count being
        off (patch causing this went in v3.12).
      
        Lastly are the fixes to make Xen PVHVM guests use PV ticketlocks (Xen
        PV already does).
      
        The enablement of that was supposed to be part of the x86 spinlock
        merge in commit 816434ec ("The biggest change here are
        paravirtualized ticket spinlocks (PV spinlocks), which bring a nice
        speedup on various benchmarks...") but unfortunatly it would cause
        hang when booting Xen PVHVM guests.  Yours truly got all of the bugs
        fixed last week and they (six of them) are included in this pull.
      
        Bug-fixes:
         - Boot on ARM without using Xen unconditionally
         - On Xen ARM don't run cpuidle/cpufreq
         - Fix regression in balloon driver, preempt count warnings
         - Fixes to make PVHVM able to use pv ticketlock.
         - Revert Xen PVHVM disabling pv ticketlock (aka, re-enable pv ticketlocks)"
      
      * tag 'stable/for-linus-3.12-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/spinlock: Don't use __initdate for xen_pv_spin
        Revert "xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM"
        xen/spinlock: Don't setup xen spinlock IPI kicker if disabled.
        xen/smp: Update pv_lock_ops functions before alternative code starts under PVHVM
        xen/spinlock: We don't need the old structure anymore
        xen/spinlock: Fix locking path engaging too soon under PVHVM.
        xen/arm: disable cpuidle and cpufreq when linux is running as dom0
        xen/p2m: Don't call get_balloon_scratch_page() twice, keep interrupts disabled for multicalls
        ARM: xen: only set pm function ptrs for Xen guests
      a60d4b98
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · fa1586a7
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Daniel had some fixes queued up, that were delayed, the stolen memory
        ones and vga arbiter ones are quite useful, along with his usual bunch
        of stuff, nothing for HSW outputs yet.
      
        The one nouveau fix is for a regression I caused with the poweroff stuff"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (30 commits)
        drm/nouveau: fix oops on runtime suspend/resume
        drm/i915: Delay disabling of VGA memory until vgacon->fbcon handoff is done
        drm/i915: try not to lose backlight CBLV precision
        drm/i915: Confine page flips to BCS on Valleyview
        drm/i915: Skip stolen region initialisation if none is reserved
        drm/i915: fix gpu hang vs. flip stall deadlocks
        drm/i915: Hold an object reference whilst we shrink it
        drm/i915: fix i9xx_crtc_clock_get for multiplied pixels
        drm/i915: handle sdvo input pixel multiplier correctly again
        drm/i915: fix hpd work vs. flush_work in the pageflip code deadlock
        drm/i915: fix up the relocate_entry refactoring
        drm/i915: Fix pipe config warnings when dealing with LVDS fixed mode
        drm/i915: Don't call sg_free_table() if sg_alloc_table() fails
        i915: Update VGA arbiter support for newer devices
        vgaarb: Fix VGA decodes changes
        vgaarb: Don't disable resources that are not owned
        drm/i915: Pin pages whilst mapping the dma-buf
        drm/i915: enable trickle feed on Haswell
        x86: add early quirk for reserving Intel graphics stolen memory v5
        drm/i915: split PCI IDs out into i915_drm.h v4
        ...
      fa1586a7
    • Linus Torvalds's avatar
      Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linux · cf596766
      Linus Torvalds authored
      Pull nfsd updates from Bruce Fields:
       "This was a very quiet cycle! Just a few bugfixes and some cleanup"
      
      * 'nfsd-next' of git://linux-nfs.org/~bfields/linux:
        rpc: let xdr layer allocate gssproxy receieve pages
        rpc: fix huge kmalloc's in gss-proxy
        rpc: comment on linux_cred encoding, treat all as unsigned
        rpc: clean up decoding of gssproxy linux creds
        svcrpc: remove unused rq_resused
        nfsd4: nfsd4_create_clid_dir prints uninitialized data
        nfsd4: fix leak of inode reference on delegation failure
        Revert "nfsd: nfs4_file_get_access: need to be more careful with O_RDWR"
        sunrpc: prepare NFS for 2038
        nfsd4: fix setlease error return
        nfsd: nfs4_file_get_access: need to be more careful with O_RDWR
      cf596766
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 516f7b3f
      Linus Torvalds authored
      Pull hwmon cleanups from Guenter Roeck:
       "Minor cleanup in ina2xx and hwmon-vid drivers; no functional changes"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (ina2xx) Remove casting the return value which is a void pointer
        hwmon: (hwmon-vid) Add __maybe_unused attribute to dummy variable
      516f7b3f
    • Linus Torvalds's avatar
      Merge branch 'x86/jumplabel' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 442e0973
      Linus Torvalds authored
      Pull x86 jumplabel changes from Peter Anvin:
       "One more x86 tree for this merge window.  This tree improves the
        handling of jump labels, so that most of the time we don't have to do
        a massive initial patching run.
      
        Furthermore, we will error out of the jump label is not what is
        expected, eg if it has been corrupted or tampered with"
      
      * 'x86/jumplabel' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/jump-label: Show where and what was wrong on errors
        x86/jump-label: Add safety checks to jump label conversions
        x86/jump-label: Do not bother updating nops if they are correct
        x86/jump-label: Use best default nops for inital jump label calls
      442e0973
    • Vaidyanathan Srinivasan's avatar
      powerpc: Default arch idle could cede processor on pseries · 363edbe2
      Vaidyanathan Srinivasan authored
      When adding cpuidle support to pSeries, we introduced two
      regressions:
      
        - The new cpuidle backend driver only works under hypervisors
          supporting the "SLPLAR" option, which isn't the case of the
          old POWER4 hypervisor and the HV "light" used on js2x blades
      
        - The cpuidle driver registers fairly late, meaning that for
          a significant portion of the boot process, we end up having
          all threads spinning. This slows down the boot process and
          increases the overall resource usage if the hypervisor has
          shared processors.
      
      This fixes both by implementing a "default" idle that will cede
      to the hypervisor when possible, in a very simple way without
      all the bells and whisles of cpuidle.
      Reported-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Acked-by: default avatarDeepthi Dharwar <deepthi@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org>
      363edbe2
    • Vladimir Murzin's avatar
      fbdev/ps3fb: Fix section mismatch warning for ps3fb_probe · 88c2d0b6
      Vladimir Murzin authored
      While cross-building for PPC64 I've got
      
      WARNING: drivers/video/built-in.o(.text+0x9f9ca): Section mismatch in
      reference from the function .ps3fb_probe() to th e variable
      .init.data:ps3fb_fix The function .ps3fb_probe() references the
      variable __initdata ps3fb_fix.  This is often because .ps3fb_probe
      lacks a __initdata annotation or the annotation of ps3fb_fix is wrong.
      
      WARNING: drivers/video/built-in.o(.text+0x9f9d2): Section mismatch in
      reference from the function .ps3fb_probe() to the variable
      .init.data:ps3fb_fix The function .ps3fb_probe() references the
      variable __initdata ps3fb_fix.  This is often because .ps3fb_probe
      lacks a __initdata annotation or the annotation of ps3fb_fix is wrong.
      
      WARNING: drivers/built-in.o(.text+0xe222a): Section mismatch in
      reference from the function .ps3fb_probe() to the variable
      .init.data:ps3fb_fix The function .ps3fb_probe() references the
      variable __initdata ps3fb_fix.  This is often because .ps3fb_probe
      lacks a __initdata annotation or the annotation of ps3fb_fix is wrong.
      
      WARNING: drivers/built-in.o(.text+0xe2232): Section mismatch in
      reference from the function .ps3fb_probe() to the variable
      .init.data:ps3fb_fix The function .ps3fb_probe() references the
      variable __initdata ps3fb_fix.  This is often because .ps3fb_probe
      lacks a __initdata annotation or the annotation of ps3fb_fix is wrong.
      
      WARNING: vmlinux.o(.text+0x561d4a): Section mismatch in reference from
      the function .ps3fb_probe() to the variable .init.data:ps3fb_fix The
      function .ps3fb_probe() references the variable __initdata ps3fb_fix.
      This is often because .ps3fb_probe lacks a __initdata annotation or
      the annotation of ps3fb_fix is wrong.
      
      Mismatch was introduced with 48c68c4f "Drivers: video: remove __dev*
      attributes."
      
      Remove __init data annotation from ps3fb_fix.
      Signed-off-by: default avatarVladimir Murzin <murzin.v@gmail.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      88c2d0b6
    • Vladimir Murzin's avatar
      powerpc: Fix section mismatch warning for prom_rtas_call · 620e5050
      Vladimir Murzin authored
      While cross-building for PPC64 I've got
      
      WARNING: vmlinux.o(.text.unlikely+0x1ba): Section mismatch in
      reference from the function .prom_rtas_call() to the variable
      .init.data:dt_string_start The function .prom_rtas_call() references
      the variable __initdata dt_string_start.  This is often because
      .prom_rtas_call lacks a __initdata annotation or the annotation of
      dt_string_start is wrong.
      
      WARNING: vmlinux.o(.meminit.text+0xeb0): Section mismatch in reference
      from the function .free_area_init_core.isra.47() to the function
      .init.text:.set_pageblock_order() The function __meminit
      .free_area_init_core.isra.47() references a function __init
      .set_pageblock_order().  If .set_pageblock_order is only used by
      .free_area_init_core.isra.47 then annotate .set_pageblock_order with a
      matching annotation.
      
      Fix it by proper annotation of prom_rtas_call.
      Signed-off-by: default avatarVladimir Murzin <murzin.v@gmail.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      620e5050
    • Aneesh Kumar K.V's avatar
      powerpc: Fix possible deadlock on page fault · 69e044dd
      Aneesh Kumar K.V authored
       stack_grow_into/14082 is trying to acquire lock:
        (&mm->mmap_sem){++++++}, at: [<c000000000206d28>] .might_fault+0x78/0xe0
      
       but task is already holding lock:
        (&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&mm->mmap_sem);
         lock(&mm->mmap_sem);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by stack_grow_into/14082:
        #0:  (&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910
      
       stack backtrace:
       CPU: 21 PID: 14082 Comm: stack_grow_into Not tainted 3.10.0-10.el7.ppc64.debug #1
       Call Trace:
       [c0000003d396b850] [c000000000016e7c] .show_stack+0x7c/0x1f0 (unreliable)
       [c0000003d396b920] [c000000000813fc8] .dump_stack+0x28/0x3c
       [c0000003d396b990] [c000000000124b90] .__lock_acquire+0x1640/0x1800
       [c0000003d396bab0] [c00000000012570c] .lock_acquire+0xac/0x250
       [c0000003d396bb80] [c000000000206d54] .might_fault+0xa4/0xe0
       [c0000003d396bbf0] [c0000000007ffe2c] .do_page_fault+0x2ec/0x910
       [c0000003d396be30] [c0000000000092e8] handle_page_fault+0x10/0x30
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      69e044dd