1. 21 Mar, 2016 40 commits
    • bingtian.ly@taobao.com's avatar
      net: avoid to hang up on sending due to sysctl configuration overflow. · d865a115
      bingtian.ly@taobao.com authored
      commit cdda8891 upstream.
      
          I found if we write a larger than 4GB value to some sysctl
      variables, the sending syscall will hang up forever, because these
      variables are 32 bits, such large values make them overflow to 0 or
      negative.
      
          This patch try to fix overflow or prevent from zero value setup
      of below sysctl variables:
      
      net.core.wmem_default
      net.core.rmem_default
      
      net.core.rmem_max
      net.core.wmem_max
      
      net.ipv4.udp_rmem_min
      net.ipv4.udp_wmem_min
      
      net.ipv4.tcp_wmem
      net.ipv4.tcp_rmem
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarLi Yu <raise.sail@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      d865a115
    • Linus Torvalds's avatar
      Initialize msg/shm IPC objects before doing ipc_addid() · 3cda8eb0
      Linus Torvalds authored
      commit b9a53227 upstream.
      
      As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before
      having initialized the IPC object state.  Yes, we initialize the IPC
      object in a locked state, but with all the lockless RCU lookup work,
      that IPC object lock no longer means that the state cannot be seen.
      
      We already did this for the IPC semaphore code (see commit e8577d1f:
      "ipc/sem.c: fully initialize sem_array before making it visible") but we
      clearly forgot about msg and shm.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      3cda8eb0
    • Al Viro's avatar
      get rid of s_files and files_lock · f074c267
      Al Viro authored
      commit eee5cc27 upstream.
      
      The only thing we need it for is alt-sysrq-r (emergency remount r/o)
      and these days we can do just as well without going through the
      list of files.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      f074c267
    • Paolo Bonzini's avatar
      KVM: svm: unconditionally intercept #DB · 8f452aa3
      Paolo Bonzini authored
      commit cbdb967a upstream.
      
      This is needed to avoid the possibility that the guest triggers
      an infinite stream of #DB exceptions (CVE-2015-8104).
      
      VMX is not affected: because it does not save DR6 in the VMCS,
      it already intercepts #DB unconditionally.
      Reported-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [bwh: Backported to 3.2, with thanks to Paolo:
       - update_db_bp_intercept() was called update_db_intercept()
       - The remaining call is in svm_guest_debug() rather than through svm_x86_ops]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      8f452aa3
    • Eric Northup's avatar
      KVM: x86: work around infinite loop in microcode when #AC is delivered · e1b6c3c9
      Eric Northup authored
      commit 54a20552 upstream.
      
      It was found that a guest can DoS a host by triggering an infinite
      stream of "alignment check" (#AC) exceptions.  This causes the
      microcode to enter an infinite loop where the core never receives
      another interrupt.  The host kernel panics pretty quickly due to the
      effects (CVE-2015-5307).
      Signed-off-by: default avatarEric Northup <digitaleric@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [lizf: Backported to 3.4:
       - adjust filename
       - adjust context
       - add definition of AC_VECTOR]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      e1b6c3c9
    • Marcelo Leitner's avatar
      ipv6: addrconf: validate new MTU before applying it · 59c8392f
      Marcelo Leitner authored
      commit 77751427 upstream.
      
      Currently we don't check if the new MTU is valid or not and this allows
      one to configure a smaller than minimum allowed by RFCs or even bigger
      than interface own MTU, which is a problem as it may lead to packet
      drops.
      
      If you have a daemon like NetworkManager running, this may be exploited
      by remote attackers by forging RA packets with an invalid MTU, possibly
      leading to a DoS. (NetworkManager currently only validates for values
      too small, but not for too big ones.)
      
      The fix is just to make sure the new value is valid. That is, between
      IPV6_MIN_MTU and interface's MTU.
      
      Note that similar check is already performed at
      ndisc_router_discovery(), for when kernel itself parses the RA.
      Signed-off-by: default avatarMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      59c8392f
    • David Howells's avatar
      KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring · 9793b7bc
      David Howells authored
      commit f05819df upstream.
      
      The following sequence of commands:
      
          i=`keyctl add user a a @s`
          keyctl request2 keyring foo bar @t
          keyctl unlink $i @s
      
      tries to invoke an upcall to instantiate a keyring if one doesn't already
      exist by that name within the user's keyring set.  However, if the upcall
      fails, the code sets keyring->type_data.reject_error to -ENOKEY or some
      other error code.  When the key is garbage collected, the key destroy
      function is called unconditionally and keyring_destroy() uses list_empty()
      on keyring->type_data.link - which is in a union with reject_error.
      Subsequently, the kernel tries to unlink the keyring from the keyring names
      list - which oopses like this:
      
      	BUG: unable to handle kernel paging request at 00000000ffffff8a
      	IP: [<ffffffff8126e051>] keyring_destroy+0x3d/0x88
      	...
      	Workqueue: events key_garbage_collector
      	...
      	RIP: 0010:[<ffffffff8126e051>] keyring_destroy+0x3d/0x88
      	RSP: 0018:ffff88003e2f3d30  EFLAGS: 00010203
      	RAX: 00000000ffffff82 RBX: ffff88003bf1a900 RCX: 0000000000000000
      	RDX: 0000000000000000 RSI: 000000003bfc6901 RDI: ffffffff81a73a40
      	RBP: ffff88003e2f3d38 R08: 0000000000000152 R09: 0000000000000000
      	R10: ffff88003e2f3c18 R11: 000000000000865b R12: ffff88003bf1a900
      	R13: 0000000000000000 R14: ffff88003bf1a908 R15: ffff88003e2f4000
      	...
      	CR2: 00000000ffffff8a CR3: 000000003e3ec000 CR4: 00000000000006f0
      	...
      	Call Trace:
      	 [<ffffffff8126c756>] key_gc_unused_keys.constprop.1+0x5d/0x10f
      	 [<ffffffff8126ca71>] key_garbage_collector+0x1fa/0x351
      	 [<ffffffff8105ec9b>] process_one_work+0x28e/0x547
      	 [<ffffffff8105fd17>] worker_thread+0x26e/0x361
      	 [<ffffffff8105faa9>] ? rescuer_thread+0x2a8/0x2a8
      	 [<ffffffff810648ad>] kthread+0xf3/0xfb
      	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
      	 [<ffffffff815f2ccf>] ret_from_fork+0x3f/0x70
      	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
      
      Note the value in RAX.  This is a 32-bit representation of -ENOKEY.
      
      The solution is to only call ->destroy() if the key was successfully
      instantiated.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      [lizf: Backported to 3.4: adjust indentation]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      9793b7bc
    • David Howells's avatar
      KEYS: Fix race between key destruction and finding a keyring by name · 48ec02cc
      David Howells authored
      commit 94c4554b upstream.
      
      There appears to be a race between:
      
       (1) key_gc_unused_keys() which frees key->security and then calls
           keyring_destroy() to unlink the name from the name list
      
       (2) find_keyring_by_name() which calls key_permission(), thus accessing
           key->security, on a key before checking to see whether the key usage is 0
           (ie. the key is dead and might be cleaned up).
      
      Fix this by calling ->destroy() before cleaning up the core key data -
      including key->security.
      Reported-by: default avatarPetr Matousek <pmatouse@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      [lizf: Backported to 3.4: adjust indentation]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      48ec02cc
    • Johan Hovold's avatar
      USB: whiteheat: fix potential null-deref at probe · 04d6387f
      Johan Hovold authored
      commit cbb4be65 upstream.
      
      Fix potential null-pointer dereference at probe by making sure that the
      required endpoints are present.
      
      The whiteheat driver assumes there are at least five pairs of bulk
      endpoints, of which the final pair is used for the "command port". An
      attempt to bind to an interface with fewer bulk endpoints would
      currently lead to an oops.
      
      Fixes CVE-2015-5257.
      Reported-by: default avatarMoein Ghasemzadeh <moein@istuary.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      04d6387f
    • Ben Hutchings's avatar
      ppp, slip: Validate VJ compression slot parameters completely · a1c3860d
      Ben Hutchings authored
      commit 4ab42d78 upstream.
      
      Currently slhc_init() treats out-of-range values of rslots and tslots
      as equivalent to 0, except that if tslots is too large it will
      dereference a null pointer (CVE-2015-7799).
      
      Add a range-check at the top of the function and make it return an
      ERR_PTR() on error instead of NULL.  Change the callers accordingly.
      
      Compile-tested only.
      Reported-by: default avatar郭永刚 <guoyonggang@360.cn>
      References: http://article.gmane.org/gmane.comp.security.oss.general/17908Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      a1c3860d
    • Ben Hutchings's avatar
      isdn_ppp: Add checks for allocation failure in isdn_ppp_open() · 76e4831f
      Ben Hutchings authored
      commit 0baa57d8 upstream.
      
      Compile-tested only.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      76e4831f
    • Jason Wang's avatar
      virtio-net: drop NETIF_F_FRAGLIST · 35464982
      Jason Wang authored
      commit 48900cb6 upstream.
      
      virtio declares support for NETIF_F_FRAGLIST, but assumes
      that there are at most MAX_SKB_FRAGS + 2 fragments which isn't
      always true with a fraglist.
      
      A longer fraglist in the skb will make the call to skb_to_sgvec overflow
      the sg array, leading to memory corruption.
      
      Drop NETIF_F_FRAGLIST so we only get what we can handle.
      
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      35464982
    • Al Viro's avatar
      sg_start_req(): make sure that there's not too many elements in iovec · 26aa430b
      Al Viro authored
      commit 451a2886 upstream.
      
      unfortunately, allowing an arbitrary 16bit value means a possibility of
      overflow in the calculation of total number of pages in bio_map_user_iov() -
      we rely on there being no more than PAGE_SIZE members of sum in the
      first loop there.  If that sum wraps around, we end up allocating
      too small array of pointers to pages and it's easy to overflow it in
      the second loop.
      
      X-Coverup: TINC (and there's no lumber cartel either)
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [lizf: Backported to 3.4: s/MAX_UIOVEC/UIO_MAXIOV]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      26aa430b
    • Quentin Casasnovas's avatar
      RDS: fix race condition when sending a message on unbound socket · 4ee85daf
      Quentin Casasnovas authored
      commit 8c7188b2 upstream.
      
      Sasha's found a NULL pointer dereference in the RDS connection code when
      sending a message to an apparently unbound socket.  The problem is caused
      by the code checking if the socket is bound in rds_sendmsg(), which checks
      the rs_bound_addr field without taking a lock on the socket.  This opens a
      race where rs_bound_addr is temporarily set but where the transport is not
      in rds_bind(), leading to a NULL pointer dereference when trying to
      dereference 'trans' in __rds_conn_create().
      
      Vegard wrote a reproducer for this issue, so kindly ask him to share if
      you're interested.
      
      I cannot reproduce the NULL pointer dereference using Vegard's reproducer
      with this patch, whereas I could without.
      
      Complete earlier incomplete fix to CVE-2015-6937:
      
        74e98eb0 ("RDS: verify the underlying transport exists before creating a connection")
      
      Cc: David S. Miller <davem@davemloft.net>
      Reviewed-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Reviewed-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarQuentin Casasnovas <quentin.casasnovas@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      4ee85daf
    • Sasha Levin's avatar
      RDS: verify the underlying transport exists before creating a connection · 473d7207
      Sasha Levin authored
      commit 74e98eb0 upstream.
      
      There was no verification that an underlying transport exists when creating
      a connection, this would cause dereferencing a NULL ptr.
      
      It might happen on sockets that weren't properly bound before attempting to
      send a message, which will cause a NULL ptr deref:
      
      [135546.047719] kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
      [135546.051270] Modules linked in:
      [135546.051781] CPU: 4 PID: 15650 Comm: trinity-c4 Not tainted 4.2.0-next-20150902-sasha-00041-gbaa1222-dirty #2527
      [135546.053217] task: ffff8800835bc000 ti: ffff8800bc708000 task.ti: ffff8800bc708000
      [135546.054291] RIP: __rds_conn_create (net/rds/connection.c:194)
      [135546.055666] RSP: 0018:ffff8800bc70fab0  EFLAGS: 00010202
      [135546.056457] RAX: dffffc0000000000 RBX: 0000000000000f2c RCX: ffff8800835bc000
      [135546.057494] RDX: 0000000000000007 RSI: ffff8800835bccd8 RDI: 0000000000000038
      [135546.058530] RBP: ffff8800bc70fb18 R08: 0000000000000001 R09: 0000000000000000
      [135546.059556] R10: ffffed014d7a3a23 R11: ffffed014d7a3a21 R12: 0000000000000000
      [135546.060614] R13: 0000000000000001 R14: ffff8801ec3d0000 R15: 0000000000000000
      [135546.061668] FS:  00007faad4ffb700(0000) GS:ffff880252000000(0000) knlGS:0000000000000000
      [135546.062836] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [135546.063682] CR2: 000000000000846a CR3: 000000009d137000 CR4: 00000000000006a0
      [135546.064723] Stack:
      [135546.065048]  ffffffffafe2055c ffffffffafe23fc1 ffffed00493097bf ffff8801ec3d0008
      [135546.066247]  0000000000000000 00000000000000d0 0000000000000000 ac194a24c0586342
      [135546.067438]  1ffff100178e1f78 ffff880320581b00 ffff8800bc70fdd0 ffff880320581b00
      [135546.068629] Call Trace:
      [135546.069028] ? __rds_conn_create (include/linux/rcupdate.h:856 net/rds/connection.c:134)
      [135546.069989] ? rds_message_copy_from_user (net/rds/message.c:298)
      [135546.071021] rds_conn_create_outgoing (net/rds/connection.c:278)
      [135546.071981] rds_sendmsg (net/rds/send.c:1058)
      [135546.072858] ? perf_trace_lock (include/trace/events/lock.h:38)
      [135546.073744] ? lockdep_init (kernel/locking/lockdep.c:3298)
      [135546.074577] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.075508] ? __might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3795)
      [135546.076349] ? __might_fault (mm/memory.c:3795)
      [135546.077179] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.078114] sock_sendmsg (net/socket.c:611 net/socket.c:620)
      [135546.078856] SYSC_sendto (net/socket.c:1657)
      [135546.079596] ? SYSC_connect (net/socket.c:1628)
      [135546.080510] ? trace_dump_stack (kernel/trace/trace.c:1926)
      [135546.081397] ? ring_buffer_unlock_commit (kernel/trace/ring_buffer.c:2479 kernel/trace/ring_buffer.c:2558 kernel/trace/ring_buffer.c:2674)
      [135546.082390] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.083410] ? trace_event_raw_event_sys_enter (include/trace/events/syscalls.h:16)
      [135546.084481] ? do_audit_syscall_entry (include/trace/events/syscalls.h:16)
      [135546.085438] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.085515] rds_ib_laddr_check(): addr 36.74.25.172 ret -99 node type -1
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      473d7207
    • Hannes Frederic Sowa's avatar
      net: add validation for the socket syscall protocol argument · 39f79797
      Hannes Frederic Sowa authored
      commit 79462ad0 upstream.
      
      郭永刚 reported that one could simply crash the kernel as root by
      using a simple program:
      
      	int socket_fd;
      	struct sockaddr_in addr;
      	addr.sin_port = 0;
      	addr.sin_addr.s_addr = INADDR_ANY;
      	addr.sin_family = 10;
      
      	socket_fd = socket(10,3,0x40000000);
      	connect(socket_fd , &addr,16);
      
      AF_INET, AF_INET6 sockets actually only support 8-bit protocol
      identifiers. inet_sock's skc_protocol field thus is sized accordingly,
      thus larger protocol identifiers simply cut off the higher bits and
      store a zero in the protocol fields.
      
      This could lead to e.g. NULL function pointer because as a result of
      the cut off inet_num is zero and we call down to inet_autobind, which
      is NULL for raw sockets.
      
      kernel: Call Trace:
      kernel:  [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70
      kernel:  [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80
      kernel:  [<ffffffff81645069>] SYSC_connect+0xd9/0x110
      kernel:  [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80
      kernel:  [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200
      kernel:  [<ffffffff81645e0e>] SyS_connect+0xe/0x10
      kernel:  [<ffffffff81779515>] tracesys_phase2+0x84/0x89
      
      I found no particular commit which introduced this problem.
      
      CVE: CVE-2015-8543
      Cc: Cong Wang <cwang@twopensource.com>
      Reported-by: default avatar郭永刚 <guoyonggang@360.cn>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [lizf: Backported to 3.4: open-code U8_MAX]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      39f79797
    • WANG Cong's avatar
      0cf0ae36
    • Rainer Weikusat's avatar
      unix: avoid use-after-free in ep_remove_wait_queue · ec54d5ae
      Rainer Weikusat authored
      commit 7d267278 upstream.
      
      Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
      An AF_UNIX datagram socket being the client in an n:1 association with
      some server socket is only allowed to send messages to the server if the
      receive queue of this socket contains at most sk_max_ack_backlog
      datagrams. This implies that prospective writers might be forced to go
      to sleep despite none of the message presently enqueued on the server
      receive queue were sent by them. In order to ensure that these will be
      woken up once space becomes again available, the present unix_dgram_poll
      routine does a second sock_poll_wait call with the peer_wait wait queue
      of the server socket as queue argument (unix_dgram_recvmsg does a wake
      up on this queue after a datagram was received). This is inherently
      problematic because the server socket is only guaranteed to remain alive
      for as long as the client still holds a reference to it. In case the
      connection is dissolved via connect or by the dead peer detection logic
      in unix_dgram_sendmsg, the server socket may be freed despite "the
      polling mechanism" (in particular, epoll) still has a pointer to the
      corresponding peer_wait queue. There's no way to forcibly deregister a
      wait queue with epoll.
      
      Based on an idea by Jason Baron, the patch below changes the code such
      that a wait_queue_t belonging to the client socket is enqueued on the
      peer_wait queue of the server whenever the peer receive queue full
      condition is detected by either a sendmsg or a poll. A wake up on the
      peer queue is then relayed to the ordinary wait queue of the client
      socket via wake function. The connection to the peer wait queue is again
      dissolved if either a wake up is about to be relayed or the client
      socket reconnects or a dead peer is detected or the client socket is
      itself closed. This enables removing the second sock_poll_wait from
      unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
      that no blocked writer sleeps forever.
      Signed-off-by: default avatarRainer Weikusat <rweikusat@mobileactivedefense.com>
      Fixes: ec0d215f ("af_unix: fix 'poll for write'/connected DGRAM sockets")
      Reviewed-by: default avatarJason Baron <jbaron@akamai.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      ec54d5ae
    • Zefan Li's avatar
      Revert "usb: dwc3: Reset the transfer resource index on SET_INTERFACE" · 44b2ffc5
      Zefan Li authored
      It was applied to the wrong function.
      
      This reverts commit 15488de7.
      44b2ffc5
    • lucien's avatar
      sctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state · 1f2a65a2
      lucien authored
      commit f648f807 upstream.
      
      Commit f8d96052 ("sctp: Enforce retransmission limit during shutdown")
      fixed a problem with excessive retransmissions in the SHUTDOWN_PENDING by not
      resetting the association overall_error_count.  This allowed the association
      to better enforce assoc.max_retrans limit.
      
      However, the same issue still exists when the association is in SHUTDOWN_RECEIVED
      state.  In this state, HB-ACKs will continue to reset the overall_error_count
      for the association would extend the lifetime of association unnecessarily.
      
      This patch solves this by resetting the overall_error_count whenever the current
      state is small then SCTP_STATE_SHUTDOWN_PENDING.  As a small side-effect, we
      end up also handling SCTP_STATE_SHUTDOWN_ACK_SENT and SCTP_STATE_SHUTDOWN_SENT
      states, but they are not really impacted because we disable Heartbeats in those
      states.
      
      Fixes: Commit f8d96052 ("sctp: Enforce retransmission limit during shutdown")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      1f2a65a2
    • David Ahern's avatar
      net: Fix RCU splat in af_key · 0ddb79b0
      David Ahern authored
      commit ba51b6be upstream.
      
      Hit the following splat testing VRF change for ipsec:
      
      [  113.475692] ===============================
      [  113.476194] [ INFO: suspicious RCU usage. ]
      [  113.476667] 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED Not tainted
      [  113.477545] -------------------------------
      [  113.478013] /work/monster-14/dsa/kernel.git/include/linux/rcupdate.h:568 Illegal context switch in RCU read-side critical section!
      [  113.479288]
      [  113.479288] other info that might help us debug this:
      [  113.479288]
      [  113.480207]
      [  113.480207] rcu_scheduler_active = 1, debug_locks = 1
      [  113.480931] 2 locks held by setkey/6829:
      [  113.481371]  #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff814e9887>] pfkey_sendmsg+0xfb/0x213
      [  113.482509]  #1:  (rcu_read_lock){......}, at: [<ffffffff814e767f>] rcu_read_lock+0x0/0x6e
      [  113.483509]
      [  113.483509] stack backtrace:
      [  113.484041] CPU: 0 PID: 6829 Comm: setkey Not tainted 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED
      [  113.485422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
      [  113.486845]  0000000000000001 ffff88001d4c7a98 ffffffff81518af2 ffffffff81086962
      [  113.487732]  ffff88001d538480 ffff88001d4c7ac8 ffffffff8107ae75 ffffffff8180a154
      [  113.488628]  0000000000000b30 0000000000000000 00000000000000d0 ffff88001d4c7ad8
      [  113.489525] Call Trace:
      [  113.489813]  [<ffffffff81518af2>] dump_stack+0x4c/0x65
      [  113.490389]  [<ffffffff81086962>] ? console_unlock+0x3d6/0x405
      [  113.491039]  [<ffffffff8107ae75>] lockdep_rcu_suspicious+0xfa/0x103
      [  113.491735]  [<ffffffff81064032>] rcu_preempt_sleep_check+0x45/0x47
      [  113.492442]  [<ffffffff8106404d>] ___might_sleep+0x19/0x1c8
      [  113.493077]  [<ffffffff81064268>] __might_sleep+0x6c/0x82
      [  113.493681]  [<ffffffff81133190>] cache_alloc_debugcheck_before.isra.50+0x1d/0x24
      [  113.494508]  [<ffffffff81134876>] kmem_cache_alloc+0x31/0x18f
      [  113.495149]  [<ffffffff814012b5>] skb_clone+0x64/0x80
      [  113.495712]  [<ffffffff814e6f71>] pfkey_broadcast_one+0x3d/0xff
      [  113.496380]  [<ffffffff814e7b84>] pfkey_broadcast+0xb5/0x11e
      [  113.497024]  [<ffffffff814e82d1>] pfkey_register+0x191/0x1b1
      [  113.497653]  [<ffffffff814e9770>] pfkey_process+0x162/0x17e
      [  113.498274]  [<ffffffff814e9895>] pfkey_sendmsg+0x109/0x213
      
      In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
      the RCU lock.
      
      Since pfkey_broadcast takes the RCU lock the allocation argument is
      pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
      The one call outside of rcu can be done with GFP_KERNEL.
      
      Fixes: 7f6b9dbd ("af_key: locking change")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      0ddb79b0
    • Herton R. Krzesinski's avatar
      ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits · cc456922
      Herton R. Krzesinski authored
      commit 602b8593 upstream.
      
      The current semaphore code allows a potential use after free: in
      exit_sem we may free the task's sem_undo_list while there is still
      another task looping through the same semaphore set and cleaning the
      sem_undo list at freeary function (the task called IPC_RMID for the same
      semaphore set).
      
      For example, with a test program [1] running which keeps forking a lot
      of processes (which then do a semop call with SEM_UNDO flag), and with
      the parent right after removing the semaphore set with IPC_RMID, and a
      kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
      CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
      in the kernel log:
      
         Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
         000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b  kkkkkkkk.kkkkkkk
         010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff  ....kkkk........
         Prev obj: start=ffff88003b45c180, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff  ...........7....
         Next obj: start=ffff88003b45c200, len=64
         000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
         010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff  ........h).<....
         BUG: spinlock wrong CPU on CPU#2, test/18028
         general protection fault: 0000 [#1] SMP
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: spin_dump+0x53/0xc0
         Call Trace:
           spin_bug+0x30/0x40
           do_raw_spin_unlock+0x71/0xa0
           _raw_spin_unlock+0xe/0x10
           freeary+0x82/0x2a0
           ? _raw_spin_lock+0xe/0x10
           semctl_down.clone.0+0xce/0x160
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           SyS_semctl+0x236/0x2c0
           ? syscall_trace_leave+0xde/0x130
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
         RIP  [<ffffffff810d6053>] spin_dump+0x53/0xc0
          RSP <ffff88003750fd68>
         ---[ end trace 783ebb76612867a0 ]---
         NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
         Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
         CPU: 3 PID: 18053 Comm: test Tainted: G      D         4.2.0-rc5+ #1
         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
         RIP: native_read_tsc+0x0/0x20
         Call Trace:
           ? delay_tsc+0x40/0x70
           __delay+0xf/0x20
           do_raw_spin_lock+0x96/0x140
           _raw_spin_lock+0xe/0x10
           sem_lock_and_putref+0x11/0x70
           SYSC_semtimedop+0x7bf/0x960
           ? handle_mm_fault+0xbf6/0x1880
           ? dequeue_task_fair+0x79/0x4a0
           ? __do_page_fault+0x19a/0x430
           ? kfree_debugcheck+0x16/0x40
           ? __do_page_fault+0x19a/0x430
           ? __audit_syscall_entry+0xa8/0x100
           ? do_audit_syscall_entry+0x66/0x70
           ? syscall_trace_enter_phase1+0x139/0x160
           SyS_semtimedop+0xe/0x10
           SyS_semop+0x10/0x20
           entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
         Kernel panic - not syncing: softlockup: hung tasks
      
      I wasn't able to trigger any badness on a recent kernel without the
      proper config debugs enabled, however I have softlockup reports on some
      kernel versions, in the semaphore code, which are similar as above (the
      scenario is seen on some servers running IBM DB2 which uses semaphore
      syscalls).
      
      The patch here fixes the race against freeary, by acquiring or waiting
      on the sem_undo_list lock as necessary (exit_sem can race with freeary,
      while freeary sets un->semid to -1 and removes the same sem_undo from
      list_proc or when it removes the last sem_undo).
      
      After the patch I'm unable to reproduce the problem using the test case
      [1].
      
      [1] Test case used below:
      
          #include <stdio.h>
          #include <sys/types.h>
          #include <sys/ipc.h>
          #include <sys/sem.h>
          #include <sys/wait.h>
          #include <stdlib.h>
          #include <time.h>
          #include <unistd.h>
          #include <errno.h>
      
          #define NSEM 1
          #define NSET 5
      
          int sid[NSET];
      
          void thread()
          {
                  struct sembuf op;
                  int s;
                  uid_t pid = getuid();
      
                  s = rand() % NSET;
                  op.sem_num = pid % NSEM;
                  op.sem_op = 1;
                  op.sem_flg = SEM_UNDO;
      
                  semop(sid[s], &op, 1);
                  exit(EXIT_SUCCESS);
          }
      
          void create_set()
          {
                  int i, j;
                  pid_t p;
                  union {
                          int val;
                          struct semid_ds *buf;
                          unsigned short int *array;
                          struct seminfo *__buf;
                  } un;
      
                  /* Create and initialize semaphore set */
                  for (i = 0; i < NSET; i++) {
                          sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
                          if (sid[i] < 0) {
                                  perror("semget");
                                  exit(EXIT_FAILURE);
                          }
                  }
                  un.val = 0;
                  for (i = 0; i < NSET; i++) {
                          for (j = 0; j < NSEM; j++) {
                                  if (semctl(sid[i], j, SETVAL, un) < 0)
                                          perror("semctl");
                          }
                  }
      
                  /* Launch threads that operate on semaphore set */
                  for (i = 0; i < NSEM * NSET * NSET; i++) {
                          p = fork();
                          if (p < 0)
                                  perror("fork");
                          if (p == 0)
                                  thread();
                  }
      
                  /* Free semaphore set */
                  for (i = 0; i < NSET; i++) {
                          if (semctl(sid[i], NSEM, IPC_RMID))
                                  perror("IPC_RMID");
                  }
      
                  /* Wait for forked processes to exit */
                  while (wait(NULL)) {
                          if (errno == ECHILD)
                                  break;
                  };
          }
      
          int main(int argc, char **argv)
          {
                  pid_t p;
      
                  srand(time(NULL));
      
                  while (1) {
                          p = fork();
                          if (p < 0) {
                                  perror("fork");
                                  exit(EXIT_FAILURE);
                          }
                          if (p == 0) {
                                  create_set();
                                  goto end;
                          }
      
                          /* Wait for forked processes to exit */
                          while (wait(NULL)) {
                                  if (errno == ECHILD)
                                          break;
                          };
                  }
          end:
                  return 0;
          }
      
      [akpm@linux-foundation.org: use normal comment layout]
      Signed-off-by: default avatarHerton R. Krzesinski <herton@redhat.com>
      Acked-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Rafael Aquini <aquini@redhat.com>
      CC: Aristeu Rozanski <aris@redhat.com>
      Cc: David Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      cc456922
    • Michael Walle's avatar
      EDAC, ppc4xx: Access mci->csrows array elements properly · 1d275d96
      Michael Walle authored
      commit 5c16179b upstream.
      
      The commit
      
        de3910eb ("edac: change the mem allocation scheme to
      		 make Documentation/kobject.txt happy")
      
      changed the memory allocation for the csrows member. But ppc4xx_edac was
      forgotten in the patch. Fix it.
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.ccSigned-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      1d275d96
    • Bart Van Assche's avatar
      libfc: Fix fc_fcp_cleanup_each_cmd() · 334b3bbf
      Bart Van Assche authored
      commit 8f2777f5 upstream.
      
      Since fc_fcp_cleanup_cmd() can sleep this function must not
      be called while holding a spinlock. This patch avoids that
      fc_fcp_cleanup_each_cmd() triggers the following bug:
      
      BUG: scheduling while atomic: sg_reset/1512/0x00000202
      1 lock held by sg_reset/1512:
       #0:  (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
      Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
      Call Trace:
       [<ffffffff816c612c>] dump_stack+0x4f/0x7b
       [<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
       [<ffffffff816c87aa>] __schedule+0x71a/0xa10
       [<ffffffff816c8ad2>] schedule+0x32/0x80
       [<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
       [<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
       [<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
       [<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
       [<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
       [<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
       [<ffffffff814a2650>] scsi_ioctl+0x150/0x440
       [<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
       [<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
       [<ffffffff811da608>] block_ioctl+0x38/0x40
       [<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
       [<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
       [<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarVasu Dev <vasu.dev@intel.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      334b3bbf
    • John Soni Jose's avatar
      libiscsi: Fix host busy blocking during connection teardown · 0adcec66
      John Soni Jose authored
      commit 660d0831 upstream.
      
      In case of hw iscsi offload, an host can have N-number of active
      connections. There can be IO's running on some connections which
      make host->host_busy always TRUE. Now if logout from a connection
      is tried then the code gets into an infinite loop as host->host_busy
      is always TRUE.
      
       iscsi_conn_teardown(....)
       {
         .........
          /*
           * Block until all in-progress commands for this connection
           * time out or fail.
           */
           for (;;) {
            spin_lock_irqsave(session->host->host_lock, flags);
            if (!atomic_read(&session->host->host_busy)) { /* OK for ERL == 0 */
      	      spin_unlock_irqrestore(session->host->host_lock, flags);
                    break;
            }
           spin_unlock_irqrestore(session->host->host_lock, flags);
           msleep_interruptible(500);
           iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
                       "host_busy %d host_failed %d\n",
      	          atomic_read(&session->host->host_busy),
      	          session->host->host_failed);
      
      	................
      	...............
           }
        }
      
      This is not an issue with software-iscsi/iser as each cxn is a separate
      host.
      
      Fix:
      Acquiring eh_mutex in iscsi_conn_teardown() before setting
      session->state = ISCSI_STATE_TERMINATE.
      Signed-off-by: default avatarJohn Soni Jose <sony.john@avagotech.com>
      Reviewed-by: default avatarMike Christie <michaelc@cs.wisc.edu>
      Reviewed-by: default avatarChris Leech <cleech@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      [lizf: Backported to 3.4: adjust context]
      Signed-of-by: default avatarZefan Li <lizefan@huawei.com>
      0adcec66
    • Joe Thornber's avatar
      dm btree: add ref counting ops for the leaves of top level btrees · f738306b
      Joe Thornber authored
      commit b0dc3c8b upstream.
      
      When using nested btrees, the top leaves of the top levels contain
      block addresses for the root of the next tree down.  If we shadow a
      shared leaf node the leaf values (sub tree roots) should be incremented
      accordingly.
      
      This is only an issue if there is metadata sharing in the top levels.
      Which only occurs if metadata snapshots are being used (as is possible
      with dm-thinp).  And could result in a block from the thinp metadata
      snap being reused early, thus corrupting the thinp metadata snap.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      [lizf: Backported to 3.4:
       - drop const
       - drop changes to remove_one()]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      f738306b
    • Richard Weinberger's avatar
      localmodconfig: Use Kbuild files too · bd1bdbec
      Richard Weinberger authored
      commit c0ddc8c7 upstream.
      
      In kbuild it is allowed to define objects in files named "Makefile"
      and "Kbuild".
      Currently localmodconfig reads objects only from "Makefile"s and misses
      modules like nouveau.
      
      Link: http://lkml.kernel.org/r/1437948415-16290-1-git-send-email-richard@nod.atReported-and-tested-by: default avatarLeonidas Spyropoulos <artafinde@gmail.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      bd1bdbec
    • Juergen Gross's avatar
      x86/ldt: Correct FPU emulation access to LDT · 8a12aac6
      Juergen Gross authored
      commit 4809146b upstream.
      
      Commit 37868fe1 ("x86/ldt: Make modify_ldt synchronous")
      introduced a new struct ldt_struct anchored at mm->context.ldt.
      
      Adapt the x86 fpu emulation code to use that new structure.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: billm@melbpc.org.au
      Link: http://lkml.kernel.org/r/1438883674-1240-1-git-send-email-jgross@suse.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      8a12aac6
    • Juergen Gross's avatar
      x86/ldt: Correct LDT access in single stepping logic · fbe750bc
      Juergen Gross authored
      commit 136d9d83 upstream.
      
      Commit 37868fe1 ("x86/ldt: Make modify_ldt synchronous")
      introduced a new struct ldt_struct anchored at mm->context.ldt.
      
      convert_ip_to_linear() was changed to reflect this, but indexing
      into the ldt has to be changed as the pointer is no longer void *.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: bp@suse.de
      Link: http://lkml.kernel.org/r/1438848278-12906-1-git-send-email-jgross@suse.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      fbe750bc
    • Joseph Qi's avatar
      ocfs2: fix BUG in ocfs2_downconvert_thread_do_work() · bcbcff7f
      Joseph Qi authored
      commit 209f7512 upstream.
      
      The "BUG_ON(list_empty(&osb->blocked_lock_list))" in
      ocfs2_downconvert_thread_do_work can be triggered in the following case:
      
      ocfs2dc has firstly saved osb->blocked_lock_count to local varibale
      processed, and then processes the dentry lockres.  During the dentry
      put, it calls iput and then deletes rw, inode and open lockres from
      blocked list in ocfs2_mark_lockres_freeing.  And this causes the
      variable `processed' to not reflect the number of blocked lockres to be
      processed, which triggers the BUG.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      bcbcff7f
    • David Daney's avatar
      MIPS: Make set_pte() SMP safe. · 73619eaf
      David Daney authored
      commit 46011e6e upstream.
      
      On MIPS the GLOBAL bit of the PTE must have the same value in any
      aligned pair of PTEs.  These pairs of PTEs are referred to as
      "buddies".  In a SMP system is is possible for two CPUs to be calling
      set_pte() on adjacent PTEs at the same time.  There is a race between
      setting the PTE and a different CPU setting the GLOBAL bit in its
      buddy PTE.
      
      This race can be observed when multiple CPUs are executing
      vmap()/vfree() at the same time.
      
      Make setting the buddy PTE's GLOBAL bit an atomic operation to close
      the race condition.
      
      The case of CONFIG_64BIT_PHYS_ADDR && CONFIG_CPU_MIPS32 is *not*
      handled.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/10835/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      73619eaf
    • Peter Zijlstra's avatar
      perf: Fix fasync handling on inherited events · 40ba03f3
      Peter Zijlstra authored
      commit fed66e2c upstream.
      
      Vince reported that the fasync signal stuff doesn't work proper for
      inherited events. So fix that.
      
      Installing fasync allocates memory and sets filp->f_flags |= FASYNC,
      which upon the demise of the file descriptor ensures the allocation is
      freed and state is updated.
      
      Now for perf, we can have the events stick around for a while after the
      original FD is dead because of references from child events. So we
      cannot copy the fasync pointer around. We can however consistently use
      the parent's fasync, as that will be updated.
      Reported-and-Tested-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho deMelo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: eranian@google.com
      Link: http://lkml.kernel.org/r/1434011521.1495.71.camel@twinsSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      40ba03f3
    • Dan Carpenter's avatar
      rds: fix an integer overflow test in rds_info_getsockopt() · d86129c5
      Dan Carpenter authored
      commit 468b732b upstream.
      
      "len" is a signed integer.  We check that len is not negative, so it
      goes from zero to INT_MAX.  PAGE_SIZE is unsigned long so the comparison
      is type promoted to unsigned long.  ULONG_MAX - 4095 is a higher than
      INT_MAX so the condition can never be true.
      
      I don't know if this is harmful but it seems safe to limit "len" to
      INT_MAX - 4095.
      
      Fixes: a8c879a7 ('RDS: Info and stats')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      d86129c5
    • Mathias Nyman's avatar
      xhci: fix off by one error in TRB DMA address boundary check · 24e5a859
      Mathias Nyman authored
      commit 7895086a upstream.
      
      We need to check that a TRB is part of the current segment
      before calculating its DMA address.
      
      Previously a ring segment didn't use a full memory page, and every
      new ring segment got a new memory page, so the off by one
      error in checking the upper bound was never seen.
      
      Now that we use a full memory page, 256 TRBs (4096 bytes), the off by one
      didn't catch the case when a TRB was the first element of the next segment.
      
      This is triggered if the virtual memory pages for a ring segment are
      next to each in increasing order where the ring buffer wraps around and
      causes errors like:
      
      [  106.398223] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 1
      [  106.398230] xhci_hcd 0000:00:14.0: Looking for event-dma fffd3000 trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 seg-end fffd4ff0
      
      The trb-end address is one outside the end-seg address.
      Tested-by: default avatarArkadiusz Miśkiewicz <arekm@maven.pl>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      24e5a859
    • Felix Fietkau's avatar
      MIPS: Fix sched_getaffinity with MT FPAFF enabled · b9917d02
      Felix Fietkau authored
      commit 1d62d737 upstream.
      
      p->thread.user_cpus_allowed is zero-initialized and is only filled on
      the first sched_setaffinity call.
      
      To avoid adding overhead in the task initialization codepath, simply OR
      the returned mask in sched_getaffinity with p->cpus_allowed.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/10740/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      b9917d02
    • NeilBrown's avatar
      md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies · 2411ca5d
      NeilBrown authored
      commit 423f04d6 upstream.
      
      raid1_end_read_request() assumes that the In_sync bits are consistent
      with the ->degaded count.
      raid1_spare_active updates the In_sync bit before the ->degraded count
      and so exposes an inconsistency, as does error()
      So extend the spinlock in raid1_spare_active() and error() to hide those
      inconsistencies.
      
      This should probably be part of
        Commit: 34cab6f4 ("md/raid1: fix test for 'was read error from
        last working device'.")
      as it addresses the same issue.  It fixes the same bug and should go
      to -stable for same reasons.
      
      Fixes: 76073054 ("md/raid1: clean up read_balance.")
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      2411ca5d
    • Andy Lutomirski's avatar
      x86/ldt: Make modify_ldt synchronous · 0444d9df
      Andy Lutomirski authored
      commit 37868fe1 upstream.
      
      modify_ldt() has questionable locking and does not synchronize
      threads.  Improve it: redesign the locking and synchronize all
      threads' LDTs using an IPI on all modifications.
      
      This will dramatically slow down modify_ldt in multithreaded
      programs, but there shouldn't be any multithreaded programs that
      care about modify_ldt's performance in the first place.
      
      This fixes some fallout from the CVE-2015-5157 fixes.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [bwh: Backported to 3.2:
       - Adjust context
       - Drop comment changes in switch_mm()
       - Drop changes to get_segment_base() in arch/x86/kernel/cpu/perf_event.c
       - Open-code lockless_dereference(), smp_store_release(), on_each_cpu_mask()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      0444d9df
    • Andy Lutomirski's avatar
      x86/xen: Probe target addresses in set_aliased_prot() before the hypercall · 86462c4a
      Andy Lutomirski authored
      commit aa1acff3 upstream.
      
      The update_va_mapping hypercall can fail if the VA isn't present
      in the guest's page tables.  Under certain loads, this can
      result in an OOPS when the target address is in unpopulated vmap
      space.
      
      While we're at it, add comments to help explain what's going on.
      
      This isn't a great long-term fix.  This code should probably be
      changed to use something like set_memory_ro.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Andrew Cooper <andrew.cooper3@citrix.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Vrabel <dvrabel@cantab.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org <security@kernel.org>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      86462c4a
    • Alexei Potashnik's avatar
      target/iscsi: Fix double free of a TUR followed by a solicited NOPOUT · 874fc859
      Alexei Potashnik authored
      commit 9547308b upstream.
      
      Make sure all non-READ SCSI commands get targ_xfer_tag initialized
      to 0xffffffff, not just WRITEs.
      
      Double-free of a TUR cmd object occurs under the following scenario:
      
      1. TUR received (targ_xfer_tag is uninitialized and left at 0)
      2. TUR status sent
      3. First unsolicited NOPIN is sent to initiator (gets targ_xfer_tag of 0)
      4. NOPOUT for NOPIN (with TTT=0) arrives
       - its ExpStatSN acks TUR status, TUR is queued for removal
       - LIO tries to find NOPIN with TTT=0, but finds the same TUR instead,
         TUR is queued for removal for the 2nd time
      
      (Drop unbalanced conditional bracket usage - nab)
      Signed-off-by: default avatarAlexei Potashnik <alexei@purestorage.com>
      Signed-off-by: default avatarSpencer Baugh <sbaugh@catern.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      [lizf: Backported to 3.4:
       - adjust context
       - leave the braces as it is]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      874fc859
    • Alex Deucher's avatar
      drm/radeon/combios: add some validation of lvds values · 3c1a25d2
      Alex Deucher authored
      commit 0a90a0cf upstream.
      
      Fixes a broken hsync start value uncovered by:
      abc0b144
      (drm: Perform basic sanity checks on probed modes)
      
      The driver handled the bad hsync start elsewhere, but
      the above commit prevented it from getting added.
      
      bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=91401Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      3c1a25d2