1. 27 Feb, 2015 5 commits
    • Daniel Borkmann's avatar
      net: sctp: fix slab corruption from use after free on INIT collisions · 727ab4c0
      Daniel Borkmann authored
      [ Upstream commit 600ddd68 ]
      
      When hitting an INIT collision case during the 4WHS with AUTH enabled, as
      already described in detail in commit 1be9a950 ("net: sctp: inherit
      auth_capable on INIT collisions"), it can happen that we occasionally
      still remotely trigger the following panic on server side which seems to
      have been uncovered after the fix from commit 1be9a950 ...
      
      [  533.876389] BUG: unable to handle kernel paging request at 00000000ffffffff
      [  533.913657] IP: [<ffffffff811ac385>] __kmalloc+0x95/0x230
      [  533.940559] PGD 5030f2067 PUD 0
      [  533.957104] Oops: 0000 [#1] SMP
      [  533.974283] Modules linked in: sctp mlx4_en [...]
      [  534.939704] Call Trace:
      [  534.951833]  [<ffffffff81294e30>] ? crypto_init_shash_ops+0x60/0xf0
      [  534.984213]  [<ffffffff81294e30>] crypto_init_shash_ops+0x60/0xf0
      [  535.015025]  [<ffffffff8128c8ed>] __crypto_alloc_tfm+0x6d/0x170
      [  535.045661]  [<ffffffff8128d12c>] crypto_alloc_base+0x4c/0xb0
      [  535.074593]  [<ffffffff8160bd42>] ? _raw_spin_lock_bh+0x12/0x50
      [  535.105239]  [<ffffffffa0418c11>] sctp_inet_listen+0x161/0x1e0 [sctp]
      [  535.138606]  [<ffffffff814e43bd>] SyS_listen+0x9d/0xb0
      [  535.166848]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b
      
      ... or depending on the the application, for example this one:
      
      [ 1370.026490] BUG: unable to handle kernel paging request at 00000000ffffffff
      [ 1370.026506] IP: [<ffffffff811ab455>] kmem_cache_alloc+0x75/0x1d0
      [ 1370.054568] PGD 633c94067 PUD 0
      [ 1370.070446] Oops: 0000 [#1] SMP
      [ 1370.085010] Modules linked in: sctp kvm_amd kvm [...]
      [ 1370.963431] Call Trace:
      [ 1370.974632]  [<ffffffff8120f7cf>] ? SyS_epoll_ctl+0x53f/0x960
      [ 1371.000863]  [<ffffffff8120f7cf>] SyS_epoll_ctl+0x53f/0x960
      [ 1371.027154]  [<ffffffff812100d3>] ? anon_inode_getfile+0xd3/0x170
      [ 1371.054679]  [<ffffffff811e3d67>] ? __alloc_fd+0xa7/0x130
      [ 1371.080183]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b
      
      With slab debugging enabled, we can see that the poison has been overwritten:
      
      [  669.826368] BUG kmalloc-128 (Tainted: G        W     ): Poison overwritten
      [  669.826385] INFO: 0xffff880228b32e50-0xffff880228b32e50. First byte 0x6a instead of 0x6b
      [  669.826414] INFO: Allocated in sctp_auth_create_key+0x23/0x50 [sctp] age=3 cpu=0 pid=18494
      [  669.826424]  __slab_alloc+0x4bf/0x566
      [  669.826433]  __kmalloc+0x280/0x310
      [  669.826453]  sctp_auth_create_key+0x23/0x50 [sctp]
      [  669.826471]  sctp_auth_asoc_create_secret+0xcb/0x1e0 [sctp]
      [  669.826488]  sctp_auth_asoc_init_active_key+0x68/0xa0 [sctp]
      [  669.826505]  sctp_do_sm+0x29d/0x17c0 [sctp] [...]
      [  669.826629] INFO: Freed in kzfree+0x31/0x40 age=1 cpu=0 pid=18494
      [  669.826635]  __slab_free+0x39/0x2a8
      [  669.826643]  kfree+0x1d6/0x230
      [  669.826650]  kzfree+0x31/0x40
      [  669.826666]  sctp_auth_key_put+0x19/0x20 [sctp]
      [  669.826681]  sctp_assoc_update+0x1ee/0x2d0 [sctp]
      [  669.826695]  sctp_do_sm+0x674/0x17c0 [sctp]
      
      Since this only triggers in some collision-cases with AUTH, the problem at
      heart is that sctp_auth_key_put() on asoc->asoc_shared_key is called twice
      when having refcnt 1, once directly in sctp_assoc_update() and yet again
      from within sctp_auth_asoc_init_active_key() via sctp_assoc_update() on
      the already kzfree'd memory, which is also consistent with the observation
      of the poison decrease from 0x6b to 0x6a (note: the overwrite is detected
      at a later point in time when poison is checked on new allocation).
      
      Reference counting of auth keys revisited:
      
      Shared keys for AUTH chunks are being stored in endpoints and associations
      in endpoint_shared_keys list. On endpoint creation, a null key is being
      added; on association creation, all endpoint shared keys are being cached
      and thus cloned over to the association. struct sctp_shared_key only holds
      a pointer to the actual key bytes, that is, struct sctp_auth_bytes which
      keeps track of users internally through refcounting. Naturally, on assoc
      or enpoint destruction, sctp_shared_key are being destroyed directly and
      the reference on sctp_auth_bytes dropped.
      
      User space can add keys to either list via setsockopt(2) through struct
      sctp_authkey and by passing that to sctp_auth_set_key() which replaces or
      adds a new auth key. There, sctp_auth_create_key() creates a new sctp_auth_bytes
      with refcount 1 and in case of replacement drops the reference on the old
      sctp_auth_bytes. A key can be set active from user space through setsockopt()
      on the id via sctp_auth_set_active_key(), which iterates through either
      endpoint_shared_keys and in case of an assoc, invokes (one of various places)
      sctp_auth_asoc_init_active_key().
      
      sctp_auth_asoc_init_active_key() computes the actual secret from local's
      and peer's random, hmac and shared key parameters and returns a new key
      directly as sctp_auth_bytes, that is asoc->asoc_shared_key, plus drops
      the reference if there was a previous one. The secret, which where we
      eventually double drop the ref comes from sctp_auth_asoc_set_secret() with
      intitial refcount of 1, which also stays unchanged eventually in
      sctp_assoc_update(). This key is later being used for crypto layer to
      set the key for the hash in crypto_hash_setkey() from sctp_auth_calculate_hmac().
      
      To close the loop: asoc->asoc_shared_key is freshly allocated secret
      material and independant of the sctp_shared_key management keeping track
      of only shared keys in endpoints and assocs. Hence, also commit 4184b2a7
      ("net: sctp: fix memory leak in auth key management") is independant of
      this bug here since it concerns a different layer (though same structures
      being used eventually). asoc->asoc_shared_key is reference dropped correctly
      on assoc destruction in sctp_association_free() and when active keys are
      being replaced in sctp_auth_asoc_init_active_key(), it always has a refcount
      of 1. Hence, it's freed prematurely in sctp_assoc_update(). Simple fix is
      to remove that sctp_auth_key_put() from there which fixes these panics.
      
      Fixes: 730fc3d0 ("[SCTP]: Implete SCTP-AUTH parameter processing")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      727ab4c0
    • Eric Dumazet's avatar
      netxen: fix netxen_nic_poll() logic · e98d2751
      Eric Dumazet authored
      [ Upstream commit 6088beef ]
      
      NAPI poll logic now enforces that a poller returns exactly the budget
      when it wants to be called again.
      
      If a driver limits TX completion, it has to return budget as well when
      the limit is hit, not the number of received packets.
      Reported-and-tested-by: default avatarMike Galbraith <umgwanakikbuti@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: d75b1ade ("net: less interrupt masking in NAPI")
      Cc: Manish Chopra <manish.chopra@qlogic.com>
      Acked-by: default avatarManish Chopra <manish.chopra@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e98d2751
    • Hagen Paul Pfeifer's avatar
      ipv6: stop sending PTB packets for MTU < 1280 · fa3f55df
      Hagen Paul Pfeifer authored
      [ Upstream commit 9d289715 ]
      
      Reduce the attack vector and stop generating IPv6 Fragment Header for
      paths with an MTU smaller than the minimum required IPv6 MTU
      size (1280 byte) - called atomic fragments.
      
      See IETF I-D "Deprecating the Generation of IPv6 Atomic Fragments" [1]
      for more information and how this "feature" can be misused.
      
      [1] https://tools.ietf.org/html/draft-ietf-6man-deprecate-atomfrag-generation-00Signed-off-by: default avatarFernando Gont <fgont@si6networks.com>
      Signed-off-by: default avatarHagen Paul Pfeifer <hagen@jauu.net>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa3f55df
    • Eric Dumazet's avatar
      net: rps: fix cpu unplug · 06b5ff9f
      Eric Dumazet authored
      [ Upstream commit ac64da0b ]
      
      softnet_data.input_pkt_queue is protected by a spinlock that
      we must hold when transferring packets from victim queue to an active
      one. This is because other cpus could still be trying to enqueue packets
      into victim queue.
      
      A second problem is that when we transfert the NAPI poll_list from
      victim to current cpu, we absolutely need to special case the percpu
      backlog, because we do not want to add complex locking to protect
      process_queue : Only owner cpu is allowed to manipulate it, unless cpu
      is offline.
      
      Based on initial patch from Prasad Sodagudi & Subash Abhinov
      Kasiviswanathan.
      
      This version is better because we do not slow down packet processing,
      only make migration safer.
      Reported-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
      Reported-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      06b5ff9f
    • Willem de Bruijn's avatar
      ip: zero sockaddr returned on error queue · 1d480edb
      Willem de Bruijn authored
      [ Upstream commit f812116b ]
      
      The sockaddr is returned in IP(V6)_RECVERR as part of errhdr. That
      structure is defined and allocated on the stack as
      
          struct {
                  struct sock_extended_err ee;
                  struct sockaddr_in(6)    offender;
          } errhdr;
      
      The second part is only initialized for certain SO_EE_ORIGIN values.
      Always initialize it completely.
      
      An MTU exceeded error on a SOCK_RAW/IPPROTO_RAW is one example that
      would return uninitialized bytes.
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      
      ----
      
      Also verified that there is no padding between errhdr.ee and
      errhdr.offender that could leak additional kernel data.
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d480edb
  2. 11 Feb, 2015 19 commits
  3. 06 Feb, 2015 16 commits
    • Greg Kroah-Hartman's avatar
      Linux 3.10.68 · 87dc7c99
      Greg Kroah-Hartman authored
      87dc7c99
    • Nicholas Bellinger's avatar
      target: Drop arbitrary maximum I/O size limit · e2d88176
      Nicholas Bellinger authored
      commit 046ba642 upstream.
      
      This patch drops the arbitrary maximum I/O size limit in sbc_parse_cdb(),
      which currently for fabric_max_sectors is hardcoded to 8192 (4 MB for 512
      byte sector devices), and for hw_max_sectors is a backend driver dependent
      value.
      
      This limit is problematic because Linux initiators have only recently
      started to honor block limits MAXIMUM TRANSFER LENGTH, and other non-Linux
      based initiators (eg: MSFT Fibre Channel) can also generate I/Os larger
      than 4 MB in size.
      
      Currently when this happens, the following message will appear on the
      target resulting in I/Os being returned with non recoverable status:
      
        SCSI OP 28h with too big sectors 16384 exceeds fabric_max_sectors: 8192
      
      Instead, drop both [fabric,hw]_max_sector checks in sbc_parse_cdb(),
      and convert the existing hw_max_sectors into a purely informational
      attribute used to represent the granuality that backend driver and/or
      subsystem code is splitting I/Os upon.
      
      Also, update FILEIO with an explicit FD_MAX_BYTES check in fd_execute_rw()
      to deal with the one special iovec limitiation case.
      
      v2 changes:
        - Drop hw_max_sectors check in sbc_parse_cdb()
      Reported-by: default avatarLance Gropper <lance.gropper@qosserver.com>
      Reported-by: default avatarStefan Priebe <s.priebe@profihost.ag>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Roland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e2d88176
    • Sagi Grimberg's avatar
      iser-target: Fix implicit termination of connections · a3ecefb6
      Sagi Grimberg authored
      commit b02efbfc upstream.
      
      In situations such as bond failover, The new session establishment
      implicitly invokes the termination of the old connection.
      
      So, we don't want to wait for the old connection wait_conn to completely
      terminate before we accept the new connection and post a login response.
      
      The solution is to deffer the comp_wait completion and the conn_put to
      a work so wait_conn will effectively be non-blocking (flush errors are
      assumed to come very fast).
      
      We allocate isert_release_wq with WQ_UNBOUND and WQ_UNBOUND_MAX_ACTIVE
      to spread the concurrency of release works.
      Reported-by: default avatarSlava Shwartsman <valyushash@gmail.com>
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a3ecefb6
    • Sagi Grimberg's avatar
      iser-target: Handle ADDR_CHANGE event for listener cm_id · b80e6c5a
      Sagi Grimberg authored
      commit ca6c1d82 upstream.
      
      The np listener cm_id will also get ADDR_CHANGE event
      upcall (in case it is bound to a specific IP). Handle
      it correctly by creating a new cm_id and implicitly
      destroy the old one.
      
      Since this is the second event a listener np cm_id may
      encounter, we move the np cm_id event handling to a
      routine.
      
      Squashed:
      
      iser-target: Move cma_id setup to a function
      Reported-by: default avatarSlava Shwartsman <valyushash@gmail.com>
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b80e6c5a
    • Sagi Grimberg's avatar
      iser-target: Fix connected_handler + teardown flow race · b524a882
      Sagi Grimberg authored
      commit 19e2090f upstream.
      
      Take isert_conn pointer from cm_id->qp->qp_context. This
      will allow us to know that the cm_id context is always
      the network portal. This will make the cm_id event check
      (connection or network portal) more reliable.
      
      In order to avoid a NULL dereference in cma_id->qp->qp_context
      we destroy the qp after we destroy the cm_id (and make the
      dereference safe). session stablishment/teardown sequences
      can happen in parallel, we should take into account that
      connected_handler might race with connection teardown flow.
      
      Also, protect isert_conn->conn_device->active_qps decrement
      within the error patch during QP creation failure and the
      normal teardown path in isert_connect_release().
      
      Squashed:
      
      iser-target: Decrement completion context active_qps in error flow
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b524a882
    • Sagi Grimberg's avatar
      iser-target: Parallelize CM connection establishment · 022ff2f5
      Sagi Grimberg authored
      commit 2371e5da upstream.
      
      There is no point in accepting a new CM request only
      when we are completely done with the last iscsi login.
      Instead we accept immediately, this will also cause the
      CM connection to reach connected state and the initiator
      is allowed to send the first login. We mark that we got
      the initial login and let iscsi layer pick it up when it
      gets there.
      
      This reduces the parallel login sequence by a factor of
      more then 4 (and more for multi-login) and also prevents
      the initiator (who does all logins in parallel) from
      giving up on login timeout expiration.
      
      In order to support multiple login requests sequence (CHAP)
      we call isert_rx_login_req from isert_rx_completion insead
      of letting isert_get_login_rx call it.
      
      Squashed:
      
      iser-target: Use kref_get_unless_zero in connected_handler
      iser-target: Acquire conn_mutex when changing connection state
      iser-target: Reject connect request in failure path
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      022ff2f5
    • Sagi Grimberg's avatar
      iser-target: Fix flush + disconnect completion handling · dc0672f1
      Sagi Grimberg authored
      commit 128e9cc8 upstream.
      
      ISER_CONN_UP state is not sufficient to know if
      we should wait for completion of flush errors and
      disconnected_handler event.
      
      Instead, split it to 2 states:
      - ISER_CONN_UP: Got to CM connected phase, This state
      indicates that we need to wait for a CM disconnect
      event before going to teardown.
      
      - ISER_CONN_FULL_FEATURE: Got to full feature phase
      after we posted login response, This state indicates
      that we posted recv buffers and we need to wait for
      flush completions before going to teardown.
      
      Also avoid deffering disconnected handler to a work,
      and handle it within disconnected handler.
      More work here is needed to handle DEVICE_REMOVAL event
      correctly (cleanup all resources).
      
      Squashed:
      
      iser-target: Don't deffer disconnected handler to a work
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc0672f1
    • Sagi Grimberg's avatar
      iscsi,iser-target: Initiate termination only once · 839eac57
      Sagi Grimberg authored
      commit 954f2372 upstream.
      
      Since commit 0fc4ea70 ("Target/iser: Don't put isert_conn inside
      disconnected handler") we put the conn kref in isert_wait_conn, so we
      need .wait_conn to be invoked also in the error path.
      
      Introduce call to isert_conn_terminate (called under lock)
      which transitions the connection state to TERMINATING and calls
      rdma_disconnect. If the state is already teminating, just bail
      out back (temination started).
      
      Also, make sure to destroy the connection when getting a connect
      error event if didn't get to connected (state UP). Same for the
      handling of REJECTED and UNREACHABLE cma events.
      
      Squashed:
      
      iscsi-target: Add call to wait_conn in establishment error flow
      Reported-by: default avatarSlava Shwartsman <valyushash@gmail.com>
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      839eac57
    • Nicholas Bellinger's avatar
      vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion · 92c6741b
      Nicholas Bellinger authored
      commit 46243860 upstream.
      
      While looking at hch's recent conversion to drop the MSG_*_TAG
      definitions, I noticed a long standing bug in vhost-scsi where
      the VIRTIO_SCSI_S_* attribute definitions where incorrectly
      being passed directly into target_submit_cmd_map_sgls().
      
      This patch adds the missing virtio-scsi to TCM/SAM task attribute
      conversion.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      92c6741b
    • Hannes Reinecke's avatar
      tcm_loop: Fix wrong I_T nexus association · 9ed763fc
      Hannes Reinecke authored
      commit 506787a2 upstream.
      
      tcm_loop has the I_T nexus associated with the HBA. This causes
      commands to become misdirected if the HBA has more than one
      target portal group; any command is then being sent to the
      first target portal group instead of the correct one.
      
      The nexus needs to be associated with the target portal group
      instead.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ed763fc
    • Nicholas Bellinger's avatar
      vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT · 3ce3a861
      Nicholas Bellinger authored
      commit ab8edab1 upstream.
      
      This patch addresses a bug where individual vhost-scsi configfs endpoint
      groups can be removed from below while active exports to QEMU userspace
      still exist, resulting in an OOPs.
      
      It adds a configfs_depend_item() in vhost_scsi_set_endpoint() to obtain
      an explicit dependency on se_tpg->tpg_group in order to prevent individual
      vhost-scsi WWPN endpoints from being released via normal configfs methods
      while an QEMU ioctl reference still exists.
      
      Also, add matching configfs_undepend_item() in vhost_scsi_clear_endpoint()
      to release the dependency, once QEMU's reference to the individual group
      at /sys/kernel/config/target/vhost/$WWPN/$TPGT is released.
      
      (Fix up vhost_scsi_clear_endpoint() error path - DanC)
      
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ce3a861
    • Or Gerlitz's avatar
      ib_isert: Add max_send_sge=2 minimum for control PDU responses · f2da3c24
      Or Gerlitz authored
      commit f57915cf upstream.
      
      This patch adds a max_send_sge=2 minimum in isert_conn_setup_qp()
      to ensure outgoing control PDU responses with tx_desc->num_sge=2
      are able to function correctly.
      
      This addresses a bug with RDMA hardware using dev_attr.max_sge=3,
      that in the original code with the ConnectX-2 work-around would
      result in isert_conn->max_sge=1 being negotiated.
      
      Originally reported by Chris with ocrdma driver.
      Reported-by: default avatarChris Moore <Chris.Moore@emulex.com>
      Tested-by: default avatarChris Moore <Chris.Moore@emulex.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2da3c24
    • Chris Moore's avatar
      IB/isert: Adjust CQ size to HW limits · c50ad63a
      Chris Moore authored
      commit b1a5ad00 upstream.
      
      isert has an issue of trying to create a CQ with more CQEs than are
      supported by the hardware, that currently results in failures during
      isert_device creation during first session login.
      
      This is the isert version of the patch that Minh Tran submitted for
      iser, and is simple a workaround required to function with existing
      ocrdma hardware.
      Signed-off-by: default avatarChris Moore <chris.moore@emulex.com>
      Reviewied-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c50ad63a
    • Tejun Heo's avatar
      workqueue: fix subtle pool management issue which can stall whole worker_pool · 85be16ba
      Tejun Heo authored
      commit 29187a9e upstream.
      
      A worker_pool's forward progress is guaranteed by the fact that the
      last idle worker assumes the manager role to create more workers and
      summon the rescuers if creating workers doesn't succeed in timely
      manner before proceeding to execute work items.
      
      This manager role is implemented in manage_workers(), which indicates
      whether the worker may proceed to work item execution with its return
      value.  This is necessary because multiple workers may contend for the
      manager role, and, if there already is a manager, others should
      proceed to work item execution.
      
      Unfortunately, the function also indicates that the worker may proceed
      to work item execution if need_to_create_worker() is false at the head
      of the function.  need_to_create_worker() tests the following
      conditions.
      
      	pending work items && !nr_running && !nr_idle
      
      The first and third conditions are protected by pool->lock and thus
      won't change while holding pool->lock; however, nr_running can change
      asynchronously as other workers block and resume and while it's likely
      to be zero, as someone woke this worker up in the first place, some
      other workers could have become runnable inbetween making it non-zero.
      
      If this happens, manage_worker() could return false even with zero
      nr_idle making the worker, the last idle one, proceed to execute work
      items.  If then all workers of the pool end up blocking on a resource
      which can only be released by a work item which is pending on that
      pool, the whole pool can deadlock as there's no one to create more
      workers or summon the rescuers.
      
      This patch fixes the problem by removing the early exit condition from
      maybe_create_worker() and making manage_workers() return false iff
      there's already another manager, which ensures that the last worker
      doesn't start executing work items.
      
      We can leave the early exit condition alone and just ignore the return
      value but the only reason it was put there is because the
      manage_workers() used to perform both creations and destructions of
      workers and thus the function may be invoked while the pool is trying
      to reduce the number of workers.  Now that manage_workers() is called
      only when more workers are needed, the only case this early exit
      condition is triggered is rare race conditions rendering it pointless.
      
      Tested with simulated workload and modified workqueue code which
      trigger the pool deadlock reliably without this patch.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarEric Sandeen <sandeen@sandeen.net>
      Link: http://lkml.kernel.org/g/54B019F4.8030009@sandeen.net
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      85be16ba
    • Martin Kaiser's avatar
      gpio: squelch a compiler warning · 60e353b1
      Martin Kaiser authored
      drivers/gpio/gpiolib-of.c: In function 'of_gpiochip_find_and_xlate':
      drivers/gpio/gpiolib-of.c:51:21: warning: assignment makes integer from
      pointer without a cast [enabled by default]
         gg_data->out_gpio = ERR_PTR(ret);
                           ^
      this was introduced in d1c34491
      
      the upstream kernel changed the type of out_gpio from int to struct gpio_desc *
      as part of a larger refactoring that wasn't backported
      Signed-off-by: default avatarMartin Kaiser <martin@kaiser.cx>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      60e353b1
    • Madper Xie's avatar
      efi-pstore: Make efi-pstore return a unique id · 45006972
      Madper Xie authored
      commit fdeadb43 upstream.
      
      Pstore fs expects that backends provide a unique id which could avoid
      pstore making entries as duplication or denominating entries the same
      name. So I combine the timestamp, part and count into id.
      Signed-off-by: default avatarMadper Xie <cxie@redhat.com>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      [hkp: Backported to 3.10: adjust context]
      Signed-off-by: default avatarHu Keping <hukeping@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45006972