1. 03 Jul, 2018 40 commits
    • Steffen Maier's avatar
      scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED · 60ed2673
      Steffen Maier authored
      commit d70aab55 upstream.
      
      For problem determination we always want to see when we were invoked on the
      terminate_rport_io callback whether we perform something or not.
      
      Temporal event sequence of interest with a long fast_io_fail_tmo of 27 sec:
      
      loose remote port
      
      t   workqueue
      [s] zfcp_q_<dev>       IRQ                 zfcperp<dev>
      
      === ================== =================== ============================
      
        0                    recv RSCN
                             q p.test_link_work
          block rport
           start fast_io_fail_tmo
          send ADISC ELS
        4                    recv ADISC fail
                             block zfcp_port
                                                 port forced reopen
                                                 send open port
       12                    recv open port fail
                                                 q p.gid_pn_work
                                                 zfcp_erp_wakeup
                                                 (zfcp_erp_wait would return)
          GID_PN fail
      
      Before this point, we got a SCSI trace with tag "sctrpi1" on fast_io_fail,
      e.g. with the typical 5 sec setting.
      
          port.status |= ERP_FAILED
      
      If fast_io_fail_tmo triggers after this point, we missed a SCSI trace.
      
          workqueue
          fc_dl_<host>
          ==================
       27 fc_timeout_fail_rport_io
          fc_terminate_rport_io
          zfcp_scsi_terminate_rport_io
          zfcp_erp_port_forced_reopen
          _zfcp_erp_port_forced_reopen
           if (port.status & ERP_FAILED)
            return;
      
      Therefore, write a trace before above early return.
      
      Example trace record formatted with zfcpdbf from s390-tools:
      
      Timestamp      : ...
      Area           : REC
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1                      ZFCP_DBF_REC_TRIG
      Tag            : sctrpi1                SCSI terminate rport I/O
      LUN            : 0xffffffffffffffff                     none (invalid)
      WWPN           : 0x<wwpn>
      D_ID           : 0x<n_port_id>
      Adapter status : 0x...
      Port status    : 0x...
      LUN status     : 0x00000000                             none (invalid)
      Ready count    : 0x...
      Running count  : 0x...
      ERP want       : 0x03                   ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
      ERP need       : 0xe0                   ZFCP_ERP_ACTION_FAILED
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      60ed2673
    • Steffen Maier's avatar
      scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return · 071f2326
      Steffen Maier authored
      commit 96d92704 upstream.
      
      get_device() and its internally used kobject_get() only return NULL if they
      get passed NULL as argument. zfcp_get_port_by_wwpn() loops over
      adapter->port_list so the iteration variable port is always non-NULL.
      Struct device is embedded in struct zfcp_port so &port->dev is always
      non-NULL. This is the argument to get_device().  However, if we get an
      fc_rport in terminate_rport_io() for which we cannot find a match within
      zfcp_get_port_by_wwpn(), the latter can return NULL.  v2.6.30 commit
      70932935 ("[SCSI] zfcp: Fix oops when port disappears") introduced an
      early return without adding a trace record for this case.  Even if we don't
      need recovery in this case, for debugging we should still see that our
      callback was invoked originally by scsi_transport_fc.
      
      Example trace record formatted with zfcpdbf from s390-tools:
      
      Timestamp      : ...
      Area           : REC
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : sctrpin        SCSI terminate rport I/O, no zfcp port
      LUN            : 0xffffffffffffffff                     none (invalid)
      WWPN           : 0x<wwpn>               WWPN
      D_ID           : 0x<n_port_id>          N_Port-ID
      Adapter status : 0x...
      Port status    : 0xffffffff             unknown (-1)
      LUN status     : 0x00000000                             none (invalid)
      Ready count    : 0x...
      Running count  : 0x...
      ERP want       : 0x03                   ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
      ERP need       : 0xc0                   ZFCP_ERP_ACTION_NONE
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: 70932935 ("[SCSI] zfcp: Fix oops when port disappears")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      071f2326
    • Steffen Maier's avatar
      scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed · 3d0d31e5
      Steffen Maier authored
      commit 512857a7 upstream.
      
      If a SCSI device is deleted during scsi_eh host reset, we cannot get a
      reference to the SCSI device anymore since scsi_device_get returns !=0 by
      design. Assuming the recovery of adapter and port(s) was successful,
      zfcp_erp_strategy_followup_success() attempts to trigger a LUN reset for the
      half-gone SCSI device. Unfortunately, it causes the following confusing
      trace record which states that zfcp will do a LUN recovery as "ERP need" is
      ZFCP_ERP_ACTION_REOPEN_LUN == 1 and equals "ERP want".
      
      Old example trace record formatted with zfcpdbf from s390-tools:
      
      Tag:           : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded
      LUN            : 0x<FCP_LUN>
      WWPN           : 0x<WWPN>
      D_ID           : 0x<N_Port-ID>
      Adapter status : 0x5400050b
      Port status    : 0x54000001
      LUN status     : 0x40000000     ZFCP_STATUS_COMMON_RUNNING
                                      but not ZFCP_STATUS_COMMON_UNBLOCKED as it
                                      was closed on close part of adapter reopen
      ERP want       : 0x01
      ERP need       : 0x01           misleading
      
      However, zfcp_erp_setup_act() returns NULL as it cannot get the reference.
      Hence, zfcp_erp_action_enqueue() takes an early goto out and _NO_ recovery
      actually happens.
      
      We always do want the recovery trigger trace record even if no erp_action
      could be enqueued as in this case. For other cases where we did not enqueue
      an erp_action, 'need' has always been zero to indicate this. In order to
      indicate above goto out, introduce an eyecatcher "flag" to mark the "ERP
      need" as 'not needed' but still keep the information which erp_action type,
      that zfcp_erp_required_act() had decided upon, is needed.  0xc_ is chosen to
      be visibly different from 0x0_ in "ERP want".
      
      New example trace record formatted with zfcpdbf from s390-tools:
      
      Tag:           : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded
      LUN            : 0x<FCP_LUN>
      WWPN           : 0x<WWPN>
      D_ID           : 0x<N_Port-ID>
      Adapter status : 0x5400050b
      Port status    : 0x54000001
      LUN status     : 0x40000000
      ERP want       : 0x01
      ERP need       : 0xc1           would need LUN ERP, but no action set up
                         ^
      
      Before v2.6.38 commit ae0904f6 ("[SCSI] zfcp: Redesign of the debug
      tracing for recovery actions.") we could detect this case because the
      "erp_action" field in the trace was NULL. The rework removed erp_action as
      argument and field from the trace.
      
      This patch here is for tracing. A fix to allow LUN recovery in the case at
      hand is a topic for a separate patch.
      
      See also commit fdbd1c5e ("[SCSI] zfcp: Allow running unit/LUN shutdown
      without acquiring reference") for a similar case and background info.
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: ae0904f6 ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d0d31e5
    • Steffen Maier's avatar
      scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF · 941e8bee
      Steffen Maier authored
      commit 81979ae6 upstream.
      
      We already have a SCSI trace for the end of abort and scsi_eh TMF. Due to
      zfcp_erp_wait() and fc_block_scsi_eh() time can pass between the start of
      our eh callback and an actual send/recv of an abort / TMF request.  In order
      to see the temporal sequence including any abort / TMF send retries, add a
      trace before the above two blocking functions.  This supports problem
      determination with scsi_eh and parallel zfcp ERP.
      
      No need to explicitly trace the beginning of our eh callback, since we
      typically can send an abort / TMF and see its HBA response (in the worst
      case, it's a pseudo response on dismiss all of adapter recovery, e.g. due to
      an FSF request timeout [fsrth_1] of the abort / TMF). If we cannot send, we
      now get a trace record for the first "abrt_wt" or "[lt]r_wait" which denotes
      almost the beginning of the callback.
      
      No need to explicitly trace the wakeup after the above two blocking
      functions because the next retry loop causes another trace in any case and
      that is sufficient.
      
      Example trace records formatted with zfcpdbf from s390-tools:
      
      Timestamp      : ...
      Area           : SCSI
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : abrt_wt        abort, before zfcp_erp_wait()
      Request ID     : 0x0000000000000000                     none (invalid)
      SCSI ID        : 0x<scsi_id>
      SCSI LUN       : 0x<scsi_lun>
      SCSI LUN high  : 0x<scsi_lun_high>
      SCSI result    : 0x<scsi_result_of_cmd_to_be_aborted>
      SCSI retries   : 0x<retries_of_cmd_to_be_aborted>
      SCSI allowed   : 0x<allowed_retries_of_cmd_to_be_aborted>
      SCSI scribble  : 0x<req_id_of_cmd_to_be_aborted>
      SCSI opcode    : <CDB_of_cmd_to_be_aborted>
      FCP rsp inf cod: 0x..                                   none (invalid)
      FCP rsp IU     : ...                                    none (invalid)
      
      Timestamp      : ...
      Area           : SCSI
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : lr_wait        LUN reset, before zfcp_erp_wait()
      Request ID     : 0x0000000000000000                     none (invalid)
      SCSI ID        : 0x<scsi_id>
      SCSI LUN       : 0x<scsi_lun>
      SCSI LUN high  : 0x<scsi_lun_high>
      SCSI result    : 0x...                                  unrelated
      SCSI retries   : 0x..                                   unrelated
      SCSI allowed   : 0x..                                   unrelated
      SCSI scribble  : 0x...                                  unrelated
      SCSI opcode    : ...                                    unrelated
      FCP rsp inf cod: 0x..                                   none (invalid)
      FCP rsp IU     : ...                                    none (invalid)
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: 63caf367 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp")
      Fixes: af4de36d ("[SCSI] zfcp: Block scsi_eh thread for rport state BLOCKED")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      941e8bee
    • Steffen Maier's avatar
      scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler · 74da693a
      Steffen Maier authored
      commit df307816 upstream.
      
      For problem determination we need to see whether and why we were successful
      or not. This allows deduction of scsi_eh escalation.
      
      Example trace record formatted with zfcpdbf from s390-tools:
      
      Timestamp      : ...
      Area           : SCSI
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : schrh_r        SCSI host reset handler result
      Request ID     : 0x0000000000000000                     none (invalid)
      SCSI ID        : 0xffffffff                             none (invalid)
      SCSI LUN       : 0xffffffff                             none (invalid)
      SCSI LUN high  : 0xffffffff                             none (invalid)
      SCSI result    : 0x00002002     field re-used for midlayer value: SUCCESS
                                      or in other cases: 0x2009 == FAST_IO_FAIL
      SCSI retries   : 0xff                                   none (invalid)
      SCSI allowed   : 0xff                                   none (invalid)
      SCSI scribble  : 0xffffffffffffffff                     none (invalid)
      SCSI opcode    : ffffffff ffffffff ffffffff ffffffff    none (invalid)
      FCP rsp inf cod: 0xff                                   none (invalid)
      FCP rsp IU     : 00000000 00000000 00000000 00000000    none (invalid)
                       00000000 00000000
      
      v2.6.35 commit a1dbfddd ("[SCSI] zfcp: Pass return code from
      fc_block_scsi_eh to scsi eh") introduced the first return with something
      other than the previously hardcoded single SUCCESS return path.
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: a1dbfddd ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarJens Remus <jremus@linux.ibm.com>
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74da693a
    • Anil Gurumurthy's avatar
      scsi: qla2xxx: Mask off Scope bits in retry delay · 9db2ad79
      Anil Gurumurthy authored
      commit 3cedc879 upstream.
      
      Some newer target uses "Status Qualifier" response in a returned "Busy
      Status". This new response code of 0x4001, which is "Scope" bits,
      translates to "Affects all units accessible by target".  Due to this new
      value returned in the Scope bits, driver was using that value as timeout
      value which resulted into driver waiting for 27min timeout.
      
      This patch masks off this Scope bits so that driver does not use this
      value as retry delay time.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAnil Gurumurthy <anil.gurumurthy@cavium.com>
      Signed-off-by: default avatarGiridhar Malavali <giridhar.malavali@cavium.com>
      Signed-off-by: default avatarHimanshu Madhani <himanshu.madhani@cavium.com>
      Reviewed-by: default avatarEwan D. Milne <emilne@redhat.com>
      Reviewed-by: default avatarMartin Wilck <mwilck@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9db2ad79
    • Himanshu Madhani's avatar
      scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails · 9224583a
      Himanshu Madhani authored
      commit 413c2f33 upstream.
      
      This patch prevents driver from setting lower default speed of 1 GB/sec,
      if the switch does not support Get Port Speed Capabilities (GPSC)
      command. Setting this default speed results into much lower write
      performance for large sequential WRITE.  This patch modifies driver to
      check for gpsc_supported flags and prevents driver from issuing
      MBC_SET_PORT_PARAM (001Ah) to set default speed of 1 GB/sec. If driver
      does not send this mailbox command, firmware assumes maximum supported
      link speed and will operate at the max speed.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHimanshu Madhani <himanshu.madhani@cavium.com>
      Reported-by: default avatarEda Zhou <ezhou@redhat.com>
      Reviewed-by: default avatarEwan D. Milne <emilne@redhat.com>
      Tested-by: default avatarEwan D. Milne <emilne@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9224583a
    • Sinan Kaya's avatar
      scsi: hpsa: disable device during shutdown · 2829829c
      Sinan Kaya authored
      commit 0d98ba8d upstream.
      
      'Commit cc27b735 ("PCI/portdrv: Turn off PCIe services during
      shutdown")' has been added to kernel to shutdown pending PCIe port service
      interrupts during reboot so that a newly started kexec kernel wouldn't
      observe pending interrupts.
      
      pcie_port_device_remove() is disabling the root port and switches by
      calling pci_disable_device() after all PCIe service drivers are shutdown.
      
      This has been found to cause crashes on HP DL360 Gen9 machines during
      reboot due to hpsa driver not clearing the bus master bit during the
      shutdown procedure by calling pci_disable_device().
      
      Disable device as part of the shutdown sequence.
      Signed-off-by: default avatarSinan Kaya <okaya@codeaurora.org>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=199779
      Fixes: cc27b735 ("PCI/portdrv: Turn off PCIe services during shutdown")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarRyan Finnie <ryan@finnie.org>
      Tested-by: default avatarDon Brace <don.brace@microsemi.com>
      Acked-by: default avatarDon Brace <don.brace@microsemi.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2829829c
    • Dan Williams's avatar
      mm: fix __gup_device_huge vs unmap · 2d329968
      Dan Williams authored
      commit a9b6de77 upstream.
      
      get_user_pages_fast() for device pages is missing the typical validation
      that all page references have been taken while the mapping was valid.
      Without this validation truncate operations can not reliably coordinate
      against new page reference events like O_DIRECT.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 3565fce3 ("mm, x86: get_user_pages() for dax mappings")
      Reported-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d329968
    • Christophe JAILLET's avatar
      iio: sca3000: Fix an error handling path in 'sca3000_probe()' · 5d6ad5a0
      Christophe JAILLET authored
      commit 4a5b4538 upstream.
      
      Use 'devm_iio_kfifo_allocate()' instead of 'iio_kfifo_allocate()' in order
      to simplify code and avoid a memory leak in an error path in
      'sca3000_probe()'. A call to 'sca3000_unconfigure_ring()' was missing.
      
      Sent via the next merge window as unimportant bug and there are
      other patches dependent on it.
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d6ad5a0
    • Alexandru Ardelean's avatar
      iio: adc: ad7791: remove sample freq sysfs attributes · d55209ee
      Alexandru Ardelean authored
      commit 7eb6b35d upstream.
      
      In the current state, these attributes are broken, because they are
      registered already, and the kernel throws a warning.
      The first registration happens via the `IIO_CHAN_INFO_SAMP_FREQ` flag from
      the `ad_sigma_delta` driver.
      
      In this commit these attrs are removed, and in the following the
      IIO_CHAN_INFO_SAMP_FREQ behavior will be implemented, which replaces these
      hooks.
      
      This is done to make things a bit easier to review as there is a bit of
      overlap in the patch if it's done all at once.
      
      Fixes: a13e831f ("staging: iio: ad7192: implement IIO_CHAN_INFO_SAMP_FREQ")
      Signed-off-by: default avatarAlexandru Ardelean <alexandru.ardelean@analog.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d55209ee
    • Filipe Manana's avatar
      Btrfs: fix return value on rename exchange failure · 6101eea4
      Filipe Manana authored
      commit c5b4a50b upstream.
      
      If we failed during a rename exchange operation after starting/joining a
      transaction, we would end up replacing the return value, stored in the
      local 'ret' variable, with the return value from btrfs_end_transaction().
      So this could end up returning 0 (success) to user space despite the
      operation having failed and aborted the transaction, because if there are
      multiple tasks having a reference on the transaction at the time
      btrfs_end_transaction() is called by the rename exchange, that function
      returns 0 (otherwise it returns -EIO and not the original error value).
      So fix this by not overwriting the return value on error after getting
      a transaction handle.
      
      Fixes: cdd1fedf ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT")
      CC: stable@vger.kernel.org # 4.9+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6101eea4
    • Maciej S. Szmigiero's avatar
      X.509: unpack RSA signatureValue field from BIT STRING · af20e4ec
      Maciej S. Szmigiero authored
      commit b65c32ec upstream.
      
      The signatureValue field of a X.509 certificate is encoded as a BIT STRING.
      For RSA signatures this BIT STRING is of so-called primitive subtype, which
      contains a u8 prefix indicating a count of unused bits in the encoding.
      
      We have to strip this prefix from signature data, just as we already do for
      key data in x509_extract_key_data() function.
      
      This wasn't noticed earlier because this prefix byte is zero for RSA key
      sizes divisible by 8. Since BIT STRING is a big-endian encoding adding zero
      prefixes has no bearing on its value.
      
      The signature length, however was incorrect, which is a problem for RSA
      implementations that need it to be exactly correct (like AMD CCP).
      Signed-off-by: default avatarMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Fixes: c26fd69f ("X.509: Add a crypto key parser for binary (DER) X.509 certificates")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJames Morris <james.morris@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af20e4ec
    • Yang Yingliang's avatar
      irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node · 7dfc8199
      Yang Yingliang authored
      commit c1797b11 upstream.
      
      On a NUMA system, if an ITS is local to an offline node, the ITS driver may
      pick an offline CPU to bind the LPI.  In this case, pick an online CPU (and
      the first one will do).
      
      But on some systems, binding an LPI to non-local node CPU may cause
      deadlock (see Cavium erratum 23144).  In this case, just fail the activate
      and return an error code.
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Sumit Garg <sumit.garg@linaro.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180622095254.5906-5-marc.zyngier@arm.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7dfc8199
    • Geert Uytterhoeven's avatar
      time: Make sure jiffies_to_msecs() preserves non-zero time periods · 88c4318d
      Geert Uytterhoeven authored
      commit abcbcb80 upstream.
      
      For the common cases where 1000 is a multiple of HZ, or HZ is a multiple of
      1000, jiffies_to_msecs() never returns zero when passed a non-zero time
      period.
      
      However, if HZ > 1000 and not an integer multiple of 1000 (e.g. 1024 or
      1200, as used on alpha and DECstation), jiffies_to_msecs() may return zero
      for small non-zero time periods.  This may break code that relies on
      receiving back a non-zero value.
      
      jiffies_to_usecs() does not need such a fix: one jiffy can only be less
      than one µs if HZ > 1000000, and such large values of HZ are already
      rejected at build time, twice:
      
        - include/linux/jiffies.h does #error if HZ >= 12288,
        - kernel/time/time.c has BUILD_BUG_ON(HZ > USEC_PER_SEC).
      
      Broken since forever.
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Stephen Boyd <sboyd@kernel.org>
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180622143357.7495-1-geert@linux-m68k.orgSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      88c4318d
    • Huacai Chen's avatar
      MIPS: io: Add barrier after register read in inX() · 0fe95015
      Huacai Chen authored
      commit 18f3e95b upstream.
      
      While a barrier is present in the outX() functions before the register
      write, a similar barrier is missing in the inX() functions after the
      register read. This could allow memory accesses following inX() to
      observe stale data.
      
      This patch is very similar to commit a1cc7034 ("MIPS: io: Add
      barrier after register read in readX()"). Because war_io_reorder_wmb()
      is both used by writeX() and outX(), if readX() need a barrier then so
      does inX().
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Patchwork: https://patchwork.linux-mips.org/patch/19516/Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: James Hogan <james.hogan@mips.com>
      Cc: linux-mips@linux-mips.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: Huacai Chen <chenhuacai@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0fe95015
    • Srinivas Pandruvada's avatar
      cpufreq: intel_pstate: Fix scaling max/min limits with Turbo 3.0 · 93e1297f
      Srinivas Pandruvada authored
      commit ff7c9917 upstream.
      
      When scaling max/min settings are changed, internally they are converted
      to a ratio using the max turbo 1 core turbo frequency. This works fine
      when 1 core max is same irrespective of the core. But under Turbo 3.0,
      this will not be the case. For example:
      Core 0: max turbo pstate: 43 (4.3GHz)
      Core 1: max turbo pstate: 45 (4.5GHz)
      In this case 1 core turbo ratio will be maximum of all, so it will be
      45 (4.5GHz). Suppose scaling max is set to 4GHz (ratio 40) for all cores
      ,then on core one it will be
       = max_state * policy->max / max_freq;
       = 43 * (4000000/4500000) = 38 (3.8GHz)
       = 38
      which is 200MHz less than the desired.
      On core2, it will be correctly set to ratio 40 (4GHz). Same holds true
      for scaling min frequency limit. So this requires usage of correct turbo
      max frequency for core one, which in this case is 4.3GHz. So we need to
      adjust per CPU cpu->pstate.turbo_freq using the maximum HWP ratio of that
      core.
      
      This change uses the HWP capability of a core to adjust max turbo
      frequency. But since Broadwell HWP doesn't use ratios in the HWP
      capabilities, we have to use legacy max 1 core turbo ratio. This is not
      a problem as the HWP capabilities don't differ among cores in Broadwell.
      We need to check for non Broadwell CPU model for applying this change,
      though.
      Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93e1297f
    • Fabio Estevam's avatar
      pinctrl: devicetree: Fix pctldev pointer overwrite · 55be2e6f
      Fabio Estevam authored
      commit bc3322bc upstream.
      
      Commit b89405b6 ("pinctrl: devicetree: Fix dt_to_map_one_config
      handling of hogs") causes the pinctrl hog pins to not get initialized
      on i.MX platforms leaving them with the IOMUX settings untouched.
      
      This causes several regressions on i.MX such as:
      
      - OV5640 camera driver can not be probed anymore on imx6qdl-sabresd
      because the camera clock pin is in a pinctrl_hog group and since
      its pinctrl initialization is skipped, the camera clock is kept
      in GPIO functionality instead of CLK_CKO function.
      
      - Audio stopped working on imx6qdl-wandboard and imx53-qsb for
      the same reason.
      
      Richard Fitzgerald explains the problem:
      
      "I see the bug. If the hog node isn't a 1st level child of the pinctrl
      parent node it will go around the for(;;) loop again but on the first
      pass I overwrite pctldev with the result of
      get_pinctrl_dev_from_of_node() so it doesn't point to the pinctrl driver
      any more."
      
      Fix the issue by stashing the original pctldev so it doesn't
      get overwritten.
      
      Fixes:  b89405b6 ("pinctrl: devicetree: Fix dt_to_map_one_config handling of hogs")
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarMika Penttilä <mika.penttila@nextfour.com>
      Reported-by: default avatarSteve Longerbeam <slongerbeam@gmail.com>
      Suggested-by: default avatarRichard Fitzgerald <rf@opensource.cirrus.com>
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Reviewed-by: default avatarDong Aisheng <aisheng.dong@nxp.com>
      Reviewed-by: default avatarRichard Fitzgerald <rf@opensource.cirrus.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55be2e6f
    • Paweł Chmiel's avatar
      pinctrl: samsung: Correct EINTG banks order · 7cc7ae5c
      Paweł Chmiel authored
      commit 5cf9a338 upstream.
      
      All banks with GPIO interrupts should be at beginning of bank array and
      without any other types of banks between them.  This order is expected
      by exynos_eint_gpio_irq, when doing interrupt group to bank translation.
      Otherwise, kernel NULL pointer dereference would happen when trying to
      handle interrupt, due to wrong bank being looked up.  Observed on
      s5pv210, when trying to handle gpj0 interrupt, where kernel was mapping
      it to gpi bank.
      
      Cc: stable@vger.kernel.org
      Fixes: 023e06df ("pinctrl: exynos: add exynos5410 SoC specific data")
      Fixes: 608a26a7 ("pinctrl: Add s5pv210 support to pinctrl-exynos)
      Signed-off-by: default avatarPaweł Chmiel <pawel.mikolaj.chmiel@gmail.com>
      Reviewed-by: default avatarTomasz Figa <tomasz.figa@gmail.com>
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7cc7ae5c
    • Randy Dunlap's avatar
      auxdisplay: fix broken menu · 9e838b2e
      Randy Dunlap authored
      commit b5b903fb upstream.
      
      Having the CHARLCD Kconfig symbol between "menuconfig AUXDISPLAY"
      and "if AUXDISPLAY" breaks the AUXDISPLAY submenus, so move the
      CHARLCD Kconfig symbol near the end of the file so that the menu
      display is continuous.
      
      Also include ARM_CHARLCD inside of the if AUXDISPLAY/endif block.
      Geert says that it should be there.
      
      Fixes: 39f8ea46 ("auxdisplay: charlcd: Extract character LCD core from misc/panel")
      
      Cc: stable@vger.kernel.org # v4.12
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e838b2e
    • Mika Westerberg's avatar
      PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume · 226ffbf6
      Mika Westerberg authored
      commit 13c65840 upstream.
      
      After a suspend/resume cycle the Presence Detect or Data Link Layer Status
      Changed bits might be set.  If we don't clear them those events will not
      fire anymore and nothing happens for instance when a device is now
      hot-unplugged.
      
      Fix this by clearing those bits in a newly introduced function
      pcie_reenable_notification().  This should be fine because immediately
      after, we check if the adapter is still present by reading directly from
      the status register.
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      226ffbf6
    • Mika Westerberg's avatar
      PCI: Add ACS quirk for Intel 300 series · fc0096bc
      Mika Westerberg authored
      commit f154a718 upstream.
      
      Intel 300 series chipset still has the same ACS issue as the previous
      generations so extend the ACS quirk to cover it as well.
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc0096bc
    • Alex Williamson's avatar
      PCI: Add ACS quirk for Intel 7th & 8th Gen mobile · 78923ba9
      Alex Williamson authored
      commit e8440f4b upstream.
      
      The specification update indicates these have the same errata for
      implementing non-standard ACS capabilities.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      78923ba9
    • Sridhar Pitchai's avatar
      PCI: hv: Make sure the bus domain is really unique · e4a424c5
      Sridhar Pitchai authored
      commit 29927dfb upstream.
      
      When Linux runs as a guest VM in Hyper-V and Hyper-V adds the virtual PCI
      bus to the guest, Hyper-V always provides unique PCI domain.
      
      commit 4a9b0933 ("PCI: hv: Use device serial number as PCI domain")
      overrode unique domain with the serial number of the first device added to
      the virtual PCI bus.
      
      The reason for that patch was to have a consistent and short name for the
      device, but Hyper-V doesn't provide unique serial numbers. Using non-unique
      serial numbers as domain IDs leads to duplicate device addresses, which
      causes PCI bus registration to fail.
      
      commit 0c195567 ("netvsc: transparent VF management") avoids the need
      for commit 4a9b0933 ("PCI: hv: Use device serial number as PCI
      domain").  When scripts were used to configure VF devices, the name of
      the VF needed to be consistent and short, but with commit 0c195567
      ("netvsc: transparent VF management") all the setup is done in the kernel,
      and we do not need to maintain consistent name.
      
      Revert commit 4a9b0933 ("PCI: hv: Use device serial number as PCI
      domain") so we can reliably support multiple devices being assigned to
      a guest.
      
      Tag the patch for stable kernels containing commit 0c195567
      ("netvsc: transparent VF management").
      
      Fixes: 4a9b0933 ("PCI: hv: Use device serial number as PCI domain")
      Signed-off-by: default avatarSridhar Pitchai <sridhar.pitchai@microsoft.com>
      [lorenzo.pieralisi@arm.com: trimmed commit log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: stable@vger.kernel.org # v4.14+
      Reviewed-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4a424c5
    • Tokunori Ikegami's avatar
      MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum · 43f6a09c
      Tokunori Ikegami authored
      commit 2a027b47 upstream.
      
      The erratum and workaround are described by BCM5300X-ES300-RDS.pdf as
      below.
      
        R10: PCIe Transactions Periodically Fail
      
          Description: The BCM5300X PCIe does not maintain transaction ordering.
                       This may cause PCIe transaction failure.
          Fix Comment: Add a dummy PCIe configuration read after a PCIe
                       configuration write to ensure PCIe configuration access
                       ordering. Set ES bit of CP0 configu7 register to enable
                       sync function so that the sync instruction is functional.
          Resolution:  hndpci.c: extpci_write_config()
                       hndmips.c: si_mips_init()
                       mipsinc.h CONF7_ES
      
      This is fixed by the CFE MIPS bcmsi chipset driver also for BCM47XX.
      Also the dummy PCIe configuration read is already implemented in the
      Linux BCMA driver.
      
      Enable ExternalSync in Config7 when CONFIG_BCMA_DRIVER_PCI_HOSTMODE=y
      too so that the sync instruction is externalised.
      Signed-off-by: default avatarTokunori Ikegami <ikegami@allied-telesis.co.jp>
      Reviewed-by: default avatarPaul Burton <paul.burton@mips.com>
      Acked-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
      Cc: Rafał Miłecki <zajec5@gmail.com>
      Cc: linux-mips@linux-mips.org
      Cc: stable@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/19461/Signed-off-by: default avatarJames Hogan <jhogan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43f6a09c
    • Joakim Tjernlund's avatar
      mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking. · c375d0bd
      Joakim Tjernlund authored
      commit f1ce87f6 upstream.
      
      cfi_ppb_unlock() walks all flash chips when unlocking sectors,
      avoid walking chips unaffected by the unlock operation.
      
      Fixes: 1648eaaa ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJoakim Tjernlund <joakim.tjernlund@infinera.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c375d0bd
    • Joakim Tjernlund's avatar
      mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary · fbbde934
      Joakim Tjernlund authored
      commit 0cd8116f upstream.
      
      The "sector is in requested range" test used to determine whether
      sectors should be re-locked or not is done on a variable that is reset
      everytime we cross a chip boundary, which can lead to some blocks being
      re-locked while the caller expect them to be unlocked.
      Fix the check to make sure this cannot happen.
      
      Fixes: 1648eaaa ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJoakim Tjernlund <joakim.tjernlund@infinera.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fbbde934
    • Joakim Tjernlund's avatar
      mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips · 2f11a0c8
      Joakim Tjernlund authored
      commit 5fdfc3db upstream.
      
      cfi_ppb_unlock() tries to relock all sectors that were locked before
      unlocking the whole chip.
      This locking used the chip start address + the FULL offset from the
      first flash chip, thereby forming an illegal address. Fix that by using
      the chip offset(adr).
      
      Fixes: 1648eaaa ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJoakim Tjernlund <joakim.tjernlund@infinera.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f11a0c8
    • Joakim Tjernlund's avatar
      mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock() · 80349943
      Joakim Tjernlund authored
      commit f93aa8c4 upstream.
      
      do_ppb_xxlock() fails to add chip->start when querying for lock status
      (and chip_ready test), which caused false status reports.
      Fix that by adding adr += chip->start and adjust call sites
      accordingly.
      
      Fixes: 1648eaaa ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJoakim Tjernlund <joakim.tjernlund@infinera.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80349943
    • Tokunori Ikegami's avatar
      mtd: cfi_cmdset_0002: Change write buffer to check correct value · 746c1362
      Tokunori Ikegami authored
      commit dfeae107 upstream.
      
      For the word write it is checked if the chip has the correct value.
      But it is not checked for the write buffer as only checked if ready.
      To make sure for the write buffer change to check the value.
      
      It is enough as this patch is only checking the last written word.
      Since it is described by data sheets to check the operation status.
      Signed-off-by: default avatarTokunori Ikegami <ikegami@allied-telesis.co.jp>
      Reviewed-by: default avatarJoakim Tjernlund <Joakim.Tjernlund@infinera.com>
      Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>
      Cc: linux-mtd@lists.infradead.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      746c1362
    • Chuck Lever's avatar
      xprtrdma: Return -ENOBUFS when no pages are available · d097e5b5
      Chuck Lever authored
      commit a8f688ec upstream.
      
      The use of -EAGAIN in rpcrdma_convert_iovs() is a latent bug: the
      transport never calls xprt_write_space() when more pages become
      available. -ENOBUFS will trigger the correct "delay briefly and call
      again" logic.
      
      Fixes: 7a89f9c6 ("xprtrdma: Honor ->send_request API contract")
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Cc: stable@vger.kernel.org # 4.8+
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d097e5b5
    • Leon Romanovsky's avatar
      RDMA/mlx4: Discard unknown SQP work requests · 786c8d79
      Leon Romanovsky authored
      commit 6b1ca7ec upstream.
      
      There is no need to crash the machine if unknown work request was
      received in SQP MAD.
      
      Cc: <stable@vger.kernel.org> # 3.6
      Fixes: 37bfc7c1 ("IB/mlx4: SR-IOV multiplex and demultiplex MADs")
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      786c8d79
    • Mike Marciniszyn's avatar
      IB/hfi1: Fix user context tail allocation for DMA_RTAIL · a3369992
      Mike Marciniszyn authored
      commit 1bc0299d upstream.
      
      The following code fails to allocate a buffer for the
      tail address that the hardware DMAs into when the user
      context DMA_RTAIL is set.
      
      if (HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL)) {
      	rcd->rcvhdrtail_kvaddr = dma_zalloc_coherent(
      		&dd->pcidev->dev, PAGE_SIZE, &dma_hdrqtail,
                      gfp_flags);
      	if (!rcd->rcvhdrtail_kvaddr)
      		goto bail_free;
      	rcd->rcvhdrqtailaddr_dma = dma_hdrqtail;
      }
      
      So the rcvhdrtail_kvaddr would then be NULL.
      
      The mmap logic fails to check for a NULL rcvhdrtail_kvaddr.
      
      The fix is to test for both user and kernel DMA_TAIL options
      during the allocation as well as testing for a NULL
      rcvhdrtail_kvaddr during the mmap processing.
      
      Additionally, all downstream testing of the capmask for DMA_RTAIL
      have been eliminated in favor of testing rcvhdrtail_kvaddr.
      
      Cc: <stable@vger.kernel.org> # 4.9.x
      Reviewed-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a3369992
    • Sebastian Sanchez's avatar
      IB/hfi1: Optimize kthread pointer locking when queuing CQ entries · 964705c4
      Sebastian Sanchez authored
      commit af8aab71 upstream.
      
      All threads queuing CQ entries on different CQs are unnecessarily
      synchronized by a spin lock to check if the CQ kthread worker hasn't
      been destroyed before queuing an CQ entry.
      
      The lock used in 6efaf10f ("IB/rdmavt: Avoid queuing work into a
      destroyed cq kthread worker") is a device global lock and will have
      poor performance at scale as completions are entered from a large
      number of CPUs.
      
      Convert to use RCU where the read side of RCU is rvt_cq_enter() to
      determine that the worker is alive prior to triggering the
      completion event.
      Apply write side RCU semantics in rvt_driver_cq_init() and
      rvt_cq_exit().
      
      Fixes: 6efaf10f ("IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker")
      Cc: <stable@vger.kernel.org> # 4.14.x
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarSebastian Sanchez <sebastian.sanchez@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      964705c4
    • Michael J. Ruhl's avatar
      IB/hfi1: Reorder incorrect send context disable · 2bd28cba
      Michael J. Ruhl authored
      commit a93a0a31 upstream.
      
      User send context integrity bits are cleared before the context is
      disabled.  If the send context is still processing data, any packets
      that need those integrity bits will cause an error and halt the send
      context.
      
      During the disable handling, the driver waits for the context to drain.
      If the context is halted, the driver will eventually timeout because
      the context won't drain and then incorrectly bounce the link.
      
      Reorder the bit clearing and the context disable.
      
      Examine the software state and send context status as well as the
      egress status to determine if a send context is in the halted state.
      
      Promote the check macros to static functions for consistency with the
      new check and to follow kernel style.
      
      Remove an unused define that refers to the egress timeout.
      
      Cc: <stable@vger.kernel.org> # 4.9.x
      Reviewed-by: default avatarMitko Haralanov <mitko.haralanov@intel.com>
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bd28cba
    • Mike Marciniszyn's avatar
      IB/hfi1: Fix fault injection init/exit issues · 9e81f9a2
      Mike Marciniszyn authored
      commit 8c79d822 upstream.
      
      There are config dependent code paths that expose panics in unload
      paths both in this file and in debugfs_remove_recursive() because
      CONFIG_FAULT_INJECTION and CONFIG_FAULT_INJECTION_DEBUG_FS can be
      set independently.
      
      Having CONFIG_FAULT_INJECTION set and CONFIG_FAULT_INJECTION_DEBUG_FS
      reset causes fault_create_debugfs_attr() to return an error.
      
      The debugfs.c routines tolerate failures, but the module unload panics
      dereferencing a NULL in the two exit routines.  If that is fixed, the
      dir passed to debugfs_remove_recursive comes from a memory location
      that was freed and potentially reused causing a segfault or corrupting
      memory.
      
      Here is an example of the NULL deref panic:
      
      [66866.286829] BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
      [66866.295602] IP: hfi1_dbg_ibdev_exit+0x2a/0x80 [hfi1]
      [66866.301138] PGD 858496067 P4D 858496067 PUD 8433a7067 PMD 0
      [66866.307452] Oops: 0000 [#1] SMP
      [66866.310953] Modules linked in: hfi1(-) rdmavt rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm iw_cm ib_cm ib_core rpcsec_gss_krb5 nfsv4 dns_resolver nfsv3 nfs fscache sb_edac x86_pkg_temp_thermal intel_powerclamp vfat fat coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel iTCO_wdt iTCO_vendor_support crypto_simd mei_me glue_helper cryptd mxm_wmi ipmi_si pcspkr lpc_ich sg mei ioatdma ipmi_devintf i2c_i801 mfd_core shpchp ipmi_msghandler wmi acpi_power_meter acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt igb fb_sys_fops ttm ahci ptp crc32c_intel libahci pps_core drm dca libata i2c_algo_bit i2c_core [last unloaded: opa_vnic]
      [66866.385551] CPU: 8 PID: 7470 Comm: rmmod Not tainted 4.14.0-mam-tid-rdma #2
      [66866.393317] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016
      [66866.405252] task: ffff88084f28c380 task.stack: ffffc90008454000
      [66866.411866] RIP: 0010:hfi1_dbg_ibdev_exit+0x2a/0x80 [hfi1]
      [66866.417984] RSP: 0018:ffffc90008457da0 EFLAGS: 00010202
      [66866.423812] RAX: 0000000000000000 RBX: ffff880857de0000 RCX: 0000000180040001
      [66866.431773] RDX: 0000000180040002 RSI: ffffea0021088200 RDI: 0000000040000000
      [66866.439734] RBP: ffffc90008457da8 R08: ffff88084220e000 R09: 0000000180040001
      [66866.447696] R10: 000000004220e001 R11: ffff88084220e000 R12: ffff88085a31c000
      [66866.455657] R13: ffffffffa07c9820 R14: ffffffffa07c9890 R15: ffff881059d78100
      [66866.463618] FS:  00007f6876047740(0000) GS:ffff88085f800000(0000) knlGS:0000000000000000
      [66866.472644] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [66866.479053] CR2: 0000000000000088 CR3: 0000000856357006 CR4: 00000000001606e0
      [66866.487013] Call Trace:
      [66866.489747]  remove_one+0x1f/0x220 [hfi1]
      [66866.494221]  pci_device_remove+0x39/0xc0
      [66866.498596]  device_release_driver_internal+0x141/0x210
      [66866.504424]  driver_detach+0x3f/0x80
      [66866.508409]  bus_remove_driver+0x55/0xd0
      [66866.512784]  driver_unregister+0x2c/0x50
      [66866.517164]  pci_unregister_driver+0x2a/0xa0
      [66866.521934]  hfi1_mod_cleanup+0x10/0xaa2 [hfi1]
      [66866.526988]  SyS_delete_module+0x171/0x250
      [66866.531558]  do_syscall_64+0x67/0x1b0
      [66866.535644]  entry_SYSCALL64_slow_path+0x25/0x25
      [66866.540792] RIP: 0033:0x7f6875525c27
      [66866.544777] RSP: 002b:00007ffd48528e78 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
      [66866.553224] RAX: ffffffffffffffda RBX: 0000000001cc01d0 RCX: 00007f6875525c27
      [66866.561185] RDX: 00007f6875596000 RSI: 0000000000000800 RDI: 0000000001cc0238
      [66866.569146] RBP: 0000000000000000 R08: 00007f68757e9060 R09: 00007f6875596000
      [66866.577120] R10: 00007ffd48528c00 R11: 0000000000000206 R12: 00007ffd48529db4
      [66866.585080] R13: 0000000000000000 R14: 0000000001cc01d0 R15: 0000000001cc0010
      [66866.593040] Code: 90 0f 1f 44 00 00 48 83 3d a3 8b 03 00 00 55 48 89 e5 53 48 89 fb 74 4e 48 8d bf 18 0c 00 00 e8 9d f2 ff ff 48 8b 83 20 0c 00 00 <48> 8b b8 88 00 00 00 e8 2a 21 b3 e0 48 8b bb 20 0c 00 00 e8 0e
      [66866.614127] RIP: hfi1_dbg_ibdev_exit+0x2a/0x80 [hfi1] RSP: ffffc90008457da0
      [66866.621885] CR2: 0000000000000088
      [66866.625618] ---[ end trace c4817425783fb092 ]---
      
      Fix by insuring that upon failure from fault_create_debugfs_attr() the
      parent pointer for the routines is always set to NULL and guards added
      in the exit routines to insure that debugfs_remove_recursive() is not
      called when when the parent pointer is NULL.
      
      Fixes: 0181ce31 ("IB/hfi1: Add receive fault injection feature")
      Cc: <stable@vger.kernel.org> # 4.14.x
      Reviewed-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e81f9a2
    • Max Gurtovoy's avatar
      IB/isert: fix T10-pi check mask setting · c3295186
      Max Gurtovoy authored
      commit 0e12af84 upstream.
      
      A copy/paste bug (probably) caused setting of an app_tag check mask
      in case where a ref_tag check was needed.
      
      Fixes: 38a2d0d4 ("IB/isert: convert to the generic RDMA READ/WRITE API")
      Fixes: 9e961ae7 ("IB/isert: Support T10-PI protected transactions")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3295186
    • Alex Estrin's avatar
      IB/isert: Fix for lib/dma_debug check_sync warning · 7d4aaca8
      Alex Estrin authored
      commit 763b6965 upstream.
      
      The following error message occurs on a target host in a debug build
      during session login:
      
      [ 3524.411874] WARNING: CPU: 5 PID: 12063 at lib/dma-debug.c:1207 check_sync+0x4ec/0x5b0
      [ 3524.421057] infiniband hfi1_0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x0000000000000000] [size=76 bytes]
      ......snip .....
      
      [ 3524.535846] CPU: 5 PID: 12063 Comm: iscsi_np Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1
      [ 3524.546764] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015
      [ 3524.555740] Call Trace:
      [ 3524.559102]  [<ffffffffa5fe915b>] dump_stack+0x19/0x1b
      [ 3524.565477]  [<ffffffffa58a2f58>] __warn+0xd8/0x100
      [ 3524.571557]  [<ffffffffa58a2fdf>] warn_slowpath_fmt+0x5f/0x80
      [ 3524.578610]  [<ffffffffa5bf5b8c>] check_sync+0x4ec/0x5b0
      [ 3524.585177]  [<ffffffffa58efc3f>] ? set_cpus_allowed_ptr+0x5f/0x1c0
      [ 3524.592812]  [<ffffffffa5bf5cd0>] debug_dma_sync_single_for_cpu+0x80/0x90
      [ 3524.601029]  [<ffffffffa586add3>] ? x2apic_send_IPI_mask+0x13/0x20
      [ 3524.608574]  [<ffffffffa585ee1b>] ? native_smp_send_reschedule+0x5b/0x80
      [ 3524.616699]  [<ffffffffa58e9b76>] ? resched_curr+0xf6/0x140
      [ 3524.623567]  [<ffffffffc0879af0>] isert_create_send_desc.isra.26+0xe0/0x110 [ib_isert]
      [ 3524.633060]  [<ffffffffc087af95>] isert_put_login_tx+0x55/0x8b0 [ib_isert]
      [ 3524.641383]  [<ffffffffa58ef114>] ? try_to_wake_up+0x1a4/0x430
      [ 3524.648561]  [<ffffffffc098cfed>] iscsi_target_do_tx_login_io+0xdd/0x230 [iscsi_target_mod]
      [ 3524.658557]  [<ffffffffc098d827>] iscsi_target_do_login+0x1a7/0x600 [iscsi_target_mod]
      [ 3524.668084]  [<ffffffffa59f9bc9>] ? kstrdup+0x49/0x60
      [ 3524.674420]  [<ffffffffc098e976>] iscsi_target_start_negotiation+0x56/0xc0 [iscsi_target_mod]
      [ 3524.684656]  [<ffffffffc098c2ee>] __iscsi_target_login_thread+0x90e/0x1070 [iscsi_target_mod]
      [ 3524.694901]  [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod]
      [ 3524.705446]  [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod]
      [ 3524.715976]  [<ffffffffc098ca78>] iscsi_target_login_thread+0x28/0x60 [iscsi_target_mod]
      [ 3524.725739]  [<ffffffffa58d60ff>] kthread+0xef/0x100
      [ 3524.732007]  [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80
      [ 3524.739540]  [<ffffffffa5fff1b7>] ret_from_fork_nospec_begin+0x21/0x21
      [ 3524.747558]  [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80
      [ 3524.755088] ---[ end trace 23f8bf9238bd1ed8 ]---
      [ 3595.510822] iSCSI/iqn.1994-05.com.redhat:537fa56299: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
      
      The code calls dma_sync on login_tx_desc->dma_addr prior to initializing it
      with dma-mapped address.
      login_tx_desc is a part of iser_conn structure and is used only once
      during login negotiation, so the issue is fixed by eliminating
      dma_sync call for this buffer using a special case routine.
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Reviewed-by: default avatarDon Dutile <ddutile@redhat.com>
      Signed-off-by: default avatarAlex Estrin <alex.estrin@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d4aaca8
    • Erez Shitrit's avatar
      IB/mlx5: Fetch soft WQE's on fatal error state · c06f8c21
      Erez Shitrit authored
      commit 7b74a83c upstream.
      
      On fatal error the driver simulates CQE's for ULPs that rely on
      completion of all their posted work-request.
      
      For the GSI traffic, the mlx5 has its own mechanism that sends the
      completions via software CQE's directly to the relevant CQ.
      
      This should be kept in fatal error too, so the driver should simulate
      such CQE's with the specified error state in order to complete GSI QP
      work requests.
      
      Without the fix the next deadlock might appears:
              schedule_timeout+0x274/0x350
              wait_for_common+0xec/0x240
              mcast_remove_one+0xd0/0x120 [ib_core]
              ib_unregister_device+0x12c/0x230 [ib_core]
              mlx5_ib_remove+0xc4/0x270 [mlx5_ib]
              mlx5_detach_device+0x184/0x1a0 [mlx5_core]
              mlx5_unload_one+0x308/0x340 [mlx5_core]
              mlx5_pci_err_detected+0x74/0xe0 [mlx5_core]
      
      Cc: <stable@vger.kernel.org> # 4.7
      Fixes: 89ea94a7 ("IB/mlx5: Reset flow support for IB kernel ULPs")
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c06f8c21
    • Jack Morgenstein's avatar
      IB/core: Make testing MR flags for writability a static inline function · 96fb9b88
      Jack Morgenstein authored
      commit 08bb558a upstream.
      
      Make the MR writability flags check, which is performed in umem.c,
      a static inline function in file ib_verbs.h
      
      This allows the function to be used by low-level infiniband drivers.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96fb9b88