Commits · 8fd03fd17ff903abf91583344aaea2043cbccdad · nexedi / linux

04 Jan, 2018 5 commits

scsi: lpfc: fix a couple of minor indentation issues · 8fd03fd1

Colin Ian King authored Dec 22, 2017

Several statements are indented too far, fix these
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

8fd03fd1

scsi: lpfc: don't dereference localport before it has been null checked · 5c665aeb

Colin Ian King authored Dec 22, 2017

localport is being dereferenced to assign lport and then immediately
afterwards localport is being sanity checked to see if it is null.  Fix
this by only dereferencing localport until after it has been null
checked.

Detected by CoverityScan, CID#1463038 ("Dereference before null check")

Fixes: 3a8cefbfc5ee ("scsi: lpfc: Beef up stat counters for debug")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

5c665aeb

scsi: scsi_transport_fc: fix typos on 64/128 GBit define names · cc019a5a

James Smart authored Dec 21, 2017

The define names specified 64Bit/128Bit, not 64GBIT/128GBIT.  Correct
the names.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

cc019a5a

scsi: libsas: remove private hex2bin() implementation · 9ea4e076

Andy Shevchenko authored Dec 19, 2017

The function sas_parse_addr() could be easily substituted by hex2bin()
which is in kernel library code.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9ea4e076

scsi: libiscsi: Allow sd_shutdown on bad transport · d7549412

Rafael David Tinoco authored Dec 07, 2017

If, for any reason, userland shuts down iscsi transport interfaces
before proper logouts - like when logging in to LUNs manually, without
logging out on server shutdown, or when automated scripts can't
umount/logout from logged LUNs - kernel will hang forever on its
sd_sync_cache() logic, after issuing the SYNCHRONIZE_CACHE cmd to all
still existent paths.

PID: 1 TASK: ffff8801a69b8000 CPU: 1 COMMAND: "systemd-shutdow"
 #0 [ffff8801a69c3a30] __schedule at ffffffff8183e9ee
 #1 [ffff8801a69c3a80] schedule at ffffffff8183f0d5
 #2 [ffff8801a69c3a98] schedule_timeout at ffffffff81842199
 #3 [ffff8801a69c3b40] io_schedule_timeout at ffffffff8183e604
 #4 [ffff8801a69c3b70] wait_for_completion_io_timeout at ffffffff8183fc6c
 #5 [ffff8801a69c3bd0] blk_execute_rq at ffffffff813cfe10
 #6 [ffff8801a69c3c88] scsi_execute at ffffffff815c3fc7
 #7 [ffff8801a69c3cc8] scsi_execute_req_flags at ffffffff815c60fe
 #8 [ffff8801a69c3d30] sd_sync_cache at ffffffff815d37d7
 #9 [ffff8801a69c3da8] sd_shutdown at ffffffff815d3c3c

This happens because iscsi_eh_cmd_timed_out(), the transport layer
timeout helper, would tell the queue timeout function (scsi_times_out)
to reset the request timer over and over, until the session state is
back to logged in state. Unfortunately, during server shutdown, this
might never happen again.

Other option would be "not to handle" the issue in the transport
layer. That would trigger the error handler logic, which would also need
the session state to be logged in again.

Best option, for such case, is to tell upper layers that the command was
handled during the transport layer error handler helper, marking it as
DID_NO_CONNECT, which will allow completion and inform about the
problem.

After the session was marked as ISCSI_STATE_FAILED, due to the first
timeout during the server shutdown phase, all subsequent cmds will fail
to be queued, allowing upper logic to fail faster.
Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d7549412

21 Dec, 2017 20 commits

scsi: lpfc: correct sg_seg_cnt attribute min vs default · b996ce39

James Smart authored Dec 19, 2017

Prior patch mixed up what argument in the macro was what, so min value
was placed as the "default" argument, and the default value was placed
as the "min" argument. Thus, when the default was applied, it looked
like the default was smaller than the allowed min.

Swap argument postions to correct.

[mkp: fixed checkpatch warning]
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b996ce39

scsi: qla2xxx: Fix smatch warning in qla25xx_delete_{rsp|req}_que · 62aa2814

Himanshu Madhani authored Dec 16, 2017

This patch fixes following warnings reported by smatch:

drivers/scsi/qla2xxx/qla_mid.c:586 qla25xx_delete_req_que()
error: we previously assumed 'req' could be null (see line 580)

drivers/scsi/qla2xxx/qla_mid.c:602 qla25xx_delete_rsp_que()
error: we previously assumed 'rsp' could be null (see line 596)

Fixes: 7867b98d ("scsi: qla2xxx: Fix memory leak in dual/target mode")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

62aa2814

scsi: qedi: Fix a possible sleep-in-atomic bug in qedi_process_tmf_resp · b1284588

Jia-Ju Bai authored Dec 13, 2017

The driver may sleep under a spinlock.
The function call path is:
qedi_cpu_offline (acquire the spinlock)
  qedi_fp_process_cqes
    qedi_mtask_completion
      qedi_process_tmf_resp
        kzalloc(GFP_KERNEL) --> may sleep

To fix it, GFP_KERNEL is replaced with GFP_ATOMIC.

This bug is found by my static analysis tool(DSAC) and checked by my
code review.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Acked-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b1284588

scsi: arcmsr: simplify arcmsr_request_device_map routine · 6ae9abe0

Ching Huang authored Dec 13, 2017

Simplify arcmsr_request_device_map routine.
Signed-off-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6ae9abe0

scsi: arcmsr: simplify all arcmsr_hbaX_get_config routine by call a new get_adapter_config function · 1e9c8108

Ching Huang authored Dec 12, 2017

Simplify all arcmsr_hbaX_get_config routine by call a new
get_adapter_config function.
Signed-off-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

1e9c8108

scsi: arcmsr: simplify arcmsr_hbaE_get_config function · 22c4ae5b

Ching Huang authored Dec 12, 2017

Simplify arcmsr_hbaE_get_config function.
Signed-off-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

22c4ae5b

scsi: arcmsr: waiting for iop firmware ready before issue get_config command to iop · b6b3084a

Ching Huang authored Dec 12, 2017

Waiting for iop firmware ready before issue get_config command to iop
for adapter type A and D.
Signed-off-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b6b3084a

scsi: arcmsr: simplify arcmsr_hbaC_get_config function · df9f0ee9

Ching Huang authored Dec 12, 2017

Simplify arcmsr_hbaC_get_config function.
Signed-off-by: Ching Huang <ching2048@areca.com.tw>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

df9f0ee9

scsi: lpfc: update driver version to 11.4.0.6 · 2f7005de

James Smart authored Dec 08, 2017

Update the driver version to 11.4.0.6
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

2f7005de

scsi: lpfc: Beef up stat counters for debug · 4b056682

James Smart authored Dec 08, 2017

If log verbose in not turned on, its hard to tell when certain error
paths get hit. Add stats counters and corresponding logic to
debugfs/sysfs to aid understanding what paths were traversed.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4b056682

scsi: lpfc: Fix infinite wait when driver unregisters a remote NVME port. · 3fd78355

James Smart authored Dec 08, 2017

When unregistering a remote port the lpfc driver would eventually wait
for the remoteport_unreg done callback. But the driver never completed
the io aborts that would allow the connections to terminate thus the
unreg done callback was never issued. Turns out the coding style of the
driver allowed for the wait to occur on the same cpu that the deferred
isr is called on. The blocking for the wait, blocked the isr, and as the
isr didn't run, the io aborts wouldn't finish.

Turns out there was never a good reason to block waiting for the unreg
done in the first place. The driver can continue execution and the ref
counting within the driver will do the right thing.

Resolve by removing the wait and patching up a few cases where the ref
counting didn't look right - mainly cases where the remote port comes
back before the aborts had completed and the unreg done had been
called. Additionally, a few places which used pointer values to guide
driver actions weren't protected by lock, so correct those.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

3fd78355

scsi: lpfc: Fix issues connecting with nvme initiator · e06351a0

James Smart authored Dec 08, 2017

In the lpfc discovery engine, when as a nvme target, where the driver
was performing mailbox io with the adapter for port login when a NVME
PRLI is received from the host. Rather than queue and eventually get
back to sending a response after the mailbox traffic, the driver
rejected the io with an error response.

Turns out this particular initiator didn't like the rejection values
(unable to process command/command in progress) so it never attempted a
retry of the PRLI. Thus the host never established nvme connectivity
with the lpfc target.

By changing the rejection values (to Logical Busy/nothing more), the
initiator accepted the response and would retry the PRLI, resulting in
nvme connectivity.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e06351a0

scsi: lpfc: Fix SCSI LUN discovery when SCSI and NVME enabled · 9de416ac

James Smart authored Dec 08, 2017

When enabled for both SCSI and NVME support, and connected pt2pt to a
SCSI only target, the driver nodelist entry for the remote port is left
in PRLI_ISSUE state and no SCSI LUNs are discovered. Works fine if only
configured for SCSI support.

Error was due to some of the prli points still reflecting the need to
send only 1 PRLI. On a lot of fabric configs, targets were NVME only,
which meant the fabric-reported protocol attributes were only telling
the driver one protocol or the other. Thus things worked fine. With
pt2pt, the driver must send a PRLI for both protocols as there are no
hints on what the target supports. Thus pt2pt targets were hitting the
multiple PRLI issues.

Complete the dual PRLI support. Track explicitly whether scsi (fcp) or
nvme prli's have been sent. Accurately track protocol support detected
on each node as reported by the fabric or probed by PRLI traffic.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9de416ac

scsi: lpfc: Increase SCSI CQ and WQ sizes. · a51e41b6

James Smart authored Dec 08, 2017

Increased the sizes of the SCSI WQ's and CQ's so that SCSI operation is
similar to that used by NVME. However, size increase restricted only to
those newer adapters that can support the larger WQE size, thus bigger
queue sizes.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

a51e41b6

scsi: lpfc: Fix receive PRLI handling · b95e29b7

James Smart authored Dec 08, 2017

Handling a rcv'ed PRLI incorrectly can cause the ndlp to end up in the
wrong state or the driver to ACC and PRLI when it should send LS_RJT.

The cause was due to the driver not properly looking at the PRLI type
and taking the multiple protocol support into consideration.

Resolved by adding checks in the various PRLI receive points to validate
PRLI type and reject if not valid for the enabled protocols and mode
(host vs target).
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b95e29b7

scsi: lpfc: Fix -EOVERFLOW behavior for NVMET and defer_rcv · cbc5de1b

James Smart authored Dec 08, 2017

The driver is all set to handle the defer_rcv api for the nvmet_fc
transport, yet didn't properly recognize the return status when the
defer_rcv occurred. The driver treated it simply as an error and aborted
the io. Several residual issues occurred at that point.

Finish the defer_rcv support: recognize the return status when the io
request is being handled in a deferred style. This stops the rogue
aborts; Replenish the async cmd rcv buffer in the deferred receive if
needed.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

cbc5de1b

scsi: lpfc: Fix random heartbeat timeouts during heavy IO · cf1a1d3e

James Smart authored Dec 08, 2017

NVME targets appear to randomly disconnect from the initiator when
running heavy IO.

The error is due to the host aggregate (across all controllers) io load
was beyond the maximum exchange count for nvme on the adapter. The
driver was properly returning a resource busy status, but the io load
was so great heartbeat commands would be bounced and not have a
successful retry within the fuzz amount for the nvme heartbeat (yes, a
very high io load!). Thus the target was terminating the controller due
to a keep alive failure.

Resolve by reserving a few exchanges (by counters) which can be used
when the adapter is out of normal exchanges and the command is a NVME
heartbeat command. As counters are used, while the reserved command is
outstanding, as soon as any other exchange completes, the counters are
adjusted and the reserved count is replenished. The heartbeat completes
execution in a normal fashion.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

cf1a1d3e

scsi: hisi_sas: add v3 hw suspend and resume · 4d0951ee

Xiang Chen authored Dec 09, 2017

For v3 hw SAS, it supports configuring power state from D0 to D3 for entering
Low Power status and power state from D3 to D0 for quit Low Power status.

When power state from D0 to D3, HW will send FLR to clear the registers of
ECAM and BAR space, and when power state from D3 to D0, it will clear the
registers of ECAM space only.

So when suspend, need to do like controller reset (including disable
interrupts/DQ/PHY/BUS), and also release slots after FLR. When resume,
re-config the registers of BAR space.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4d0951ee

scsi: hisi_sas: re-add the lldd_port_deformed() · 336bd78b

Xiang Chen authored Dec 09, 2017

In function sas_suspend_devices(), it requires callback lldd_port_deformed
callback to be implemented if lldd_port_deformed is implemented.

So add a stub for lldd_port_deformed.

Callback lldd_port_deformed was not required as the port deformation is done
elsewhere in the LLDD.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

336bd78b

scsi: hisi_sas: fix SAS_QUEUE_FULL problem while running IO · 9960a24a

Xiang Chen authored Dec 09, 2017

This patch fix SAS_QUEUE_FULL problem. The test situation is close port while
running IO.

In sas_eh_handle_sas_errors(), SCSI EH will free sas_task of the device if
lldd_I_T_nexus_reset() return TMF_RESP_FUNC_COMPLETE or -ENODEV.  But in our
SAS driver, we only free slots of the device when the return value is
TMF_RESP_FUNC_COMPLETE. So if the return value is -ENODEV, the slot resource
will not free any more.

As an solution, we should also free slots of the device in
lldd_I_T_nexus_reset() if the return value is -ENODEV.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9960a24a

15 Dec, 2017 15 commits

scsi: hisi_sas: add internal abort dev in some places · 2a038131

Xiaofei Tan authored Dec 09, 2017

We should do internal abort dev before TMF_ABORT_TASK_SET and TMF_LU_RESET.
Because we may only have done internal abort for single IO in the earlier part
of SCSI EH process. Even the internal abort to the single IO, we also don't
know whether it is successful.

Besides, we should release slots of the device in hisi_sas_abort_task_set() if
the abort is successful.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

2a038131

scsi: hisi_sas: judge result of internal abort · 813709f2

Xiaofei Tan authored Dec 09, 2017

Normally, hardware should ensure that internal abort timeout will never
happen. If happen, it would be an SoC failure. What's more, HW will not
process any other commands if an internal abort hasn't return CQ, and they
will time out also.

So, we should judge the result of internal abort in SCSI EH, if it is failed,
we should give up to do TMF/softreset and return failure to the upper layer
directly.

This patch do following things to achieve this:

1. When internal abort timeout happened, we set return value to -EIO in
   hisi_sas_internal_task_abort().

2. If prep_abort() is not support, let hisi_sas_internal_task_abort() return
   TMF_RESP_FUNC_FAILED.

3. If hisi_sas_internal_task_abort() return an negative number, it can be
   thought that it not executed properly or internal abort timeout. Then we
   won't do behind TMF or softreset, and return failure directly.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

813709f2

scsi: hisi_sas: do link reset for some CHL_INT2 ints · 057c3d1f

Xiaofei Tan authored Dec 09, 2017

We should do link reset of PHY when identify timeout or STP link timeout. They
are internal events of SOC and are notified to driver through interrupts of
CHL_INT2.

Besides, we should add an delay work to do link reset as it needs sleep. So,
this patch add an new PHY event HISI_PHYE_LINK_RESET for this.

Notes: v2 HW doesn't report the event of STP link timeout.  So, we only need
to handle event of identify timeout for v2 HW.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

057c3d1f

scsi: hisi_sas: use an general way to delay PHY work · e537b62b

Xiaofei Tan authored Dec 09, 2017

Use an general way to do delay work for a PHY. Then it will be easier to add
new delayed work for a PHY in future.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e537b62b

scsi: hisi_sas: add v2 hw port AXI error handling support · 72f7fc30

Xiaofei Tan authored Dec 09, 2017

Add port AXI errors handling for v2 hw. We do host controller reset for such
errors.

Besides, change port muli-bits ECC error handling, and we should also do host
reset for such error. So, this patch put them in the same struct with port AXI
error.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

72f7fc30

scsi: hisi_sas: improve int_chnl_int_v2_hw() consistency with v3 hw · f64715d2

Xiaofei Tan authored Dec 09, 2017

Change code format of int_chnl_int_v2_hw() to be consistent with v3 hw to
reduce an tag indent.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

f64715d2

scsi: hisi_sas: add some print to enhance debugging · f1c88211

Xiang Chen authored Dec 09, 2017

Add some print at some places such as error info and cq of exception IO,
device found etc, and also adjust some log levels.

All this to assist debugging ability.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

f1c88211

scsi: hisi_sas: add RAS feature for v3 hw · 1aaf81e0

Xiaofei Tan authored Dec 09, 2017

We use PCIe AER to support RAS feature for v3 hw.  This driver should do
following two things to support this:

1. Enable RAS interrupts, so that errors can be reported to RAS module.

2. Realize err_handler for sas_v3_pci_driver. Then if non-fatal error is
   detected, print error source and try to recover SAS controller.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

1aaf81e0

scsi: hisi_sas: change ncq process for v3 hw · 9f347b2f

Xiang Chen authored Dec 09, 2017

For v3 hw, each NCQ will return a CQ, so it is no need to acquire IPTT from
ITCT, just acquire it from IPTT field of CQ.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9f347b2f

scsi: hisi_sas: add an mechanism to do reset work synchronously · e402acdb

Xiaofei Tan authored Dec 09, 2017

Sometimes it is required to know when the controller reset has completed and
also if it has completed successfully.  For such places, we call
hisi_sas_controller_reset() directly before. That may lead to multiple calls
to this function.

This patch create a per-reset structure which contains a completion structure
and status flag to know when the reset completes and also the status. It is
also in hisi_hba.wq to do reset work.

As all host reset works are done in hisi_hba.wq, we don't worry multiple calls
to hisi_sas_controller_reset().
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e402acdb

scsi: hisi_sas: modify hisi_sas_dev_gone() for reset · f8e45ec2

Xiang Chen authored Dec 09, 2017

Do a couple of changes for when HISI_SAS_RESET_BIT is set for HBA:

 - Clearing ITCT is not necessary

 - Remove internal abort as it will fail during reset

Flag sas_dev->dev_type is kept as SAS_PHY_UNUSED.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

f8e45ec2

scsi: hisi_sas: some optimizations of host controller reset · fb51e7a8

Xiaofei Tan authored Dec 09, 2017

This patch do following optimizations to host controller reset:

1. Unblock scsi requests before rescanning topology, as SCSI command need be
   used if new device is found during rescanning topology.

2. Remove drain_workqueue(hisi_hba->wq) and drain_workqueue(shost->work_q), as
   there is no need to ensure that all PHYs event are done before exiting host
   reset.

3. Improve message print level of host reset. Host reset is an important and
   very few occurrence event. We should know its progress even when not
   debugging.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

fb51e7a8

scsi: hisi_sas: optimise port id refresh function · a669bdbf

Xiaofei Tan authored Dec 09, 2017

Currently refreshing the PHY port id after reset is done in the rescan
topology function, which is quite late in the reset process. It could be moved
earlier in the process, as the port id can be refreshed once the PHYs become
ready.

In addition to this, we should set the hisi_sas_dev port id to 0xff (invalid
port id) if all PHYs of this port remain down for the same device.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

a669bdbf

scsi: hisi_sas: relocate clearing ITCT and freeing device · 0258141a

Xiaofei Tan authored Dec 09, 2017

In certain scenarios we may just want to clear the ITCT for a device, and not
free other resources like the SATA bitmap using in v2 hw.

To facilitate this, this patch relocates the code of clearing ITCT from
free_device() to a new hw interface clear_itct().  Then for some hw, we should
not realise free_device() if there's nothing left to do for it.

[mkp: typo]
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

0258141a

scsi: hisi_sas: fix dma_unmap_sg() parameter · dc1e4730

Xiang Chen authored Dec 09, 2017

For function dma_unmap_sg(), the <nents> parameter should be number of
elements in the scatterlist prior to the mapping, not after the mapping.

Fix this usage.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

dc1e4730