• Roman Bolshakov's avatar
    scsi: qla2xxx: Reserve extra IRQ vectors · f02d4086
    Roman Bolshakov authored
    Commit a6dcfe08 ("scsi: qla2xxx: Limit interrupt vectors to number of
    CPUs") lowers the number of allocated MSI-X vectors to the number of CPUs.
    
    That breaks vector allocation assumptions in qla83xx_iospace_config(),
    qla24xx_enable_msix() and qla2x00_iospace_config(). Either of the functions
    computes maximum number of qpairs as:
    
      ha->max_qpairs = ha->msix_count - 1 (MB interrupt) - 1 (default
                       response queue) - 1 (ATIO, in dual or pure target mode)
    
    max_qpairs is set to zero in case of two CPUs and initiator mode. The
    number is then used to allocate ha->queue_pair_map inside
    qla2x00_alloc_queues(). No allocation happens and ha->queue_pair_map is
    left NULL but the driver thinks there are queue pairs available.
    
    qla2xxx_queuecommand() tries to find a qpair in the map and crashes:
    
      if (ha->mqenable) {
              uint32_t tag;
              uint16_t hwq;
              struct qla_qpair *qpair = NULL;
    
              tag = blk_mq_unique_tag(cmd->request);
              hwq = blk_mq_unique_tag_to_hwq(tag);
              qpair = ha->queue_pair_map[hwq]; # <- HERE
    
              if (qpair)
                      return qla2xxx_mqueuecommand(host, cmd, qpair);
      }
    
      BUG: kernel NULL pointer dereference, address: 0000000000000000
      #PF: supervisor read access in kernel mode
      #PF: error_code(0x0000) - not-present page
      PGD 0 P4D 0
      Oops: 0000 [#1] SMP PTI
      CPU: 0 PID: 72 Comm: kworker/u4:3 Tainted: G        W         5.10.0-rc1+ #25
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
      Workqueue: scsi_wq_7 fc_scsi_scan_rport [scsi_transport_fc]
      RIP: 0010:qla2xxx_queuecommand+0x16b/0x3f0 [qla2xxx]
      Call Trace:
       scsi_queue_rq+0x58c/0xa60
       blk_mq_dispatch_rq_list+0x2b7/0x6f0
       ? __sbitmap_get_word+0x2a/0x80
       __blk_mq_sched_dispatch_requests+0xb8/0x170
       blk_mq_sched_dispatch_requests+0x2b/0x50
       __blk_mq_run_hw_queue+0x49/0xb0
       __blk_mq_delay_run_hw_queue+0xfb/0x150
       blk_mq_sched_insert_request+0xbe/0x110
       blk_execute_rq+0x45/0x70
       __scsi_execute+0x10e/0x250
       scsi_probe_and_add_lun+0x228/0xda0
       __scsi_scan_target+0xf4/0x620
       ? __pm_runtime_resume+0x4f/0x70
       scsi_scan_target+0x100/0x110
       fc_scsi_scan_rport+0xa1/0xb0 [scsi_transport_fc]
       process_one_work+0x1ea/0x3b0
       worker_thread+0x28/0x3b0
       ? process_one_work+0x3b0/0x3b0
       kthread+0x112/0x130
       ? kthread_park+0x80/0x80
       ret_from_fork+0x22/0x30
    
    The driver should allocate enough vectors to provide every CPU it's own HW
    queue and still handle reserved (MB, RSP, ATIO) interrupts.
    
    The change fixes the crash on dual core VM and prevents unbalanced QP
    allocation where nr_hw_queues is two less than the number of CPUs.
    
    Link: https://lore.kernel.org/r/20210412165740.39318-1-r.bolshakov@yadro.com
    Fixes: a6dcfe08 ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs")
    Cc: Daniel Wagner <daniel.wagner@suse.com>
    Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
    Cc: Quinn Tran <qutran@marvell.com>
    Cc: Nilesh Javali <njavali@marvell.com>
    Cc: Martin K. Petersen <martin.petersen@oracle.com>
    Cc: stable@vger.kernel.org # 5.11+
    Reported-by: default avatarAleksandr Volkov <a.y.volkov@yadro.com>
    Reported-by: default avatarAleksandr Miloserdov <a.miloserdov@yadro.com>
    Reviewed-by: default avatarDaniel Wagner <dwagner@suse.de>
    Reviewed-by: default avatarHimanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: default avatarRoman Bolshakov <r.bolshakov@yadro.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    f02d4086
qla_isr.c 114 KB