• Honggang LI's avatar
    mlx5: wrong page mask if CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for 32Bit architectures · 59d2d18c
    Honggang LI authored
    If CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for x86 systems and physical
    memory is more than 4GB, dma_map_page may return a valid memory
    address which greater than 0xffffffff. As a result, the mlx5 device page
    allocator RB tree will be initialized with valid addresses greater than
    0xfffffff.
    
    However, (addr & PAGE_MASK) set the high four bytes to zeros. So, it's
    impossible for the function, free_4k, to release the pages whose
    addresses greater than 4GB. Memory leaks. And mlx5_ib module can't
    release the pages when user try to remove the module, as a result,
    system hang.
    
    [root@rdma05 root]# dmesg  | grep addr | head
    addr             = 3fe384000
    addr & PAGE_MASK =  fe384000
    [root@rdma05 root]# rmmod mlx5_ib   <---- hang on
    
    ---------------------- cosnole log -----------------
    mlx5_ib 0000:04:00.0: irq 138 for MSI/MSI-X
      alloc irq_desc for 139 on node -1
      alloc kstat_irqs on node -1
    mlx5_ib 0000:04:00.0: irq 139 for MSI/MSI-X
    0000:04:00.0:free_4k:221:(pid 1519): page not found
    0000:04:00.0:free_4k:221:(pid 1519): page not found
    0000:04:00.0:free_4k:221:(pid 1519): page not found
    0000:04:00.0:free_4k:221:(pid 1519): page not found
    ---------------------- cosnole log -----------------
    
    Fixes: bf0bf77f ('mlx5: Support communicating arbitrary host page size to firmware')
    Signed-off-by: default avatarHonggang Li <honli@redhat.com>
    Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
    59d2d18c
pagealloc.c 12.3 KB