• Anthony DeRossi's avatar
    drm/ttm: Fix TTM page pool accounting · ca63d76f
    Anthony DeRossi authored
    Freed pages are not subtracted from the allocated_pages counter in
    ttm_pool_type_fini(), causing a leak in the count on device removal.
    The next shrinker invocation loops forever trying to free pages that are
    no longer in the pool:
    
      rcu: INFO: rcu_sched self-detected stall on CPU
      rcu:  3-....: (9998 ticks this GP) idle=54e/1/0x4000000000000000 softirq=434857/434857 fqs=2237
        (t=10001 jiffies g=2194533 q=49211)
      NMI backtrace for cpu 3
      CPU: 3 PID: 1034 Comm: kswapd0 Tainted: P           O      5.11.0-com #1
      Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019
      Call Trace:
       <IRQ>
       ...
       </IRQ>
       sysvec_apic_timer_interrupt+0x77/0x80
       asm_sysvec_apic_timer_interrupt+0x12/0x20
      RIP: 0010:mutex_unlock+0x16/0x20
      Code: e7 48 8b 70 10 e8 7a 53 77 ff eb aa e8 43 6c ff ff 0f 1f 00 65 48 8b 14 25 00 6d 01 00 31 c9 48 89 d0 f0 48 0f b1 0f 48 39 c2 <74> 05 e9 e3 fe ff ff c3 66 90 48 8b 47 20 48 85 c0 74 0f 8b 50 10
      RSP: 0018:ffffbdb840797be8 EFLAGS: 00000246
      RAX: ffff9ff445a41c00 RBX: ffffffffc02a9ef8 RCX: 0000000000000000
      RDX: ffff9ff445a41c00 RSI: ffffbdb840797c78 RDI: ffffffffc02a9ac0
      RBP: 0000000000000080 R08: 0000000000000000 R09: ffffbdb840797c80
      R10: 0000000000000000 R11: fffffffffffffff5 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000084 R15: ffffffffc02a9a60
       ttm_pool_shrink+0x7d/0x90 [ttm]
       ttm_pool_shrinker_scan+0x5/0x20 [ttm]
       do_shrink_slab+0x13a/0x1a0
    ...
    
    debugfs shows the incorrect total:
    
      $ cat /sys/kernel/debug/dri/0/ttm_page_pool
                --- 0--- --- 1--- --- 2--- --- 3--- --- 4--- --- 5--- --- 6--- --- 7--- --- 8--- --- 9--- ---10---
      wc      :        0        0        0        0        0        0        0        0        0        0        0
      uc      :        0        0        0        0        0        0        0        0        0        0        0
      wc 32   :        0        0        0        0        0        0        0        0        0        0        0
      uc 32   :        0        0        0        0        0        0        0        0        0        0        0
      DMA uc  :        0        0        0        0        0        0        0        0        0        0        0
      DMA wc  :        0        0        0        0        0        0        0        0        0        0        0
      DMA     :        0        0        0        0        0        0        0        0        0        0        0
    
      total   :     3029 of  8244261
    
    Using ttm_pool_type_take() to remove pages from the pool before freeing
    them correctly accounts for the freed pages.
    
    Fixes: d099fc8f ("drm/ttm: new TT backend allocation pool v3")
    Signed-off-by: default avatarAnthony DeRossi <ajderossi@gmail.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20210303011723.22512-1-ajderossi@gmail.comReviewed-by: default avatarChristian König <christian.koenig@amd.com>
    Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
    Signed-off-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
    ca63d76f
ttm_pool.c 17.1 KB