• Michael Ellerman's avatar
    ehea: Fix memory hook reference counting crashes · 3051f392
    Michael Ellerman authored
    The recent commit to only register the EHEA memory hotplug hooks on
    adapter probe has a few problems.
    
    Firstly the reference counting is wrong for multiple adapters, in that
    the hooks are registered multiple times. Secondly the check in the tear
    down path is backward. Finally the error path doesn't decrement the
    count.
    
    The multiple registration of the hooks is the biggest problem, as it
    leads to oopses when the system is rebooted, and/or errors during memory
    hotplug, eg:
    
      $ ./mem-on-off-test.sh -r 2
      ...
      ehea: memory is going offline
      ehea: LPAR memory changed - re-initializing driver
      ehea: re-initializing driver complete
      ehea: memory is going offline
      ehea: LPAR memory changed - re-initializing driver
      ehea: opcode=26c ret=fffffffffffffffc arg1=8000000003000003 arg2=0 arg3=700000060000d600 arg4=3fded0000 arg5=200 arg6=0 arg7=0
      ehea: register_rpage_mr failed
      ehea: registering mr failed
      ehea: register MR failed - driver inoperable!
      ehea: memory is going offline
    
    Fixes: aa183323 ("ehea: Register memory hotplug, reboot and crash hooks on adapter probe")
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    3051f392
ehea_main.c 84.7 KB