• Mahesh Salgaonkar's avatar
    powerpc/powernv/elog: Fix race while processing OPAL error log event. · aea948bb
    Mahesh Salgaonkar authored
    Every error log reported by OPAL is exported to userspace through a
    sysfs interface and notified using kobject_uevent(). The userspace
    daemon (opal_errd) then reads the error log and acknowledges the error
    log is saved safely to disk. Once acknowledged the kernel removes the
    respective sysfs file entry causing respective resources to be
    released including kobject.
    
    However it's possible the userspace daemon may already be scanning
    elog entries when a new sysfs elog entry is created by the kernel.
    User daemon may read this new entry and ack it even before kernel can
    notify userspace about it through kobject_uevent() call. If that
    happens then we have a potential race between
    elog_ack_store->kobject_put() and kobject_uevent which can lead to
    use-after-free of a kernfs object resulting in a kernel crash. eg:
    
      BUG: Unable to handle kernel data access on read at 0x6b6b6b6b6b6b6bfb
      Faulting instruction address: 0xc0000000008ff2a0
      Oops: Kernel access of bad area, sig: 11 [#1]
      LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
      CPU: 27 PID: 805 Comm: irq/29-opal-elo Not tainted 5.9.0-rc2-gcc-8.2.0-00214-g6f56a67bcbb5-dirty #363
      ...
      NIP kobject_uevent_env+0xa0/0x910
      LR  elog_event+0x1f4/0x2d0
      Call Trace:
        0x5deadbeef0000122 (unreliable)
        elog_event+0x1f4/0x2d0
        irq_thread_fn+0x4c/0xc0
        irq_thread+0x1c0/0x2b0
        kthread+0x1c4/0x1d0
        ret_from_kernel_thread+0x5c/0x6c
    
    This patch fixes this race by protecting the sysfs file
    creation/notification by holding a reference count on kobject until we
    safely send kobject_uevent().
    
    The function create_elog_obj() returns the elog object which if used
    by caller function will end up in use-after-free problem again.
    However, the return value of create_elog_obj() function isn't being
    used today and there is no need as well. Hence change it to return
    void to make this fix complete.
    
    Fixes: 774fea1a ("powerpc/powernv: Read OPAL error log and export it through sysfs")
    Cc: stable@vger.kernel.org # v3.15+
    Reported-by: default avatarOliver O'Halloran <oohall@gmail.com>
    Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.ibm.com>
    Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Reviewed-by: default avatarOliver O'Halloran <oohall@gmail.com>
    Reviewed-by: default avatarVasant Hegde <hegdevasant@linux.vnet.ibm.com>
    [mpe: Rework the logic to use a single return, reword comments, add oops]
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20201006122051.190176-1-mpe@ellerman.id.au
    aea948bb
opal-elog.c 7.68 KB