• H. Nikolaus Schaller's avatar
    w1: omap-hdq: fix interrupt handling which did show spurious timeouts · 13db4c40
    H. Nikolaus Schaller authored
    Since
    
    commit 27d13da8 ("w1: omap-hdq: Simplify driver with PM runtime autosuspend")
    
    was applied,
    
    I did see timeouts and wrong values when reading a bq27000 connected
    to hdq of the omap3. This occurred mainly after boot but remained and
    only sometimes settled down after several reads.
    
    root@letux:~# time cat /sys/class/power_supply/bq27000-battery/uevent
    POWER_SUPPLY_NAME=bq27000-battery
    POWER_SUPPLY_STATUS=Discharging
    POWER_SUPPLY_PRESENT=1
    POWER_SUPPLY_VOLTAGE_NOW=0
    POWER_SUPPLY_CURRENT_NOW=0
    POWER_SUPPLY_CAPACITY=0
    POWER_SUPPLY_CAPACITY_LEVEL=Normal
    POWER_SUPPLY_TEMP=-2731
    POWER_SUPPLY_TIME_TO_EMPTY_NOW=0
    POWER_SUPPLY_TIME_TO_EMPTY_AVG=0
    POWER_SUPPLY_TIME_TO_FULL_NOW=0
    POWER_SUPPLY_TECHNOLOGY=Li-ion
    POWER_SUPPLY_CHARGE_FULL=0
    POWER_SUPPLY_CHARGE_NOW=0
    POWER_SUPPLY_CHARGE_FULL_DESIGN=0
    POWER_SUPPLY_CYCLE_COUNT=0
    POWER_SUPPLY_ENERGY_NOW=0
    POWER_SUPPLY_POWER_AVG=0
    POWER_SUPPLY_HEALTH=Good
    POWER_SUPPLY_MANUFACTURER=Texas Instruments
    
    real    0m15.761s
    user    0m0.001s
    sys     0m0.025s
    root@letux:~#
    
    Sometimes the effect did disappear after accessing
    the device multiple times, speed went up and results
    became correct.
    
    All this indicates that some interrupts from the hdq
    controller are lost by the driver.
    
    Enabling debugging revealed that there were spurious tx
    and rx timeouts, i.e. the driver does not always recognise
    interrupts. The main problem is that rx and tx interrupts
    share a single variable which was sometimes reset to
    0 wiping out other interrupts. And it was overwritten
    by a second interrupt, independent of whether the
    previous interrupt was already processed or not.
    
    This patch improves interrupt handling to avoid such
    races and loss of interrupt flags.
    
    The ideas are:
    * only the hdq_isr() sets bits in hdq_status
    * it does not reset any bits
    * it does wake_up() if any interrupt is pending
    * bits are only reset by the read/write/break functions
      if they were waited for
    * this makes sure that no interrupts can be lost
    * rx/tx/timeout bits are completely decoupled from each
      other (and not reset all after waiting for any of them)
    * which bits to reset is now specified by a new parameter
      to hdq_reset_irqstatus()
    * hdq_reset_irqstatus() also returns the state before
      resetting so that we can encapsulate the spinlock
    * this should now handle the case that the write and read
      are both already finished quickly before the hdq_write_byte()
      ends.
    * Or that two interrupts occur in succession before
      they are processed by the driver.
      Old code may have reset all status bits making the next
      hdq_read_byte() timeout.
    * the spinlock now always protects changing of bits in function
      hdq_reset_irqstatus() which could become a read-write-modify
      problem if the interrupt handler tries to read-modify-write
      exactly at the same moment
    * we add mutex protection also for hdq_write_byte() just to
      be safe to not to disturb a hdq_read_byte() triggered by
      some other thread/process.
    
    This patch was tested on a GTA04 and results in no
    boot problems any more. And first read after boot is now ok:
    
    root@letux:~# time cat /sys/class/power_supply/bq27000-battery/uevent
    POWER_SUPPLY_NAME=bq27000-battery
    POWER_SUPPLY_STATUS=Discharging
    POWER_SUPPLY_PRESENT=1
    POWER_SUPPLY_VOLTAGE_NOW=3970000
    POWER_SUPPLY_CURRENT_NOW=354144
    POWER_SUPPLY_CAPACITY=82
    POWER_SUPPLY_CAPACITY_LEVEL=Normal
    POWER_SUPPLY_TEMP=266
    POWER_SUPPLY_TIME_TO_EMPTY_NOW=7680
    POWER_SUPPLY_TIME_TO_EMPTY_AVG=7380
    POWER_SUPPLY_TECHNOLOGY=Li-ion
    POWER_SUPPLY_CHARGE_FULL=934856
    POWER_SUPPLY_CHARGE_NOW=763976
    POWER_SUPPLY_CHARGE_FULL_DESIGN=1233792
    POWER_SUPPLY_CYCLE_COUNT=82
    POWER_SUPPLY_ENERGY_NOW=2852840
    POWER_SUPPLY_POWER_AVG=1392840
    POWER_SUPPLY_HEALTH=Good
    POWER_SUPPLY_MANUFACTURER=Texas Instruments
    
    real    0m0.233s
    user    0m0.000s
    sys     0m0.025s
    root@letux:~#
    
    It was also tested with dev_dbg enabled and more
    printk that all activities behave correctly, especially
    hdq_write_byte(), hdq_read_byte(), omap_hdq_break().
    
    Not tested is omap_w1_triplet().
    
    Fixes: 27d13da8 ("w1: omap-hdq: Simplify driver with PM runtime autosuspend")
    Cc: stable@vger.kernel.org # v5.6+
    Signed-off-by: default avatarH. Nikolaus Schaller <hns@goldelico.com>
    Link: https://lore.kernel.org/r/68fc8623ae741878beef049273696d2377526165.1590255176.git.hns@goldelico.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    13db4c40
omap_hdq.c 17 KB