• Mario Kleiner's avatar
    drm/radeon: Handle irqs only based on irq ring, not irq status regs. · 07f18f0b
    Mario Kleiner authored
    Trying to resolve issues with missed vblanks and impossible
    values inside delivered kms pageflip completion events showed
    that radeon's irq handling sometimes doesn't handle valid irqs,
    but silently skips them. This was observed for vblank interrupts.
    
    Although those irqs have corresponding events queued in the gpu's
    irq ring at time of interrupt, and therefore the corresponding
    handling code gets triggered by these events, the handling code
    sometimes silently skipped processing the irq. The reason for those
    skips is that the handling code double-checks for each irq event if
    the corresponding irq status bits in the irq status registers
    are set. Sometimes those bits are not set at time of check
    for valid irqs, maybe due to some hardware race on some setups?
    
    The problem only seems to happen on some machine + card combos
    sometimes, e.g., never happened during my testing of different PC
    cards of the DCE-2/3/4 generation a year ago, but happens consistently
    now on two different Apple Mac cards (RV730, DCE-3, Apple iMac and
    Evergreen JUNIPER, DCE-4 in a Apple MacPro). It also doesn't happen
    at each interrupt but only occassionally every couple of
    hundred or thousand vblank interrupts.
    
    This results in XOrg warning messages like
    
    "[  7084.472] (WW) RADEON(0): radeon_dri2_flip_event_handler:
    Pageflip completion event has impossible msc 420120 < target_msc 420121"
    
    as well as skipped frames and problems for applications that
    use kms pageflip events or vblank events, e.g., users of DRI2 and
    DRI3/Present, Waylands Weston compositor, etc. See also
    
    https://bugs.freedesktop.org/show_bug.cgi?id=85203
    
    After some talking to Alex and Michel, we decided to fix this
    by turning the double-check for asserted irq status bits into a
    warning. Whenever a irq event is queued in the IH ring, always
    execute the corresponding interrupt handler. Still check the irq
    status bits, but only to log a DRM_DEBUG message on a mismatch.
    
    This fixed the problems reliably on both previously failing
    cards, RV-730 dual-head tested on both crtcs (pipes D1 and D2)
    and a triple-output Juniper HD-5770 card tested on all three
    available crtcs (D1/D2/D3). The r600 and evergreen irq handling
    is therefore tested, but the cik an si handling is only compile
    tested due to lack of hw.
    Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
    Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
    CC: Michel Dänzer <michel.daenzer@amd.com>
    CC: Alex Deucher <alexander.deucher@amd.com>
    CC: <stable@vger.kernel.org> # v3.16+
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    07f18f0b
cik.c 284 KB