• Nimrod Andy's avatar
    net: fec: Fix NAPI race · 94191fd6
    Nimrod Andy authored
    Do camera capture test on i.MX6q sabresd board, and save the capture data to
    nfs rootfs. The command is:
    gst-launch-1.0 -e imxv4l2src device=/dev/video1 num-buffers=2592000 ! tee name=t !
    queue ! imxv4l2sink sync=false t. ! queue ! vpuenc ! queue ! mux. pulsesrc num-buffers=3720937
    blocksize=4096 ! 'audio/x-raw, rate=44100, channels=2' ! queue ! imxmp3enc ! mpegaudioparse !
    queue ! mux. qtmux name=mux ! filesink location=video_recording_long.mov
    
    After about 10 hours running, there have net watchdog timeout kernel dump:
    ...
    WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x2b4/0x2d8()
    NETDEV WATCHDOG: eth0 (fec): transmit queue 0 timed out
    CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.24-01051-gdb840b7 #440
    [<80014e6c>] (unwind_backtrace) from [<800118ac>] (show_stack+0x10/0x14)
    [<800118ac>] (show_stack) from [<806ae3f0>] (dump_stack+0x78/0xc0)
    [<806ae3f0>] (dump_stack) from [<8002b504>] (warn_slowpath_common+0x68/0x8c)
    [<8002b504>] (warn_slowpath_common) from [<8002b558>] (warn_slowpath_fmt+0x30/0x40)
    [<8002b558>] (warn_slowpath_fmt) from [<8055e0d4>] (dev_watchdog+0x2b4/0x2d8)
    [<8055e0d4>] (dev_watchdog) from [<800352d8>] (call_timer_fn.isra.33+0x24/0x8c)
    [<800352d8>] (call_timer_fn.isra.33) from [<800354c4>] (run_timer_softirq+0x184/0x220)
    [<800354c4>] (run_timer_softirq) from [<8002f420>] (__do_softirq+0xc0/0x22c)
    [<8002f420>] (__do_softirq) from [<8002f804>] (irq_exit+0xa8/0xf4)
    [<8002f804>] (irq_exit) from [<8000ee5c>] (handle_IRQ+0x54/0xb4)
    [<8000ee5c>] (handle_IRQ) from [<80008598>] (gic_handle_irq+0x28/0x5c)
    [<80008598>] (gic_handle_irq) from [<800123c0>] (__irq_svc+0x40/0x74)
    Exception stack(0x80d27f18 to 0x80d27f60)
    7f00:                                                       80d27f60 0000014c
    7f20: 8858c60e 0000004d 884e4540 0000004d ab7250d0 80d34348 00000000 00000000
    7f40: 00000001 00000000 00000017 80d27f60 800702a4 80476e6c 600f0013 ffffffff
    [<800123c0>] (__irq_svc) from [<80476e6c>] (cpuidle_enter_state+0x50/0xe0)
    [<80476e6c>] (cpuidle_enter_state) from [<80476fa8>] (cpuidle_idle_call+0xac/0x154)
    [<80476fa8>] (cpuidle_idle_call) from [<8000f174>] (arch_cpu_idle+0x8/0x44)
    [<8000f174>] (arch_cpu_idle) from [<80064c54>] (cpu_startup_entry+0x100/0x158)
    [<80064c54>] (cpu_startup_entry) from [<80cd8a9c>] (start_kernel+0x304/0x368)
    ---[ end trace 09ebd32fb032f86d ]---
    ...
    
    There might have a race in napi_schedule(), leaving interrupts disabled forever.
    After these patch, the case still work more than 40 hours running.
    Signed-off-by: default avatarFugang Duan <B38611@freescale.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    94191fd6
fec_main.c 84.6 KB