• Jeff Layton's avatar
    nfs: take extra reference to fl->fl_file when running a setlk · bdece860
    Jeff Layton authored
    [ Upstream commit feaff8e5 ]
    
    We had a report of a crash while stress testing the NFS client:
    
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
        IP: [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
        PGD 0
        Oops: 0000 [#1] SMP
        Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables ip6table_security ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_filter ip6_tables iptable_security iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw coretemp crct10dif_pclmul ppdev crc32_pclmul crc32c_intel ghash_clmulni_intel vmw_balloon serio_raw vmw_vmci i2c_piix4 shpchp parport_pc acpi_cpufreq parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih mptbase e1000 ata_generic pata_acpi
        CPU: 1 PID: 399 Comm: kworker/1:1H Not tainted 4.1.0-0.rc1.git0.1.fc23.x86_64 #1
        Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
        Workqueue: rpciod rpc_async_schedule [sunrpc]
        task: ffff880036aea7c0 ti: ffff8800791f4000 task.ti: ffff8800791f4000
        RIP: 0010:[<ffffffff8127b698>]  [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
        RSP: 0018:ffff8800791f7c00  EFLAGS: 00010293
        RAX: ffff8800791f7c40 RBX: ffff88001f2ad8c0 RCX: ffffe8ffffc80305
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
        RBP: ffff8800791f7c88 R08: ffff88007fc971d8 R09: 279656d600000000
        R10: 0000034a01000000 R11: 279656d600000000 R12: ffff88001f2ad918
        R13: ffff88001f2ad8c0 R14: 0000000000000000 R15: 0000000100e73040
        FS:  0000000000000000(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000150 CR3: 0000000001c0b000 CR4: 00000000000407e0
        Stack:
         ffffffff8127c5b0 ffff8800791f7c18 ffffffffa0171e29 ffff8800791f7c58
         ffffffffa0171ef8 ffff8800791f7c78 0000000000000246 ffff88001ea0ba00
         ffff8800791f7c40 ffff8800791f7c40 00000000ff5d86a3 ffff8800791f7ca8
        Call Trace:
         [<ffffffff8127c5b0>] ? __posix_lock_file+0x40/0x760
         [<ffffffffa0171e29>] ? rpc_make_runnable+0x99/0xa0 [sunrpc]
         [<ffffffffa0171ef8>] ? rpc_wake_up_task_queue_locked.part.35+0xc8/0x250 [sunrpc]
         [<ffffffff8127cd3a>] posix_lock_file_wait+0x4a/0x120
         [<ffffffffa03e4f12>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4]
         [<ffffffffa03bf108>] ? nfs41_sequence_done+0xd8/0x2d0 [nfsv4]
         [<ffffffffa03c116d>] do_vfs_lock+0x2d/0x30 [nfsv4]
         [<ffffffffa03c251d>] nfs4_lock_done+0x1ad/0x210 [nfsv4]
         [<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
         [<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
         [<ffffffffa0171a5c>] rpc_exit_task+0x2c/0xa0 [sunrpc]
         [<ffffffffa0167450>] ? call_refreshresult+0x150/0x150 [sunrpc]
         [<ffffffffa0172640>] __rpc_execute+0x90/0x460 [sunrpc]
         [<ffffffffa0172a25>] rpc_async_schedule+0x15/0x20 [sunrpc]
         [<ffffffff810baa1b>] process_one_work+0x1bb/0x410
         [<ffffffff810bacc3>] worker_thread+0x53/0x480
         [<ffffffff810bac70>] ? process_one_work+0x410/0x410
         [<ffffffff810bac70>] ? process_one_work+0x410/0x410
         [<ffffffff810c0b38>] kthread+0xd8/0xf0
         [<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180
         [<ffffffff817a1aa2>] ret_from_fork+0x42/0x70
         [<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180
    
    Jean says:
    
    "Running locktests with a large number of iterations resulted in a
     client crash.  The test run took a while and hasn't finished after close
     to 2 hours. The crash happened right after I gave up and killed the test
     (after 107m) with Ctrl+C."
    
    The crash happened because a NULL inode pointer got passed into
    locks_get_lock_context. The call chain indicates that file_inode(filp)
    returned NULL, which means that f_inode was NULL. Since that's zeroed
    out in __fput, that suggests that this filp pointer outlived the last
    reference.
    
    Looking at the code, that seems possible. We copy the struct file_lock
    that's passed in, but if the task is signalled at an inopportune time we
    can end up trying to use that file_lock in rpciod context after the process
    that requested it has already returned (and possibly put its filp
    reference).
    
    Fix this by taking an extra reference to the filp when we allocate the
    lock info, and put it in nfs4_lock_release.
    Reported-by: default avatarJean Spector <jean@primarydata.com>
    Signed-off-by: default avatarJeff Layton <jeff.layton@primarydata.com>
    Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
    Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
    bdece860
nfs4proc.c 230 KB