• Eric Dumazet's avatar
    tcp: fix a race in inet_diag_dump_icsk() · f0c928d8
    Eric Dumazet authored
    Alexei reported use after frees in inet_diag_dump_icsk() [1]
    
    Because we use refcount_set() when various sockets are setup and
    inserted into ehash, we also need to make sure inet_diag_dump_icsk()
    wont race with the refcount_set() operations.
    
    Jonathan Lemon sent a patch changing net_twsk_hashdance() but
    other spots would need risky changes.
    
    Instead, fix inet_diag_dump_icsk() as this bug came with
    linux-4.10 only.
    
    [1] Quoting Alexei :
    
    First something iterating over sockets finds already freed tw socket:
    
    refcount_t: increment on 0; use-after-free.
    WARNING: CPU: 2 PID: 2738 at lib/refcount.c:153 refcount_inc+0x26/0x30
    RIP: 0010:refcount_inc+0x26/0x30
    RSP: 0018:ffffc90004c8fbc0 EFLAGS: 00010282
    RAX: 000000000000002b RBX: 0000000000000000 RCX: 0000000000000000
    RDX: ffff88085ee9d680 RSI: ffff88085ee954c8 RDI: ffff88085ee954c8
    RBP: ffff88010ecbd2c0 R08: 0000000000000000 R09: 000000000000174c
    R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: 0000000000000000
    R13: ffff8806ba9bf210 R14: ffffffff82304600 R15: ffff88010ecbd328
    FS:  00007f81f5a7d700(0000) GS:ffff88085ee80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f81e2a95000 CR3: 000000069b2eb006 CR4: 00000000003606e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     inet_diag_dump_icsk+0x2b3/0x4e0 [inet_diag]  // sock_hold(sk); in net/ipv4/inet_diag.c:1002
     ? kmalloc_large_node+0x37/0x70
     ? __kmalloc_node_track_caller+0x1cb/0x260
     ? __alloc_skb+0x72/0x1b0
     ? __kmalloc_reserve.isra.40+0x2e/0x80
     __inet_diag_dump+0x3b/0x80 [inet_diag]
     netlink_dump+0x116/0x2a0
     netlink_recvmsg+0x205/0x3c0
     sock_read_iter+0x89/0xd0
     __vfs_read+0xf7/0x140
     vfs_read+0x8a/0x140
     SyS_read+0x3f/0xa0
     do_syscall_64+0x5a/0x100
    
    then a minute later twsk timer fires and hits two bad refcnts
    for this freed socket:
    
    refcount_t: decrement hit 0; leaking memory.
    WARNING: CPU: 31 PID: 0 at lib/refcount.c:228 refcount_dec+0x2e/0x40
    Modules linked in:
    RIP: 0010:refcount_dec+0x2e/0x40
    RSP: 0018:ffff88085f5c3ea8 EFLAGS: 00010296
    RAX: 000000000000002c RBX: ffff88010ecbd2c0 RCX: 000000000000083f
    RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000003f
    RBP: ffffc90003c77280 R08: 0000000000000000 R09: 00000000000017d3
    R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: ffffffff82ad2d80
    R13: ffffffff8182de00 R14: ffff88085f5c3ef8 R15: 0000000000000000
    FS:  0000000000000000(0000) GS:ffff88085f5c0000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fbe42685250 CR3: 0000000002209001 CR4: 00000000003606e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <IRQ>
     inet_twsk_kill+0x9d/0xc0  // inet_twsk_bind_unhash(tw, hashinfo);
     call_timer_fn+0x29/0x110
     run_timer_softirq+0x36b/0x3a0
    
    refcount_t: underflow; use-after-free.
    WARNING: CPU: 31 PID: 0 at lib/refcount.c:187 refcount_sub_and_test+0x46/0x50
    RIP: 0010:refcount_sub_and_test+0x46/0x50
    RSP: 0018:ffff88085f5c3eb8 EFLAGS: 00010296
    RAX: 0000000000000026 RBX: ffff88010ecbd2c0 RCX: 000000000000083f
    RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000003f
    RBP: ffff88010ecbd358 R08: 0000000000000000 R09: 000000000000185b
    R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: ffff88010ecbd358
    R13: ffffffff8182de00 R14: ffff88085f5c3ef8 R15: 0000000000000000
    FS:  0000000000000000(0000) GS:ffff88085f5c0000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fbe42685250 CR3: 0000000002209001 CR4: 00000000003606e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <IRQ>
     inet_twsk_put+0x12/0x20  // inet_twsk_put(tw);
     call_timer_fn+0x29/0x110
     run_timer_softirq+0x36b/0x3a0
    
    Fixes: 67db3e4b ("tcp: no longer hold ehash lock while calling tcp_get_info()")
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
    Acked-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    f0c928d8
inet_diag.c 31.2 KB