• Eric W. Biederman's avatar
    propogate_mnt: Handle the first propogated copy being a slave · 5ec0811d
    Eric W. Biederman authored
    When the first propgated copy was a slave the following oops would result:
    > BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    > IP: [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0
    > PGD bacd4067 PUD bac66067 PMD 0
    > Oops: 0000 [#1] SMP
    > Modules linked in:
    > CPU: 1 PID: 824 Comm: mount Not tainted 4.6.0-rc5userns+ #1523
    > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
    > task: ffff8800bb0a8000 ti: ffff8800bac3c000 task.ti: ffff8800bac3c000
    > RIP: 0010:[<ffffffff811fba4e>]  [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0
    > RSP: 0018:ffff8800bac3fd38  EFLAGS: 00010283
    > RAX: 0000000000000000 RBX: ffff8800bb77ec00 RCX: 0000000000000010
    > RDX: 0000000000000000 RSI: ffff8800bb58c000 RDI: ffff8800bb58c480
    > RBP: ffff8800bac3fd48 R08: 0000000000000001 R09: 0000000000000000
    > R10: 0000000000001ca1 R11: 0000000000001c9d R12: 0000000000000000
    > R13: ffff8800ba713800 R14: ffff8800bac3fda0 R15: ffff8800bb77ec00
    > FS:  00007f3c0cd9b7e0(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
    > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    > CR2: 0000000000000010 CR3: 00000000bb79d000 CR4: 00000000000006e0
    > Stack:
    >  ffff8800bb77ec00 0000000000000000 ffff8800bac3fd88 ffffffff811fbf85
    >  ffff8800bac3fd98 ffff8800bb77f080 ffff8800ba713800 ffff8800bb262b40
    >  0000000000000000 0000000000000000 ffff8800bac3fdd8 ffffffff811f1da0
    > Call Trace:
    >  [<ffffffff811fbf85>] propagate_mnt+0x105/0x140
    >  [<ffffffff811f1da0>] attach_recursive_mnt+0x120/0x1e0
    >  [<ffffffff811f1ec3>] graft_tree+0x63/0x70
    >  [<ffffffff811f1f6b>] do_add_mount+0x9b/0x100
    >  [<ffffffff811f2c1a>] do_mount+0x2aa/0xdf0
    >  [<ffffffff8117efbe>] ? strndup_user+0x4e/0x70
    >  [<ffffffff811f3a45>] SyS_mount+0x75/0xc0
    >  [<ffffffff8100242b>] do_syscall_64+0x4b/0xa0
    >  [<ffffffff81988f3c>] entry_SYSCALL64_slow_path+0x25/0x25
    > Code: 00 00 75 ec 48 89 0d 02 22 22 01 8b 89 10 01 00 00 48 89 05 fd 21 22 01 39 8e 10 01 00 00 0f 84 e0 00 00 00 48 8b 80 d8 00 00 00 <48> 8b 50 10 48 89 05 df 21 22 01 48 89 15 d0 21 22 01 8b 53 30
    > RIP  [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0
    >  RSP <ffff8800bac3fd38>
    > CR2: 0000000000000010
    > ---[ end trace 2725ecd95164f217 ]---
    
    This oops happens with the namespace_sem held and can be triggered by
    non-root users.  An all around not pleasant experience.
    
    To avoid this scenario when finding the appropriate source mount to
    copy stop the walk up the mnt_master chain when the first source mount
    is encountered.
    
    Further rewrite the walk up the last_source mnt_master chain so that
    it is clear what is going on.
    
    The reason why the first source mount is special is that it it's
    mnt_parent is not a mount in the dest_mnt propagation tree, and as
    such termination conditions based up on the dest_mnt mount propgation
    tree do not make sense.
    
    To avoid other kinds of confusion last_dest is not changed when
    computing last_source.  last_dest is only used once in propagate_one
    and that is above the point of the code being modified, so changing
    the global variable is meaningless and confusing.
    
    Cc: stable@vger.kernel.org
    fixes: f2ebb3a9 ("smarter propagate_mnt()")
    Reported-by: default avatarTycho Andersen <tycho.andersen@canonical.com>
    Reviewed-by: default avatarSeth Forshee <seth.forshee@canonical.com>
    Tested-by: default avatarSeth Forshee <seth.forshee@canonical.com>
    Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
    5ec0811d
pnode.c 11.3 KB