• Daniel Borkmann's avatar
    net: vxlan: do not use vxlan_net before checking event type · fd27e0d4
    Daniel Borkmann authored
    Jesse Brandeburg reported that commit acaf4e70 caused a panic
    when adding a network namespace while vxlan module was present in
    the system:
    
    [<ffffffff814d0865>] vxlan_lowerdev_event+0xf5/0x100
    [<ffffffff816e9e5d>] notifier_call_chain+0x4d/0x70
    [<ffffffff810912be>] __raw_notifier_call_chain+0xe/0x10
    [<ffffffff810912d6>] raw_notifier_call_chain+0x16/0x20
    [<ffffffff815d9610>] call_netdevice_notifiers_info+0x40/0x70
    [<ffffffff815d9656>] call_netdevice_notifiers+0x16/0x20
    [<ffffffff815e1bce>] register_netdevice+0x1be/0x3a0
    [<ffffffff815e1dce>] register_netdev+0x1e/0x30
    [<ffffffff814cb94a>] loopback_net_init+0x4a/0xb0
    [<ffffffffa016ed6e>] ? lockd_init_net+0x6e/0xb0 [lockd]
    [<ffffffff815d6bac>] ops_init+0x4c/0x150
    [<ffffffff815d6d23>] setup_net+0x73/0x110
    [<ffffffff815d725b>] copy_net_ns+0x7b/0x100
    [<ffffffff81090e11>] create_new_namespaces+0x101/0x1b0
    [<ffffffff81090f45>] copy_namespaces+0x85/0xb0
    [<ffffffff810693d5>] copy_process.part.26+0x935/0x1500
    [<ffffffff811d5186>] ? mntput+0x26/0x40
    [<ffffffff8106a15c>] do_fork+0xbc/0x2e0
    [<ffffffff811b7f2e>] ? ____fput+0xe/0x10
    [<ffffffff81089c5c>] ? task_work_run+0xac/0xe0
    [<ffffffff8106a406>] SyS_clone+0x16/0x20
    [<ffffffff816ee689>] stub_clone+0x69/0x90
    [<ffffffff816ee329>] ? system_call_fastpath+0x16/0x1b
    
    Apparently loopback device is being registered first and thus we
    receive an event notification when vxlan_net is not ready. Hence,
    when we call net_generic() and request vxlan_net_id, we seem to
    access garbage at that point in time. In setup_net() where we set
    up a newly allocated network namespace, we traverse the list of
    pernet ops ...
    
    list_for_each_entry(ops, &pernet_list, list) {
    	error = ops_init(ops, net);
    	if (error < 0)
    		goto out_undo;
    }
    
    ... and loopback_net_init() is invoked first here, so in the middle
    of setup_net() we get this notification in vxlan. As currently we
    only care about devices that unregister, move access through
    net_generic() there. Fix is based on Cong Wang's proposal, but
    only changes what is needed here. It sucks a bit as we only work
    around the actual cure: right now it seems the only way to check if
    a netns actually finished traversing all init ops would be to check
    if it's part of net_namespace_list. But that I find quite expensive
    each time we go through a notifier callback. Anyway, did a couple
    of tests and it seems good for now.
    
    Fixes: acaf4e70 ("net: vxlan: when lower dev unregisters remove vxlan dev as well")
    Reported-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
    Cc: "Eric W. Biederman" <ebiederm@xmission.com>
    Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
    Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
    Tested-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    fd27e0d4
vxlan.c 68.7 KB