• Alexander Viro's avatar
    [PATCH] handle_initrd() and request_module() · 09589177
    Alexander Viro authored
    There are 4 different scenarios of late boot:
    
    1.	no initrd or ROOT_DEV is ram0.  That's the simplest one - we want
    	whatever is on ROOT_DEV as final root.
    
    2.	initrd is there, ROOT_DEV is not ram0, /linuxrc on initrd doesn't
    	exit.   We want initrd mounted, /linuxrc launched and /linuxrc
    	will mount whatever it wants, maybe do pivot_root and exec init
    	itself.  Task with PID 1 (parent of linuxrc) will sit there reaping
    	zombies, never leaving the kernel mode.
    
    3.	initrd is there, ROOT_DEV is not ram0, /linuxrc on initrd does exit
    	and sets real-root-dev to 256 (1:0, aka. ram0).   We want initrd
    	mounted, /linuxrc launched and we expect linuxrc to mount all stuff
    	we need, maybe do pivot root and exit.  Parent of /linuxrc (PID 1)
    	will proceed to exec init once /linuxrc is done.
    
    4.	initrd is there, ROOT_DEV is not ram0, /linuxrc on initrd might have
    	done something or not, but when it exits real-root-dev is not ram0.
    	We want initrd mounted, /linuxrc launched and when it exits we are
    	going to mount final root according to real-root-dev.  If there is
    	/initrd on the final root, initrd will be moved there.  Otherwise
    	initrd will be unmounted and its memory (if possible) freed.  Then
    	we exec init from final root.
    
    Note that we want the parent of linuxrc chrooted to initrd while linuxrc
    runs - otherwise things like request_module() will be rather unhappy.  That
    goes for all variants that run linuxrc.
    
    Scenarios above go in order of increasing complexity.  Let's start with #4:
    
    	we had loaded initrd
    	we mount initrd on /root
    	we open / and /old (on initrd)
    	chdir /root
    	mount -- move . /
    	chroot .
    
    Now we have initrd mounted on /, we are chrooted into it but keep opened
    descriptors of / and /old, so we'll be able to break out of jail later.
    
    	we fork a child that will be linuxrc
    	child closes opened descriptors, opens /dev/console, dups it to stdout
    and stderr, does setsid and execs /linuxrc.
    
    	parent sits there reaping zombies until child is finished.
    
    Note that both parent and linuxrc are chrooted into /initrd and if linuxrc
    calls pivot_root, the parent will also have its root/cwd switched.
    
    	OK, child is finished and after checking real_root_dev we see that
    it's not MKDEV(1,0).  Now we know that it's scenario #4.
    We break out of jail, doing the following:
    
    	fchdir to /old on rootfs
    	mount --move / .
    	fchdir to / on rootfs
    	chroot to .
    
    That will move initrd to /old and leave us with root and cwd in / of rootfs.
    We can close these two descriptors now - they'd done their job.
    
    	We mount final root to /root
    	We attempt to mount -- move /old /root/initrd; if we are successful -
    we chdir to /root, mount --move . / and chroot to .  That will leave us with
    	* final root on /
    	* initrd on /initrd of final root
    	* cwd and root on final root.
    
    At that point we simply exec init.
    
    	Now, if mount --move had failed, we got to clean up the mess.  We
    unmount (with MNT_DETACH) initrd from /old and do BLKFLSBUF on ram0.  After
    that we have final root on /root, initrd maybe still alive, but not mounted
    anywhere and our root/cwd in / of rootfs.  Again,
    
    	chdir /root
    	mount --move . /
    	chroot to .
    
    and we have final root mounted on /, we are chrooted into it and it's time
    for exec init.
    
    	That's it for scenario 4.  The rest will be simpler - there's less
    work to do.
    
    #3 diverges from #4 after linuxrc had finished and we had already broken out
    of jail.  Whatever we got from linuxrc is mounted on /old now, so we move it
    back to /, get chrooted there and exec init.   We could've left earlier
    (skipping the move to /old and move back parts), but that would lead to
    even messier logics in prepare_namespace() ;-/
    
    #2 means that parent of /linuxrc never gets past waiting its child to finish.
    End of story.
    
    #1 is the simplest variant - it mounts final root on /root and then does usual
    "chdir there, mount --move . /, chroot to ." and execs init.
    
    Relevant code is in prepare_namespace()/handle_initrd() and yes, it's messy.
    Had been even worse... ;-/
    09589177
do_mounts.c 23.4 KB