-
Alexander Viro authored
There are 4 different scenarios of late boot: 1. no initrd or ROOT_DEV is ram0. That's the simplest one - we want whatever is on ROOT_DEV as final root. 2. initrd is there, ROOT_DEV is not ram0, /linuxrc on initrd doesn't exit. We want initrd mounted, /linuxrc launched and /linuxrc will mount whatever it wants, maybe do pivot_root and exec init itself. Task with PID 1 (parent of linuxrc) will sit there reaping zombies, never leaving the kernel mode. 3. initrd is there, ROOT_DEV is not ram0, /linuxrc on initrd does exit and sets real-root-dev to 256 (1:0, aka. ram0). We want initrd mounted, /linuxrc launched and we expect linuxrc to mount all stuff we need, maybe do pivot root and exit. Parent of /linuxrc (PID 1) will proceed to exec init once /linuxrc is done. 4. initrd is there, ROOT_DEV is not ram0, /linuxrc on initrd might have done something or not, but when it exits real-root-dev is not ram0. We want initrd mounted, /linuxrc launched and when it exits we are going to mount final root according to real-root-dev. If there is /initrd on the final root, initrd will be moved there. Otherwise initrd will be unmounted and its memory (if possible) freed. Then we exec init from final root. Note that we want the parent of linuxrc chrooted to initrd while linuxrc runs - otherwise things like request_module() will be rather unhappy. That goes for all variants that run linuxrc. Scenarios above go in order of increasing complexity. Let's start with #4: we had loaded initrd we mount initrd on /root we open / and /old (on initrd) chdir /root mount -- move . / chroot . Now we have initrd mounted on /, we are chrooted into it but keep opened descriptors of / and /old, so we'll be able to break out of jail later. we fork a child that will be linuxrc child closes opened descriptors, opens /dev/console, dups it to stdout and stderr, does setsid and execs /linuxrc. parent sits there reaping zombies until child is finished. Note that both parent and linuxrc are chrooted into /initrd and if linuxrc calls pivot_root, the parent will also have its root/cwd switched. OK, child is finished and after checking real_root_dev we see that it's not MKDEV(1,0). Now we know that it's scenario #4. We break out of jail, doing the following: fchdir to /old on rootfs mount --move / . fchdir to / on rootfs chroot to . That will move initrd to /old and leave us with root and cwd in / of rootfs. We can close these two descriptors now - they'd done their job. We mount final root to /root We attempt to mount -- move /old /root/initrd; if we are successful - we chdir to /root, mount --move . / and chroot to . That will leave us with * final root on / * initrd on /initrd of final root * cwd and root on final root. At that point we simply exec init. Now, if mount --move had failed, we got to clean up the mess. We unmount (with MNT_DETACH) initrd from /old and do BLKFLSBUF on ram0. After that we have final root on /root, initrd maybe still alive, but not mounted anywhere and our root/cwd in / of rootfs. Again, chdir /root mount --move . / chroot to . and we have final root mounted on /, we are chrooted into it and it's time for exec init. That's it for scenario 4. The rest will be simpler - there's less work to do. #3 diverges from #4 after linuxrc had finished and we had already broken out of jail. Whatever we got from linuxrc is mounted on /old now, so we move it back to /, get chrooted there and exec init. We could've left earlier (skipping the move to /old and move back parts), but that would lead to even messier logics in prepare_namespace() ;-/ #2 means that parent of /linuxrc never gets past waiting its child to finish. End of story. #1 is the simplest variant - it mounts final root on /root and then does usual "chdir there, mount --move . /, chroot to ." and execs init. Relevant code is in prepare_namespace()/handle_initrd() and yes, it's messy. Had been even worse... ;-/
09589177