Introduction
I was recently having a look through HN and saw the post about run0, a new sudo alternitive by the creator of systemd. Along with a lot of skepticism for the tool, there was a lot of the usual bashing of systemd and its creator. This got me wondering, if there is so much hate for systemd, why is it the most common init system in the current day? And then I got to thinking, how does an init system even work?
This will be a very quick overview of how a system goes from the bootloader, to the kernel to running systemd. This series will be using systemd to explain the process but it should be applicable to every init system.
Where is /init?
My veigue understanding on how a linux system worked was a sequence of
bootloader -> kernel -> init process (systemd) -> everything else
, where the init process is executed by the kernel from /init
.
Looking in the root of your filesystem you will see no such binary called /init
.
┬─[smc@legion:~]─[12:11:01]
╰─>$ ls -a /
bin home opt srv var
boot lib proc swapfile
dev lib64 root sys
efi lost+found run tmp
etc mnt sbin usr
So where is the init process?
Well, its in a diffrent filesystem. The file structure you will see in the
/
directory of your system will not be found anywhere on disk. It is atcually a collection of mount points.
You may have heard that in linux its possable to mount diffrent parts of the system onto diffrent disks, well this is how that works.
The contents of /user
could be on a diffrent phisical disk than /sbin
but they will still be mounted to the same place.
The next question is, what is mounting these paths? Lets explore
The bootloader
To work out how these filesystems get mounded, we are going to disect a linux distro and follow how it works. Im going to be using arch linux iso.
Download the arch iso and mount it onto the system and naviaged to the mounted directory.
┬─[smc@legion:~/scratch]─[14:12:23]
╰─>$ sudo mkdir /mnt/arch-iso
┬─[smc@legion:~/scratch]─[14:12:38]
╰─>$ sudo mount -o loop arch.iso /mnt/arch-iso
mount: /mnt/arch-iso: WARNING: source write-protected, mounted read-only.
┬─[smc@legion:~/scratch]─[14:12:42]
╰─>$ cd /mnt/arch-iso
┬─[smc@legion:/mnt/arch-iso]─[14:21:25]
╰─>$ ls -la
dr-xr-xr-x - root 2 May 05:05 arch
dr-xr-xr-x - root 2 May 05:05 boot
dr-xr-xr-x - root 2 May 05:05 EFI
dr-xr-xr-x - root 2 May 05:05 loader
.r--r--r-- 941k root 2 May 05:05 shellia32.efi
.r--r--r-- 1.0M root 2 May 05:05 shellx64.efi
In the loader
directory is where the bootloader (grub) configuration files are.
Lets have a look at the default arch grub config and see what we can find.
┬─[smc@legion:/mnt/arch-iso]─[14:23:37]
╰─>$ cat loader/entries/01-archiso-x86_64-linux.conf
title Arch Linux install medium (x86_64, UEFI)
sort-key 01
linux /arch/boot/x86_64/vmlinuz-linux
initrd /arch/boot/x86_64/initramfs-linux.img
options archisobasedir=arch archisosearchuuid=2024-05-01-17-04-31-00
- linux
is the path to the linux kernel
- initrd
is the initrd
, or the initial RAM disk.
- options
are the cmdline optios passed to the kernel.
The configuration we are interested in is the initrd
.
This contains the initial filesystem that will be loaded before the filesystem containing arch.
Initramfs
Lets copy the initrd
out of the iso so we can have a closer look.
┬─[smc@legion:~/scratch]─[14:32:55]
╰─>$ cp /mnt/arch-iso/arch/boot/x86_64/initramfs-linux.img initramfs-linux.imgs
┬─[smc@legion:~/scratch]─[14:34:25]
╰─>$ file initramfs-linux.img
initramfs-linux.img: ASCII cpio archive (SVR4 with no CRC)
As you can see, it is a cpio
archive. Lets have a look at what it has inside
┬─[smc@legion:~/scratch]─[14:39:09]
╰─>$ cat initramfs-linux.img | cpio -idtv
-rw-r--r-- 0 root root 2 Jan 1 1970 early_cpio
drwxr-xr-x 0 root root 0 Jan 1 1970 kernel/
drwxr-xr-x 0 root root 0 Jan 1 1970 kernel/x86/
drwxr-xr-x 0 root root 0 Jan 1 1970 kernel/x86/microcode/
-rw-r--r-- 0 root root 76166 Jan 1 1970 kernel/x86/microcode/AuthenticAMD.bin
-rw-r--r-- 0 root root 12897280 Jan 1 1970 kernel/x86/microcode/GenuineIntel.bin
25341 blocks
Still no /init
?
This is because the initramfs is formatted with non-standard allignment.
Extracting it like this only shows us a small section of what it contains,
so we need to use a diffrent tool for reading and writing initramfs files.
Im going to use mkinitcpio, but there are other tools such as initramfs-tools that will also work.
Now using lsinitcpio
(part of mkinitcpio), we can see way more files in the initramfs one of them being /init
!.
┬─[smc@legion:~/scratch]─[14:49:10]
╰─>$ lsinitcpio initramfs-linux.img | wc --lines
3648
┬─[smc@legion:~/scratch]─[14:50:19]
╰─>$ lsinitcpio initramfs-linux.img | grep "^init$"
init
Now lets extract the initramfs so we can have a look what init is doing.
┬─[smc@legion:~/scratch]─[14:50:21]
╰─>$ mkdir extracted_initframs
┬─[smc@legion:~/scratch]─[15:00:18]
╰─>$ cd extracted_initframs
┬─[smc@legion:~/scratch/extracted_initframs]─[15:00:29]
╰─>$ lsinitcpio -x ../initramfs-linux.imgs
┬─[smc@legion:~/scratch/extracted_initframs]─[15:00:34]
╰─>$ ls
bin config etc init_functions lib64 run tmp VERSION
buildconfig dev hooks kernel new_root sbin usr
codesign.crt early_cpio init lib proc sys var
/init
In the case of arch (and many other distros), the initramfs is a very small busybox os with the init process written shell script, executed with ash.
lrwxrwxrwx - smc 9 May 15:00 sbin/sh -> busybox
┬─[smc@legion:~/scratch/extracted_initframs]─[15:08:37]
╰─>$ head -1 init
#!/usr/bin/ash
So what does it do?
First off it includes the init_functions
script that does a lot of the work.
Inside init_functions
, we will find a function called mount_setup
that is responsible for setting up a minimal filesystem for the rest of the init.
It mounts the proc
, /dev
, /run
and sys
directories, along with creating symlinks for /dev/std[in,out,err]
┬─[smc@legion:~/scratch/extracted_initframs]─[15:19:01]
╰─>$ sed -n '523,542 p' init_functions
mount_setup() {
mount -t proc proc /proc -o nosuid,noexec,nodev
mount -t sysfs sys /sys -o nosuid,noexec,nodev
mount -t devtmpfs dev /dev -o mode=0755,nosuid
mount -t tmpfs run /run -o nosuid,nodev,mode=0755
mkdir -m755 /run/initramfs
if [ -e /sys/firmware/efi ]; then
mount -t efivarfs efivarfs /sys/firmware/efi/efivars -o nosuid,nodev,noexec
fi
# Setup /dev symlinks
if [ -e /proc/kcore ]; then
ln -sfT /proc/kcore /dev/core
fi
ln -sfT /proc/self/fd /dev/fd
ln -sfT /proc/self/fd/0 /dev/stdin
ln -sfT /proc/self/fd/1 /dev/stdout
ln -sfT /proc/self/fd/2 /dev/stderr
}
Then calls default_mount_handler /new_root
to mount the “real” root system onto /new_root
using the $root
linux kernel cmd option.
Grub sets the linux command line option “$root” to the real filesystem (you can read more about it here).
┬─[smc@legion:~/scratch/extracted_initframs]─[16:24:42]
╰─>$ sed -n '419,430 p' init_functions
default_mount_handler() {
msg ":: mounting '$root' on real root"
if ! mount -t "${rootfstype:-auto}" -o "${rwopt:-ro}${rootflags:+,$rootflags}" "$root" "$1"; then
# shellcheck disable=SC2086
run_hookfunctions 'run_emergencyhook' 'emergency hook' $EMERGENCYHOOKS
err "Failed to mount '$root' on real root"
echo "You are now being dropped into an emergency shell."
# shellcheck disable=SC2119
launch_interactive_shell
msg "Trying to continue (this will most likely fail) ..."
fi
}
Finally, calls switch_root
to move all of the previous mount points (/dev
etc)
intoto /new_root
and mount /new_root
to /
and executes /sbin/init
.
And now we have our filesystem!
Checking /sbin/init
on my current machine will show that this is indeed systemd.
┬─[smc@legion:/mnt/arch-iso]─[16:30:55]
╰─>$ ls -la /sbin/init
lrwxrwxrwx - root 2 May 07:35 /sbin/init -> ../lib/systemd/systemd
Conclusion
This post is part of a series. The next part will be explaining how to make a minimal linux init system/distro. You can navigate this series at the top of the page.