How The Linux Kernel Starts Systemd

After the bootloader has booted and the linux kernel is loaded, what happens next? How does systemd run?

prevThe Linux Init Processnext


I was recently having a look through HN and saw the post about run0, a new sudo alternitive by the creator of systemd. Along with a lot of skepticism for the tool, there was a lot of the usual bashing of systemd and its creator. This got me wondering, if there is so much hate for systemd, why is it the most common init system in the current day? And then I got to thinking, how does an init system even work?

This will be a very quick overview of how a system goes from the bootloader, to the kernel to running systemd. This series will be using systemd to explain the process but it should be applicable to every init system.

Where is /init?

My veigue understanding on how a linux system worked was a sequence of
bootloader -> kernel -> init process (systemd) -> everything else, where the init process is executed by the kernel from /init.

Looking in the root of your filesystem you will see no such binary called /init.

╰─>$ ls -a /
bin   home        opt   srv       var
boot  lib         proc  swapfile
dev   lib64       root  sys
efi   lost+found  run   tmp
etc   mnt         sbin  usr

So where is the init process?

Well, its in a diffrent filesystem. The file structure you will see in the / directory of your system will not be found anywhere on disk. It is atcually a collection of mount points. You may have heard that in linux its possable to mount diffrent parts of the system onto diffrent disks, well this is how that works. The contents of /user could be on a diffrent phisical disk than /sbin but they will still be mounted to the same place.

The next question is, what is mounting these paths? Lets explore

The bootloader

To work out how these filesystems get mounded, we are going to disect a linux distro and follow how it works. Im going to be using arch linux iso.

Download the arch iso and mount it onto the system and naviaged to the mounted directory.

╰─>$ sudo mkdir /mnt/arch-iso
╰─>$ sudo mount -o loop arch.iso /mnt/arch-iso
mount: /mnt/arch-iso: WARNING: source write-protected, mounted read-only.
╰─>$ cd /mnt/arch-iso
╰─>$ ls -la
dr-xr-xr-x    - root  2 May 05:05 arch
dr-xr-xr-x    - root  2 May 05:05 boot
dr-xr-xr-x    - root  2 May 05:05 EFI
dr-xr-xr-x    - root  2 May 05:05 loader
.r--r--r-- 941k root  2 May 05:05 shellia32.efi
.r--r--r-- 1.0M root  2 May 05:05 shellx64.efi

In the loader directory is where the bootloader (grub) configuration files are. Lets have a look at the default arch grub config and see what we can find.

╰─>$ cat loader/entries/01-archiso-x86_64-linux.conf
title    Arch Linux install medium (x86_64, UEFI)
sort-key 01
linux    /arch/boot/x86_64/vmlinuz-linux
initrd   /arch/boot/x86_64/initramfs-linux.img
options  archisobasedir=arch archisosearchuuid=2024-05-01-17-04-31-00

- linux is the path to the linux kernel

- initrd is the initrd, or the initial RAM disk.

- options are the cmdline optios passed to the kernel.

The configuration we are interested in is the initrd. This contains the initial filesystem that will be loaded before the filesystem containing arch.


Lets copy the initrd out of the iso so we can have a closer look.

╰─>$ cp /mnt/arch-iso/arch/boot/x86_64/initramfs-linux.img initramfs-linux.imgs
╰─>$ file initramfs-linux.img
initramfs-linux.img: ASCII cpio archive (SVR4 with no CRC)

As you can see, it is a cpio archive. Lets have a look at what it has inside

╰─>$ cat initramfs-linux.img | cpio -idtv
-rw-r--r--   0 root     root            2 Jan  1  1970 early_cpio
drwxr-xr-x   0 root     root            0 Jan  1  1970 kernel/
drwxr-xr-x   0 root     root            0 Jan  1  1970 kernel/x86/
drwxr-xr-x   0 root     root            0 Jan  1  1970 kernel/x86/microcode/
-rw-r--r--   0 root     root        76166 Jan  1  1970 kernel/x86/microcode/AuthenticAMD.bin
-rw-r--r--   0 root     root     12897280 Jan  1  1970 kernel/x86/microcode/GenuineIntel.bin
25341 blocks

Still no /init? This is because the initramfs is formatted with non-standard allignment. Extracting it like this only shows us a small section of what it contains, so we need to use a diffrent tool for reading and writing initramfs files.

Im going to use mkinitcpio, but there are other tools such as initramfs-tools that will also work.

Now using lsinitcpio (part of mkinitcpio), we can see way more files in the initramfs one of them being /init!.

╰─>$ lsinitcpio initramfs-linux.img | wc --lines
╰─>$ lsinitcpio initramfs-linux.img | grep "^init$"

Now lets extract the initramfs so we can have a look what init is doing.

╰─>$ mkdir extracted_initframs
╰─>$ cd extracted_initframs
╰─>$ lsinitcpio -x ../initramfs-linux.imgs
╰─>$ ls
bin           config      etc    init_functions  lib64     run   tmp  VERSION
buildconfig   dev         hooks  kernel          new_root  sbin  usr
codesign.crt  early_cpio  init   lib             proc      sys   var


In the case of arch (and many other distros), the initramfs is a very small busybox os with the init process written shell script, executed with ash.

lrwxrwxrwx - smc  9 May 15:00 sbin/sh -> busybox
╰─>$ head -1 init

So what does it do?

First off it includes the init_functions script that does a lot of the work. Inside init_functions, we will find a function called mount_setup that is responsible for setting up a minimal filesystem for the rest of the init.

It mounts the proc, /dev, /run and sys directories, along with creating symlinks for /dev/std[in,out,err]

╰─>$ sed -n '523,542 p' init_functions
mount_setup() {
    mount -t proc proc /proc -o nosuid,noexec,nodev
    mount -t sysfs sys /sys -o nosuid,noexec,nodev
    mount -t devtmpfs dev /dev -o mode=0755,nosuid
    mount -t tmpfs run /run -o nosuid,nodev,mode=0755
    mkdir -m755 /run/initramfs

    if [ -e /sys/firmware/efi ]; then
        mount -t efivarfs efivarfs /sys/firmware/efi/efivars -o nosuid,nodev,noexec

    # Setup /dev symlinks
    if [ -e /proc/kcore ]; then
        ln -sfT /proc/kcore /dev/core
    ln -sfT /proc/self/fd /dev/fd
    ln -sfT /proc/self/fd/0 /dev/stdin
    ln -sfT /proc/self/fd/1 /dev/stdout
    ln -sfT /proc/self/fd/2 /dev/stderr

Then calls default_mount_handler /new_root to mount the “real” root system onto /new_root using the $root linux kernel cmd option.

Grub sets the linux command line option “$root” to the real filesystem (you can read more about it here).

╰─>$ sed -n '419,430 p' init_functions

default_mount_handler() {
    msg ":: mounting '$root' on real root"
    if ! mount -t "${rootfstype:-auto}" -o "${rwopt:-ro}${rootflags:+,$rootflags}" "$root" "$1"; then
        # shellcheck disable=SC2086
        run_hookfunctions 'run_emergencyhook' 'emergency hook' $EMERGENCYHOOKS
        err "Failed to mount '$root' on real root"
        echo "You are now being dropped into an emergency shell."
        # shellcheck disable=SC2119
        msg "Trying to continue (this will most likely fail) ..."

Finally, calls switch_root to move all of the previous mount points (/dev etc) intoto /new_root and mount /new_root to / and executes /sbin/init.

And now we have our filesystem!

Checking /sbin/init on my current machine will show that this is indeed systemd.

╰─>$ ls -la /sbin/init
lrwxrwxrwx - root  2 May 07:35 /sbin/init -> ../lib/systemd/systemd


This post is part of a series. The next part will be explaining how to make a minimal linux init system/distro. You can navigate this series at the top of the page.