--- layout: post title: Booting the Bootloader date: 2024-08-11 --- a.k.a "How I wrote a FreeBSD Bootloader" A few days ago I wrote [a FreeBSD bootloader](https://git.mildlyfunctional.gay/artemist/freeloader/). There are a few reasons why I did so (I want a better bootloader for [NixBSD](https://github.com/nixos-bsd/nixbsd) and I want something like [lanzaboote](https://github.com/nix-community/lanzaboote) for FreeBSD), but mostly I wrote it because I could. It runs as a UEFI application, reads a kernel from the filesystem, loads it into memory, sets arguments, and executes it, all without using any of the upstream FreeBSD "stand" code. The code sucks, changing any settings requires recompiling, it's missing features, and it certainly won't be portable. But it works. As far as I can tell, the only other project that has done that is grub. After I posted about it [on fedi](https://social.mildlyfunctional.gay/@artemist/112907665770518105) someone asked if I was going to write about my project. That seemed like a fun idea, but I'm not sure how useful a rant about weird FreeBSD design decisions would be to anybody, so instead I'll talk more about my thought process for reimplementing ## 1. Setting the Scope When I program I try to take everything into account. I'll constantly be trying to answer questions like: * "What happens if the firmware is buggy?" * "What if I want to port this to ARM later?" * "What if meow?" This can be useful when I'm trying to write secure fault-tolerant production code, but it's mostly a hindrance when I'm trying to just get something to work. Therefore, the first step is setting the smallest scope where I've still accomplished something. This can be a bit flexible, but for this project I wanted: "load a FreeBSD kernel with serial or graphical output from a fixed path in an x86_64 VM". I didn't even put "boot from a root filesystem" in scope, but it turned out that was trivial. This gives me a sense of accomplishment early in the process and helps banish the "what if" demons [^demon] in my head. ## 2. Understanding the Problem Before starting any programming I like to have a good idea of what I'm interfacing with. This tends to mean first learning more general "How do I use $thing" information then moving onto "How does $thing work". It's no use knowing how to encode kernel environment if you have no idea what kernel environment is. While reading documentation sometimes gives me a starting point, it's rarely enough so I quickly end up experimenting, trying debug features, tracing, and reading the code. A lot of these suggestions apply whether or not you have source code. You can try a bunch of inputs, `strace`, dump memory, find important functions, and sometimes enable debug logging whether or not you have the code, code is just easier to search than binaries. In this case already had a good idea of the user-visible parts of the boot process [^user-visible] from working on the NixBSD bootloader so it was immediately time to figure out how the process works. I spent around 2 days for this project just reading code and writing notes. My notes skip general concepts I already know and just include reminders and lists of information I might forget. They're probably not useful to anyone but me, but could be useful in the future if I want to write documentation. It would probably behoove me to add important code references to my notes, but I mostly end up looking through my search history trying to find what I was looking at. Please don't do this. (have a sample of my [notes](https://git.mildlyfunctional.gay/artemist/freeloader/src/commit/fb7dcf0f401cad2fb124044df8104747c008a2ed/notes.md) to get an idea of what they include) ```markdown ## Modinfo Loader must provide modinfo to kernel, a TLV structure * Dump from normal FreeBSD with `sysctl debug.dump_modinfo` * Tag is `MODINFO_*` or `MODINFO_METADATA | MODINFOMD_*` * Tag and length are 4 bytes native endian * Value is padded to align to `sizeof(size_t)` * Strings are null-terminated * Encodes multiple modules in sequence, separated by `MODINFO_NAME` string ### Fields * `MODINFO_NAME`: string with path to file if available ``` FreeBSD keeps the loader ("stand") and kernel ("sys") code mostly separate, so I simultaneously reverse engineered the loader serialization and kernel deserialization code. Before I could do much of anything though, I needed to know where to look. The easiest starting points are often the beginning or end of a program, in this case the kernel's entry point and the part of the loader that jumps to it. The kernel's entry point (`btext`) was relatively easy to find with `readelf -Wa kernel`. The readelf command gave me the address of the entry point. Since I was using a kernel with debug symbols, the address is linked to the function name later in the ELF output, so a quick search gave me the name, and from there [the function](https://cgit.freebsd.org/src/tree/sys/amd64/amd64/locore.S?h=release/14.1.0#n63). [^script] The loader's exit point was also easy to find. In the standard elf header entry is called `e_entry`, so I used [ripgrep](https://github.com/BurntSushi/ripgrep) with `rg e_entry` and immediately found [the function](https://cgit.freebsd.org/src/tree/stand/efi/loader/arch/amd64/elf64_freebsd.c?h=release/14.1.0#n91). From there I traced where important variables are changed, which quickly led me to the TODO ## 3. Writing the code My goal when writing the code is to get something that superficially produces the right output that I can fix later. TODO ## 4. Debugging TODO ## 5. Cleaning up the code At this point, I generally have code that works, but is terrible. It might use hardcoded constants, have tons of unnecessary debug statements, have no configuration, or just barely work. From here I have 3 options: * Don't clean up the code, because I have no plans to use it anymore * Iteratively clean up the code * Rewrite the code from scratch with more foresight, maybe copying some parts over A lot of my projects end up in the first category because they were just experiments to see if I could. However, if I have any plans to use it in the future, the best option is normally to take a break. A few days or weeks of thinking it over and talking normally help me figure out how to rewrite or improve the code. This is not always advice that I follow myself. The day after I got freeloader working, I tried to refactor the `Serialize` trait, but ended up spending hours just making the code worse and threw my work away. A few days later I realized there was a much better way and could have avoided all that trouble. ## The Boot Process With all that out of the way, here's what I discovered about the boot process: The loader stuffs the kernel and all its dependencies into contiguous block of physical memory, which it calls several things including `modulep` or just `addr`. I call it the "staging buffer" since it's good a name as any. On x86 [^x86] it must be aligned on a 2MiB boundary. [^buffer] ### The kernel The first thing the loader puts in the staging buffer is the kernel. Conveniently, the kernel is an [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format), also used for programs on Linux and FreeBSD, so there's plenty of existing code for parsing it. [^interp] Like other ELF programs, the kernel specifies the location of metadata and code in its Program Headers. [^phdr] Although there are a few types here, the loader only cares about `LOAD` headers, each representing a segment of memory to copy. Readelf's interpretation of my kernel's program header table: ``` Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0xffffffff80200040 0x0000000000200040 0x000268 0x000268 R 0x8 INTERP 0x0002a8 0xffffffff802002a8 0x00000000002002a8 0x00000d 0x00000d R 0x1 [Requesting program interpreter: /red/herring] LOAD 0x000000 0xffffffff80200000 0x0000000000200000 0x17baa0 0x17baa0 R 0x200000 LOAD 0x17baa0 0xffffffff8037baa0 0x000000000037baa0 0xd5efd8 0xd5efd8 R E 0x200000 LOAD 0xedaa80 0xffffffff810daa80 0x00000000010daa80 0x425e1c 0x425e1c R 0x200000 LOAD 0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x001000 RW 0x200000 LOAD 0x1600000 0xffffffff81800000 0x0000000001800000 0x1868b0 0x600000 RW 0x200000 DYNAMIC 0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x000180 RW 0x8 GNU_RELRO 0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x001000 R 0x1 GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0 NOTE 0x1300648 0xffffffff81500648 0x0000000001500648 0x0001c0 0x0001c0 R 0x4 ``` Before it can copy though, the loader takes the `VirtualAddr` of the first `LOAD` segment and keeps it as an offset. That offset lets the loader place the first segment at the beginning of the staging buffer but keep the other segments at the correct relative positions. For example, if the staging buffer was at `0xacab_0000_0000`, then the loader would put the first segment of my kernel at `0xacab_0000_0000` and the second at `0xacab_0017_baa0`. With that offset, the loader looks at each `LOAD` segment and copies from the kernel file (from `Offset` to `Offset + FileSiz` bytes in) to the staging buffer (from `VirtAddr - ` bytes in). Note that in some cases `MemSiz > FileSiz`. The loader zeroes the excess amount in the staging buffer, and the kernel uses it for uninitialized global variables (placed in the section `.bss`). ### The Kernel 2: Electric Boogaloo At this point all the kernel's code is in RAM, but it's missing the `.symtab` and `.symstr` sections [^symtab] that the kernel will need later to load modules. The loader finds these sections by looking at the aptly-named Section Header Table. Sections include info about the purpose of different parts of the file (e.g. `.text` for code, `.rodata` for constants) that are useful for linkers but not normally needed to run a program. Readelf's interpretation of my kernel's section header table: ``` Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 0000000000000000 000000 000000 00 0 0 0 [ 1] .interp PROGBITS ffffffff802002a8 0002a8 00000d 00 A 0 0 1 ... [ 9] .text PROGBITS ffffffff8037c000 17c000 d5ea78 00 AX 0 0 4096 ... [58] .SUNW_ctf PROGBITS 0000000000000000 1abdde8 105984 00 59 0 4 [59] .symtab SYMTAB 0000000000000000 17869a8 189d38 18 60 43442 8 [60] .strtab STRTAB 0000000000000000 1910a49 1ad39c 00 0 0 1 ``` The loader only needs to give the kernel `.symstr` — which lists the names of functions, global variables, and other "symbols" — and `.symtab` — which provides the address and type of those symbols. The two sections are only useful with one another, so `.symtab` includes a link to its `.strtab`. Readelf shows this with the `Lk` field, as in the table above. Readelf's interpretation of my kernel's symtab and strtab: ``` Symbol table '.symtab' contains 67213 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: ffffffff8037c05a 0 NOTYPE LOCAL DEFAULT 9 l1 2: ffffffff8037c080 0 NOTYPE LOCAL DEFAULT 9 l2 3: ffffffff8037c570 10 FUNC LOCAL DEFAULT 9 camstatusentrycomp 4: ffffffff81808000 112 OBJECT LOCAL DEFAULT 48 sysctl___kern_features_scbus ... ``` With the "why" out of the way, the "how" is relatively simple. The loader: * Searches the section header table for an entry with type `SYMTAB` * Copies the length of the symtab section immediately after the kernel * Copies the symtab section immediately after its length * Copies the length of the linked strtab section after the symtab * Copies the strtab section immediately after its length This leaves the following structure immediately after the kernel (lower addresses on the bottom):
strtab contents
strtab length
symtab contents
symtab length
The loader then remembers the start and end address of this structure for later. TODO ### Environment TODO ### Modinfo TODO ### Booting TODO #### Footnotes [^demon]: Wait, this is BSD, it's named "beastie" and I want to load it, not banish it [^user-visible]: Things that a knowledgable system administrator might know about, like kernel environment, module loading, and memdisks [^script]: I think I did this, but it's also possible that I used the [linker script](https://cgit.freebsd.org/src/tree/sys/conf/ldscript.amd64?h=release/14.1.0#n3) [^x86]: I think the 2MiB alignment limitation is x86-specific because of the [horrible code](https://cgit.freebsd.org/src/tree/sys/amd64/amd64/machdep.c?h=release/14.1.0#n1273) that causes it, but I haven't actually tried any other architectures [^buffer]: Historically on x86 this would start at 2MiB (physical address `0x20_0000`) but this isn't possible on modern systems where part could be reserved by EFI. [^interp]: In fact it's so similar that users could accidentally run it as a program and get confused. To stop this, the kernel's interpreter is set to `/red/herring`. [^elf]: or is it "elves"? [^phdr]: Confusingly each entry is called a "Program Header" and is in the "Program Header Table" [^symtab]: Technically `.symtab` and `.symstr` could be copied as part of a `LOAD` segment and the kernel will know where to look, but I haven't seen it.