diff --git a/_drafts/draft-freebsd-bootloader.md b/_drafts/draft-freebsd-bootloader.md new file mode 100644 index 0000000..c83ccfa --- /dev/null +++ b/_drafts/draft-freebsd-bootloader.md @@ -0,0 +1,241 @@ +--- +layout: post +title: Booting the Bootloader +date: 2024-08-11 +--- + +a.k.a "How I wrote a FreeBSD Bootloader" + +A few days ago I wrote [a FreeBSD bootloader](https://git.mildlyfunctional.gay/artemist/freeloader/). +There are a few reasons why I did so (I want a better bootloader for [NixBSD](https://github.com/nixos-bsd/nixbsd) and +I want something like [lanzaboote](https://github.com/nix-community/lanzaboote) for FreeBSD), +but mostly I wrote it because I could. + +It runs as a UEFI application, reads a kernel from the filesystem, loads it into memory, sets arguments, +and executes it, all without using any of the upstream FreeBSD "stand" code. The code sucks, changing any +settings requires recompiling, it's missing features, and it certainly won't be portable. But it works. +As far as I can tell, the only other project that has done that is grub. + +After I posted about it [on fedi](https://social.mildlyfunctional.gay/@artemist/112907665770518105) someone asked +if I was going to write about my project. That seemed like a fun idea, but I'm not sure how useful +a rant about weird FreeBSD design decisions would be to anybody, so instead I'll talk more about my thought +process for reimplementing + +## 1. Setting the Scope +When I program I try to take everything into account. I'll constantly be trying to answer questions like: +* "What happens if the firmware is buggy?" +* "What if I want to port this to ARM later?" +* "What if meow?" + +This can be useful when I'm trying to write secure fault-tolerant production code, +but it's mostly a hindrance when I'm trying to just get something to work. + +Therefore, the first step is setting the smallest scope where I've still accomplished something. +This can be a bit flexible, but for this project I wanted: "load a FreeBSD kernel with serial or graphical output +from a fixed path in an x86_64 VM". I didn't even put "boot from a root filesystem" in scope, but it turned out +that was trivial. + +This gives me a sense of accomplishment early in the process and helps banish the "what if" demons [^demon] in my head. + +## 2. Understanding the Problem +Before starting any programming I like to have a good idea of what I'm interfacing with. +This tends to mean first learning more general "How do I use $thing" information +then moving onto "How does $thing work". +It's no use knowing how to encode kernel environment if you have no idea what kernel environment is. + +While reading documentation sometimes gives me a starting point, it's rarely enough so I quickly +end up experimenting, trying debug features, tracing, and reading the code. + +A lot of these suggestions apply whether or not you have source code. +You can try a bunch of inputs, `strace`, dump memory, find important functions, and sometimes enable debug logging +whether or not you have the code, code is just easier to search than binaries. + +In this case already had a good idea of the user-visible parts of the boot process [^user-visible] +from working on the NixBSD bootloader so it was immediately time to figure out how the process works. + +I spent around 2 days for this project just reading code and writing notes. +My notes skip general concepts I already know and just include reminders and lists of +information I might forget. They're probably not useful to anyone but me, but +could be useful in the future if I want to write documentation. + +It would probably behoove me to add important code references to my notes, +but I mostly end up looking through my search history trying to find what I was looking at. +Please don't do this. + +(have a sample of my [notes](https://git.mildlyfunctional.gay/artemist/freeloader/src/commit/fb7dcf0f401cad2fb124044df8104747c008a2ed/notes.md) to get an idea of what they include) +```markdown +## Modinfo +Loader must provide modinfo to kernel, a TLV structure + +* Dump from normal FreeBSD with `sysctl debug.dump_modinfo` +* Tag is `MODINFO_*` or `MODINFO_METADATA | MODINFOMD_*` +* Tag and length are 4 bytes native endian +* Value is padded to align to `sizeof(size_t)` +* Strings are null-terminated +* Encodes multiple modules in sequence, separated by `MODINFO_NAME` string + +### Fields +* `MODINFO_NAME`: string with path to file if available +``` + +FreeBSD keeps the loader ("stand") and kernel ("sys") code mostly separate, so I simultaneously reverse engineered the +loader serialization and kernel deserialization code. + +Before I could do much of anything though, I needed to know where to look. +The easiest starting points are often the beginning or end of a program, +in this case the kernel's entry point and the part of the loader that jumps to it. + +The kernel's entry point (`btext`) was relatively easy to find with `readelf -Wa kernel`. +The readelf command gave me the address of the entry point. Since I was using a kernel with +debug symbols, the address is linked to the function name later in the ELF output, +so a quick search gave me the name, and from there [the function](https://cgit.freebsd.org/src/tree/sys/amd64/amd64/locore.S?h=release/14.1.0#n63). [^script] + +The loader's exit point was also easy to find. In the standard elf header entry is called +`e_entry`, so I used [ripgrep](https://github.com/BurntSushi/ripgrep) with `rg e_entry` +and immediately found [the function](https://cgit.freebsd.org/src/tree/stand/efi/loader/arch/amd64/elf64_freebsd.c?h=release/14.1.0#n91). + +From there I traced where important variables are changed, which quickly led me to the + +## 3. Writing the code +My goal when writing the code is to get something that superficially produces the right output +that I can fix later. + +## 4. Debugging + +## 5. Cleaning up the code +At this point, I generally have code that works, but is terrible. It might use hardcoded constants, have tons of unnecessary debug statements, have no configuration, or just barely work. + +From here I have 3 options: +* Don't clean up the code, because I have no plans to use it anymore +* Iteratively clean up the code +* Rewrite the code from scratch with more foresight, maybe copying some parts over + +A lot of my projects end up in the first category because they were just experiments to see if I could. + +However, if I have any plans to use it in the future, the best option is normally to take a break. +A few days or weeks of thinking it over and talking normally help me figure out how to rewrite or improve +the code. + +This is not always advice that I follow myself. The day after I got freeloader working, +I tried to refactor the `Serialize` trait, but ended up spending hours just making the code worse +and threw my work away. + +A few days later I realized there was a much better way and could have avoided all that trouble. + + +## The Boot Process +With all that out of the way, here's what I discovered about the boot process: + +The loader stuffs the kernel and all its dependencies into contiguous block of physical memory, +which it calls several things including `modulep` or just `addr`. +I call it the "staging buffer" since it's good a name as any. +On x86 [^x86] it must be aligned on a 2MiB boundary. [^buffer] + +### The kernel +The first thing the loader puts in the staging buffer is the kernel. +Conveniently, the kernel is an [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format), +also used for programs on Linux and FreeBSD, so there's plenty of existing code for parsing it. [^interp] + +Like other ELF programs, the kernel specifies the location of metadata and code in its +Program Headers. [^phdr] Although there are a few types here, the loader only cares about `LOAD` +headers, each representing a segment of memory to copy. + +Readelf's interpretation of my kernel's program header table: +``` +Program Headers: + Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align + PHDR 0x000040 0xffffffff80200040 0x0000000000200040 0x000268 0x000268 R 0x8 + INTERP 0x0002a8 0xffffffff802002a8 0x00000000002002a8 0x00000d 0x00000d R 0x1 + [Requesting program interpreter: /red/herring] + LOAD 0x000000 0xffffffff80200000 0x0000000000200000 0x17baa0 0x17baa0 R 0x200000 + LOAD 0x17baa0 0xffffffff8037baa0 0x000000000037baa0 0xd5efd8 0xd5efd8 R E 0x200000 + LOAD 0xedaa80 0xffffffff810daa80 0x00000000010daa80 0x425e1c 0x425e1c R 0x200000 + LOAD 0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x001000 RW 0x200000 + LOAD 0x1600000 0xffffffff81800000 0x0000000001800000 0x1868b0 0x600000 RW 0x200000 + DYNAMIC 0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x000180 RW 0x8 + GNU_RELRO 0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x001000 R 0x1 + GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0 + NOTE 0x1300648 0xffffffff81500648 0x0000000001500648 0x0001c0 0x0001c0 R 0x4 +``` + +Before it can copy though, the loader takes the `VirtualAddr` of the first `LOAD` segment and +keeps it as an offset. That offset lets the loader place the first segment at the beginning +of the staging buffer but keep the other segments at the correct relative positions. For example, if the staging buffer was at `0xacab_0000_0000`, then the loader would put the first segment of my kernel at `0xacab_0000_0000` and the second at `0xacab_0017_baa0`. + +With that offset, the loader looks at each `LOAD` segment and copies from the kernel file (from `Offset` to `Offset + FileSiz` bytes in) to the staging buffer (from `VirtAddr - ` bytes in). + +Note that in some cases `MemSiz > FileSiz`. The loader zeroes the excess amount in the staging buffer, +and the kernel uses it for uninitialized global variables (placed in the section `.bss`). + +### The Kernel 2: Electric Boogaloo +At this point all the kernel's code is in RAM, +but it's missing the `.symtab` and `.symstr` sections [^symtab] that the kernel will need later to load modules. + +The loader finds these sections by looking at the aptly-named Section Header Table. +Sections include info about the purpose of different parts of the file (e.g. `.text` for code, `.rodata` for constants) +that are useful for linkers but not normally needed to run a program. + +Readelf's interpretation of my kernel's section header table: +``` +Section Headers: + [Nr] Name Type Addr Off Size ES Flg Lk Inf Al + [ 0] NULL 0000000000000000 000000 000000 00 0 0 0 + [ 1] .interp PROGBITS ffffffff802002a8 0002a8 00000d 00 A 0 0 1 +... + [ 9] .text PROGBITS ffffffff8037c000 17c000 d5ea78 00 AX 0 0 4096 +... + [58] .SUNW_ctf PROGBITS 0000000000000000 1abdde8 105984 00 59 0 4 + [59] .symtab SYMTAB 0000000000000000 17869a8 189d38 18 60 43442 8 + [60] .strtab STRTAB 0000000000000000 1910a49 1ad39c 00 0 0 1 +``` + +The loader only needs to give the kernel `.symstr` — which lists the names of functions, global variables, and other "symbols" — and `.symtab` — which provides the address and type of those symbols. +The two sections are only useful with one another, so `.symtab` includes a link to its `.strtab`. +Readelf shows this with the `Lk` field, as in the table above. + +Readelf's interpretation of my kernel's symtab and strtab: +``` +Symbol table '.symtab' contains 67213 entries: + Num: Value Size Type Bind Vis Ndx Name + 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND + 1: ffffffff8037c05a 0 NOTYPE LOCAL DEFAULT 9 l1 + 2: ffffffff8037c080 0 NOTYPE LOCAL DEFAULT 9 l2 + 3: ffffffff8037c570 10 FUNC LOCAL DEFAULT 9 camstatusentrycomp + 4: ffffffff81808000 112 OBJECT LOCAL DEFAULT 48 sysctl___kern_features_scbus +... +``` + +With the "why" out of the way, the "how" is relatively simple. The loader: +* Searches the section header table for an entry with type `SYMTAB` +* Copies the length of the symtab section immediately after the kernel +* Copies the symtab section immediately after its length +* Copies the length of the linked strtab section after the symtab +* Copies the strtab section immediately after its length + +This leaves the following structure immediately after the kernel (lower addresses on the bottom): + + + + + +
strtab contents
strtab length
symtab contents
symtab length
+ +The loader then remembers the start and end address of this structure for later. + +### Environment + +### Modinfo + +### Booting + + +#### Footnotes +[^demon]: Wait, this is BSD, it's named "beastie" and I want to load it, not banish it +[^user-visible]: Things that a knowledgable system administrator might know about, like kernel environment, module loading, and memdisks +[^script]: I think I did this, but it's also possible that I used the [linker script](https://cgit.freebsd.org/src/tree/sys/conf/ldscript.amd64?h=release/14.1.0#n3) +[^x86]: I think the 2MiB alignment limitation is x86-specific because of the [horrible code](https://cgit.freebsd.org/src/tree/sys/amd64/amd64/machdep.c?h=release/14.1.0#n1273) that causes it, but I haven't actually tried any other architectures +[^buffer]: Historically on x86 this would start at 2MiB (physical address `0x20_0000`) but this isn't possible on modern systems where part could be reserved by EFI. +[^interp]: In fact it's so similar that users could accidentally run it as a program and get confused. To stop this, the kernel's interpreter is set to `/red/herring`. +[^elf]: or is it "elves"? +[^phdr]: Confusingly each entry is called a "Program Header" and is in the "Program Header Table" +[^symtab]: Technically `.symtab` and `.symstr` could be copied as part of a `LOAD` segment and the kernel will know where to look, but I haven't seen it.