Add FreeBSD bootloader draft

2024-08-12 04:27:02 +00:00 · 2024-08-12 04:27:02 +00:00 · 54abcdc177
parent 763efc908a
commit 54abcdc177
1 changed files with 247 additions and 0 deletions
--- a/_drafts/draft-freebsd-bootloader.md
+++ b/_drafts/draft-freebsd-bootloader.md
@ -0,0 +1,247 @@
+---
+layout: post
+title: Booting the Bootloader
+date: 2024-08-11
+---
+
+a.k.a "How I wrote a FreeBSD Bootloader"
+
+A few days ago I wrote [a FreeBSD bootloader](https://git.mildlyfunctional.gay/artemist/freeloader/).
+There are a few reasons why I did so (I want a better bootloader for [NixBSD](https://github.com/nixos-bsd/nixbsd) and
+I want something like [lanzaboote](https://github.com/nix-community/lanzaboote) for FreeBSD),
+but mostly I wrote it because I could.
+
+It runs as a UEFI application, reads a kernel from the filesystem, loads it into memory, sets arguments,
+and executes it, all without using any of the upstream FreeBSD "stand" code. The code sucks, changing any
+settings requires recompiling, it's missing features, and it certainly won't be portable. But it works.
+As far as I can tell, the only other project that has done that is grub.
+
+After I posted about it [on fedi](https://social.mildlyfunctional.gay/@artemist/112907665770518105) someone asked
+if I was going to write about my project. That seemed like a fun idea, but I'm not sure how useful
+a rant about weird FreeBSD design decisions would be to anybody, so instead I'll talk more about my thought
+process for reimplementing 
+
+## 1. Setting the Scope
+When I program I try to take everything into account. I'll constantly be trying to answer questions like:
+* "What happens if the firmware is buggy?"
+* "What if I want to port this to ARM later?"
+* "What if meow?"
+
+This can be useful when I'm trying to write secure fault-tolerant production code,
+but it's mostly a hindrance when I'm trying to just get something to work.
+
+Therefore, the first step is setting the smallest scope where I've still accomplished something.
+This can be a bit flexible, but for this project I wanted: "load a FreeBSD kernel with serial or graphical output
+from a fixed path in an x86_64 VM". I didn't even put "boot from a root filesystem" in scope, but it turned out
+that was trivial.
+
+This gives me a sense of accomplishment early in the process and helps banish the "what if" demons [^demon] in my head.
+
+## 2. Understanding the Problem
+Before starting any programming I like to have a good idea of what I'm interfacing with.
+This tends to mean first learning more general "How do I use $thing" information
+then moving onto "How does $thing work".
+It's no use knowing how to encode kernel environment if you have no idea what kernel environment is.
+
+While reading documentation sometimes gives me a starting point, it's rarely enough so I quickly
+end up experimenting, trying debug features, tracing, and reading the code.
+
+A lot of these suggestions apply whether or not you have source code.
+You can try a bunch of inputs, `strace`, dump memory, find important functions, and sometimes enable debug logging
+whether or not you have the code, code is just easier to search than binaries.
+
+In this case already had a good idea of the user-visible parts of the boot process [^user-visible]
+from working on the NixBSD bootloader so it was immediately time to figure out how the process works.
+
+I spent around 2 days for this project just reading code and writing notes.
+My notes skip general concepts I already know and just include reminders and lists of
+information I might forget. They're probably not useful to anyone but me, but
+could be useful in the future if I want to write documentation.
+
+It would probably behoove me to add important code references to my notes,
+but I mostly end up looking through my search history trying to find what I was looking at.
+Please don't do this.
+
+(have a sample of my [notes](https://git.mildlyfunctional.gay/artemist/freeloader/src/commit/fb7dcf0f401cad2fb124044df8104747c008a2ed/notes.md) to get an idea of what they include)
+```markdown
+## Modinfo
+Loader must provide modinfo to kernel, a TLV structure
+
+* Dump from normal FreeBSD with `sysctl debug.dump_modinfo`
+* Tag is `MODINFO_*` or `MODINFO_METADATA | MODINFOMD_*`
+* Tag and length are 4 bytes native endian
+* Value is padded to align to `sizeof(size_t)`
+* Strings are null-terminated
+* Encodes multiple modules in sequence, separated by `MODINFO_NAME` string
+
+### Fields
+* `MODINFO_NAME`: string with path to file if available
+```
+
+FreeBSD keeps the loader ("stand") and kernel ("sys") code mostly separate, so I simultaneously reverse engineered the
+loader serialization and kernel deserialization code.
+
+Before I could do much of anything though, I needed to know where to look.
+The easiest starting points are often the beginning or end of a program,
+in this case the kernel's entry point and the part of the loader that jumps to it.
+
+The kernel's entry point (`btext`) was relatively easy to find with `readelf -Wa kernel`.
+The readelf command gave me the address of the entry point. Since I was using a kernel with
+debug symbols, the address is linked to the function name later in the ELF output,
+so a quick search gave me the name, and from there [the function](https://cgit.freebsd.org/src/tree/sys/amd64/amd64/locore.S?h=release/14.1.0#n63). [^script]
+
+The loader's exit point was also easy to find. In the standard elf header entry is called
+`e_entry`, so I used [ripgrep](https://github.com/BurntSushi/ripgrep) with `rg e_entry`
+and immediately found [the function](https://cgit.freebsd.org/src/tree/stand/efi/loader/arch/amd64/elf64_freebsd.c?h=release/14.1.0#n91).
+
+From there I traced where important variables are changed, which quickly led me to the 
+
+TODO
+
+## 3. Writing the code
+My goal when writing the code is to get something that superficially produces the right output
+that I can fix later. 
+
+TODO
+
+## 4. Debugging
+TODO
+
+## 5. Cleaning up the code
+At this point, I generally have code that works, but is terrible. It might use hardcoded constants, have tons of unnecessary debug statements, have no configuration, or just barely work.
+
+From here I have 3 options:
+* Don't clean up the code, because I have no plans to use it anymore
+* Iteratively clean up the code
+* Rewrite the code from scratch with more foresight, maybe copying some parts over
+
+A lot of my projects end up in the first category because they were just experiments to see if I could.
+
+However, if I have any plans to use it in the future, the best option is normally to take a break.
+A few days or weeks of thinking it over and talking normally help me figure out how to rewrite or improve
+the code.
+
+This is not always advice that I follow myself. The day after I got freeloader working,
+I tried to refactor the `Serialize` trait, but ended up spending hours just making the code worse
+and threw my work away.
+
+A few days later I realized there was a much better way and could have avoided all that trouble.
+
+
+## The Boot Process
+With all that out of the way, here's what I discovered about the boot process:
+
+The loader stuffs the kernel and all its dependencies into contiguous block of physical memory,
+which it calls several things including `modulep` or just `addr`.
+I call it the "staging buffer" since it's good a name as any.
+On x86 [^x86] it must be aligned on a 2MiB boundary. [^buffer]
+
+### The kernel
+The first thing the loader puts in the staging buffer is the kernel.
+Conveniently, the kernel is an [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format),
+also used for programs on Linux and FreeBSD, so there's plenty of existing code for parsing it. [^interp]
+
+Like other ELF programs, the kernel specifies the location of metadata and code in its
+Program Headers. [^phdr] Although there are a few types here, the loader only cares about `LOAD`
+headers, each representing a segment of memory to copy.
+
+Readelf's interpretation of my kernel's program header table:
+```
+Program Headers:
+  Type           Offset    VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
+  PHDR           0x000040  0xffffffff80200040 0x0000000000200040 0x000268 0x000268 R   0x8
+  INTERP         0x0002a8  0xffffffff802002a8 0x00000000002002a8 0x00000d 0x00000d R   0x1
+      [Requesting program interpreter: /red/herring]
+  LOAD           0x000000  0xffffffff80200000 0x0000000000200000 0x17baa0 0x17baa0 R   0x200000
+  LOAD           0x17baa0  0xffffffff8037baa0 0x000000000037baa0 0xd5efd8 0xd5efd8 R E 0x200000
+  LOAD           0xedaa80  0xffffffff810daa80 0x00000000010daa80 0x425e1c 0x425e1c R   0x200000
+  LOAD           0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x001000 RW  0x200000
+  LOAD           0x1600000 0xffffffff81800000 0x0000000001800000 0x1868b0 0x600000 RW  0x200000
+  DYNAMIC        0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x000180 RW  0x8
+  GNU_RELRO      0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x001000 R   0x1
+  GNU_STACK      0x000000  0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
+  NOTE           0x1300648 0xffffffff81500648 0x0000000001500648 0x0001c0 0x0001c0 R   0x4
+```
+
+Before it can copy though, the loader takes the `VirtualAddr` of the first `LOAD` segment and
+keeps it as an offset. That offset lets the loader place the first segment at the beginning
+of the staging buffer but keep the other segments at the correct relative positions. For example, if the staging buffer was at `0xacab_0000_0000`, then the loader would put the first segment of my kernel at `0xacab_0000_0000` and the second at `0xacab_0017_baa0`.
+
+With that offset, the loader looks at each `LOAD` segment and copies from the kernel file (from `Offset` to `Offset + FileSiz` bytes in) to the staging buffer (from `VirtAddr - <load offset>` bytes in).
+
+Note that in some cases `MemSiz > FileSiz`. The loader zeroes the excess amount in the staging buffer,
+and the kernel uses it for uninitialized global variables (placed in the section `.bss`).
+
+### The Kernel 2: Electric Boogaloo
+At this point all the kernel's code is in RAM,
+but it's missing the `.symtab` and `.symstr` sections [^symtab] that the kernel will need later to load modules.
+
+The loader finds these sections by looking at the aptly-named Section Header Table.
+Sections include info about the purpose of different parts of the file (e.g. `.text` for code, `.rodata` for constants)
+that are useful for linkers but not normally needed to run a program.
+
+Readelf's interpretation of my kernel's section header table:
+```
+Section Headers:
+  [Nr] Name              Type            Addr             Off     Size   ES Flg Lk Inf Al
+  [ 0]                   NULL            0000000000000000 000000  000000 00      0   0  0
+  [ 1] .interp           PROGBITS        ffffffff802002a8 0002a8  00000d 00   A  0   0  1
+...
+  [ 9] .text             PROGBITS        ffffffff8037c000 17c000  d5ea78 00  AX  0   0 4096
+...
+  [58] .SUNW_ctf         PROGBITS        0000000000000000 1abdde8 105984 00     59   0  4
+  [59] .symtab           SYMTAB          0000000000000000 17869a8 189d38 18     60 43442  8
+  [60] .strtab           STRTAB          0000000000000000 1910a49 1ad39c 00      0   0  1
+```
+
+The loader only needs to give the kernel `.symstr` — which lists the names of functions, global variables, and other "symbols" — and `.symtab` — which provides the address and type of those symbols.
+The two sections are only useful with one another, so `.symtab` includes a link to its `.strtab`.
+Readelf shows this with the `Lk` field, as in the table above.
+
+Readelf's interpretation of my kernel's symtab and strtab:
+```
+Symbol table '.symtab' contains 67213 entries:
+   Num:    Value          Size Type    Bind   Vis      Ndx Name
+     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
+     1: ffffffff8037c05a     0 NOTYPE  LOCAL  DEFAULT    9 l1
+     2: ffffffff8037c080     0 NOTYPE  LOCAL  DEFAULT    9 l2
+     3: ffffffff8037c570    10 FUNC    LOCAL  DEFAULT    9 camstatusentrycomp
+     4: ffffffff81808000   112 OBJECT  LOCAL  DEFAULT   48 sysctl___kern_features_scbus
+...
+```
+
+With the "why" out of the way, the "how" is relatively simple. The loader:
+* Searches the section header table for an entry with type `SYMTAB`
+* Copies the length of the symtab section immediately after the kernel
+* Copies the symtab section immediately after its length
+* Copies the length of the linked strtab section after the symtab
+* Copies the strtab section immediately after its length
+
+This leaves the following structure immediately after the kernel (lower addresses on the bottom):
+<table style="max-width: fit-content;">
+<tr><td>strtab contents</td></tr>
+<tr><td>strtab length</td></tr>
+<tr><td>symtab contents</td></tr>
+<tr><td>symtab length</td></tr>
+</table>
+
+The loader then remembers the start and end address of this structure for later.
+
+TODO
+### Environment
+TODO
+### Modinfo
+TODO
+### Booting
+TODO
+
+#### Footnotes
+[^demon]: Wait, this is BSD, it's named "beastie" and I want to load it, not banish it
+[^user-visible]: Things that a knowledgable system administrator might know about, like kernel environment, module loading, and memdisks
+[^script]: I think I did this, but it's also possible that I used the [linker script](https://cgit.freebsd.org/src/tree/sys/conf/ldscript.amd64?h=release/14.1.0#n3)
+[^x86]: I think the 2MiB alignment limitation is x86-specific because of the [horrible code](https://cgit.freebsd.org/src/tree/sys/amd64/amd64/machdep.c?h=release/14.1.0#n1273) that causes it, but I haven't actually tried any other architectures
+[^buffer]: Historically on x86 this would start at 2MiB (physical address `0x20_0000`) but this isn't possible on modern systems where part could be reserved by EFI.
+[^interp]: In fact it's so similar that users could accidentally run it as a program and get confused. To stop this, the kernel's interpreter is set to `/red/herring`.
+[^elf]: or is it "elves"?
+[^phdr]: Confusingly each entry is called a "Program Header" and is in the "Program Header Table"
+[^symtab]: Technically `.symtab` and `.symstr` could be copied as part of a `LOAD` segment and the kernel will know where to look, but I haven't seen it.