blog/_drafts/draft-freebsd-bootloader.md

14 KiB

layout title date
post Booting the Bootloader 2024-08-11

a.k.a "How I wrote a FreeBSD Bootloader"

A few days ago I wrote a FreeBSD bootloader. There are a few reasons why I did so (I want a better bootloader for NixBSD and I want something like lanzaboote for FreeBSD), but mostly I wrote it because I could.

It runs as a UEFI application, reads a kernel from the filesystem, loads it into memory, sets arguments, and executes it, all without using any of the upstream FreeBSD "stand" code. The code sucks, changing any settings requires recompiling, it's missing features, and it certainly won't be portable. But it works. As far as I can tell, the only other project that has done that is grub.

After I posted about it on fedi someone asked if I was going to write about my project. That seemed like a fun idea, but I'm not sure how useful a rant about weird FreeBSD design decisions would be to anybody, so instead I'll talk more about my thought process for reimplementing

1. Setting the Scope

When I program I try to take everything into account. I'll constantly be trying to answer questions like:

  • "What happens if the firmware is buggy?"
  • "What if I want to port this to ARM later?"
  • "What if meow?"

This can be useful when I'm trying to write secure fault-tolerant production code, but it's mostly a hindrance when I'm trying to just get something to work.

Therefore, the first step is setting the smallest scope where I've still accomplished something. This can be a bit flexible, but for this project I wanted: "load a FreeBSD kernel with serial or graphical output from a fixed path in an x86_64 VM". I didn't even put "boot from a root filesystem" in scope, but it turned out that was trivial.

This gives me a sense of accomplishment early in the process and helps banish the "what if" demons 1 in my head.

2. Understanding the Problem

Before starting any programming I like to have a good idea of what I'm interfacing with. This tends to mean first learning more general "How do I use $thing" information then moving onto "How does $thing work". It's no use knowing how to encode kernel environment if you have no idea what kernel environment is.

While reading documentation sometimes gives me a starting point, it's rarely enough so I quickly end up experimenting, trying debug features, tracing, and reading the code.

A lot of these suggestions apply whether or not you have source code. You can try a bunch of inputs, strace, dump memory, find important functions, and sometimes enable debug logging whether or not you have the code, code is just easier to search than binaries.

In this case already had a good idea of the user-visible parts of the boot process 2 from working on the NixBSD bootloader so it was immediately time to figure out how the process works.

I spent around 2 days for this project just reading code and writing notes. My notes skip general concepts I already know and just include reminders and lists of information I might forget. They're probably not useful to anyone but me, but could be useful in the future if I want to write documentation.

It would probably behoove me to add important code references to my notes, but I mostly end up looking through my search history trying to find what I was looking at. Please don't do this.

(have a sample of my notes to get an idea of what they include)

## Modinfo
Loader must provide modinfo to kernel, a TLV structure

* Dump from normal FreeBSD with `sysctl debug.dump_modinfo`
* Tag is `MODINFO_*` or `MODINFO_METADATA | MODINFOMD_*`
* Tag and length are 4 bytes native endian
* Value is padded to align to `sizeof(size_t)`
* Strings are null-terminated
* Encodes multiple modules in sequence, separated by `MODINFO_NAME` string

### Fields
* `MODINFO_NAME`: string with path to file if available

FreeBSD keeps the loader ("stand") and kernel ("sys") code mostly separate, so I simultaneously reverse engineered the loader serialization and kernel deserialization code.

Before I could do much of anything though, I needed to know where to look. The easiest starting points are often the beginning or end of a program, in this case the kernel's entry point and the part of the loader that jumps to it.

The kernel's entry point (btext) was relatively easy to find with readelf -Wa kernel. The readelf command gave me the address of the entry point. Since I was using a kernel with debug symbols, the address is linked to the function name later in the ELF output, so a quick search gave me the name, and from there the function. 3

The loader's exit point was also easy to find. In the standard elf header entry is called e_entry, so I used ripgrep with rg e_entry and immediately found the function.

From there I traced where important variables are changed, which quickly led me to the

TODO

3. Writing the code

My goal when writing the code is to get something that superficially produces the right output that I can fix later.

TODO

4. Debugging

TODO

5. Cleaning up the code

At this point, I generally have code that works, but is terrible. It might use hardcoded constants, have tons of unnecessary debug statements, have no configuration, or just barely work.

From here I have 3 options:

  • Don't clean up the code, because I have no plans to use it anymore
  • Iteratively clean up the code
  • Rewrite the code from scratch with more foresight, maybe copying some parts over

A lot of my projects end up in the first category because they were just experiments to see if I could.

However, if I have any plans to use it in the future, the best option is normally to take a break. A few days or weeks of thinking it over and talking normally help me figure out how to rewrite or improve the code.

This is not always advice that I follow myself. The day after I got freeloader working, I tried to refactor the Serialize trait, but ended up spending hours just making the code worse and threw my work away.

A few days later I realized there was a much better way and could have avoided all that trouble.

The Boot Process

With all that out of the way, here's what I discovered about the boot process:

The loader stuffs the kernel and all its dependencies into contiguous block of physical memory, which it calls several things including modulep or just addr. I call it the "staging buffer" since it's good a name as any. On x86 4 it must be aligned on a 2MiB boundary. 5

The kernel

The first thing the loader puts in the staging buffer is the kernel. Conveniently, the kernel is an ELF, also used for programs on Linux and FreeBSD, so there's plenty of existing code for parsing it. 6

Like other ELF programs, the kernel specifies the location of metadata and code in its Program Headers. 7 Although there are a few types here, the loader only cares about LOAD headers, each representing a segment of memory to copy.

Readelf's interpretation of my kernel's program header table:

Program Headers:
  Type           Offset    VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040  0xffffffff80200040 0x0000000000200040 0x000268 0x000268 R   0x8
  INTERP         0x0002a8  0xffffffff802002a8 0x00000000002002a8 0x00000d 0x00000d R   0x1
      [Requesting program interpreter: /red/herring]
  LOAD           0x000000  0xffffffff80200000 0x0000000000200000 0x17baa0 0x17baa0 R   0x200000
  LOAD           0x17baa0  0xffffffff8037baa0 0x000000000037baa0 0xd5efd8 0xd5efd8 R E 0x200000
  LOAD           0xedaa80  0xffffffff810daa80 0x00000000010daa80 0x425e1c 0x425e1c R   0x200000
  LOAD           0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x001000 RW  0x200000
  LOAD           0x1600000 0xffffffff81800000 0x0000000001800000 0x1868b0 0x600000 RW  0x200000
  DYNAMIC        0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x000180 RW  0x8
  GNU_RELRO      0x1400000 0xffffffff81600000 0x0000000001600000 0x000180 0x001000 R   0x1
  GNU_STACK      0x000000  0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
  NOTE           0x1300648 0xffffffff81500648 0x0000000001500648 0x0001c0 0x0001c0 R   0x4

Before it can copy though, the loader takes the VirtualAddr of the first LOAD segment and keeps it as an offset. That offset lets the loader place the first segment at the beginning of the staging buffer but keep the other segments at the correct relative positions. For example, if the staging buffer was at 0xacab_0000_0000, then the loader would put the first segment of my kernel at 0xacab_0000_0000 and the second at 0xacab_0017_baa0.

With that offset, the loader looks at each LOAD segment and copies from the kernel file (from Offset to Offset + FileSiz bytes in) to the staging buffer (from VirtAddr - <load offset> bytes in).

Note that in some cases MemSiz > FileSiz. The loader zeroes the excess amount in the staging buffer, and the kernel uses it for uninitialized global variables (placed in the section .bss).

The Kernel 2: Electric Boogaloo

At this point all the kernel's code is in RAM, but it's missing the .symtab and .symstr sections 8 that the kernel will need later to load modules.

The loader finds these sections by looking at the aptly-named Section Header Table. Sections include info about the purpose of different parts of the file (e.g. .text for code, .rodata for constants) that are useful for linkers but not normally needed to run a program.

Readelf's interpretation of my kernel's section header table:

Section Headers:
  [Nr] Name              Type            Addr             Off     Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000  000000 00      0   0  0
  [ 1] .interp           PROGBITS        ffffffff802002a8 0002a8  00000d 00   A  0   0  1
...
  [ 9] .text             PROGBITS        ffffffff8037c000 17c000  d5ea78 00  AX  0   0 4096
...
  [58] .SUNW_ctf         PROGBITS        0000000000000000 1abdde8 105984 00     59   0  4
  [59] .symtab           SYMTAB          0000000000000000 17869a8 189d38 18     60 43442  8
  [60] .strtab           STRTAB          0000000000000000 1910a49 1ad39c 00      0   0  1

The loader only needs to give the kernel .symstr — which lists the names of functions, global variables, and other "symbols" — and .symtab — which provides the address and type of those symbols. The two sections are only useful with one another, so .symtab includes a link to its .strtab. Readelf shows this with the Lk field, as in the table above.

Readelf's interpretation of my kernel's symtab and strtab:

Symbol table '.symtab' contains 67213 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: ffffffff8037c05a     0 NOTYPE  LOCAL  DEFAULT    9 l1
     2: ffffffff8037c080     0 NOTYPE  LOCAL  DEFAULT    9 l2
     3: ffffffff8037c570    10 FUNC    LOCAL  DEFAULT    9 camstatusentrycomp
     4: ffffffff81808000   112 OBJECT  LOCAL  DEFAULT   48 sysctl___kern_features_scbus
...

With the "why" out of the way, the "how" is relatively simple. The loader:

  • Searches the section header table for an entry with type SYMTAB
  • Copies the length of the symtab section immediately after the kernel
  • Copies the symtab section immediately after its length
  • Copies the length of the linked strtab section after the symtab
  • Copies the strtab section immediately after its length

This leaves the following structure immediately after the kernel (lower addresses on the bottom):

strtab contents
strtab length
symtab contents
symtab length

The loader then remembers the start and end address of this structure for later.

TODO

Environment

TODO

Modinfo

TODO

Booting

TODO

Footnotes


  1. Wait, this is BSD, it's named "beastie" and I want to load it, not banish it ↩︎

  2. Things that a knowledgable system administrator might know about, like kernel environment, module loading, and memdisks ↩︎

  3. I think I did this, but it's also possible that I used the linker script ↩︎

  4. I think the 2MiB alignment limitation is x86-specific because of the horrible code that causes it, but I haven't actually tried any other architectures ↩︎

  5. Historically on x86 this would start at 2MiB (physical address 0x20_0000) but this isn't possible on modern systems where part could be reserved by EFI. ↩︎

  6. In fact it's so similar that users could accidentally run it as a program and get confused. To stop this, the kernel's interpreter is set to /red/herring. ↩︎

  7. Confusingly each entry is called a "Program Header" and is in the "Program Header Table" ↩︎

  8. Technically .symtab and .symstr could be copied as part of a LOAD segment and the kernel will know where to look, but I haven't seen it. ↩︎