blog/_drafts/nixos-rebuild.md

185 lines
10 KiB
Markdown
Raw Normal View History

---
layout: post
title: What is nixos-rebuild anyway?
date: 2024-04-12
---
If you've used NixOS before, you've almost certainly used the `nixos-rebuild` program before.
With one `nixos-rebuild switch` command you can build your updated system configuration,
add it to your bootloader as the default entry, stop all old services, and start any new services.
What you may not know is that `nixos-rebuild` is a bash script and you can do everything (relatively) easily without it.
The [full source code](https://github.com/NixOS/nixpkgs/blob/c074160dcfa338f8424c440ccb0f0a5412de0dbf/pkgs/os-specific/linux/nixos-rebuild/nixos-rebuild.sh) is quite long and includes many special cases, but most of these aren't necessary if you're building manually.
Unfortunately, there's one important question we have to answer first:
# What is a NixOS?
NixOS is a very complicated way of defining options and setting them to a value.
For example, your configuration you could set:
```nix
environment.systemPackages = [ pkgs.git ];
```
The value is matched by an "option" of the same name which describes the default value and the type [^type].
This is how NixOS describes `environment.systemPackages`:
```nix
options.environment.systemPackages = mkOption {
type = types.listOf types.package;
default = [];
example = literalExpression "[ pkgs.firefox pkgs.thunderbird ]";
description = lib.mdDoc ''
...
'';
};
```
The idea that makes this useful is that you can set values based off of each other.
For example, the `htop` program sets:
```nix
with lib;
let
cfg = config.programs.htop;
...
in {
...
config = mkIf cfg.enable {
environment.systemPackages = [ cfg.package ];
environment.etc."htoprc".text = ''
# Global htop configuration
# To change set: programs.htop.settings.KEY = VALUE;
'' + concatStringsSep "\n" (mapAttrsToList (key: value: "${key}=${fmt value}") cfg.settings);
};
}
```
Meaning that when `programs.htop.enable` is set then `programs.htop.package` is added to `environment.systemPackages` (to add htop to your path)
and `environment.etc."htoprc".text` is set to an autogenerated configuration file (to create the configuration file in `/etc/htoprc`).
## Top-Level
So you have a big set of options with values, but that doesn't make it an operating system.
The piece that ties this all together is the "toplevel derivation",
which you can access through the `system.build.toplevel` option.
The [full definition](https://github.com/NixOS/nixpkgs/blob/c074160dcfa338f8424c440ccb0f0a5412de0dbf/nixos/modules/system/activation/top-level.nix#L48) is a bit obtuse, but in short
it's a package that links to every generated file that you need for your operating system.
For example, when you set a kernel, the toplevel derivation sees that and puts it in `kernel`,
when a module wants to put a configuration file in `/etc` it creates a file in `etc`, and when you add a package with `environment.systemPackages` it gets stuffed `sw`. Here's what's in mine:
```
artemis@starlight ~> tree -L 1 /run/current-system/
/run/current-system/
├── activate
├── append-initrd-secrets -> /nix/store/b9179a3c206iid1z0fkr9d53kd76hm8q-append-initrd-secrets/bin/append-initrd-secrets
├── bin
├── boot.json
├── dry-activate
├── etc -> /nix/store/s7g253q8pf9lzw80cc20xfpbc2x9w6dv-etc/etc
├── extra-dependencies
├── firmware -> /nix/store/pw1mlhjxsg8b8id9g9n503h989k5gw6g-firmware/lib/firmware
├── init
├── init-interface-version
├── initrd -> /nix/store/pk1kck0lknn4ap9d46ivnl049142xkpq-initrd-linux-6.8.3/initrd
├── kernel -> /nix/store/0ir2cc8bjfp1idpqvyf9vphwxw0rj6g7-linux-6.8.3/bzImage
├── kernel-modules -> /nix/store/rvk7dwjzy2090l58a8057r311il83s1m-linux-6.8.3-modules
├── kernel-params
├── nixos-version
├── specialisation
├── sw -> /nix/store/bdcvja8kfwkrz35ilb33pn89ls9mymkx-system-path
├── system
└── systemd -> /nix/store/4npvfi1zh3igsgglxqzwg0w7m2h7sr9b-systemd-255.4
```
If you want to see yours, then go to `/run/current-system` on a NixOS machine,
which will always have the version you're currently running [^booted-system].
# An actual rebuild
With all of that out of the way, let's step through an actual `nixos-rebuild` call:
## Step 0: Build nix
The `nixos-rebuild` script tries not to make too many assumptions about the build host. It must have a nix store,
but that doesn't necessarily mean it has a new enough nix to build your configuration, or that your configuration
is defined using only settings that your build nix can understand. Therefore, it downloads a newer nix if possible,
or builds one using your configuration. I'm not _entirely_ sure what cursed setup you'd need to make this useful,
but it does happen.
## Step 1: Build your system
Now that it has a nix to use, it's time to build your toplevel derivation. If you're using [flakes](https://zero-to-nix.com/concepts/flakes) that means it runs [^gcroot]
```shell
nix build /etc/nixos#nixosConfigurations.$(hostname).config.system.build.toplevel
```
which builds the toplevel derivation based on your hostname from the flake in `/etc/nixos`. If you don't like that default you can pass `--flake /your/flake#system-name` to `nixos-rebuild` and it will build `/your/flake#nixosConfigurations.system-name.config.system.build.toplevel` instead.
If you're not using flakes, that means [^no-link]
```shell
nix-build <nixpkgs/nixos> -A system
```
which builds the toplevel derivation
based on `/etc/nixos/configuration.nix` using the nixpkgs in its [channel](https://zero-to-nix.com/concepts/channels). The path is somewhat obscured here though, even in [the source](https://github.com/NixOS/nixpkgs/blob/c074160dcfa338f8424c440ccb0f0a5412de0dbf/nixos/default.nix#L1): By default the configuration fie is loaded from the `nixos-config` channel,
which nixos sets to `/etc/nixos/configuration.nix`.
If you want to build from some other path, you can set `NIXOS_CONFIG` environment variable or pass `-I nixos-config=/your/path/to/whatever.nix` to `nixos-rebuild`, which will get passed through to `nix-build`.
## Step 2: Add a profile
While your configuration lists everything that should be installed and running when it's active,
it has no way of referencing previous configurations.
NixOS handles this by creating a "profile", a fancy way of saying "create a symlink to each version".
Profiles serve a dual purpose of being a "garbage collector root" (telling nix that it shouldn't delete these paths) and creating a list of versions for you to choose from, in case you want to rollback.
The symlinks are named `/nix/var/nix/profiles/system-{n}-link` for the version history
and `/nix/var/nix/profiles/system` for the default.
Nix has an easy command to set this profile and create a new numbered version if necessary:
`nix-env -p /nix/var/nix/profiles/system --set $(readlink result)`
## Step 3: Activate
The final step, activation, sets a lot of things in motion.
It uses two important scripts inside the toplevel derivation:
`activate` sets up the most important configuration files that you need both during boot and while switching. Its responsibilities include:
- Linking static configuration files from the toplevel derivation to `/etc`
- Creating users and groups
- Creating the impure `/bin/sh` and `/usr/bin/env` programs
- Linking the toplevel to `/run/current-system`
`bin/switch-to-configuration` sets up files that only make sense when you've just made a new toplevel. It does things like:
- Install the bootloader
- Create bootloader entries for each of the system versions in the profile
- Run `activate`
- Figure out which services need to be restarted and restart them
- Tell systemd to restart itself if needed
If you want more detail the [NixOS Manual](https://nixos.org/manual/nixos/stable/#sec-switching-systems) has a
reasonable description of how `bin/switch-to-configuration` works.
`nixos-rebuild` doesn't have to deal with any of that though.
It just has to run [^systemd-run]
```shell
env -i LOCALE_ARCHIVE=$LOCALE_ARCHIVE NIXOS_INSTALL_BOOTLOADER= \
$(readlink result)/bin/switch-to-configuration switch
```
Clearing the environment with `env -i` helps prevent weird impurities due to e.g. a strange PATH setting,
though it's not technically necessary.
The `LOCALE_ARCHIVE` variable is to fix programs complaining if they can't find internationalization metadata.
# Doing it yourself
You can put all this together to switch yourself without `nixos-rebuild`:
```shell
# (flakes)
nix build /etc/nixos#nixosConfigurations.$(hostname).config.system.build.toplevel
# (not flakes)
nix-build <nixos/nixpkgs> -A system
sudo nix-env -p /nix/var/nix/profiles/system --set $(readlink result)
sudo result/bin/switch-to-configuration switch
```
A lot less code than the 800 lines `nixos-rebuild` needs.
#### Footnotes
[^booted-system]: ... for a certain definition of "running". Software is loaded from here but the kernel and modules will be in the version in `/run/booted-system` because Linux can't load modules from other kernel versions. This is only setup at boot and won't be changed by a `nixos-rebuild switch`.
[^type]: In NixOS `type` defines not just "can I set this to a string or only a list" but also what happens when multiple conflicting options are set. If you set `environment.systemPackages = [ pkgs.git ];` in one file and `environment.systemPackages = [ pkgs.mercurial ];` then the result will be `[ pkgs.git pkgs.mercurial ]` because `listOf` says to merge them.
[^gcroot]: `nixos-rebuild` also passes `--out-link ${tmpDir}/result` for the flake builder, which creates a [garbage collector](https://nixos.org/manual/nix/stable/package-management/garbage-collection) root in a temporary directory so nix won't delete the toplevel derivation between building it and the next step. I'm not sure why this only happens for flakes.
[^no-link]: The `nix-build` command for non-flakes is also run with `--no-out-link` so it won't create a result symlink cluttering your current directory. `nixos-rebuild` reads the path of toplevel from `nix-build`'s standard out.
[^systemd-run]: Although it might run this command sometimes, `nixos-rebuild` prefers a much longer `systemd-run` command. This runs the switch in the background, so it won't get killed if you lose your session because of networking issues.