From b17cda0956abb46d2aba9869ac672f3b789daff0 Mon Sep 17 00:00:00 2001 From: Artemis Tosini Date: Thu, 3 Jun 2021 23:14:10 +0000 Subject: [PATCH] Typo fixes and add all namespace descriptions --- _drafts/journald-1.md | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/_drafts/journald-1.md b/_drafts/journald-1.md index 869374f..fbc4a94 100644 --- a/_drafts/journald-1.md +++ b/_drafts/journald-1.md @@ -56,14 +56,14 @@ While working on my prototype for a system-journald replacement, [rjournald](htt These fields help you figure out if the error is coming from this computer, OS install, or boot. - **_BOOT_ID** is a unique ID (UUID in this case) generated at every startup. The kernel creates it and you can access the current one at `/proc/sys/kernel/random/boot_id`. I've found it useful to help figure out if I've rebooted my system since an error occurred. - **_MACHINE_ID** is a unique ID to the system which you can find in `/etc/machine-id`. This should be set on the first boot of your system by systemd and helps you figure out if the logs could be from before a system was reinstalled. -- **_HOSTNAME** is the name of the system. You probably set this in install. There's a few places to get this but journald uses `/proc/sys/kernel/hostname` (what the kernel thinks your hostname is). You can also get the hostname from systemd-hostnamed (which lets you set non-ascii hostnames for some programs) or `/etc/hostname` (which is where systemd will read your hostname and tell it to the kernel), but these might be different. +- **_HOSTNAME** is the name of the system. You probably set this in install. There's a few places to get this but journald uses `/proc/sys/kernel/hostname` (what the kernel thinks your hostname is). You can also get the hostname from systemd-hostnamed (which lets you set non-ASCII hostnames for some programs) or `/etc/hostname` (which is where systemd will read your hostname and tell it to the kernel), but these might be different. ### Process permissions These help you understand what kind of access a process has. You might get errors if a process has insufficient permission or runs as the wrong user - **_UID** tells you what user executed the process, as seen by journald. You can get this for your user with the command `id`. - **_GID** tells you which group the process was using, as seen by journald. While a user can have several groups, a process executes under one primary group ID - **_CAP_EFFECTIVE** provides what [capabilities](https://linux.die.net/man/7/capabilities) a process can use. Capabilities give fine-grained privileged access to processes without requiring them to be the root user. For example, binding to port 80 or 443 requires the CAP_NET_BIND_SERVICE capability. If _CAP_EFFECTIVE=0 then you know you've missed that capability. -- **_SELINUX_CONTEXT** is an additional set of permissions when using the [SELinux LSM](https://www.redhat.com/en/topics/linux/what-is-selinux) (Linux Security Module). I don't use SELinux on this sytem so it just shows up as "kernel", meaning SELinux will not limit permissions. Fedora, CentOS, and RHEL use SELinux by default. +- **_SELINUX_CONTEXT** is an additional set of permissions when using the [SELinux LSM](https://www.redhat.com/en/topics/linux/what-is-selinux) (Linux Security Module). I don't use SELinux on this system so it just shows up as "kernel", meaning SELinux will not limit permissions. Fedora, CentOS, and RHEL use SELinux by default. ### systemd context If you're using journald you're almost certainly using systemd to start all your processes. Systemd organizes processes into "units", such as OS services and user sessions, and "slices", a set of similar units. @@ -75,13 +75,15 @@ These are represented to the rest of the OS as a hierarchical set of "cgroups" w - **_SYSTEMD_USER_SLICE** and **_SYSTEMD_USER_UNIT** are similar to the same fields without USER, but assigned by the user systemd ### Process context +These fields give you some information about how the process was executed. The program will pass its PID to journald in a transport-specific way, then journald will retrieve the other values from `/proc/[pid]/[field name]`. + - **_PID** is the process ID (a numeric ID from 1 to 4194304) as seen by journald. PIDs are not unique over a system boot but should not be reused at the same time. - **_EXE** is the location of the executable. This is the result of canonicalizing the symlinks at executable start (i.e. if originally a → b → c, you execute a, then a → b → d, then `_EXE` will still contain a). On your system this will probably end up being something in `/usr/bin` but I use [NixOS](https://nixos.org/) which uses extremely long executable path names. - **_CMDLINE** is the full command with arguments as you might see in `char **argv`. Note that a program can change this. The most high-profile example I've seen of this is nginx, where you will see logs from `nginx: worker process`. - **_COMM** is the command name. This will normally be the final part of the path in `_EXE` but can be different, especially when running programs like [busybox](https://www.busybox.net/) where multiple programs are in file. This is also the pthread name of the sending thread. For example when logging from Firefox this might be `Web Content`. ### Time -Time, as it turns out, is extremely complicated. You'll get 3 separate time fields. Two of them are the "wall clock" time in unix time (nominally microseconds since midnight at the beginning of 1 January 1970 UTC, though leap seconds make this a bit more complicated). Unfortunately, wall clock time can jump forwards or backwards if your computer's clock is too slow or fast, respectively. Therefore, systemd also includes the "monotonic time", a number of seconds since some point in the past. This is guaranteed to always move forward so this is what you'll want to discern ordering. +Time, as it turns out, is extremely complicated. You'll get 3 separate time fields. Two of them are the "wall clock" time in Unix time (nominally microseconds since midnight at the beginning of 1 January 1970 UTC, though leap seconds make this a bit more complicated). Unfortunately, wall clock time can jump forwards or backwards if your computer's clock is too slow or fast, respectively. Therefore, systemd also includes the "monotonic time", a number of seconds since some point in the past. This is guaranteed to always move forward so this is what you'll want to discern ordering. Unfortunately wall clock time is also more complicated than you might expect. Linux has 4 separate monotonic timers: - **CLOCK_MONOTONIC_RAW** counts the amount of time that Linux has spent not asleep since last boot. @@ -105,17 +107,18 @@ Finally there's the untrusted message sent by the process. - **SYSLOG_IDENTIFIER** is the program identifier and is what you would get as the program source if you were using syslogd (as you would before systemd) ## A Note on Namespaces -Linux has the concept of a "namespace" mostly seen with containers which allows different processes to see the system differently. A few types make things interesting when logging to a journald outside the namespace (e.g. if you pass through the host journald socket to a container) +Linux has the concept of a "namespace" mostly seen with containers which allows different processes to see the system differently. A few types make things interesting when logging to a journald outside the namespace (e.g. if you pass through the host journald socket to a container). Running journald in a container then logging from outside is possible but that will cause some problems when journald doesn't have access to process data and I won't talk about it. + - **PID** namespaces allow different processes to see a different list of processes. For example, I'm currently running a container to run games on my laptop. If I list processes in the container I only see 34 while there are 472 running on my system overall. Additionally, within a PID namespace PIDs will be remapped. My games container is running systemd at PID 1 but that same process appears to the rest of my system as PID 2454372. The Linux kernel remaps PIDs to make sense in the receiver's PID namespace when sending the sender's credentials, so journald will record the PID as seen from the host if you pass it through to a container. -- **User** namespaces remap user and group IDs. This can be useful so that Linux doesn't assume everything running as root in your container has full root permissions on the the host with respect to e.g. loading kernel modules. When you make a user namespace you specify a UID map for reading and setting UIDs. For example, I use UID 1000000 instead of UID 0 in my games container. Journald will see the PID and GID as seen from the host namespace -- **Mount** +- **User** namespaces remap user and group IDs. This can be useful so that Linux doesn't assume everything running as root in your container has full root permissions on the host with respect to e.g. loading kernel modules. When you make a user namespace you specify a UID map for reading and setting UIDs. For example, I use UID 1000000 instead of UID 0 in my games container. Journald will see the PID and GID as seen from the host namespace +- **Mount** namespaces can cause some problems when you're looking through the log. If the executable was a symlink then it is canonicalized in the mount namespace before it becomes `_EXE`. This means you can end up with an executable path that doesn't exist in your current namespace. If a program changes namespaces, like `unshare` will, then `_EXE` will be from the namespace where it starts. - **UTS** namespaces allow programs to see a different hostname. Journald deals with these by not caring. Journald sets the `_HOSTNAME` field by asking for the hostname from the OS once then caching it. Messages from containers will show up in the log as using the hostname where journald receives it. -- **Time** +- **Time** namespaces allow programs to have a different time offset for both realtime and monotonic timers. Journald ignores these and just reads the time from its own namespace, which will probably be the main system time. ## Transports There's still one field I haven't described: **_TRANSPORT**. This requires a little more context. -Journald can get messages from one of 6 separate sources: **journal** (using the native journald protocol), **stdout** (a process's standard output or error redirected to systemd), **syslog** (the legacy linux logging system), **kernel** (kernel messages you can get through the `dmesg` command), **audit** (logs the kernel generates about programs' activities), and **driver** (error messages from within journald). Each has their own peculiarities from both the journald side and the client side but I'll mostly be talking about journald, stdout, and syslog. +Journald can get messages from one of 6 separate sources: **journal** (using the native journald protocol), **stdout** (a process's standard output or error redirected to systemd), **syslog** (the legacy Linux logging system), **kernel** (kernel messages you can get through the `dmesg` command), **audit** (logs the kernel generates about programs' activities), and **driver** (error messages from within journald). Each has their own peculiarities from both the journald side and the client side but I'll mostly be talking about journald, stdout, and syslog. ### Native (journal) @@ -125,4 +128,4 @@ Journald can get messages from one of 6 separate sources: **journal** (using the ## Overview Anything related to Linux quickly turns into a huge rabbit hole. I could certainly write articles on many of the -This is mostly from my own experimentation. If you have more inforomation and noticed an error, please contact me. I'd be happy to correct anything. +This is mostly from my own experimentation. If you have more information and noticed an error, please contact me. I'd be happy to correct anything.