NixOS is a good server OS, except when it isn’t
Ever since I built my first NixOS system (building a custom image to upload on DigitalOcean), I’ve been bothered by one thing: the default installation size is too large. To give you an idea, this simple system (using flakes):
nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
(nixpkgs.outPath + "/nixos/modules/profiles/minimal.nix")
(nixpkgs.outPath + "/nixos/modules/profiles/headless.nix")
{
fileSystems."/".device = "/dev/sda1";
boot.loader.systemd-boot.enable = true;
}
];
}
ends up taking ~900MB of disk space on my system 1 . And it’s using the minimal and headless profiles!
When I started working on improving this, I expected the eventual blog post to be very different than what it became, but you can’t win everything in life. There’s a bit of pain ahead.
Some context
I really like Nix and NixOS (I wouldn’t be spending time helping their documentation otherwise). After getting some experience managing NixOS servers, I really can’t see myself going back to other systems unless required by some external factor.
I’m also working on a system that has worker machines which will spin up a bunch of microVMs. Naturally, I want to use NixOS both for the worker machines and the microVMs themselves. The system on the microVMs is currently taking ~210MB (including kernel) of disk space, but it’s based on Alpine . The worker machines are already using NixOS, but I’d also like them to be as lean as possible.
NixOS makes it very simple to manage a server from the outside. You can push an entirely new system configuration without the server changing its behaviour, and then almost atomically switch the server to the new configuration. You can easily configure the whole thing deterministically, deploy the same configuration to multiple servers, and even deploy the same configuration under a VM too so you can locally test things if you wish to.
I envisioned a world where all my worker machines ran the bare minimum software required for things to work, which would help to lock the system down and prevent any escalations in case some piece of software was broken into, and would also make deployments and tests faster.
And if I could achieve something like that on those machines, why not extend this to the OS running on the microVMs too? This would help cut boot times as much as possible, short of using a unikernel.
I knew from my previous experience with NixOS that it didn’t generate lean images by default, so a couple days ago I started looking into this to see if I could fix things, or at least significantly improve them.
Figuring out package dependencies and their sizes
A curious thing about the Nix ecosystem is that it has some powerful tools, but they’re severely underdocumented, sometimes functionality is hidden by their obscure naming, and/or some tools have some really specific assumptions which makes it harder to use them more generally.
One such tool is nix-store --query
, which is one of the more well-known tools in this category.
nix-store --query --tree
will give you a tree of packages 2 , starting from a package you specify, and show you the dependencies of that package, and their dependencies, and so on…
Running it will give you some output like this:
/nix/store/g4ppw7x76dyykj33x99xzf30zq5ym29z-nixos-system-nixos-24.05.20240323.44d0940
├───/nix/store/09fpwkb108ckhljahy7p84if7m8qh1wh-firmware
├───/nix/store/0v0wrr6ngh9d487lhwicwr5z61kz40zw-kmod-31
│ ├───/nix/store/1rm6sr6ixxzipv5358x0cmaw8rs84g2j-glibc-2.38-44
│ │ ├───/nix/store/3sxwxqzkkrgpgaibkm27ggb9kjbzdy31-xgcc-13.2.0-libgcc
│ │ ├───/nix/store/n9sq1bvghs9z0qg6cmwg27y4jmszwgqi-libidn2-2.3.7
│ │ │ ├───/nix/store/77yhmwrwism02371kzyda4d127kdwdnf-libunistring-1.1
│ │ │ │ └───/nix/store/77yhmwrwism02371kzyda4d127kdwdnf-libunistring-1.1 [...]
│ │ │ └───/nix/store/n9sq1bvghs9z0qg6cmwg27y4jmszwgqi-libidn2-2.3.7 [...]
...
To complement that, nix-store --query --size
gives you (roughly) the size that a specific package takes on disk.
It’s slightly more complicated than this, but for our purposes it will be enough to understand how much disk is used.
There are some tools which help visualise all this information in cool ways. Two of my favorites are nix-tree and nix-visualize . However, ideally I wanted an interactive graph so I could see each node in the graph by their size on disk, and inspect their dependencies, search things, and so on. nix-visualize was the closest of the tools to give me a graph, but it wasn’t interactive and the node sizes weren’t based on disk usage, so I tried to write my own.
It took me some hours to come up with code that generated a graphviz file, with node sizes based on disk usage. Coupled with vscode-interactive-graphviz , I felt like I had a good approach to interactively working with the graph, but the visualisations turned out to be too crowded. I tried to add some more space into things, but it was kind of a hack because graphviz likes to be the one to position elements. In the end, I gave up on that idea and decided to just generate a CSV, which worked way better than I expected. No wonder we still use spreadsheets for a lot of things.
The repository with the code and the final config of the NixOS system from this post is here .
An investigation of a minimal, headless NixOS system
With a way to see each package, its disk usage, and all its dependencies, let’s look at the minimal, headless system I mentioned at the beginning of the post. The one that takes ~900MB.
Each subsection below will be a small report of my investigation into some items of this CSV 3 . It starts “easy” and gets progressively more complicated. Feel free to skim and skip any part if you don’t feel like it.
Getting rid of Nix (~179MB reduction)
The heaviest item in that list is a mysterious source package. A quick look into what the heck could be taking 170MB of disk space shows it’s actually a complete copy of Nixpkgs!
$ ls /nix/store/amxd2p02wx78nyaa4bkb0hjvgwhz1dq7-source
CONTRIBUTING.md README.md doc lib nixos
COPYING default.nix flake.nix maintainers pkgs
Searching for that package’s pos
(an identifier I used in the code that generates the CSV and the graphviz files) shows that it’s only used by this other package:
This package is a single file which doesn’t have a lot in it other than a link to the source package.
A search through Nixpkgs shows the file coming from here , the actual content of registry
coming from here , and the source
attribute being set here .
I’m building this system with flakes and I’m using that nixosSystem
function from Nixpkgs’s flake.nix
, which means by default I get this extra 170MB in the system.
I think it would’ve been easy to just undo what Nixpkgs’s flake.nix
is doing, but if you look at the list of the heaviest packages again, you’ll see that Nix itself is the 10th heaviest package in the system.
Nix also pulls a lot of dependencies, each one taking some space too (for example, aws-sdk-cpp-1.11.207
eats another 5.7MB by itself, and is only used by Nix).
After some thinking, I realised that I don’t need Nix in any of these systems. I definitely don’t need it in a microVM, but I also don’t need it in my servers, because I’m building their configurations in an external machine and deploying the built bits directly. So let’s add this to the system configuration:
nix.enable = false;
After rebuilding the system, we’re at ~733MB.
Getting rid of Perl, Python (~242MB reduction)
After removing Nix, the 2nd heaviest package is Python3, and 3rd is Perl.
Python only comes in because of install-systemd-boot.sh
(truly a shame, why waste so much disk space like this!), and Perl comes in through a bunch of perl-envs (search for perl-5.38.2-env
in the CSV and you’ll see them).
Those perl-envs are all used in the top-level package, so let’s figure out where they’re being used there:
$ grep -nr 'perl-5.38.2-env' /nix/store/7z0y5sscnpx4hczzkjh3jvjgn2mq3106-nixos-system-nixos-24.05.20240323.44d0940
/nix/store/7z0y5sscnpx4hczzkjh3jvjgn2mq3106-nixos-system-nixos-24.05.20240323.44d0940/dry-activate:23:/nix/store/d3qxgm4ffhi2ixx3n9clwqlr6z21dd8i-perl-5.38.2-env/bin/perl \
/nix/store/7z0y5sscnpx4hczzkjh3jvjgn2mq3106-nixos-system-nixos-24.05.20240323.44d0940/activate:43:/nix/store/d3qxgm4ffhi2ixx3n9clwqlr6z21dd8i-perl-5.38.2-env/bin/perl \
/nix/store/7z0y5sscnpx4hczzkjh3jvjgn2mq3106-nixos-system-nixos-24.05.20240323.44d0940/activate:63:/nix/store/zkmm5iha0rsm4ypwfc67byq52gz0jb8b-perl-5.38.2-env/bin/perl /nix/store/rg5rf512szdxmnj9qal3wfdnpfsx38qi-setup-etc.pl /nix/store/jq5a0yw04ichvggf7dx80xc438z2v1gv-etc/etc
/nix/store/7z0y5sscnpx4hczzkjh3jvjgn2mq3106-nixos-system-nixos-24.05.20240323.44d0940/bin/switch-to-configuration:1:#! /nix/store/8mlvyl3sab5hxpxz2naz5g2sfd42a40q-perl-5.38.2-env/bin/perl
To make it easier to parse this bunch of text:
Perl is used in the dry-activate
, activate
, and bin/switch-to-configuration
scripts.
dry-activate
only needs Perl to run the update-users-groups.pl
script, while the activate
script runs the same script and also setup-etc.pl
, and bin/switch-to-configuration
is a Perl script from the beginning.
I thought Perl was going to be hard to remove, but I was determined to at least take a look.
After all, update-users-groups.pl
doesn’t seem like the kind of thing I need if I judge it only by its name (I have no idea what it actually does).
I don’t expect my servers to create any extra users or groups dynamically, so there should be nothing to update.
I decided to search Nixpkgs for that script name to get an idea of how it was being added to the system. It was through this search that I stumbled upon a Nixpkgs tracking issue called Perlless Activation - Tracking Issue .
Someone decided it wasn’t a good idea to have Perl in the base NixOS system for slightly different reasons, and they did a lot of work to get rid of it. Luckily for me, I could piggyback off their work and include the following module in my system configuration:
modules = [
# ...
(nixpkgs.outPath + "/nixos/modules/profiles/perlless.nix")
];
After rebuilding the system, we’re at ~491MB. As a bonus, Python is now gone as well!
Deduplicating systemd (~14MB reduction)
systemd is now the 2nd heaviest package. It has some stuff inside that I think could be removed, but since it’s an integral part of the system, let’s overlook it for now. But scanning the list of packages, what’s this in 5th place?
For some reason, our NixOS system has both systemd and systemd-minimal. A look through which packages use systemd-minimal show that only dbus uses it. It comes from here .
Nixpkgs has a lot of packages, and sometimes due to circular dependencies or to keep the size of dependencies smaller, it introduces variants of packages/functions that have reduced functionality. systemd-minimal probably exists to avoid certain circular dependencies, but I’m not sure. It’s defined here .
In any case, I’d like to try to get rid of systemd-minimal, since we already have the full systemd in our system anyway. There is no easy way to override the package used by the NixOS module that brings in dbus, so I added a Nixpkgs overlay to change the dbus package directly:
nixpkgs.overlays = [
(
self: super:
{
dbus = super.dbus.override {
systemdMinimal = self.systemd;
};
}
)
];
This seems to work, and after rebuilding the system, we’re now at ~477MB.
Removing udev, lvm, sudo and security wrappers (~30MB reduction)
This is where things start to get very messy. While looking through the list of heaviest packages, I saw an hwdb.bin package which seems linked to udev. I don’t know about udev too much, but it feels like it’s only needed for scenarios that won’t happen on the kind of servers I want to manage.
In case it is actually used for something important and this breaks the system, I have a feeling that a workaround could be hard-coded and wouldn’t require udev anyway. I’d gladly go into that rabbit hole, but (spoiler alert) you’ll see that I gave up well before that.
There’s an option to disable it:
services.udev.enable = false;
While looking through the stuff adjacent to udev, I noticed that lvm is also enabled by default. Similar reasoning to udev, I don’t think I’d need lvm for these servers, so I disabled it.
services.lvm.enable = false;
While looking through the lvm stuff, I noticed fuse2 and fuse3 are hard-coded by default (and changing those gets complicated quickly). I saw they’re used by some security wrappers, which also set other security wrappers for mount, umount, sudo, and a bunch of other binaries. This is needed because Nix doesn’t support sid/gid binaries by design, so NixOS has a binary that dynamically sets some capabilities and permissions, and then executes any other binary with the elevated bits.
I don’t like having this functionality, especially on a specific-purpose server like the ones I want. Instead of a single wrapper binary which receives an argument with the binary to execute with elevated permissions, I’d rather have X wrapper binaries with hardcoded paths and no parametrisation of any kind (one for each of the X things I want to execute), and that’s only IF I actually need this functionality.
For anything I want to run in these servers, I think I can configure the proper permissions through systemd unit configs instead.
The security wrappers module doesn’t have an enable
option to toggle it off, so one way to get rid of it completely is to add it to the disabledModules
attribute .
This requires me to provide dummy options that were provided by the security wrappers module earlier, every module gets evaluated by default when building a NixOS system (most of them just won’t do anything because they’re not enabled).
Some of these modules set additional wrappers, so the dummy options are needed to make the module system happy.
({ lib, pkgs, ... }: {
disabledModules = [ "security/wrappers/default.nix" ];
options.security = {
wrappers = lib.mkOption {
type = lib.types.attrs;
default = { };
};
wrapperDir = lib.mkOption {
type = lib.types.path;
default = "/run/wrappers/bin";
};
};
config = {
# ...
};
})
I think doing this could break some script that calls mount or umount or fuse (because those are hardcoded in the security wrappers module), but I also think that most scripts that use those are being run directly as root, so I’m not sure.
To finish this section, let’s also disable sudo completely because it’s useless without its security wrapper.
security.sudo.enable = false;
We’re at ~447MB now.
Some other minimal shenanigans
At this point, the 10th and 11st heaviest packages are util-linux and util-linux-minimal, respectively. This seems similar to that systemd-minimal thing from a while ago!
Let’s look at where these are being used:
- util-linux
- system-path, a bunch of systemd services and a mount-pstore shell script.
- util-linux-minimal
- fuse2 and fuse3, and etc-systemd-system.conf
Removing fuse is very annoying (although to be honest, with all the mess in the config so far, it wouldn’t even look that bad anymore). But we can at least try to make them use util-linux instead of util-linux-minimal, right?
To get there, let’s look at how these packages are declared in all-packages.nix
.
We’ll need an overlay, but trying to change fusePackages
to use the normal util-linux will hit an infinite recursion error, so I’ll start by overlaying fuse3
:
nixpkgs.overlays = [(
self: super: {
fuse3 = (self.lib.dontRecurseIntoAttrs (self.callPackage (nixpkgs.outPath + "/pkgs/os-specific/linux/fuse") { })).fuse_3;
}
)];
This builds nicely, but the moment I try to do this with fuse2, the infinite recursion error is back. Sigh. Whatever.
Browsing through the list of packages, I also see systemd-minimal-libs sneaking in there. It’s being used by a bunch of other packages, and it’s equally difficult to add more overlays to get rid of it. More infinite recursions.
This is where I look at the current system config, look at all the notes I made of things to look into that I haven’t yet (the list is right there in the next section), think about how much worse it’ll get by trying to fix all of this, and give up.
Things I noted, but didn’t look at
-
With Nix gone, the heaviest package was the linux kernel at ~136MB. I know I can get it down to ~50MB easily (the kernel used by default on NixOS has a lot of modules and extra things that a server doesn’t need), so I left that for later because it was easy.
-
One disadvantage of a perlless system is that it can’t switch to a different configuration at runtime, because the script that does this is written in Perl. This isn’t an issue for most of the servers in my scenario. MicroVMs don’t need that, and I’d be perfectly ok just killing a bunch of other servers and starting new ones with the updated configuration.
However, I made a note to look into that perl script and figure out how much work it would be to build a replacement. This is something that needs to happen anyway at some point, and would benefit the NixOS community at large.
-
A bunch of packages build with all locales and some internationalisation content. Those end up taking some good space, so I made a note to look at how to simplify this and get rid of most of the locales and files I wouldn’t need.
-
The system has both libressl and openssl packages. libressl is only used for netcat, but I don’t really need it in the servers. In fact, NixOS includes a lot of utilities by default and marks them as required (making it super annoying to remove them) which aren’t really needed on the servers.
-
A bunch of extra default config files that aren’t needed (such as bashrc) could be removed. This would also remove some packages that they use, such as bash-completion, which won’t ever be needed in the servers either.
-
coreutils and util-linux are both kind of heavy, but it’s very likely that the scripts and things that use them only really need a few binaries from each one. Perhaps an overlay that filters the binaries only to the list that are used would help free up quite a bunch of disk space.
-
nix-store
has a command to optimise disk space by finding identical files and hard-linking them. This could be helpful in some cases, but might not be possible in others, depending on how a server is imaged, or how new configuration gets pushed to it. It could be useful to decrease the disk space used by some of those “-minimal” packages (as long as they share exactly the same file).
Leaving this to the future
There is a huge audience that uses NixOS as a personal OS, and a lot of the defaults and modules present in NixOS reflect that. NixOS can still be used as a server OS, but it requires a very different set of configurations, and it still ends up not being adequate in every situation.
I can apply many of the configs I used in this post to my existing servers and make them leaner, cutting ~300MB of stuff I don’t need. I got some experience and figured out some tools to help me investigate these issues in detail whenever I feel the need to.
But over the 2 days I spent looking at this, I concluded that trying to mold NixOS into the shape I wanted just isn’t the way to go, but I also don’t like the other option if I want to stick with it, which is creating a fork of NixOS that is very opinionated and completely focused on server scenarios, so I’m leaving this for future me to figure out.
I was trying to bring NixOS to a bare minimum, which is an exercise similar to building containers with the bare minimum required for the software in the container to run. One can argue I should just use containers instead, but I think it’s a worthy endeavour to avoid them. I think we have all the tools in regular non-docker, non-kubernetes linux to get to a similar outcome, except we won’t need docker or kubernetes or whatever, and this removes a bunch of complexity from the systems we build.
But doing it on top of NixOS currently feels like a bad path to take.
Footnotes
-
Actually the top-level system as given by the attribute
.config.system.build.toplevel
, which covers essentially everything the system needs to run. ↩ -
I’m going to use the term package to mean “store object” as defined by the Nix manual , because for most people this is an easier way to reason about store objects. ↩
-
If you want to look at the same CSV I used, you can download it , but you won’t be able to inspect the store paths unless you happen to build the same configuration with the same Nixpkgs version. ↩