Hillingar: MirageOS Unikernels on NixOS

NixOS allows reproducible deployments of systems by managing configuration declaratively. MirageOS is a unikernel creation framework that creates targeted operating systems for high-level applications that can run on a hypervisor. By building MirageOS unikernels with Nix, we can enable reproducible builds of these unikernels and enable easy deployment on NixOS systems.

Introduction

The Domain Name System (DNS) is a critical component of the modern Internet, allowing domain names to be mapped to IP addresses, mailservers, and more. This allows users to access services independent of their location in the Internet using human-readable names. We can host a DNS server ourselves to have authoritative control over our domain, protect the privacy of those using our server, increase reliability by not relying on a third party DNS provider, and allow greater customisation of the records served. However, it can be quite challenging to deploy one's own server reliably and reproducibly. The Nix deployment system aims to address this. With a NixOS machine, deploying a DNS server is as simple as:

{
  services.bind = {
    enable = true;
    zones."freumh.org" = {
      master = true;
      file = "freumh.org.zone";
    };
  };
}

Which we can then query with:

$ dig freumh.org @ns1.freumh.org +short
135.181.100.27

To enable the user to query our domain without specifying the nameserver, we have to create a glue record with our registrar pointing ns1.freumh.org to the IP address of our DNS-hosting machine.

You might notice this configuration is running the venerable bind^[ISC bind has many CVE's], which is written in C. As an alternative, using functional, high-level, type-safe programming languages to create network applications can greatly benefit safety and usability whilst maintaining performant execution. One such language is OCaml.

MirageOS^[ mirage.io ] is a deployment method for these OCaml programs. Instead of running them as a traditional Unix process, we instead create a specialised 'unikernel' operating system to run the application. They offer reduced image sizes through dead code elimination, as well as improved security and efficiency.

However, to deploy a Mirage unikernel with NixOS, one must use the imperative deployment methodologies native to the OCaml ecosystem, thus eliminating the benefit of reproducible systems that Nix offers. This blog post will explore how we enabled reproducible deployments of Mirage unikernels by building them with Nix.

MirageOS

^[Credits to Takayuki Imada]

MirageOS is a library operating system that allows users to create unikernels, which are specialised operating systems that include both low-level operating system code and high-level application code in a single kernel and a single address space. It was the first such 'unikernel creation framework', but comes from a long lineage of OS research, such as the exokernel library OS architecture. Embedding application code in the kernel allows for dead-code elimination, removing OS interfaces that are unused, which reduces the unikernel's attack surface and offers improved efficiency.

Diagram representing the MirageOS Compiler

Contrasting software layers in existing VM appliances vs. unikernel's standalone kernel compilation approach [3]

Mirage unikernels are written OCaml^[Barring the use of foreign function interfaces (FFIs).]. OCaml is more practical for systems programming than other functional programming languages, such as Haskell. It supports falling back on impure imperative code or mutable variables when warranted.

Nix

Nix snowflake^[As 'nix' means snow in Latin. Credits to Tim Cuthbertson.].

Nix is a deployment system that uses cryptographic hashes to compute unique paths for components^[NB: we will use component, dependency, and package somewhat interchangeably in this blog post, as they all fundamentally mean the same thing -- a piece of software.] that are stored in a read-only directory: the Nix store, at /nix/store/<hash>-<name>. This provides several benefits, including concurrent installation of multiple versions of a package, atomic upgrades, and multiple user environments.

Nix uses a declarative domain-specific language (DSL), also called Nix, to build and configure software. The snippet used to deploy the DNS server is in fact a Nix expression. This example doesn't demonstrate it, but Nix is Turing complete. Nix does not, however, have a type system.

We used the DSL to write derivations for software that describe how to build said software with input components and a build script. This Nix expression is then 'instantiated' to create 'store derivations' (.drv files), which is the low-level representation of how to build a single component. This store derivation is 'realised' into a built artefact, hereafter referred to as 'building.'

Possibly the simplest Nix derivation uses bash to create a single file containing Hello, World!:

{ pkgs ? import <nixpkgs> {  } }:

builtins.derivation {
  name = "hello";
  system = builtins.currentSystem;
  builder = "${nixpkgs.bash}/bin/bash";
  args = [ "-c" ''echo "Hello, World!" > $out'' ];
}

Note that derivation is a function that we're calling with one argument, which is a set of attributes.

We can instantiate this Nix derivation to create a store derivation:

$ nix-instantiate default.nix
/nix/store/5d4il3h1q4cw08l6fnk4j04a19dsv71k-hello.drv
$ nix show-derivation /nix/store/5d4il3h1q4cw08l6fnk4j04a19dsv71k-hello.drv
{
  "/nix/store/5d4il3h1q4cw08l6fnk4j04a19dsv71k-hello.drv": {
    "outputs": {
      "out": {
        "path": "/nix/store/4v1dx6qaamakjy5jzii6lcmfiks57mhl-hello"
      }
    },
    "inputSrcs": [],
    "inputDrvs": {
      "/nix/store/mnyhjzyk43raa3f44pn77aif738prd2m-bash-5.1-p16.drv": [
        "out"
      ]
    },
    "system": "x86_64-linux",
    "builder": "/nix/store/2r9n7fz1rxq088j6mi5s7izxdria6d5f-bash-5.1-p16/bin/bash",
    "args": [ "-c", "echo \"Hello, World!\" > $out" ],
    "env": {
      "builder": "/nix/store/2r9n7fz1rxq088j6mi5s7izxdria6d5f-bash-5.1-p16/bin/bash",
      "name": "hello",
      "out": "/nix/store/4v1dx6qaamakjy5jzii6lcmfiks57mhl-hello",
      "system": "x86_64-linux"
    }
  }
}

And build the store derivation:

$ nix-store --realise /nix/store/5d4il3h1q4cw08l6fnk4j04a19dsv71k-hello.drv
/nix/store/4v1dx6qaamakjy5jzii6lcmfiks57mhl-hello
$ cat /nix/store/4v1dx6qaamakjy5jzii6lcmfiks57mhl-hello
Hello, World!

Most Nix tooling does these two steps together:

nix-build default.nix
this derivation will be built:
  /nix/store/q5hg3vqby8a9c8pchhjal3la9n7g1m0z-hello.drv
building '/nix/store/q5hg3vqby8a9c8pchhjal3la9n7g1m0z-hello.drv'...
/nix/store/zyrki2hd49am36jwcyjh3xvxvn5j5wml-hello

Nix realisations (hereafter referred to as 'builds') are done in isolation to ensure reproducibility. Projects often rely on interacting with package managers to make sure all dependencies are available and may implicitly rely on system configuration at build time. To prevent this, every Nix derivation is built in isolation (without network access or access to the global file system) with only other Nix derivations as inputs.

The name Nix is derived from the Dutch word niks, meaning nothing; build actions do not see anything that has not been explicitly declared as an input.

Nixpkgs

You may have noticed a reference to nixpkgs in the above derivation. As every input to a Nix derivation also has to be a Nix derivation, one can imagine the tedium involved in creating a Nix derivation for every dependency of your project. However, Nixpkgs^[ github.com/nixos/nixpkgs ] is a large repository of software packaged in Nix, where a package is a Nix derivation. We can use packages from Nixpkgs as inputs to a Nix derivation, as we've done with bash.

There is also a command line package manager installing packages from Nixpkgs, which is why people often refer to Nix as a package manager. While Nix, and therefore Nix package management, is primarily source-based (since derivations describe how to build software from source), binary deployment is an optimisation of this. Since packages are built in isolation and entirely determined by their inputs, binaries can be transparently deployed by downloading them from a remote server instead of building the derivation locally.

Visualisation of Nixpkgs

Visualisation of Nixpkgs^[www.tweag.io/blog/2022-09-13-nixpkgs-graph/]

NixOS

NixOS^[nixos.org] is a Linux distribution built with Nix from a modular, purely functional specification. It has no traditional filesystem hierarchy (FSH), like /bin, /lib, /usr, but instead stores all components in /nix/store. The system configuration is managed by Nix and configured with Nix expressions. NixOS modules are Nix files containing chunks of system configuration that can be composed to build a full NixOS system^[NixOS manual Chapter 66. Writing NixOS Modules.]. While many NixOS modules are provided in the Nixpkgs repository, they can also be written by an individual user. For example, the expression used to deploy a DNS server is a NixOS module. Together these modules form the configuration which builds the Linux system as a Nix derivation.

NixOS minimises global mutable state that -- without knowing it -- you might rely on being set up in a certain way. For example, you might follow instructions to run a series of shell commands and edit some files to get a piece of software working. You may subsequently be unable to reproduce the result because you've forgotten some intricacy or are now using a different version of the software. Nix forces you to encode this in a reproducible way, which is extremely useful for replicating software configurations and deployments, aiming to solve the 'It works on my machine' problem. Docker is often used to fix this configuration problem, but Nix aims to be more reproducible. This can be frustrating at times because it can make it harder to get a project off the ground, but the benefits often outweigh the downsides.

Nix uses pointers (implemented as symlinks) to system dependencies, which are Nix derivations for programs or pieces of configuration files. This means NixOS supports atomic upgrades, as the pointers to the new packages are only updated when the install succeeds; the old versions can be kept until garbage collection. This also allows NixOS to trivially supports rollbacks to previous system configurations, as the pointers can be restored to their previous state. Every new system configuration creates a GRUB entry, so you can boot previous systems even from your UEFI/BIOS. Finally, NixOS also supports partial upgrades: while Nixpkgs also has one global coherent package set, one can use multiple instances of Nixpkgs (i.e., channels) at once, as this Nix store allows multiple versions of a dependency to be stored.

To summarise the parts of the Nix ecosystem that we've discussed:

Flakes

We also use Nix flakes for this project. Without going into too much depth, they enable hermetic evaluation of Nix expressions and provide a standard way to compose Nix projects. With flakes, instead of using a Nixpkgs repository version from a 'channel'^[nixos.org/manual/nix/stable/package-management/channels.html], we pin Nixpkgs as an input to every Nix flake, be it a project build with Nix or a NixOS system. Integrated with flakes, there is also a new nix command aimed at improving the Nix UI. You can read more detail about flakes in a series of blog posts by Eelco Dolstra on the topic^[tweag.io/blog/2020-05-25-flakes].

Deploying Unikernels

Now that we understand what Nix and Mirage are, and we've motivated the desire to deploy Mirage unikernels on a NixOS machine, what's stopping us from doing just that? To support deploying a Mirage unikernel, like for a DNS server, we need to write a NixOS module for it.

A paired-down^[The full module can be found here] version of the bind NixOS module, the module used in our Nix expression for deploying a DNS server on NixOS (§), is:

{ config, lib, pkgs, ... }:

with lib;

{
  options = {
    services.bind = {
      enable = mkEnableOption "BIND domain name server";

      zones = mkOption {
        ...
      };
    };
  };

  config = mkIf cfg.enable {
    systemd.services.bind = {
      description = "BIND Domain Name Server";
      after = [ "network.target" ];
      wantedBy = [ "multi-user.target" ];

      serviceConfig = {
        ExecStart = "${pkgs.bind.out}/sbin/named";
      };
    };
  };
}

Notice the reference to pkgs.bind. This is the Nixpkgs repository Nix derivation for the bind package. Recall that every input to a Nix derivation is itself a Nix derivation (§); in order to use a package in a Nix expression -- i.e., a NixOS module -- we need to build said package with Nix. Once we build a Mirage unikernel with Nix, we can write a NixOS module to deploy it.

Building Unikernels

Mirage uses the package manager for OCaml called opam^[opam.ocaml.org]. Dependencies in opam, as is common in programming language package managers, have a file which -- among other metadata, build/install scripts -- specifies dependencies and their version constraints. For example^[For mirage-www targetting hvt.]

...
depends: [
  "arp" { ?monorepo & >= "3.0.0" & < "4.0.0" }
  "ethernet" { ?monorepo & >= "3.0.0" & < "4.0.0" }
  "lwt" { ?monorepo }
  "mirage" { build & >= "4.2.0" & < "4.3.0" }
  "mirage-bootvar-solo5" { ?monorepo & >= "0.6.0" & < "0.7.0" }
  "mirage-clock-solo5" { ?monorepo & >= "4.2.0" & < "5.0.0" }
  "mirage-crypto-rng-mirage" { ?monorepo & >= "0.8.0" & < "0.11.0" }
  "mirage-logs" { ?monorepo & >= "1.2.0" & < "2.0.0" }
  "mirage-net-solo5" { ?monorepo & >= "0.8.0" & < "0.9.0" }
  "mirage-random" { ?monorepo & >= "3.0.0" & < "4.0.0" }
  "mirage-runtime" { ?monorepo & >= "4.2.0" & < "4.3.0" }
  "mirage-solo5" { ?monorepo & >= "0.9.0" & < "0.10.0" }
  "mirage-time" { ?monorepo }
  "mirageio" { ?monorepo }
  "ocaml" { build & >= "4.08.0" }
  "ocaml-solo5" { build & >= "0.8.1" & < "0.9.0" }
  "opam-monorepo" { build & >= "0.3.2" }
  "tcpip" { ?monorepo & >= "7.0.0" & < "8.0.0" }
  "yaml" { ?monorepo & build }
]
...

Each of these dependencies will have its own dependencies with their own version constraints. As we can only link one dependency into the resulting program, we need to solve a set of dependency versions that satisfies these constraints. This is not an easy problem. In fact, it's NP-complete ^[research.swtch.com/version-sat]. Opam uses the Zero Install^[0install.net] SAT solver for dependency resolution.

Nixpkgs has a large number of OCaml packages^[github.com/NixOS/nixpkgs pkgs/development/ocaml-modules], which we could provide as build inputs to a Nix derivation. However, Nixpkgs has one global coherent set of package versions^[Bar some exceptional packages that have multiple major versions packaged, like Postgres.]. The support for installing multiple versions of a package concurrently comes from the fact that they are stored at a unique path and can be referenced separately, or symlinked, where required. So different projects or users that use a different version of Nixpkgs won't conflict, but Nix does not do any dependency version resolution -- everything is pinned. This is a problem for opam projects with version constraints that can't be satisfied with a static instance of Nixpkgs.

Luckily, a project from Tweag already exists (opam-nix) to deal with this^[github.com/tweag/opam-nix]. This project uses the opam dependency versions solver inside a Nix derivation, and then creates derivations from the resulting dependency versions.

This still doesn't support building our Mirage unikernels, though. Unikernels quite often need to be cross-compiled: compiled to run on a platform other than the one they're being built on. A common target, Solo5^[github.com/Solo5/solo5], is a sandboxed execution environment for unikernels. It acts as a minimal shim layer to interface between unikernels and different hypervisor backends. Solo5 uses a different glibc which requires cross-compilation. Mirage 4^[mirage.io/blog/announcing-mirage-40] supports cross compilation with toolchains in the Dune build system^[dune.build]. This uses a host compiler installed in an opam switch (a virtual environment) as normal, as well as a target compiler^[github.com/mirage/ocaml-solo5]. But the cross-compilation context of packages is only known at build time, as some metaprogramming modules may require preprocessing with the host compiler. To ensure that the right compilation context is used, we have to provide Dune with all our sources' dependencies. A tool called opam-monorepo was created to do just that^[github.com/tarides/opam-monorepo].

We extended the opam-nix project to support the opam-monorepo workflow with this pull request: github.com/tweag/opam-nix/pull/18. This is very low-level support for building Mirage unikernels with Nix, however. In order to provide a better user experience, we also created the Hillinar Nix flake: github.com/RyanGibb/hillingar. This wraps the Mirage tooling and opam-nix function calls so that a simple high-level flake can be dropped into a Mirage project to support building it with Nix. To add Nix build support to a unikernel, simply:

# create a flake from hillingar's default template
$ nix flake new . -t github:/RyanGibb/hillingar
# substitute the name of the unikernel you're building
$ sed -i 's/throw "Put the unikernel name here"/"<unikernel-name>"/g' flake.nix
# build the unikernel with Nix for a particular target
$ nix build .#<target>

For example, see the flake for building the Mirage website as a unikernel with Nix: github.com/RyanGibb/mirage-www/blob/master/flake.nix.

Evaluation

Hillingar's primary limitations are (1) complex integration is required with the OCaml ecosystem to solve dependency version constraints using opam-nix, and (2) that cross-compilation requires cloning all sources locally with opam-monorepo (§). Another issue that proved an annoyance during this project is the Nix DSL's dynamic typing. When writing simple derivations this often isn't a problem, but when writing complicated logic, it quickly gets in the way of productivity. The runtime errors produced can be very hard to parse. Thankfully there is work towards creating a typed language for the Nix deployment system, such as Nickel^[www.tweag.io/blog/2020-10-22-nickel-open-sourcing]. However gradual typing is hard, and Nickel still isn't ready for real-world use despite being open-sourced (in a week as of writing this) for two years.

A glaring omission is that despite it being the primary motivation, we haven't actually written a NixOS module for deploying a DNS server as a unikernel. There are still questions about how to provide zone file data declaratively to the unikernel and manage the runtime of deployed unikernels. One option to do the latter is Albatross^[hannes.robur.coop/Posts/VMM], which has recently had support for building with Nix added^[https://github.com/roburio/albatross/pull/120]. Albatross aims to provision resources for unikernels such as network access, share resources for unikernels between users, and monitor unikernels with a Unix daemon. Using Albatross to manage some of the inherent imperative processes behind unikernels, as well as share access to resources for unikernels for other users on a NixOS system, could simplify the creation and improve the functionality of a NixOS module for a unikernel.

There also exists related work in the reproducible building of Mirage unikernels. Specifically, improving the reproducibility of opam packages (as Mirage unikernels are opam packages themselves)^[hannes.nqsb.io/Posts/ReproducibleOPAM]. Hillingar differs in that it only uses opam for version resolution, instead using Nix to provide dependencies, which provides reproducibility with pinned Nix derivation inputs and builds in isolation by default.

Conclusion

To summarise, this project was motivated (§) by deploying unikernels on NixOS (§). Towards this end, we added support for building MirageOS unikernels with Nix: we extended opam-nix to support the opam-monorepo workflow and created the Hillingar project to provide a usable Nix interface (§).

While only the first was the primary motivation, the benefits of building unikernels with Nix are:

Reproducible and low-config unikernel deployment using NixOS modules is enabled.
Nix allows reproducible builds pinning system dependencies and composing multiple language environments. For example, the OCaml package conf-gmp is a 'virtual package' that relies on a system installation of the C/Assembly library gmp (The GNU Multiple Precision Arithmetic Library). Nix easily allows us to depend on this package in a reproducible way.
We can use Nix to support building on different systems (§).

To conclude, while NixOS and MirageOS take fundamentally very different approaches, they're both trying to bring some kind of functional programming paradigm to operating systems. NixOS does this in a top-down manner, trying to tame Unix with functional principles like laziness and immutability^[tweag.io/blog/2022-07-14-taming-unix-with-nix]; whereas, MirageOS does this by throwing Unix out the window and rebuilding the world from scratch in a very much bottom-up approach. Despite these two projects having different motivations and goals, Hillingar aims to get the best from both worlds by marrying the two.

To dive deeper, please see a more detailed article on my personal blog.

If you have a unikernel, consider trying to build it with Hillingar, and please report any problems at github.com/RyanGibb/hillingar/issues!