The MirageOS Retreat: A Journey of Food, Cats, and Unikernelsby Jules Aguillon, Sayo Bamigbade, Enguerrand Decorne, Sonja Heinze, Jan Midtgaard, Lucas Pluvinage on Oct 28th, 2022
MirageOS is an OCaml ecosystem to construct unikernels, i.e., minimal operating systems. Here, we write about our social and technical experience at the MirageOS retreat in Morocco, as well as the vibe and wonderful organisational details. To sum up the technical part, we worked on different facets of the MirageOS world: different kinds of unikernels, some groundwork for Raspberry Pi 4 bare-metal unikernels, and a workflow to leverage an existing deployment/orchestrating infrastructure. The MirageOS retreat was amazing!
Our journey started in Agadir, a Moroccan city right on the coast of the Atlantic sea, just south of the Atlas mountains. In Agadir, we had the best fish in the world (according to some) and amazing "cornes de gazelle," a delicious sample of Moroccan culture.
From Agadir, we went to Mirleft, a small town further south, full of square roads and beautiful reefs. That's where the MirageOS retreat took place. The venue had a kitchen and an amazing cook, a place for computers and presentations, a garden with a small pool, and a rooftop with dusty but nice views of the coast.
Both the venue and Mirleft as a whole were extremely inspiring in many ways. One of which included hacking on MirageOS, which was the main reason we came--of course, but we also enjoyed amazing food, saw old and new friends, and had a great time collaborating and creating with MirageOS.
At least once a year since the first MirageOS retreat in 2016 (with a Covid break in 2021), people get together and work on anything related to MirageOS. These retreats provide a great atmosphere, working environment, and everything else that's needed to be productive and to have a wonderful time.
Besides, the retreat is always a nice opportunity to eat our own dog food.
The organiser, Hannes (among others), always makes sure that most of the infrastructure we rely on is running on MirageOS as much as possible. A welcome addition this year was a local opam cache, which allowed us to download and install packages without crushing the data allowance on the SIM card installed on our main access point.
MirageOS is an ecosystem that constructs unikernels. In a superficial nutshell, a unikernel is a machine image that contains one process and a minimal set of operating system features the process requires. Unikernels are designed to be secure, efficient, and small. MirageOS unikernels are written in OCaml, a functional, semantically rich and type-safe programming language.
MirageOS can be used in a wide range of settings, like robust reimplementations of core system services and protocols like (DNS, SSH, TLS, and many more), as well as higher level applications like web services. It's also on its way to become a good candidate for bare-metal applications on various chipsets (e.g., a good choice for the Raspberry Pi 4. See also the section below on Implementing a Jack Port Driver)
We worked on lots of interesting things, but let's start with the ones that directly relate to MirageOS.
Albatross is an orchestrator for MirageOS unikernels. It runs on a Linux system and manages unikernels using Solo5. It's made of several services, one of which is the remote TLS endpoint, which accepts requests from the network to manage the orchestrator.
Some of us wanted to run Albatross on our favourite Linux distribution, NixOS, and we hoped to be able to hack around this quickly; however, it turned out to be harder than expected. We learned so much about systemd and networking while doing this project.
A Nix flake (a new way of defining packages, which comes with many rough edges) and a NixOS module are added to the main repository in this PR. To test that it works and to play with it, we've written a small tutorial that explains how to build a Qemu VM with Albatross and how to deploy a unikernel using the remote TLS endpoint.
Some of us worked on deploying a coffee chat bot as a MirageOS unikernel. Contrary to how it sounds, it isn't a robot that serves coffee (which would be extremely awesome)! Instead, it's a Slack bot that lets people on our company's Slack channel to opt-in for a coffee chat with a colleague. The coffee chat bot then matches each opt-in randomly with another opt-in.
This bot was already written in OCaml, and it's merely a single process. Due to its nature, it doesn't need to do any super complicated operating system stuff, so a natural question came to us: why not make a unikernel out of it?
So we did.
Making a unikernel out of a relatively simple application sounds rather straightforward. The first step was to get rid of all Unix operations. It's incredible how many small Unix calls we were doing without even noticing. For example, we were using
Unix.time all over the place, such as scheduling, providing a seed for the random library, and giving timestamps to our database entries.
The database posed another problem. We had been using
irmin-unix, which writes to disk using Unix. To fix that, now we use
irmin-mem, which writes to memory. We persist (and inspect) the data by syncing our in-memory database with a GitHub repository. If you're not familiar with Irmin (a MirageOS library), its design follows the principles of the Git design and provides a library called
irmin-git to bridge the two.
Providing the network stack needed for the Git (and also for the Slack API) communication is one of the typical tasks the operating system needs do. In our case, that's MirageOS. It has a concept called "devices," which are the operating system features your unikernel might need. Examples of "devices" are network interfaces, network stacks, filesystems (which we didn't need), and monotonic time sources. MirageOS will provide a concrete implementation of such a device at your unikernel's compile time, as long as you declare the device in the MirageOS configuration file
The things described were just a small part of our nice, educational journey making a coffee chat unikernel. One more detail that's worth mentioning: the bot now uses
httpaf for the Slack API interactions. Before, it was using
cohttp, which is already independent from Unix (unlike, for example, the OCaml
curly). Porting it to
httpaf wasn't technically necessary, but it was a great way to get to know and test the latest "cutting-edge" unikernel features.
We also went bare-metal during the retreat. "Bare-metal" sounds cool, doesn't it? Let us explain what we really mean by it. Often, the way to run a MirageOS unikernel is as follows:
- You have a Linux kernel on your machine and virtualize it via a hypervisor such as KVM.
- That hypervisor is then abstracted further by a tool called Solo5 which integrates well with MirageOS unikernels.
With this workflow, the communication between the unikernel and the hardware goes over several layers of abstraction. A "bare-metal" unikernel, on the contrary, communicates with the hardware directly, without any interfacing kernel such as Linux. The device we chose to do bare-metal work on is the Rasperry Pi 4 (RPi4).
So we needed an RPi4 bare-metal OCaml runtime. Luckily, Dinosaure wrote one last year: Gilbraltar. It also dumps the text of OCaml print statements into the UART, which is a technical way of saying that we can send such text over USB (concretely over a USB to serial TTL cable) and see it. Quite useful for debugging!
As you can see, doing bare-metal work is quite restrictive and everything that tends to be taken for granted needs to be implemented, like drivers, for example.
So that's what we decided to do.
Last year, some colleagues already implemented a driver for LED strips and powered our office's Christmas tree with a bare-metal OCaml RPi4! What is cooler than making our bare-metal RPi4 Christmas tree sing? Well, a lot of things are. Anyways, we love music, so we decided to implement a jack port driver.
Jack port drivers on a digital device are an interesting concept. Digital devices are digital, but jack ports expect analog data. One way the RPi4 can handle that is via a concept called PWM: Pulse Width Modulation. The PWM modulates analog signals (i.e., values between 0 and 1) by sending digital signals (i.e., either 0 or 1) really fast.
That modulation is done on the hardware side of the RPi4, concretely on a RPi4 peripheral also called PWM. Peripherals are RPi4 hardware devices that are mapped to specific address ranges in the RPi4's memory. You communicate with them by writing to or reading from those locations in memory. The address range of each peripheral is structured into registers. One example of a register of the PWM is the PWM FIFO, i.e., the hardware queue that stores the data flowing from the program to the jack port.
Our jack port driver does two things--both by writing to and reading from the right places in the PWM memory range.
- It can initiate the RPi4 for jack port communication (e.g., it sets the RPi4's clock to the correct frequency at which the port reads data from the FIFO, and it configures the correct modes to ensure the right data flow).
- It can send music to the jack port (by writing data to the FIFO--without overflowing it).
To use the new driver, we convert music into the right binary format by simply using
ffmpeg. Then we write a program with that music in-memory using the MirageOS tool
ocaml-crunch. That program just calls the driver to do the rest and is compiled for the RPi4 target with
This work is strongly related to MirageOS in three ways. First, the program playing music bare-metal on the RPi4 is a unikernel written in OCaml. Second, the program is compiled with
gilbraltar, which forms part of the MirageOS ecosystem and whose design and implementation is based on core tools in the MirageOS ecosystem, such as Solo5 and
ocaml-solo5. Third, by adding one layer of abstraction to the jack port driver, we can make it a MirageOS "device," so one could use the driver while also leveraging other MirageOS features that work bare-metal on a RPi4.
One of the MirageOS goals is to be able self host our infrastructure. At the retreat, many tools we used were based on the MirageOS ecosytem: a DNS resolver (mirage/ocaml-dns), an opam repository cache (robur/opam-mirror), and a portable file transfer application (dinosaure/bob). It's not a surprise that the official website, mirage.io, is a unikernel itself. However, in the past six months, we experienced two website crashes due to
Out_of_memory exceptions. The unikernel is configured to run with 1GB of RAM, so that's a slow running memory leak that requires investigation.
The question is how to investigate such a leak.
The initial attempt consisted in tracing memory allocations using
statmemprof while bombarding the server with requests by using benchmarking tools such as ApacheBench (
statmemprof is an implementation of Statistical Memory Profiling in the OCaml runtime. It enables sampling allocations at a fixed rate and tracing values until they are garbage collected. Using memtrace-viewer, one can analyse the memory usage and see which values are still live when the program goes out of memory, for example. For a unikernel with network access, it's possible to add an endpoint to enable tracing on demand: roburio/memtrace-mirage.
Unfortunately, this setup didn't help us identify the leak. Indeed, we can still expect the server to work fine under normal conditions. Somehow we need to understand which rare event, leaking a small bit of memory at a time, is happening enough times to consume all available memory.
Only the Real™ Internet would tell us the answer, so we monitored the live unikernel application. roburio/mirage-monitoring was of great help, as it enables two things:
- Reporting application-wide metrics to an InfluxDB endpoint, which can be displayed using Grafana
- Changing logs level/metrics sources at runtime
mirage-monitoring to a unikernel was surprisingly easy. It was only a matter of updating the configuration file with some functoria voodoo: https://github.com/mirage/mirage-www/pull/767. At some point, it will upstreamed in the
mirage tool so that adding monitoring is a single-line job. The hard part was providing the unikernel two network stacks to expose one to the internet while keeping the other for internal use only.
Next, we set up a typical Grafana deployment using InfluxDB/Telegraf for the metrics input and data storage. Logs were displayed using
Now we can see the numbers for the live website. Memory usage, indeed, but also other metrics were included by default, such as the number of established connections in the TCP stack. There we found the source of the leak. Throughout the day, the number of established TCP connections kept increasing.
Finally at runtime, we temporarily changed the TCP stack's log level to debug, monitor the logs, and wait for the moment where the number of established TCP connections would increase without decreasing afterwards. These logs described what was going on in the TCP stack at the exact moment the connection leak happened. At this point, we figured out that it occurred when a client connected to the server but fail to perform the TLS handshake, so the server dropped the connection without closing it--hence leaking it forever.
Here we go: one less leak.
Matching logs and metrics to inspect them together has proven to be very useful. We used Grafana for metrics, so the next step would be to also provide logs because Grafana supports structured logging through the Loki logs aggregation system.
One way to stop web trackers, advertisements, and malware is to block access to sites known to contain such things. A popular approach is through browser extensions like AdBlock and Privacy Badger. Another approach known as a DNS sinkhole involves installing a local DNS server that resolves bad domains to an invalid IP address. This approach has the advantage of working across different operating systems, browsers, and devices (laptops, smartphones, smart-TVs, etc.). For an added bonus, it can also save network bandwidth.
Another project initiated during this year's retreat was to implement Mirage-hole: a DNS sinkhole running as a Mirage Unikernel. It was inspired by Pi-hole for the Raspberry Pi. Starting from a DNS-stub example from dnsvizor (and after a bit of network debugging), we got a unikernel running that would block a single selected domain. We then extended this to fetch and parse a blocklist at start-up. Next, we worked on integrating a little webserver to serve statistics about the requested and blocked domains. Overall, the project was a nice opportunity to talk to and learn from several MirageOS contributors, and it served as a nice tour-de-force of several MirageOS networking libraries.
Tarides Map is a project intended to show the geographic distribution of all Tarides collaborators as a website. At the retreat, we explored deploying the site in a unikernel. To do this, we had to decide how to serve the files on the server and integrate it into a unikernel. We had two options use
ocaml-crunch or Docteur.
We initially used Docteur due to an inspiration from a different project called Pasteur, which uses Docteur and is deployed in a unikernel as a static site, which was exactly what we were aiming to do with Tarides Map. However, integrating Docteur into the project proved to be more difficult than we had expected. One reason was that Solo5 isn't currently supported on MacOS, the operating system used to write the project at the retreat. After compiling to Unix instead and numerous hours debugging, we were eventually able to generate the disk image; however, we still had issues deploying it in a unikernel, so we decided to try using ocaml-crunch instead.
ocaml-crunch proved to be a more straightforward option. We merely had to move some files around so that the directory structure could be turned into a standalone OCaml module to serve the file contents without requiring an external filesystem to be present. After doing this, we were successfully able to deploy the site here.
Another very interesting part of the retreat were the dreaming sessions organized by Hannes. The central idea behind this exercice was to allow ourselves to dream about how we envision the MirageOS project in the future, no matter how untangible and seemingly unrealistic. We talked about those dreams in two sessions.
The initial session revolved around gathering these dreams and ideas, without discussing how to achieve them, and let our mind go free with what we wanted to accomplish with MirageOS. Often times, those dreams would be shared with other participants. Some dreamed about replacing their whole software infrastructure by MirageOS, if not their main operating system! Others dreamed of artistic applications for Mirage, like using it as a backbone for musical endeavors.
The subsequent session revolved around how we could reach those dreams. This facilitated a more practical discussion around the challenges we may face along the way. Interestingly enough, in some instances, it turned out some dreams were either already achieved (like reverse-debugging Solo5!) or were close to being achievable.
A beautiful example of the attendees' dedication is that it did not take long for some to start working on projects like MirageOS-OS, a hypervisor for MirageOS unikernels and written with MirageOS, or to successfully implement a jack port driver for the Raspberry Pi 4, bringing us closer to MirageOS powered synthesisers and to MirageOS midi interfaces!
As mentioned above, the retreat was extremely inspiring, even with respect to topics less related to MirageOS than the ones mentioned here. The one we're most proud of is our Mirleft MirageOS EP that contains five tracks (five in the spirit of Solo5 and OCaml 5.0, of course). Its genre might be better described as OCamlwave! On our EP, you will find many musical oddities ranging from an on-premise recorded drum solo (with glasses, cloth-racks, and flip-flops) to a cat-powered cover of Mr Sandman (as an hommage to our time singing to Morroco's many, many cute cats.) to the occasional dramatic rendition of controversial pull requests on the OCaml compiler.
However, not everything in Mirleft was about music and animals. Some things were also about the beauitful waves.
Mirleft is a paradise for surfing, both for beginners and advanced surfers! We went to a nice sandy beach with perfect conditions to get started with surfing. Advanced surfers would probably go to one of the reefs for surfing, which we, in turn, found amazing for a peaceful walk, sometimes with and sometimes without company from a street dog we christened
And, well, talking about
null (apart from naming street animals), we also had plenty of other computer science related conversations at the retreat. All of them were extremely enriching! A couple of examples include exception backtracing in LWT programs and BGP intrinsics.
As this lengthy report can attest, our experience was an amazing one for all. The MirageOS Hack Retreats are always an otherworldly space, where amazing individuals gather to exchange thoughts and create new (and better) software. Friends are made along the way, some bugs are fixed, new ones are found, and great new ideas emerges.
This very special sense of community is rare, so we would like to thank everyone who organized, attended, and tended to the event. Thank you to our delightful hosts, who've been with us since the first retreat in 2016! Thank you as well to Hannes and Robur for organizing those retreats and spending time instilling the same inspiration in the great project that is MirageOS! Finally, thank you to old and new friends, as well as old and new MirageOS hackers, for this amazing week of happy banter and hacking!
PD: Some of the pics in this post were shared among us via bob, a MirageOS unikernel to share files.