Irmin on MirageOS: Introducing the Notafs File System
Principal Software Engineer
We are pleased to announce one (or two) new filesystems for MirageOS! The motivation behind creating them is an exciting new use case requiring the system to store data on disk.
MirageOS allows you to compile OCaml applications into unikernels. By selecting the operating system functionalities required, a unikernel can be constructed for your application using only the necessary components. This process reduces the attack surface and increases the hardware efficiency of your final application. MirageOS unikernels can be deployed to various cloud and mobile platforms.
As a case in point, Tarides is developing SpaceOS to run unikernels on satellites, an industry where both security and performance are critical. While MirageOS has a long history of cloud usage and comes with advanced network capabilities, SpaceOS provides an interesting use case for disk storage. A possible use case for satellites is to take pictures from space and send them to Earth for analysis, and since satellites may not be able to send the pictures right away (due to their location, for example), storing high-resolution images on a disk is a must.
So, naturally, the question we posed ourselves was what MirageOS file system could we use for SpaceOS? And that started us down the journey to the new file system Notafs, that even lets you use Irmin on top!
Why Create a New Filesystem?
At the start, there were a couple of existing filesystems available for MirageOS that we needed to evaluate:
- The Chamelon filesystem is designed to efficiently support a large number of very small files on a disk.
- The Docteur filesystem which provides read-only file compression.
- The FAT filesystem, which provides support for the (old) standard FAT16 filesystem with restrictions on file sizes.
- The TAR file format, which has the limitation that users can only add new files: deletion or modification of existing files is not supported.
All of the above libraries are well-tested and highly recommended if they fit your application needs! But, as we will illustrate in part 2 with our benchmarks, none of these systems satisfied the SpaceOS requirement to store large files.
Furthermore, using a conventional file system like the ones available on traditional operating systems like Linux and Windows was also not an option since they are closely tied to large operating system functionalities. This is because to ensure higher security, MirageOS libraries are built using the OCaml programming language, which provides strong memory-safety guarantees. While this is a fantastic design choice for the future of computing, it is harder to support a conventional filesystem (programmed in C/C++) natively on MirageOS. A unikernel based on that kind of technology would lose the appeal of a small software stack with high-security guarantees. In comparison, the Notafs solution fits into four thousand lines, which is a much more ‘human-sized’ project to review.
The Challenges With Filesystems
Implementing a filesystem is a complex task requiring compromises, which explains the lack of options for storing large files with the SpaceOS project. For example, even though file systems may appear simple, with an interface that everyone is familiar with, their critical main function is to protect applications from (recoverable) hardware defects. Unless the disk dies, the user should never lose or see corrupted data, even if a power outage was to interrupt the filesystem in the middle of an operation. In other words, a filesystem update should have transactional semantics: either the operation succeeds, or it does not, but applications should never observe an in-between broken state. Without this property, software built on top of the file system would be vulnerable to experiencing faults.
Tarides is well aware of the challenges involved with implementing a filesystem – we maintain the Irmin database, which provides a hierarchical key-value store with high data consistency guarantees, git-inspired history for rollbacks, and distributed replication over the network. While the Irmin API resembles a filesystem, it is a full-blown database and provides many more functionalities. So rather than starting a new filesystem implementation from scratch, we asked ourselves whether we could reuse the Irmin database as a filesystem for MirageOS.
Irmin as a Filesystem?
Using Irmin for this purpose is not a new idea, and Irmin already provides multiple backends that allow users to run its databases on various platforms with different constraints. The irmin-pack
backend was of particular interest for MirageOS support. We have spent years optimising its performance since this backend is notably used to store the Tezos blockchain, and it enables the freeing of disk space by truncating the database history to get rid of unnecessary old backups.
Finally, the low-level implementation of irmin-pack
is especially suited for a port to MirageOS, as it only requires support for a few large (append-only) files from the operating system.
This last technical requirement was especially relevant to the SpaceOS use case, where the satellite needs to be able to store high-resolution pictures on disk. If our custom-made file system supported large files, then we would be able to support SpaceOS and enable the general-purpose use of Irmin for MirageOS unikernels. This realisation still left us with the task of developing the foundations of a file system, but even just the bare functionalities would satisfy our use cases.
Notafs is Born
We called this new file system ‘Notafs’, which, as the name suggests, is not a general-purpose file system due to its focus on handling a few large files. It is designed to handle a small number of large files for Mirage block devices. It can, however, be used to run the irmin-pack
backend, which gives users all the benefits of an Irmin filesystem for MirageOS. Together, running irmin-pack
on Notafs lifts the limitations of the latter, and supports many file names, is optimised for small and large files, and includes a git-like history with branching and merging just to name a few features!
Navigating design restrictions is an excellent way to focus the efforts of a project. In our case, that meant directing them towards the correctness and performance of the few selected operations of Notafs
. We didn’t need to implement advanced filesystem operations since missing functionalities could be provided by irmin-pack
, including optimised management of many small files. As long as our underlying file system could provide fast operations on large files, the rest was taken care of!
From a unikernel developer’s perspective, Notafs
provides an implementation of the Mirage_kv
interfaces. This is the standard API for filesystem usage on MirageOS, so any existing unikernel can use it without needing to change its application code. We also provide a command-line interface to simplify the formatting of disks and to allow for external inspection of file system contents of the unikernel. Check out this example of a minimal setup of Notafs in MirageOS.
While Notafs has restricted functionalities, it provides a nice developer experience and a useful alternative in the file system design space for MirageOS. We are especially proud of the safety guarantees that Notafs provides for your data stored on disk.
Using Irmin on Notafs
To enable users to run Irmin on MirageOS, we provide an implementation of the syscalls required by the irmin-pack
backend on top of Notafs. This was straightforward as Notafs provides the functionalities required to run irmin-pack
. Thanks to OCaml functors, the irmin-pack
codebase was already structured for alternative syscall implementations.
One OCaml subtlety we encountered was handling asynchronous I/O: MirageOS and Notafs use the Lwt
monad, while the Irmin syscalls expected a direct-style I/O implementation. This would have been an issue in the past, but we could bridge that gap using the effect handlers that came with OCaml 5. Note that OCaml 5 support is still experimental on MirageOS at the time of writing!
From a user perspective, using Irmin addresses the design limitations of Notafs (while keeping its desirable consistency properties) by adding efficient support for managing a large number of small files. Irmin is a much more general-purpose file system; on top of that, a user can build a unikernel application with many additional features. We’re hopeful that applications already using Irmin will consider this new bridge for targeting MirageOS unikernels.
Until Next Time!
Thank you for checking out the first part of this two-part series where we’ve introduced Notafs, the motivation behind the file system, the challenges, and its use cases. Look out for part two where we delve into the details behind Notafs’ design alongside benchmarks and visualisations of the file system in action.
Connect with us online on X, Mastodon, Threads, and LinkedIn to stay updated on our latest projects. We also invite you to consult the open-source Notafs repository for more information or to test out the file system yourself!
Open-Source Development
Tarides champions open-source development. We create and maintain key features of the OCaml language in collaboration with the OCaml community. To learn more about how you can support our open-source work, discover our page on GitHub.
Stay Updated on OCaml and MirageOS!
Subscribe to our mailing list to receive the latest news from Tarides.
By signing up, you agree to receive emails from Tarides. You can unsubscribe at any time.