
Expanding Dune Package Management to the Rest of the Ecosystem

Senior Software Engineer
Since we published The Dune Developer Preview a lot of things have improved on the package management front. While the developer preview has demonstrated how Dune can manage dependencies in a unified workflow, we have been working on making it practical for more projects to adopt Dune to handle their package dependencies. Our goal is to slowly move from a developer preview to a mature feature that the general public can use and rely on.
What do we mean by maturation? The goal is fuzzy (as with every software, it is never 'done'), but we want to get Dune package management into a shape where we can consistently recommend that people use it for their projects. They should be confident that their workflows will continue to work while unlocking the new features that Dune package management brings.
The core points of this are:
- The OCaml Platform Tools should work at least as well with Dune package management as they work with
opam
. With the new features in Dune, this interoperability should work even better as users do not have to share dependencies with the project in the local switch since tools can be installed automatically, possibly even from precompiled binaries. Do you want MDX? Declare a dependency, and voila, you have MDX. - Most projects can start using Dune with little to no adjustments. The majority will work out of the box, and the most frequent fix required is to correct the list of project dependencies. No substantial code changes are necessary, and all projects should continue to be compatible with both
opam
anddune
; there is no lock-in to one tool or the other.
Our goal is to successfully build as many projects as possible using Dune's package management feature. But to evaluate what we have left to do, we need to know where we stand now. This blog post will give you an overview of the project's scope and biggest challenges.
Building "All" Packages
What if we want to try to build all the existing OCaml packages? Opam-repository to the rescue! While it might not include proprietary code bases, there are still a significant number of projects we can try to build with it. Fortunately, there has already been prior work done on this subject. Opam-health-check is an existing tool mostly written by Kate that can determine whether packages can be installed on different historical, current, and future OCaml versions. It continuously monitors the state of the opam ecosystem, which inspired its name.
Tarides is running and maintaining multiple opam-health-check
instances for the community. The most well-known is check.ci.ocaml.org which regularly builds thousands of opam
packages on Linux, freebsd.check.ci.dev which does the same thing but on FreeBSD, and windows.check.ci.dev which as the name implies builds packages on Windows to help us with the effort to deliver a better OCaml experience on Windows.
We were wondering whether we could use the tool when building with Dune instead of opam
. Fortunately, the software is free, so we could extend the functionality to build Dune projects instead of installing opam packages. This gave rise to the next instance of opam-health-check, dune.check.ci.dev which, instead of using opam
, builds them using Dune package management.
Which Packages are we Building, Actually?
Wer misst, misst Mist. – German proverb
Opam takes its installation instructions from the opam
metadata files that are collected in opam
repositories like opam-repository. This is how the regular opam
health check works, it selects (nearly all) packages, and attempts to build them.
However, only projects that already use Dune to build can use package management. This happens because, when building a project, you need to know which dependencies to build, where these dependencies get built and installed, and which paths to pass to the compiler so it can find the modules that the dependencies install. Unlike in opam
, the packages don't get installed into a location containing all installed libraries (a switch), but into separate directories that will be composed together when building.
That means we need to be a bit more selective about which packages we are going to pick for testing. Picking projects that don't use Dune will fail in 100% of the cases and will not let us draw useful conclusions besides telling us that you need Dune projects to use Dune package management, which we already know.
So, when determining which packages we want to include as our candidates, we need to filter the list of packages to ones that use Dune. The opam-health-check
tool expects to call a shell command to generate the list. However, the process of determining which packages count as 'are using Dune' is more complicated, since the best way to determine that would be to detect whether dune build
is used in a package and whether the package depends on the dune
package.
It's a bit fuzzy, but we decided to only include packages that depend on the 'dune' package. This leaves us with a few false positives (e.g. packages that don't support the most recent versions of Dune) and also some false negatives (packages that accidentally capture a 'dune' dependency through their own dependencies), so this will probably need a bit of revision in the future, but for now, it should be good enough.
What About the Rest?
There are a significant number of projects using Dune and this is far from all of them. While we can't build them directly because every build system works differently, all opam
packages can be used as dependencies and should just work.
How do we know this? We run different kinds of tests before using an internal tool that is quite similar but less sophisticated than opam-health-check
. In a previous run on OCaml 4.14, we tested using an opam
package as a dependency, attempting to build a project, and then checking the results. For that test, we selected 2505 opam
packages (since they were compatible with 4.14, opam install
could find a solution) and ran it over a few days. Ultimately, we only had 36 failures; thus, our success rate was a whopping 98%! This means that users can safely start using Dune for package management in their projects as the overwhelming majority of dependencies are compatible.
What is Building a Package, Really?
The biggest challenge is that much of the package metadata in the source archives is incorrect. As a result, dune pkg lock
almost certainly picks invalid versions of dependencies. Why is that?
Dependencies Galore
Opam installs packages by inspecting the files in its own metadata repository, opam-repository
. This repository is created by authors submitting their packages on release, and from there on, it is maintained by the opam-repository
maintainers. They will make sure to add dependencies that have been accidentally left out or adjust when new, incompatible versions of dependencies get published. Older package definitions will be updated to include upper version constraints.
However, if we check out a repository via git or download the source archive and try to build it with Dune, we don't have all these updates. Without them, many packages will fail to build (be it with opam or Dune).
These issues can often be fixed very easily by the author of the package, and having Dune fail to build packages due to invalid dependencies is very disappointing. If the dependencies were to be fixed, the project would either work just fine with Dune package management (success, hooray!) or at least fail with a more interesting error. Marking it as a dependency failure does us a disservice by hiding potential errors.
Our hack to test for Dune package management compatibility rather than accurate dependency declarations was to replace the dependencies from the source archive with information from opam-repository
. This was a two-step process:
- Overwriting the
opam
files with the opam files fromopam-repository
. - Removing the dependency information from
dune-project
because Dune prioritises the information in this file by default.
Step two had an additional challenge as the dune-project
file is in S-expression syntax, but the usual helpful processing tools like jq
do not support S-expressions. So, we used Jane Street's sexp tool to do the processing, along with a generous helping of common Unix shell tools.
This is not to say that users should be migrating their dependency specifications out of dune-project
(they shouldn't), but for our automated processing it was easier to take the updated opam files and use them as-is, instead of migrating them back into the dune-project
syntax.
What is a Package, Actually?
When opam
builds a package that uses Dune, it usually calls dune build -p <package-name>
, which makes Dune ignore everything in the source repository that is not attached to the package name. However, it doesn't work for the health check, as you want all projects in the source archive to be built, not just the current one that is to be tested. But you also don't want to build every package from the source archive, as that might introduce additional dependencies and unrelated failures. Likewise, you don't want to build code that is not part of any package (e.g. examples, benchmark, utilities).
In the end, we solve it by determining the internal dependencies of the project to be built and then collecting these dependencies. We start the build by calling dune build --only-packages <packages-discovered>
to restrict the build to only these packages.
Ok, Ok, but Show Me the Results!
The output of these runs is published on dune.check.ci.dev, where we build the candidate packages on Linux amd64 using the Dune developer preview binaries. We chose this platform because it will give us the biggest set of candidates since most packages are developed on systems similar to it. On the website, you can see all the packages we selected and the result of the build. At the time of writing we have selected 2243 packages to build and 1866 have completed the build successfully, which means that, at the time of writing, we have an 83% success rate in building projects directly! For the remaining 377 packages, the failures can be seen when clicking the entries since opam-health-check
keeps logs of all the builds. It is our main tool to determine which issues to tackle next. So as we go forward we expect the success rate to rise to match opam
as closely as possible!
Where Do We Go From Here?
Now that we have opam-health-check
running and reporting build successes and failures, we can look into the build issues that it has revealed. A lot of them were small stumbling blocks which could have nevertheless been blockers to adoption:
- The potentially simplest issue arose from the Dune not supporting packages distributed in ZIP archives. Due to OCaml's strong origins on Unix, most packages are distributed as compressed tarballs (often
gzip
orbzip2
compressed). However, especially on Windows, the ZIP format is more popular and is also supported inopam
. In #11511, we added Dune support for uncompressing ZIP files. We usually call programs to decompress the data to avoid shipping implementations of compression algorithms. However, to use these programs, they need to be available, and what is available depends on the platform. The simplest command to call isunzip
from the Info-ZIP project. Still, on some platforms, thetar
command also supports decompressing ZIP files as if they were tarballs, so we're trying to use whatever the user might have available. - When pinning a package, we assume it uses Dune. This works most of the time because a significant number of packages use Dune to build, but if a package does not, we will have to build and install it using the commands that it declares in its
opam
file. #11513 does just that. It extracts the commands when pinned and uses them when the pinned package needs to be built. - A somewhat obscure semantic of the way dependencies and conflicts are represented in
opam
files is that packages which are dependencies are implicitly conjunctions (depending onfoo
,bar
means depending onfoo
ANDbar
); however, for conflicts, they are implicitly disjunctions (conflicting withfoo
andbar
means to conflict withfoo
ORbar
). This makes a lot of sense intuitively but is easily forgotten. Dune used to accept a conflict only if all packages were conflicting, and this behaviour flew under the radar for a long time because conflicts are rare. Most of the time, the conflict is only a single package, in which case it doesn't make a difference. This was fixed in #11515, which also simplified the code. - When solving a project's dependencies, the solver has to go through all of them and find a solution that satisfies all constraints, or it will display an error. These constraints are usually declared in your
dune-project
or.opam
files, but when using Dune package management, there is an additional constraint: the solution needs to be buildable with the currently running version of Dune. Unfortunately, in such a case, the solver would crash. In #11554, we solved the issue to some degree: instead of crashing, the solver will display an error message, which will hopefully make it clearer why it can't find a solution. - Opam has a little-known but very useful feature when declaring package dependencies. Instead of depending on a specific version, the user can use the current version of the package as a variable. This allows projects that consist of multiple packages to depend on each other without having to update all dependencies on every release (an example of this is
ocaml-zmq
, which comes withasync
andlwt
variants which depend on a common core). However, these constraints don't matter much when building the packages, so we always set the version todev
. Unfortunately, this can cause subtle issues where no solution can be found, so in #11517, the code was changed to attempt to read theversion
fields to populate the variable with the value the user declared. - At the moment, Dune handles the compiler in a special way. When attempting to build the compiler, instead of building it in the project, it will build it in a separate location in the user's home directory. This is due to the fact that the compiler can't be moved to a different location at the moment (work is underway to improve the situation - that effort is called "relocatable OCaml"). How OCaml 5.3.0 is packaged in
opam-repository
changed and introduced a new transitive dependency for the compiler. Thus, the code would not be able to properly detect whichopam
package is the compiler. This was fixed in #11310 by computing the dependency cone of all possible compiler packages that are currently used to detect which package contains the compiler. - Opam has a way to mark a package as 'do not pick this package unless requested explicitly' -
avoid-version
. This is, for example, used to mark beta versions of packages that can be installed manually but should not be automatically picked. The solver in Dune does not have such a feature, so originally, Dune sorted these packages to the end of the candidate list, but it would not match the semantics ofopam
. Dune would then interpret them as forbidden dependencies. However, some older packages failed to build without access to these dependencies, so #11494 was implemented where, instead of failing, the solver tries to minimise the number of dependencies picked that have theavoid-version
flag. - Findlib, the tool whose package specification format is prevalent in the OCaml ecosystem and is also used by Dune, has a feature where parts of packages are installed in subdirectories. These subdirectories can also be optional when certain package features are enabled or disabled during building. It is a rare feature, but some real-world packages use it. Unfortunately, Dune would always assume that these directories existed if they were declared and try to read their contents. But if the directory does not exist (e.g. the feature is disabled), this would lead to a crash. The fix in #11569 is short and shows that all bugs are shallow if enough eyes inspect the code.
Fixing these issues has gotten us to an (at the time of writing) 83% success rate in building projects according to opam-heath-check
. That's a pretty good result and makes us confident that the package management feature is on the right track.
The issues above, as well as future issues related to package coverage and their status, are collected in a tracking issue on the Dune bug tracker.
How You Can Help
If you want to take part in improving our OCaml ecosystem to have a simple, one-stop-shop for building and installing packages check out the nightly developer preview and try it with your projects. The team is looking for feedback on how they can improve Dune package management, so please share your thoughts on Discuss, and report any issues on GitHub!
Stay in touch with us on Bluesky, Mastodon, Threads, and LinkedIn. We look forward to hearing from you!
Open-Source Development
Tarides champions open-source development. We create and maintain key features of the OCaml language in collaboration with the OCaml community. To learn more about how you can support our open-source work, discover our page on GitHub.
Stay Updated on OCaml and MirageOS!
Subscribe to our mailing list to receive the latest news from Tarides.
By signing up, you agree to receive emails from Tarides. You can unsubscribe at any time.