
Opam Health Check: or How we Got to 90+% of Packages Building with Dune Package Management

Senior Software Engineer
We have recently posted about the process of enabling Dune to build as many packages as possible. Since then, we've been hard at work, going through the failures and fixing issues as we go along. In today's post, I'll give you an overview of what we have achieved so far, as well as an idea of what is yet to come.
What Has Improved Since Last Time
If you check our tracking issue, you'll notice there are significantly more items there than there were before.
We've made enhancements in how dune pkg and the health check handle
dependencies - including depexts - and aligned them better with how opam
does it. There are, however, some intentional differences between how dune pkg and opam do things. Inspecting their repository health has led
to fixes aligning dune pkg more with opam semantics and sometimes improving
the correctness of the metadata on opam-repository. Below, we'll go through some
of these improvements.
- A lot of packages in
opamdid not declareocamlas a dependency. In theory,opamis OCaml-agnostic and can install packages written in any language (Topiary is written in Rust, for example). However, in practice, most packages onopamrequire OCaml, so when a package does not declare a dependency on OCaml, and none of its dependencies capture an OCaml compiler in their dependency cone, then Dune Package Management locks a solution without a compiler. In many cases, this will fail, so many packages have had their metadata updated to includeocamlas a dependency onopam-repositoryand, where possible, upstream. - When
opamencounters undefined variables, it evaluates them to 'false'. When locking a solution, we translate the build and install instructions into Dune's own variable format. However, in the Dune semantics, unknown variables are not evaluated as false by default. We changed the way we translate variables to wrap the variables withcatch_undefined_varin #11512, thus matching the semantics of the original expressions. - Some packages depend on
ez-conf-lib, which is a package that records the place of its executable when it is built. Unfortunately, in the case of Dune package management, that would be a sandbox location, so when other packages attempted to access it, it would not exist at that location anymore. This made the package non-relocatable. In #11598, this was changed to useopamand Dune-provided variables, which are set to the appropriate location when building so that users can find them. - When packages build, they often need additional dependencies from the
operating system: these are called
depexts. In Opam-Health-Check, we usedopamto install these, but sometimes there was no valid solution, andopamwould fail. Unfortunately, the failure displayed an error message, but the process still succeeded with exit code 0. We changed our code to detect the error message in #103 and ended up reporting the issue upstream toopamas #6488. - When users locked a solution with
dune pkg, it would also record the detecteddepexts. However, differences in how optional packages were handled betweenopamanddunecould lead to not enough packages being installed if we usedopamto install depexts. In #104, we changed the logic to use Dune to create the list ofdepextsand install these in a separate step. This way, there should be no confusion between whatopamand Dune consider a dependency. - While most source archives ship with
.opamfiles, they are technically not required.Opamnever reads them when installing (since it uses the information fromopam-repository), and Dune does not need them as it can read all the required information fromdune-project. However, Opam-Health-Check used them to determine which package names existed, so when it encountered packages without.opamfiles, it assumed there were no packages to build in the source archive. With #97, we read the package names from.opamfiles and fromdune-projectto ensure we capture all names. - When
opambuilds packages with Dune, for the most part, it usesdune build -p <pkg-name>. The-pflag is a special flag which is mainly used for releasing and implies--release --only-packages <pkg-name>. We couldn't use the same-pflag, as--releaseitself expanded to a lot of other configuration options, among these--ignore-lock-dir. It meant that ifdune pkg lockand thendune buildwere used,--releasewould ignore the lock directory. This was implemented so that introducingdune pkgwould not break packages inopam-repositorythat used lock files. However, there aren't many packages inopam-repositorythat use--releaseand building packages with Dune package management inreleasemode is useful. Dune was patched in #11378 to move--ignore-lock-dirto-p. This allows you to use--releasewith package management, and #96 was merged to take advantage of it. The use of--releasebetter represents anopambuild and enables the building of several key packages, such asbaseandcore, for which--releasedisables building Jane-Street-internal tests. - When we looked for which packages to build, we accidentally used a subset
search instead of an exact name match. Thus, we would sometimes accidentally
pick packages to build that were not meant to be built. This was fixed in
#99, ensuring that
when determining whether
labshould be built, accidentally matching ongitlabwould not give us false positives.
Maybe Some Packages Just Don't Build
It turns out some packages that are on opam-repository just do not build.
This can be due to a lot of reasons. Some packages don't support OCaml 5.3 (the
most recent release at the time of writing and the one we run the checks on),
and others don't support the platform we are running on. Some can't be
downloaded because the server that hosted them disappeared. In such cases,
there is nothing that Dune package management can do besides fail.
Thus, to make a fair comparison, we patched
Opam-Health-Check and
extended it so it can build the same package with Dune and opam in the same
run. That way, we see that if a package doesn't build on opam, it is
unlikely to magically work when using Dune package management (although that
can happen, e.g. on transient network failures, which would prevent opam from
downloading the source tarball).
Some Things We Don't Support
There are some packages that will not work. Often, this is because the packages fail due to how Opam-Health-Check works, which is not something we expect a user of Dune package management to encounter.
Complex Build Commands
When selecting the packages that we plan to build, we make sure to only pick Dune packages. However, the definition of a package 'using Dune' is not clear-cut.
A source might have a dune-project file but never call Dune. A build might
call Dune but also do an arbitrary number of other steps. In opam, this
process is simple because opam will just execute all steps in the build and
install entries, be it launching Dune, calling make, or any other command.
For the health check, we decided to set the limit at dune build. This means
that packages that require extra instructions will most likely fail to build in
the health check.
The reason why we are setting the limit here is twofold:
- Interpreting which commands to run in the health-check would require us to
implement and evaluate the filters that
opamsupports for running the commands. Making sure we evaluate things exactly likeopamdoes would be a non-trivial undertaking. - Packages that need extra commands to run usually run just fine when these
commands are run manually; thus users of
dune pkgcan most likely use Dune package management when using it on their machines.
Another bonus reason is that not that many packages are affected by this, so it didn't seem worth the time investment.
What Work is There Still Left to Do?
There are still categories of errors that make it difficult to adopt package management. The most notorious issue is #10855, colloquially called the "in and out of workspace" bug.
It occurs when a project has dependencies, and these dependencies, in turn, depend on a package that is in the project's workspace. Usually, this is a circular dependency, but such a configuration can reasonably happen in some cases, such as when a test dependency uses something from your project. For example, if Lwt uses a test tool that, in turn, depends on Lwt, it is currently impossible to build it with Dune package management, as Lwt would be part of both the build and its own dependencies.
There are not many packages affected, but the ones that are are some of the
most used packages in OCaml. Among these are Lwt, Odoc, and, unfortunately,
Dune (due to lots of projects depending on dune-configurator). Thus, at the
moment, Dune package management cannot be used to develop Dune itself.
While addressing these issues was outside the scope of this particular project, we plan to tackle them through future initiatives. Ultimately, our goal is to provide a seamless user experience with Dune package management.
Until Next Time
If you're using Dune Package Management and have feedback or questions, please share your thoughts on Discuss. Our teams are always looking for input in order to improve tools and features, and your feedback can help us make everyone's experience better.
Stay in touch with Tarides on Bluesky, Mastodon, Threads, and LinkedIn. We look forward to hearing from you!
Open-Source Development
Tarides champions open-source development. We create and maintain key features of the OCaml language in collaboration with the OCaml community. To learn more about how you can support our open-source work, discover our page on GitHub.
Explore Commercial Opportunities
We are always happy to discuss commercial opportunities around OCaml. We provide core services, including training, tailor-made tools, and secure solutions. Tarides can help your teams realise their vision
Stay Updated on OCaml and MirageOS!
Subscribe to our mailing list to receive the latest news from Tarides.
By signing up, you agree to receive emails from Tarides. You can unsubscribe at any time.