Keeping Up With the Compiler: How we Help Maintain the OCaml Language

by Isabella Leandersson on Jun 19th, 2024

Not all of our projects have a definite end: a grand culmination of effort and time where we pop champagne and set off fireworks (which is, of course, how we celebrate most of the time). Indeed, providing ongoing support for the OCaml ecosystem is one of our biggest priorities, and it means that we resolve issues, maintain libraries, and improve features continuously over time.

Providing high-quality maintenance may not be glamorous, but it is crucial work that keeps the OCaml compiler running smoothly and efficiently. No matter how well-designed an individual feature is, if it does not receive regular maintenance, it risks going out of sync with all the other features. Since the compiler is made up of many interrelated parts, sustained and targeted maintenance of its individual parts, as well as their interactions, is crucial to ensure stability and robustness.

Read on to learn about the process behind ensuring the long-term quality of the compiler and discover what our team has achieved so far!

How do we Maintain the Compiler?

Many teams, companies, and community members collaborate on the development and maintenance of the OCaml compiler. We are just one group among many, and aligning our work with the goals and needs of the ecosystem at large is essential. Our team members generally focus on areas where they have expertise and can be most helpful, including multicore support, Windows compatibility, and build system enhancements. Their skills range from compiler front-end development (including the PPX and type system) to the runtime, from signal handling to the garbage collector.

We work closely with researchers at Inria, who created OCaml and continue to significantly contribute to its development and maintenance, to discuss existing issues and goals. Core maintainers and other contributors to OCaml hold triaging meetings every two weeks, led by Florian Angeletti at Inria, where they review recent issues and pull requests made to the OCaml repository. Each question, bug, or contribution is assigned to a developer responsible for ensuring it is addressed. This is a collaborative process between maintainers across organisations and companies. Several existing core maintainers are also Tarides staff members, and we support their work as part of our internal compiler maintenance effort. More and more Tarides engineers are becoming OCaml core maintainers, with the rights and responsibilities that come along with it, as the community recognises the high quality of their contributions.

Tarides encourages all of our compiler developers to allocate time towards maintenance. This helps us disseminate knowledge more evenly around the teams and ensures continuous attention is put towards improvements, fixes, and optimisations.

Identifying Key Areas of Effective Long-Term Maintenance

The best way to maintain a project is to target areas of strategic importance while remaining flexible and responding to issues as they arise. Our goal is to ensure that the maintenance of OCaml is effective in the long run, and to accomplish this, we focus on areas where we have considerable expertise and on fostering growing community involvement.

  • Long-Term Improvements:

Maintenance work on the compiler does not just mean fixing small issues as they come up but includes long-term work towards key features. For example, OCaml 5 introduced a relaxed memory model that provides strong guarantees for programs that have data races. While the recommendation is prescriptive, OCaml, having started out as a sequential language, does not always enforce the memory model correctly. When such divergences (bugs) are identified, Tarides maintainers aim to identify the expected behaviour based on the prescriptive definition of the memory model and enforce it. In addition, we work on projects that make the build system simpler and hence more maintainable, that increase portability across various platforms (which was reduced in the move from OCaml 4 to OCaml 5), and improve the user experience on Windows.

  • Automated Quality Assurance:

Part of guaranteeing the long-term stability of the compiler happens through the Continuous Integration (CI) process. Contributions to the OCaml compiler are subject to rounds of testing to ensure that their code behaves predictably. The first round runs on GitHub actions and is completed before a contribution is even accepted into the compiler. Only contributions that pass these tests can be accepted. The second round is significantly more exhaustive (and therefore requires much more computing power to execute) and is performed on PRs with a needs-precheck label. This round of testing uses the Jenkins-CI hosted by Inria. Jenkins-CI has a lot more backends than GitHub Actions and is used to check changes that may affect backend code.

In addition to the above-mentioned rounds of CI, multicore tests are frequently performed on the compiler as part of the team's workflow, enabling them to catch and fix some hard-to-spot bugs, including data races. As a case in point, developers are using multicore tests alongside their ongoing efforts to restore MSVC support to OCaml 5.3 in order to ensure that changes do not introduce unwanted behaviours into the code.

  • Community:

When all is said and done, the OCaml compiler exists for the language's wider open-source community, so encouraging community engagement and feedback is vital. We prioritise clear documentation and open discussion in the public repositories to allow everyone to weigh in. We also organise regular compiler hacking events where we invite people into our offices (in 2023, we hosted people in Cambridge and in Chennai) to discuss, hack, and hang out. Our hacking days have sparked significant contributions to the compiler and brought new contributors on board. These initiatives are designed to facilitate open communication, share progress, and involve the community in the ongoing development of the OCaml compiler – thereby fostering alignment with community interests.

The compiler team's effort, in collaboration with other ecosystem members, provides a core service to the OCaml community. Together, we improve and maintain fundamental parts of the OCaml compiler with high levels of oversight, safety, and transparency.

Compiler Maintenance Fixes

Here is a short list illustrating the range of issues, big and small, that the compiler engineers at Tarides address as part of compiler maintenance. This list is far from exhaustive and instead aims to give an overview of the kinds of tasks that we undertake:

Our compiler team not only introduces features and resolves problems relevant to Tarides but also addresses issues raised by different members of the OCaml community. This task includes reviewing external PRs and helping to keep the OCaml compiler maintained and up-to-date. David Allsopp reviewed this PR, which fixed a bug happening when a 32-bit MSVC ran on Windows, and the CI script accidentally ran 32-bit Cygwin.

Improving the OCaml user experience involves making error messages easier to understand and, therefore, also falls under the goals of the compiler team. More accessible error messages make the language easier for beginners to use and learn. In this PR, Samuel Hym made a symlink error message that appeared when users tried to link non-existent files much easier to understand.

This pull request addressed a problem where the Gnu Compiler Collection C compiler would inline some C function backtraces and not others. Fabrice Buoro updated the ocaml_program frame-pointer backtrace to ignore differences in the case of inconsistent inlining decisions made by the C compiler. In addition, caml_program now does all of the previous 'backtrace post-processing' locally to the C code, eliminating the awk, sh, and sed dependencies.

Part of the compiler team's work is also to prepare the compiler for future features. Nick Barnes has been working on bringing statmemprof support to OCaml 5 and preparing the trunk runtime to be compatible with statmemprof; he has improved its backtrace abstractions in this PR. Previously, the OCaml backtrace API allowed backtraces to be obtained as a single per-domain buffer or as an object on the OCaml heap. However, statmemprof needs to be able to use the current backtrace at arbitrary allocation points when Caml heap allocation might not be possible. Nick's PR changed the backtrace.h abstraction by adding caml_get_callstack().

With this PR, Antonin Décimo improved the detection of the C compiler on Windows, replacing the use of $cc_basename with $ocaml_cv_cc_vendor. This change helps users detect when mingw-w64 and clang-cl are used, fixes TSan detection on macOS, and removes uses of $cc_basename. A bonus of this change is that improved detection improves the quality of bug reports since users can report which C compiler they use when they experience a bug.

Some bugs can be far-reaching but hard to identify and reproduce (where developers try to re-create the condition under which the bug manifests). These cases call for a lot of patience and meticulousness on behalf of the programmer trying to solve it. In PR #13207 Miod Vallat worked on a bug identified by Vesa Karvonen that affected all 64-bit architectures apart from amd64. The bug could only be reproduced when the system was calling C code that reached back to OCaml code, and that OCaml code had had to grow its stack, and upon return from the invoked C code, there was an exception pending. In OCaml 5, as opposed to in OCaml 4, growing the OCaml stack changes the value of the exception pointer. Thus, the cached pointer was now pointing to the old stack - an issue fixed by ensuring that the register storing the cached exception pointer is refreshed upon return.

Finally, a decent chunk of the team's tasks requires a lot of investigative skills! Some bugs are so hard to define and understand that our engineers must act as detectives, piecing together a larger picture from hard-won clues. This is what Jan Midtgaard did to understand the cause of a rare race condition in the debug runtime. It's essential to allow the problem-solving process to take the time it requires and not try to rush it. Complex problems require a patient and methodical approach, and we have set up the compiler maintenance project to manage these essential tasks.

Sharing the Load

Aside from the fixes contributed by the team, Tarides also contributes significantly to reviewing issues and pull requests. For example, of the 273 PRs that went into the OCaml 5.2.0 release, Tarides was significantly involved in approximately 160. This includes 90 PRs from Tarides with new features and bug fixes (some mentioned above) and 70 PRs made by non-Tarides contributors where Tarides engineers were credited as reviewers. Among the many areas of OCaml maintenance, ensuring that issues and PRs are addressed is significant for the project's sustainability, and Tarides is proud to do its part.

Stay in Touch!

Tarides supports many critical projects and parts of the OCaml ecosystem, as we believe long-term maintenance is vital to improving and strengthening the language over time. We're lucky to collaborate with people and organisations passionate about the language, and we invite you to contribute your own expertise to the repo and join the discussions on the OCaml Discuss Forum.

Follow us on LinkedIn and X (formerly known as Twitter) for the latest news from Tarides. You can also contact us directly on our website if you have questions or want more information about our projects and how you can benefit from them.