Tarides Logo
The planet Saturn with its rings.

Saturn 1.0: Data structures for OCaml Multicore

Isabella Leandersson

Communications Officer

Posted on Wed, 11 Dec 2024

The first version of the Saturn library is out! Saturn is a new OCaml 5 library available on opam, which offers a collection of well-tested, benchmarked, and efficient concurrent data structures ready to be used with OCaml Multicore. Access to concurrent-safe data structures saves developers from the time-consuming and often error-prone process of designing their own.

This post will give you an overview of the library, its main features, and some use cases. The team encourages you to try the data structures and share your feedback in the repo and on Discuss. After you're finished with this post, I also recommend you read the paper on Saturn from the OCaml workshop at ICFP 2024 if you want more details. Let’s dig in!

What is Saturn?

The Saturn repository is made up of one package, saturn, containing all lock-free data structures. You can use Saturn with OCaml 5.2 or later, and it can be installed from opam with the command opam install saturn.

Saturn covers many use cases, from simple stacks and queues to more complex structures, including skip lists, hash tables, work-stealing deques, and more. All of them have been adapted to be compatible with and take advantage of the OCaml 5 memory model. For example, the team behind Saturn had to rework the Michael-Scott queue to avoid memory leaks.

To improve performance, they introduced several micro-optimisations, including preventing false sharing, adding fenceless atomic reads where possible (improving performance on ARM processors), and avoiding extra indirection in arrays and atomics to reduce memory consumption. While optimising the library, their work highlighted some missing features in OCaml 5 and led to upstreamed improvements to the language, such as padded atomics and fixing a CSE bug.

Why Saturn?

Sharing data between multiple threads or cores is a well-known problem in computer science. The most obvious solution is to use a sequential data structure protected by a lock, but this approach can introduce performance overhead due to contention between locks. Liveness issues like deadlock, starvation, and priority inversion are also associated with locks in parallel programming.

The opposite approach is to use a lock-free implementation, relying on fine-grained synchronisation instead of locks. This approach benefits from higher performance and guarantees system-wide progress. However, it does come with its own set of bugs, including the ABA problem (which is largely mitigated in garbage-collected languages), data races, and unexpected behaviours as a result of non-linearisability.

In this quagmire of pitfalls, Saturn stands out as a source of reliable, tested data structures that the user can adopt for their own projects. Developers can pick from a variety of structures knowing that they are suitable and safe, saving them time and making OCaml 5 easier to use.

Benchmarks

The library is still relatively new, so more benchmarks are still to come, but there are some numbers to give you an idea of performance. You can find tables showing the throughput of different queues and stacks on single and multiple domains in the paper from the OCaml Workshop at ICFP 2024 about Saturn. The tests reveal that Saturn implementations either outperform or, in the case of simpler implementations, match the performance of non-Saturn structures. However, the benefit of using Saturn, even in cases where another simple implementation exists, is that the Saturn data structures have built-in optimisations and synchronisation safeguards, and it’s up to each individual developer to decide what that’s worth to them.

The library also has a benchmarking command make bench that can be run from the root of the repository to run a standard set of benchmarks. It outputs in JSON and is intended to be consumed by current-bench.

Tests

Testing Saturn’s structures was a high priority for the team and crucial to ensure the safety, linearisability, and lock-freedom where expected. Saturn was tested using two primary tools: DSCheck and STM. STM is used for both unit testing and linearisability. It can automatically generate random full programs using the API provided, and for Saturn, this is a data structure. It then executes the programs in parallel with two domains and checks all the results against the post-conditions of each function, providing unit testing. STM performs a sequence of random commands in parallel, records the results, and checks whether the observed results can be linearised and reconciled with some sequential execution.

DSCheck is a model checker designed to compute all the possible interleavings of instructions between multiple domains and verify that each one returns the expected result. This method is useful for detecting bugs that only occur in rare circumstances which would otherwise be hard to catch. DSCheck can also be used to verify that a program is lock-free, as it will fail to terminate if any form of blocking is present in the program. The DSCheck implementation has been optimised to make the tests efficient even on Saturn’s more complex data structures.

Formal Verification

The verification work for Saturn is done at Inria, forming part of Clément Allain's PhD work there. Due to the notoriously finicky behaviours of lock-free algorithms, they have formally verified part of Saturn’s data structures and aim to keep going until they’ve covered the entire library. The main criterion for correctness in concurrent data structures is linearisability, which requires each operation in a data structure to appear to take effect instantaneously at some point during its execution (called the linearisation point) so that all linearisation points from all operations form a coherent, sequential history. The team at Inria are formally verifying that this linearisability is correctly present in Saturn's data structures.

To verify linearisability, they use the mechanised concurrent separation logic IRIS, which has been used in the past to verify realistic data structures. All proofs are formalised in Coq and available on GitHub. They manually translated the original code from Saturn to a deeply embedded language in Coq, hoping to potentially automate the process in the future. Work continues to verify data structures for Saturn, ensuring safe and predictable behaviours.

Stay in Touch!

To delve deeper into everything Saturn, I recommend you watch the ICFP 2024 presentation from the OCaml Workshop. You can try Saturn for your projects via the repo, and don’t forget to share your feedback with the team!

Connect with us online on X, Mastodon, Threads, and LinkedIn to stay updated on our latest projects.

Open-Source Development

Tarides champions open-source development. We create and maintain key features of the OCaml language in collaboration with the OCaml community. To learn more about how you can support our open-source work, discover our page on GitHub.

Explore Commercial Opportunities

We are always happy to discuss commercial opportunities around OCaml. We provide core services, including training, tailor-made tools, and secure solutions. Tarides can help your teams realise their vision