Tarides Logo
Four pieces of a half-solved wooden jigsaw puzzle against a blue background

Dynamic Formal Verification in OCaml: An Ortac/QCheck-STM Tutorial

NO

Senior Software Engineer

Posted on Wed, 10 Sep 2025

You may have read our recent post discussing our involvement in the Gospel Project. In today's follow-up post, we will focus on Ortac, a tool we have been developing at Tarides as part of the Gospel Project.

What is Ortac?

Ortac aims to give OCaml programmers easy access to dynamic formal verification, also known as specification-driven testing. At its core, it translates computable Gospel terms into equivalent OCaml expressions. Different Ortac modes can then use these translations to generate OCaml code. This post is about the Ortac/QCheck-STM mode that generates black-box model-based state-machine tests based on the QCheck-STM framework.

Ortac/QCheck-STM Mode

QCheck is a property-based testing framework inspired by QuickCheck. As the name implies, the idea behind property-based testing is to check a property about a function, generally expressed as an equation involving the inputs and outputs of the function, against randomly generated inputs.

In contrast, QCheck-STM checks the behaviour of randomly generated sequences of function calls against a model. The model is implemented as a state machine (hence the STM). Testing a sequence of function calls helps users discover more bugs, especially when a mutable state is involved. In order to use QCheck-STM on a library, the users have to specify which type is the center of attention, also called System Under Test, or SUT for short. They also have to provide a functional model for the SUT, equipped with a next_state function computing the new model given a function call. You can read more about QCheck-STM in a previous post on our blog and in a paper on parallel testing libraries for OCaml 5.

One of the positives of using Ortac/QCheck-STM is that with just some Gospel annotations, a Dune rule, and a configuration file, we can benefit from QCheck-STM tests. Another bonus is that in case of failure, the generated tests will provide a bug report containing the piece of Gospel specification that has been violated, a runnable scenario with the actual returned values to reproduce the failure and, if available, the expected returned value of the failing command according to the function specification.

Since the previous post, the Ortac tool has been improved in several ways. Version 0.4.0 brought support for keeping track of multiple Systems Under Test in the generated tests, all thanks to Nikolaus Huber's work. Then, version 0.5.0 brought support for testing higher-order functions, thanks to Jan Midtgaard. Finally, version 0.6.0 improved the computation of the expected returned value based on the specifications.

The reader curious about how Ortac/QCheck-STM works internally can refer to this paper. The rest of this post will adopt a more practical perspective, using the priority queue as a running example.

Project Setup

Let's first explore a project setup, how to write the Gospel specification of the API, and finally, how to integrate Ortac/QCheck-STM with Dune.

Here is the project's structure:

.
├── dune-project
├── priority-queue.opam
├── src
│   ├── dune
│   ├── priority_queue.ml
│   └── priority_queue.mli
└── test
    ├── dune
    ├── dune.inc
    ├── priority_queue_config.ml
    └── priority_queue_tests.ml

3 directories, 9 files

The idea behind Ortac/QCheck-STM is to not write the tests. Compared to a more traditional project, priority_queue.mli contains some Gospel specifications describing the expected behaviour of the functions, and the files dune.inc and priority_queue_tests.ml are generated by Ortac.

The priority_queue_tests.ml contains the generated QCheck-STM tests for the Priority_queue module, based on the Gospel specifications contained in the priority_queue.mli file and the priority_queue_config.ml file provided by the user. Furthermore, dune.inc contains the generated Dune rules to generate priority_queue_tests.ml and attach the execution of these tests to the runtest alias.

Generating dune.inc is the job of the Ortac/Dune mode, which is called by a hand-written dune rule in the dune file.

Due to how Dune's promote mode interacts with the include stanza, we must create an empty dune.inc. Another possibility is to use a dynamic_include rule (see the rule generation chapter in the Dune docs for more details).

Generating priority_queue_tests.ml is the job of the Ortac/QCheck-STM mode, which is called by a generated dune rule in dune.inc.

Ortac/QCheck-STM is provided by the ortac-qcheck-stm package, and Ortac/Dune by the ortac-dune one. We need to declare these two packages as dependency for our project.

Writing Some Gospel Specifications

Let's take a look at the interface file for our priority queue, where the Gospel specifications are stored.

We begin by declaring the OCaml abstract type of a priority queue alongside its logical specification:

(*@ open Sequence *)

(*@ type 'a priority = 'a * integer *)

(*@ type 'a priority_queue = 'a priority sequence *)

type 'a t
(*@ mutable model contents : 'a priority_queue
    with t
    invariant let q = t.contents in
              forall i.
              1 <= i < length q
              -> snd q[i-1] >= snd q[i] *)

Gospel annotations are written inside special comments opened with (*@. We can open modules from the logical library that Gospel provides. The Sequence Module contains the definition of a mathematical sequence and some operations. Ortac comes with an implementation of the Gospel logical library. We can also declare Gospel types using the same syntax as the OCaml one.

Gospel annotations immediately following an OCaml type or val are attached to it, a bit like documentation comments. So, in the example above, mutable model contents ... is the specification for the type 'a t. Gospel is a specification language based on models. Thus, in the first line of the specification, we give our OCaml type a model named contents. The model is also marked as mutable. That doesn't mean that the model itself is mutable but that the model of an 'a t may change when that 'a t is mutated. It is a way of specifying that the OCaml type 'a t is mutable.

Now, our model contents has a Gospel type: 'a priority_queue. If we unfold the definitions of the Gospel types priority_queue and priority given just before in the file, this means that we will see OCaml values of type 'a t as mathematical sequences of pairs of an element and an integer representing this element priority. There are other models we could have reasonably chosen, but let's go with this one. Let's also note that the model we choose is not directly related to the actual implementation. The logical model should make sense for any possible implementation.

We also use the invariant mechanism from Gospel to express that we keep the elements in decreasing order of priority in the model. This is not necessary, but it makes expressing the inspection and the evolution of the logical model a lot easier.

Now, we need to be able to express three fundamental operations precisely in our model: inserting a new element with its associated priority, looking at the next element, and deleting the next element.

(*@ function insert (q : 'a priority_queue)
                    (a : 'a)
                    (i : integer)
                    : 'a priority_queue =
      let higher = filter (fun x -> snd x > i) q in
      let equal = filter (fun x -> snd x = i) q in
      let lesser = filter (fun x -> snd x < i) q in
      higher ++ snoc equal (a, i) ++ lesser *)

(*@ function peek (q : 'a priority_queue) : 'a option =
      if q = empty
      then None
      else Some (fst (hd q)) *)

(*@ function delete (q : 'a priority_queue) : 'a priority_queue =
      if q = empty
      then q
      else tl q *)

The insert function does all the specification heavy lifting and is responsible for maintaining the invariant we've declared for our logical model. Note that Ortac/QCheck-STM will include invariant verifications in the generated tests!

The insert function is also the place where we choose to store the elements of the same priority in a FIFO manner by using the function snoc from the Gospel logical library.

The way insert is written makes peeking and deleting straightforward. In both cases, it suffices to look at and delete the first element of the sequence.

As we can see, this is similar to a very naive trusted functional implementation of the logical model. This makes a lot of sense if we think about it! As Ortac compiles Gospel specifications into OCaml code, it consumes computable specifications. This process also looks like part of what we would have written if we had been using QCheck-STM directly. These three functions are the ones necessary to implement the functional state-machine that the QCheck-STM tests will rely on, whether hand-written or Ortac-generated. One of the benefits of using Ortac/QCheck-STM is that it will test the Gospel implementation of these functions against the model's invariants for free!

We could argue that using an ordered list as a model for a priority queue is inefficient. But, as we are in a testing situation, performance matters less than the correctness of the model.

We now have all the necessary vocabulary to talk about the behaviour we expect from a priority queue:

val empty : unit -> 'a t
(*@ q = empty ()
    ensures q.contents = Sequence.empty *)

val insert : 'a t -> 'a -> int -> unit
(*@ insert q a i
    modifies q.contents
    ensures q.contents = insert (old q.contents) a i *)

val peek : 'a t -> 'a option
(*@ o = peek q
    ensures o = peek q.contents *)

val extract : 'a t -> 'a option
(*@ o = extract q
    modifies q.contents
    ensures o = peek (old q.contents)
    ensures q.contents = delete (old q.contents) *)

Gospel function's specification takes the form of a contract. The first line of which is a header naming the arguments and the returned value. What follows is a sequence of clauses describing the expected behaviour. Note that in these clauses, the names insert, peek, and delete refer to the Gospel functions defined above.

Here, the functions' contract is pretty straightforward. It is worth noting that in order to be able to include a function in the tests, Ortac/QCheck-STM asks that whenever a new SUT is created, or an existing one is modified, the contract contains a post-condition (an ensures clause) describing, again, in a computable way, the value of the related model.

When there are other post-conditions, they are added to the tests. Besides, when a post-condition is describing, in a computable way, the output value, Ortac/QCheck-STM will use it to include information about the expected returned value in the bug report in case of test failure.

Integrating Specification-Driven Tests With a Dune Workflow

Now, with a bit more effort, specification-driven testing is just a dune runtest away!

The first thing Ortac/QCheck-STM needs is a minimal configuration file in order to know how to build the QCheck-STM test suite we want. The minimal configuration consists of defining the sut type and the init_sut value.

The sut type is the type we want to focus the tests on, and the init_sut value is how to create an initial value to start the tests.

The configuration file also contains some additional information. Here we shadow the QCheck generators for ints and chars so that we only deal with three levels of priorities and readable elements in the tests.

type sut = char t

let init_sut = empty ()

module Gen = struct
  let int = oneofl [0;1;2]
  let char = char_range 'a' 'z'
end

Finally, we want to be able to generate and launch the tests with a dune runtest. Thanks to the Ortac/Dune plugin, we only need to write one rule:

(rule
 (alias runtest)
 (mode promote)
 (deps
  (:specs %{project_root}/src/priority_queue.mli))
 (action
  (with-stdout-to
   dune.inc
   (run ortac dune qcheck-stm %{specs}))))

(include dune.inc)

Other than that, you are all set to implement a priority queue against specification-driven generated tests!

Current and Future Work

As mentioned in the repo, Ortac/QCheck-STM has already proven itself useful by discovering and helping fix a number of issues.

Thanks to Charlène Gros, the Ortac/Wrapper mode has just been released. Ortac/Wrapper consumes Gospel annotations to generate runtime assertion checking. Given an annotated OCaml interface file, this plugin will generate a new module with the same signature but with an implementation instrumented with assertions taken from the Gospel specifications. Each function is wrapped with an assertion of its pre- and post-conditions.

Another project is to make Ortac/QCheck-STM also target QCheck-STM/Domains. For now, Ortac/QCheck-STM only generates QCheck-STM tests for testing the library in a sequential context. One of the strengths of QCheck-STM is that it provides a way to test mutable data structures in a parallel context using domains.

The whole project is under active development and should continue to evolve. We are also committed to open-source software, so if you want to take Ortac for a spin (and I encourage you to), please don't hesitate to ask questions, contribute issues, or even PRs!

Until Next Time

You can connect with us on Bluesky, Mastodon, Threads, and LinkedIn or sign up to our mailing list to stay updated on our latest projects. We look forward to hearing from you!

Open-Source Development

Tarides champions open-source development. We create and maintain key features of the OCaml language in collaboration with the OCaml community. To learn more about how you can support our open-source work, discover our page on GitHub.

Explore Commercial Opportunities

We are always happy to discuss commercial opportunities around OCaml. We provide core services, including training, tailor-made tools, and secure solutions. Tarides can help your teams realise their vision