Effective ML Through Merlin's Destruct Command

by Xavier Van de Woestyne on May 29th, 2024

The Merlin server and OCaml LSP server, two closely related OCaml language servers, enhance productivity with features like autocompletion and type inference. Their lesser known, yet highly useful destruct command simplifies the use of pattern matching by generating exhaustive match statements, as we’ll illustrate in this article. The command has recently received a bit of love, making it more usable, and we are taking advantage of this refresh to introduce it and showcase some use cases.

A good IDE for a programming language ought to provide contextual information, such as completion suggestions, details about expressions like types, and real-time error feedback. However, in an ideal world, it should also serve as a code-writing assistant, capable of generating code as needed. And even though there are undeniably commonalities among a broad range of programming languages, allowing for the "generalisation" of interactions with a code editor via a protocol (such as LSP), some languages possess uncommon or even unique functionalities that require special treatment. Fortunately, it is possible to develop functionalities tailored to these particularities. These can be invoked within LSP through custom requests to retrieve arbitrary information and code actions to transform a document as needed. Splendid! However, such functionality can be more difficult to discover, as it somewhat denormalises the IDE user experience. This is the case with the destruct command, which is immensely useful and saves a great deal of time.

In this article, we'll attempt to fathom of the command's usefulness and its application using somewhat simplistic examples. Following that, we'll delve into a few less artificial examples that I use in my day-to-day coding. I hope that the article is useful and entertaining both for people who already know destruct and for people who don't.

Destruct in Broad Terms

OCaml allows the expression of algebraic data types that, coupled with pattern matching, can be used to describe data structures and perform case analysis. In the event that a pattern match falls short of being exhaustive, warning 8, known as partial-match, will be raised during the compilation phase. Hence, it is advisable to uphold exhaustive match blocks.

The destruct command aids in achieving completeness. When applied to a pattern (via M-x merlin-destruct in Emacs, :MerlinDestruct in Vim, and Alt + d in Visual Studio Code), it generates patterns. The command behaves differently depending on the cursor’s context:

  • When it is called on an expression, it replaces it by a pattern match over its constructors.

  • When it is called on a pattern of a non-exhaustive matching, it will make the pattern matching exhaustive by adding missing cases.

  • When it is called on a wildcard pattern, it will refine it if possible.

For those unfamiliar with the term destruct, pattern matching is case analysis, and expressing the form (a collection of patterns) on which you match is called destructuring, because you are unpacking values from structured data. This is the same terminology used in JavaScript.

Let's examine each of these scenarios using examples.

Destruct on an Expression

Destructing an expression works in a fairly obvious way. If the typechecker is aware of the expression type (in our example, it knows this by inference), the expression will be substituted by a matching on all enumerable cases.

Destruct on expression

Destruct on a Non-Exhaustive Matching

The second behaviour is, in my opinion, the most practical. Although I rarely need to substitute an expression with a pattern match, I often want to perform a case analysis on all the constructors of a sum type. By implementing just a single pattern, such as Foo, my match expression is non-exhaustive, and if I destruct on this, it will generate all the missing cases.

Destruct on non-exhaustive match

Destruct on a Wildcard Pattern

The final behaviour is very similar to the previous one; when you destruct a wildcard pattern (or a pattern producing a wildcard, for example, a variable declaration), the command will generate all the missing branches.

Destruct on wildcard

Dealing With Nesting

When used interactively, it is possible to destruct nested patterns to quickly achieve exhaustiveness. For example, let’s imagine that our variable x is of type t option:

  • We start by destructing our wildcard (_), which will produce two branches, None and Some _.
  • Then, we can destruct on the associated wildcard of Some _, which will produce all conceivable cases for the type t.

Destruct on nested patterns

In the Case of Products (Instead of Sums)

In the previous examples, we were always dealing with cases whose domains are perfectly defined, only destructing cases of simple sum type branches. However, the destruct command can also act on products. Let's consider a very ambitious example where we will make exhaustive pattern matching on a value of type t * t option, generating all possible cases using destruct alone :

Destruct on nested tuples

It can be seen that when used interactively, the command saves a lot of time, and coupled with Merlin's real-time feedback regarding errors, one can quickly ascertain when our pattern matching is exhaustive. In a way, it's a bit like a manual "deriver."

The destruct command can act on any pattern, so it also works within function arguments (although their representation has changed slightly for 5.2.0), and in addition to destructing tuples, it is also possible to destruct records, which can be very useful for our quest for exhaustiveness!

Destruct on nested records

When the Set of Constructors is Non-Finite

Sometimes types are not finitely enumerable. For example, how are we to handle strings or even integers? In such situations, destruct will attempt to find an example. For integers, it will be 0, and for strings, it will be the empty string.

Destruct on non-enumerable values

Excellent! We have covered a large portion of the behaviors of the destruct command, which are quite contextually relevant. There are others (such as cases of destruction in the presence of GADTs that only generate subsets of patterns), but it's time to move on to an example from the real world!

The Quest for Exhaustiveness: Effective ML

In 2010, Yaron Minsky gave an excellent presentation on the reasons (and advantages) for using OCaml at Jane Street. In addition to being highly inspiring, it provides specific insights and gotchas on using OCaml effectively in an incredibly sensitive industrial context (hence the name "Effective ML".)! It was in this presentation that the maxim "Make illegal states unrepresentable" was publicly mentioned for the first time, a phrase that would later be frequently used to promote other technologies (such as Elm). Moreover, the presentation anticipates many discussions on domain modeling, which are dear to the Software Craftsmanship community, by proposing strategies for domain reduction (later extensively developed in the book Domain Modeling Made Functional).

Among the list of effective approaches to using an ML language, Yaron presents a scenario where one might too hastily use the wildcard in a case analysis. The example is closely tied to finance, but it's easy to transpose into a simpler example. We will implement an equal function for a very basic type:

type t = 
  | Foo
  | Bar

The equal function can be trivially implemented as follows:

let equal a b = 
  match (a, b) with
  | Foo, Foo -> true
  | Bar, Bar -> true
  | _ -> false

Our function works perfectly and is exhaustive. However, what happens if we add a constructor to our type t?

  type t
    | Foo
    | Bar
+   | Baz

Our function, in the case of equal Baz Baz, will return false, which is obviously not the expected behavior. Since the wildcard makes our function exhaustive, the compiler won't raise any errors. That's why Yaron Minsky argues that in many cases with a wildcard clause, it's probably a mistake. If our function had been exhaustive, adding a constructor would have raised a partial-match warning, forcing us to explicitly decide how to behave in the presence of the new constructor! Therefore, using a wildcard in this context deprives us of the fearless refactoring, which is a strength of OCaml. This is indeed an argument in favor of using a preprocessor to generate equality functions, using, for example the eq standard deriver or the more hygienic Ppx_compare. But sometimes, using a preprocessor is not possible. Fortunately, the destruct command can assist us in defining an exhaustive equality function!

We will proceed step by step, specifically separating the different cases and using nested pattern matching to make the various cases easy to express in a recurrent manner:

Destruct for equal on Foo and Bar

As we can see, destruct allows us to quickly implement an exhaustive equal function without relying on wildcards. Now, we can add our Baz constructor to see how the refactoring unfolds! By adding a constructor, we quickly detect a recurring pattern where we try to give the destruct command as much leeway as possible to generate the missing patterns!

Destruct for equal on Foo, Bar and Baz

Fantastic! We were able to quickly implement an equal function. Adding a new case is trivial, leaving destruct to handle all the work!

Coupled with modern text editing features (e.g., using multi-cursors), it's possible to save a tremendous amount of time! Another example of the immoderate use of destruct (but too long to be detailed in this article) was the Mime module implementation in YOCaml for generating RSS feeds.

In Conclusion

Paired with a formatter like OCamlFormat (to neatly reformat generated code fragments), destruct is an unconventional tool in the IDE landscape. It aligns with algebraic types and pattern matching to simplify code writing and move towards code that is easier to refactor and thus maintain! Aware of the command's utility, the Merlin team continues to maintain it, streamlining the latest features of OCaml to make the command as usable as possible in as many contexts as possible!

I hope this collection of illustrated examples has motivated you to use the destruct feature if you were not already aware of it. Please do not hesitate to send us ideas for improvements, fixes, and fun use cases via X or LinkedIn!

Happy Hacking.

Contact Tarides to see how OCaml can benefit your business and/or for support while learning OCaml. Follow us on Twitter and LinkedIn to ensure you never miss a post, and join the OCaml discussion on Discuss!