Effective ML Through Merlin's Destruct Command
Senior Software Engineer
The Merlin server and OCaml LSP server, two closely related OCaml language
servers, enhance productivity with features like autocompletion and type
inference. Their lesser known, yet highly useful destruct
command simplifies
the use of pattern matching by generating exhaustive match statements, as we’ll
illustrate in this article. The command has recently received a bit of love,
making it more usable, and we are taking advantage of this refresh to introduce
it and showcase some use cases.
A good IDE for a programming language ought to provide contextual information,
such as completion suggestions, details about expressions like types, and
real-time error feedback. However, in an ideal world, it should also serve as a
code-writing assistant, capable of generating code as needed. And even though
there are undeniably commonalities among a broad range of programming languages,
allowing for the "generalisation" of interactions with a code editor via a
protocol (such as LSP), some languages
possess uncommon or even unique functionalities that require special
treatment. Fortunately, it is possible to develop functionalities tailored to
these particularities. These can be invoked within LSP through custom
requests to retrieve arbitrary information and code actions to transform a
document as needed. Splendid! However, such functionality can be more difficult
to discover, as it somewhat denormalises the IDE user experience. This is the
case with the destruct
command, which is immensely useful and saves a great
deal of time.
In this article, we'll attempt to fathom of the command's usefulness and its
application using somewhat simplistic examples. Following that, we'll delve into
a few less artificial examples that I use in my day-to-day coding. I hope that
the article is useful and entertaining both for people who already know
destruct
and for people who don't.
Destruct in Broad Terms
OCaml allows the expression of algebraic data
types that, coupled with pattern
matching, can be used to describe data
structures and perform case analysis. In the event that a pattern match falls
short of being exhaustive, warning 8, known as partial-match
, will be
raised during the compilation phase. Hence, it is advisable to uphold exhaustive
match blocks.
The destruct
command aids in achieving completeness. When applied to a pattern
(via M-x merlin-destruct
in Emacs, :MerlinDestruct
in Vim, and Alt + d
in
Visual Studio Code), it generates patterns. The command behaves differently
depending on the cursor’s context:
-
When it is called on an expression, it replaces it by a pattern match over its constructors.
-
When it is called on a pattern of a non-exhaustive matching, it will make the pattern matching exhaustive by adding missing cases.
-
When it is called on a wildcard pattern, it will refine it if possible.
For those unfamiliar with the term
destruct
, pattern matching is case analysis, and expressing the form (a collection of patterns) on which you match is called destructuring, because you are unpacking values from structured data. This is the same terminology used in JavaScript.
Let's examine each of these scenarios using examples.
Destruct on an Expression
Destructing an expression works in a fairly obvious way. If the typechecker is aware of the expression type (in our example, it knows this by inference), the expression will be substituted by a matching on all enumerable cases.
Destruct on a Non-Exhaustive Matching
The second behaviour is, in my opinion, the most practical. Although I rarely
need to substitute an expression with a pattern match, I often want to perform a
case analysis on all the constructors of a sum type. By implementing just a
single pattern, such as Foo
, my match expression is non-exhaustive, and if I
destruct
on this, it will generate all the missing cases.
Destruct on a Wildcard Pattern
The final behaviour is very similar to the previous one; when you destruct
a
wildcard pattern (or a pattern producing a wildcard, for example, a variable
declaration), the command will generate all the missing branches.
Dealing With Nesting
When used interactively, it is possible to destruct nested patterns to quickly
achieve exhaustiveness. For example, let’s imagine that our variable x
is of
type t option
:
- We start by destructing our wildcard (
_
), which will produce two branches,None
andSome _
. - Then, we can destruct on the associated wildcard of
Some _
, which will produce all conceivable cases for the typet
.
In the Case of Products (Instead of Sums)
In the previous examples, we were always dealing with cases whose domains are
perfectly defined, only destructing cases of simple sum type branches. However,
the destruct
command can also act on products. Let's consider a very ambitious
example where we will make exhaustive pattern matching on a value of type t * t option
, generating all possible cases using destruct
alone :
It can be seen that when used interactively, the command saves a lot of time, and coupled with Merlin's real-time feedback regarding errors, one can quickly ascertain when our pattern matching is exhaustive. In a way, it's a bit like a manual "deriver."
The destruct
command can act on any pattern, so it also works within function
arguments (although their representation has
changed slightly for 5.2.0
), and
in addition to destructing tuples, it is also possible to destruct records,
which can be very useful for our quest for exhaustiveness!
When the Set of Constructors is Non-Finite
Sometimes types are not finitely enumerable. For example, how
are we to handle strings or even integers? In such situations, destruct
will
attempt to find an example. For integers, it will be 0
, and for strings, it
will be the empty string.
Excellent! We have covered a large portion of the behaviors of the destruct
command, which are quite contextually relevant. There are others (such as cases
of destruction in the presence of GADTs that only generate subsets of patterns),
but it's time to move on to an example from the real world!
The Quest for Exhaustiveness: Effective ML
In 2010, Yaron Minsky gave an excellent presentation on the reasons (and advantages) for using OCaml at Jane Street. In addition to being highly inspiring, it provides specific insights and gotchas on using OCaml effectively in an incredibly sensitive industrial context (hence the name "Effective ML".)! It was in this presentation that the maxim "Make illegal states unrepresentable" was publicly mentioned for the first time, a phrase that would later be frequently used to promote other technologies (such as Elm). Moreover, the presentation anticipates many discussions on domain modeling, which are dear to the Software Craftsmanship community, by proposing strategies for domain reduction (later extensively developed in the book Domain Modeling Made Functional).
Among the list of effective approaches to using an ML language, Yaron presents a
scenario where one might too hastily use the wildcard in a case analysis. The
example is closely tied to finance, but it's easy to transpose into a simpler
example. We will implement an equal
function for a very basic type:
type t =
| Foo
| Bar
The equal
function can be trivially implemented as follows:
let equal a b =
match (a, b) with
| Foo, Foo -> true
| Bar, Bar -> true
| _ -> false
Our function works perfectly and is exhaustive. However, what happens if we add
a constructor to our type t
?
type t
| Foo
| Bar
+ | Baz
Our function, in the case of equal Baz Baz
, will return false
, which is
obviously not the expected behavior. Since the wildcard makes our function
exhaustive, the compiler won't raise any errors. That's why Yaron Minsky
argues that in many cases with a wildcard clause, it's probably a mistake. If
our function had been exhaustive, adding a constructor would have raised a
partial-match
warning, forcing us to explicitly decide how to behave in the
presence of the new constructor! Therefore, using a wildcard in this context
deprives us of the fearless refactoring, which is a strength of OCaml. This
is indeed an argument in favor of using a preprocessor to generate equality
functions, using, for example the eq
standard
deriver
or the more hygienic Ppx_compare
.
But sometimes, using a preprocessor is not possible. Fortunately, the destruct
command can assist us in defining an exhaustive equality function!
We will proceed step by step, specifically separating the different cases and using nested pattern matching to make the various cases easy to express in a recurrent manner:
As we can see, destruct
allows us to quickly implement an exhaustive equal
function without relying on wildcards. Now, we can add our Baz
constructor to
see how the refactoring unfolds! By adding a constructor, we quickly detect a
recurring pattern where we try to give the destruct
command as much leeway
as possible to generate the missing patterns!
Fantastic! We were able to quickly implement an equal
function. Adding a
new case is trivial, leaving destruct
to handle all the work!
Coupled with modern text editing features (e.g., using multi-cursors),
it's possible to save a tremendous amount of time! Another example of the
immoderate use of destruct
(but too long to be detailed in this article) was
the Mime module
implementation in YOCaml for generating RSS feeds.
In Conclusion
Paired with a formatter like
OCamlFormat (to neatly reformat
generated code fragments), destruct
is an unconventional tool in the IDE
landscape. It aligns with algebraic types and pattern matching to simplify code
writing and move towards code that is easier to refactor and thus maintain!
Aware of the command's utility, the Merlin
team continues to maintain it, streamlining the latest features of OCaml to make
the command as usable as possible in as many contexts as possible!
I hope this collection of illustrated examples has motivated you to use the destruct
feature if you were not already aware of it. Please do not hesitate to send
us ideas for improvements,
fixes, and fun use cases via X or
LinkedIn!
Happy Hacking.
Open-Source Development
Tarides champions open-source development. We create and maintain key features of the OCaml language in collaboration with the OCaml community. To learn more about how you can support our open-source work, discover our page on GitHub.
Stay Updated on OCaml and MirageOS!
Subscribe to our mailing list to receive the latest news from Tarides.
By signing up, you agree to receive emails from Tarides. You can unsubscribe at any time.