Feature Parity Series: Improving Developer Tooling on macOS

When considering which projects to focus on, our highest priority tends to be those that restore support for tools users rely on or introduce new tools that address a compelling problem. One of those tools is the LLDB debugger, which needed some attention after the OCaml 5 update.

LLDB is the primary supported debugger for macOS and comes included with Xcode as part of Apple's developer tools. It supports both the ARM64 and AMD64 platforms, which OCaml also supports. Ensuring a smooth macOS experience is crucial, as it enables development on the Apple hardware used in the community and by Tarides engineers. This post provides an overview of the work done to enhance the macOS debugging experience for OCaml developers.

LLDB and Our Goals

Let's begin with some context about the technology we're discussing today. Debuggers are tools used to trace, manipulate, and visualise the state of a target program running on a target system. Developers use debuggers for tasks like tracing a program's control flow, inspecting the values of variables during execution, halting the program at predetermined locations, and executing functions within the running process.

Why focus on LLDB? It is a well-maintained project that supports a wide range of platforms we target, including macOS, iOS, FreeBSD, Windows, and Linux. Most significantly, for OCaml, LLDB is the only supported choice for ARM64 MacOS (an important and popular developer platform) and comes included with XCode. This means that providing a debugging experience on macOS requires LLDB. Unfortunately, GDB, another well-known open-source debugger, is unavailable for ARM64 macOS.

Based on our usage, we recognised that LLDB needed some attention and initially raised issue #12933, highlighting that setting breakpoints within LLDB was broken. Further investigation revealed other problems, such as printing backtraces producing incorrect results. Additionally, we saw the opportunity to integrate GDB features, such as printing OCaml values and running debugger tests within OCaml's test suite.

To make an impact with LLDB support, we focused on:

Fixing how to create breakpoints in LLDB
Porting GDB's Python-based value printers to LLDB
Improving the debugging information emitted by the OCaml compiler

Breakpoints and Name Mangling

Breakpoints are a common feature in debuggers; they allow developers to halt program execution when a specific piece of code is executed. LLDB provides several methods for creating breakpoints. Firstly, you can use a memory address, or secondly, you can specify the name of a function, or finally, use a combination of a filename and a line number. Happily, using memory addresses to create breakpoints worked, but the other two ways were broken.

To understand how LLDB can set a breakpoint using a function name, we must explain how the compiler treats source-level names in OCaml and how that impacts LLDB. In the OCaml compiler, there is a process called name mangling, which involves turning the name of a program entity in OCaml into a form that is unique and can be linked against. Often, there are repeated names for particular functions or even modules in an OCaml codebase, and the compiler needs to generate unique names for them before sending them all to the linker, which is responsible for producing the final executable as a Mach-O binary.

Concretely, when setting a breakpoint based on a function name, it is necessary to use the mangled name produced by OCaml. During OCaml 5 development (while fixing a linking bug), the name mangler was changed to generate names like camlModule.function_name. Let's illustrate how this works with an example, consider this Fibonacci program:

(* fib.ml *)
let rec fib n =
 if n = 0 then 0
 else if n = 1 then 1
 else fib (n-1) + fib (n-2)

let main () =
 let r = fib 20 in
 Printf.printf "fib(20) = %d" r

let _ = main ()

The main function would get the name camlFib.main_123, and the fib function would be camlFib.fib_271; note that the trailing number can change between different runs of the compiler. These names are then used to create breakpoints within LLDB. Tab completion helps out, allowing you to specify part of the name and use tab to fill in the rest.

Unfortunately, using . in these names conflicted with LLDB, which used . for other purposes and wouldn't accept names with it present. Rather than modifying LLDB itself, we needed to alter the separator used in mangled names.

A one-character change seems simple, right? The other program that consumes these mangled names is called a linker, and they have restrictions on what characters can be used in a name, often restricted to a printable subset of ASCII characters. This problem already appeared in the MSVC porting work, where the linker on that platform wouldn't accept . either and a workaround was introduced to use $ instead. That was the approach we decided to take, and we modified name mangling for all platforms to use $. More details about how this change impacts other areas are detailed in PR #13050. This change will appear in OCaml 5.4, fixing the problem of setting breakpoints using mangled names and providing consistent names across platforms.

Setting breakpoints based on filename and line number was a more exciting experience, which we will cover in a later post.

Printing OCaml Values Using Python

OCaml uses a uniform memory representation in which all values can be kept in a single machine word, typically 64 bits on modern hardware. An OCaml value is either an immediate integer or a pointer to some other memory representing the value. This representation has implications for debuggers, as understanding how to print an OCaml value requires familiarity with this memory structure.

Both GDB and LLDB can be extended using Python as a scripting language. This capability enables developers to implement custom printing formats for values, add new commands, and perform various other useful functions. There were pre-existing macros for GDB for printing OCaml values, these could be rewritten into Python which would allow them to be used with both debuggers.

The resulting core ocaml.py library understands OCaml's uniform memory representation and how to print out values. The GDB-specific file gdb.py handles integrating with GDB's value printer, and a similar lldb.py exists for LLDB. The previous GDB macro file was retained for backward compatibility, but now it prints a deprecated warning when used. Beyond the core printing functionality, the new system also introduces improved commands ocaml and ocaml find, the former of which is introduced with the PR #13136 and allows for future sub-commands, and the latter being the heap search command, which was changed from gdb-macros.

Check out PR #13136 for more details, including several examples of what formatting you can expect when working with the debuggers. The end result is a Python-based solution shared between the two debuggers that can be easily extended in future.

Improving the Debugging Information

Debugging information is any data that is required by the debugger to perform its task. Often, this is extra information outside of the main executable. For example, a debugger needs to associate machine code in an executable with the source code used to produce it. DWARF is one such debug information format used on macOS and Linux systems.

We identified two issues with the debug information produced by OCaml:

Printing backtraces produced incorrect results
LLDB would not display the OCaml source for an executable

A backtrace is a visual representation of the current call stack for a program. There are two ways a debugger generates a backtrace (a process called unwinding): Call Frame Information (CFI) or Frame Pointers. CFI is part of the larger DWARF specification and is already used in the OCaml compiler. Clearly, the team needed to start by understanding the CFI information emitted by the compiler and validating the fixes. What followed was a series of PRs improving CFI.

The first PR, #13079, focused on fixing the backtraces for macOS on the ARM64 platform. Somewhere in the 5.1.1 update, a change happened to the CFI information OCaml produced that caused LLDB to lose part of the stack trace when moving between C and OCaml frames.

The bug was caused by the difference in handling frame pointers: C frames maintained them while OCaml frames did not. The OCaml code generator reused the x29 frame pointer register in the Iextcall fast path when calling from OCaml to C. Once identified, the fix involved correctly saving the x29 register, resulting in better backtraces for ARM64 on macOS, and Linux. #13595 is a follow-up bugfix for CFI-based backtraces, where the wrong CFA register was used. This part of the code was rewritten when adding frame pointer support for ARM64 on macOS and Linux #13500. Speaking of frame pointers, #13163 enabled frame pointers on the other macOS platform, AMD64. Now, printing backtraces can use either CFI or frame pointers to unwind the call stack.

The issue of LLDB not displaying OCaml source code for an executable was due to missing DWARF information, which is necessary on macOS but not on other platforms. We understand how to fix it and are working on a solution. Interestingly, the same lack of DWARF information is why setting breakpoints based on filenames and line numbers is broken.

Extras

While working on CFI for ARM64, we found and fixed CFI issues on other platforms. In particular, Linux for ARM64 and Risc-V platforms. The three PRs, #13241, #13261, and #13271, address incorrect Call Frame Information (CFI) for LLDB and GDB. These bugs highlighted the difficulty in ensuring the CFI information is correct and that debuggers work as expected. Although both GDB and LLDB use CFI, their interpretations of the specification sometimes differ, leading to subtle bugs.

To allow users to check their debugger's functionality and to help ensure that future changes don't break the debugging experience, the team enabled both the GDB and LLDB native debuggers to run as part of the OCaml test suite. Since both debuggers provide Python APIs to facilitate interaction between them and the test suite, programmers can write Python-scripted tests for their code. The PR #13199 has more details. Since merging this change, we have discovered a few bugs, such as #13509. The idea of writing a debugger test suite in Python is so good that it is what LLDB does, so we are in good company!

Finally, we gathered all the details we discovered about OCaml debugging and added a chapter to the OCaml manual documenting what we learnt. PR #13737 does just that, covering more technical details that might be interesting.

Until Next Time

Let us know your experience with debugging OCaml 5 programs! You can always share your feedback or questions on Discuss. For more information on how to debug OCaml with LLDB, check out Tim McGilchrist's blog post, which provides an excellent overview.

You can connect with us on Bluesky, Mastodon, Threads, and LinkedIn or sign up for our mailing list to stay updated on our latest projects. We look forward to hearing from you!

Feature Parity Series: Improving Developer Tooling on macOS

LLDB and Our Goals

Breakpoints and Name Mangling

Printing OCaml Values Using Python

Improving the Debugging Information

Extras

Until Next Time

Open-Source Development

Explore Commercial Opportunities

Stay Updated on OCaml and MirageOS!

Subscription Succesful

Subscription Succesful