Diagnostics and recovery tool for Irmin backends

Mentor: Clément Pascutto

Location: Tarides office, Paris

Irmin is an OCaml library for building mergeable, branchable distributed data stores. It is built on the same principles as Git.

irmin-pack is a disk-optimized backend for Irmin, following the pack file architecture design of git. It was released in November 2019 and is now used in production by hundreds of users to store the Tezos ledger, providing a 10x size reduction compared to the former storage solution, while still ensuring similar speed and memory performances. Despite some integrity and consistency guarantees even in case of an unexpected event (e.g. a crash), misusing the storage may lead to inconsistent states or inability to recover the data.

The goal of this internship is to provide a tool allowing to inspect the storage contents to check for possible corruption, and help fix them when possible. The tool should also provide ways to extract usage statistics about the backend; it will rely on already existing benchmarking infrastructure. The tool will be integrated to the irmin command line tool and will be released under the irmin-unix package. It should be generalized to provide a unified framework allowing diagnostics and recovery for any irmin backend.

Necessary skills

  • Non-trivial functional programming experience
For more information and to apply, please email clement@tarides.com.
