Diagnostics and recovery tool for Irmin backends
Mentor: Clément Pascutto
Location: Tarides office, Paris
Irmin is an OCaml library for building mergeable, branchable distributed data stores. It is built on the same principles as Git.
irmin-pack is a disk-optimized backend for Irmin, following the pack file
architecture design of
git. It was released in November 2019 and is now used
in production by hundreds of users to store the Tezos ledger, providing a 10x
size reduction compared to the former storage solution, while still ensuring
similar speed and memory performances. Despite some integrity and consistency
guarantees even in case of an unexpected event (e.g. a crash), misusing the
storage may lead to inconsistent states or inability to recover the data.
The goal of this internship is to provide a tool allowing to inspect the storage
contents to check for possible corruption, and help fix them when possible. The
tool should also provide ways to extract usage statistics about the backend; it
will rely on already existing benchmarking infrastructure. The tool will be
integrated to the
irmin command line tool and will be released under the
irmin-unix package. It should be generalized to provide a unified framework
allowing diagnostics and recovery for any irmin backend.
- Non-trivial functional programming experience