Internship - Optimising tree data structures for very large directories

Tarides 

Tarides is a tech start-up founded in Paris in 2018 by pioneers of programming languages and cloud computing. Tarides develops a software infrastructure platform to deploy secure, distributed applications with strict resource constraints and low-latency performance requirements. Today, Tarides is composed of a diverse team of 35+ people.

Tarides has been part of the Founder program of Station F in 2018 (6% acceptance rate) and has been selected in France within “Concours d’Innovation i-Lab” organized by the French Ministry of Higher Education, Research and Innovation in partnership with Bpifrance (15% acceptance rate). This national contest awards company creation and innovative technologies. Tarides was also recognised during the FIC 2020 fair (International Cybersecurity Forum) which is the leading European event on cybersecurity. These awards acknowledge the innovation of the solutions developed by Tarides and emphasize the interest from the cybersecurity community.”


Internship at Tarides 

Tarides internships are an excellent opportunity to participate in open-source functional programming with tangible real-world applications.

Our interns each work on a personal project that will have a meaningful impact on the project and the wider OCaml open-source ecosystem. Each intern is assigned a mentor at Tarides to give advice and guidance when necessary. Below are ideas for potential internship topics. These are intended as suggestions only; if you're excited about a particular aspect of our work at Tarides, let us know and we'll do our best to accommodate you.


Context

The trees in Irmin are composed of nodes and blobs. In Irmin-pack, the backend used for Tezos, the nodes, called inodes, are optimised for very large directories. The nodes can reach 3 million children. The inodes are similar to btrees: the large nodes organise their children in trees, each intermediate node can hold up to n children. The inodes are themselves Merklee trees - the hash of the root of the inode is obtained by hashing all internal nodes in the inode tree. Therefore they need to be consistent regardless of the order in which values are added.

The lookup and insertion in inodes are that of binary trees (O(log n)), and their disk representation is also optimised. Lastly, the memory cost for these operations needs also to be bound.

The scope of this internship is to extract the inode data structure from Irmin, and to investigate different data structures that can replace it.

The inodes used now in Irmin have fixed parameters (for instance the nb of children in the nodes and the leaves) so a first step in the quest for better inodes is to look for better parameters.

 

Qualifications 

(You don’t have to fill 100% of the qualifications to apply.)

- Non-trivial functional programming experience

- Ability to read scientific publications and implement concepts described in them

- Basic knowledge in algorithmics and data-structures

  

What we offer

Nice office in Paris (Place de la Contrescarpe, Paris 5)

  • Flexible working hours and possibility to work remotely
  • Supportive team environment with experienced Technical and Team Leads
  • A “ticket restaurant” card 
  • 100% of public transportation pass reimbursed


Process

If shortlisted, you will have two online interviews starting with a general interview, followed by a technical interview. 

We welcome applications from people of all backgrounds. We strive to create a representative, inclusive and friendly team, because we know that different experiences, perspectives and backgrounds make for a better workplace.

Apply