Register Now: Bookings still open for our OCaml Basics online course 5-6th Feb 2025

Tarides Logo
Bottle with cork on wet sand, text inside, ocean.

Secure Virtual Messages in a Bottle with SCoP

Irina Mariuca Asavoea

Senior Software Engineer

Christine Rose

Technical Writer

Posted on Tue, 08 Mar 2022

People love to receive mail, especially from loved ones. It’s heartwarming to read each word as their thoughts touch our deepest feelings. Now imagine someone else reading those private sentiments, like a postal worker. Imagine how violated they’d feel if their postal carrier handed them an open letter with a knowing smile. Of course, people trust that postal employees won’t read their personal correspondence; however, they regularly risk their privacy when sending emails, images, and messages.

Around 300 billion emails traverse the Internet every single day. They travel through portals with questionable security, and the messages often contain private or sensitive data. Most online communication services are composed of multiple components with complex interactions. If anything goes wrong, it results in critical security incidents. This leaves an unlocked door for malicious hackers to breach private information for profit or just for fun. Since it takes considerable technical skills and reliable infrastructure to operate a secure email service, most Internet users must rely on third-parties operators. In practice, there are only a few large companies that can handle communications with the proper security levels. Unfortunately for regular people, these companies profit from mining their personal data. Due to this global challenge, Tarides focused their efforts to address these issues and find solutions to protect both personal and professional data.

An Innovative Solution

Our work resulted in the project "Secure-by-Design Communications Protocols" (SCoP), a secure, easily deployable solution to preserve users' privacy. In essence, SCoP puts your messages in a secure, virtual ‘bottle’ to protect it from invasive actions. This bottle represents a secure architecture using type-safe languages and unikernels for both email and instant messaging. We mould unikernels (specialised applications that run on a VM) into refined meshes linked by TLS-firm communication pipes, as depicted in the image below.

TLS Communication Pipes

The SCoP virtual bottle creates a trustworthy information flow where dedicated unikernels ensure security for communication from origin to destination. We carefully design every component of SCoP as independent libraries, using modern development techniques to avoid the common reported threats and flaws. The OCaml-based development enables this safe online environment, which eliminates many exploited security pitfalls. Moreover, our SCoP project comes with energy-efficient consumption provided by the lightweight and low-latency design components.

We mostly focused on the sender’s side, securing the message inside the SCoP bottle. For instant messages, we created a capsule with a Matrix client library, and for emails we based our bottle on the SMTP protocol and Mr. MIME. For further protection, we developed the bottle’s ‘cork’ with the Hamlet email corpus.

The SCoP Processes

First, we generated Hamlet, a collection of emails to test our parser implementation against existing projects, to ensure that they kept equivalence between the encoder and decoder. After we successfully parsed and encoded one million emails, we used Hamlet to stress-test our SMTP stack.

Secondly, we created an SMTP extension mechanism and support for SPF, including an implementation for DMARC, a security framework in addition to DKIM and SPF. We completed four components: SPF, DKIM, SMTP, and Mr. MIME, which can generate a correctly-signed email, signatures, and the DKIM field containing the signatures.

In essence, we designed the SMTP sender bottle with a mesh of unikernels connected via secured communication pipes. The SMTP Submission Server unikernel receives the sender’s authentication credentials against the secured database maintained by Irmin. After it confirms the credentials, it sends the email for sealing (via a TLS pipe) to the DKIM signer. Then the DKIM signer unikernel, responsible for handling IP addresses, communicates via the nsupdate protocol with the Primary DNS Server. The DKIM signer places the sender’s and receiver’s addresses on the email, seals it with the DKIM signature, and sends it to the SMTP relay for distribution. The SMTP relay unikernel communicates with the DNS resolver unikernel to locate the receiver by the DNS name, then it coordinates this location with the Irmin database to verify the authorization according to the SPF protocol. After all these checks have passed, the signed and sealed email is secured in the SCoP bottle and launched through Cyberspace.

Next, we developed the Matrix protocol’s client library, and we used it to enable notifications from the CI system, testing all the new OCaml packages. We also designed an initial PoC for a Matrix’s server-side daemon.

We made significant progress in deploying DNSSEC, a set of security extensions over DNS. While we completed our first investigation into the DNSSEC prototype, we also discovered several issues, so we addressed those as lessons learned.

Finally, we completed the SCoP bottle with the email receiver, which Spamtacus (the Bayesian spam filter) guards against spam intruders. Furthermore, the OCaml-Matrix server represents our solution to take care of the instant communication in the Matrix federation.

A Secure-by-Design SMTP Stack

We researched state-of-the-art email spam filtering methods and identified machine learning as the main trend. We followed this path and equipped our email architecture with a spam-filter unikernel, which uses a Bayesian method for supervised learning of spam and acts as a proxy for internet communication in the SMTP receiver. This spam filter works in two states: preparation, where the unikernel detects spam, and operation, where the unikernel integrates into the SMTP receiver unikernel architecture to filter spam emails. Our spam-filter unikernel can also be used independently as an individual anti-spam tool to help enforce GDPR rules and protect the user’s privacy by preventing spam-induced attacks, such as phishing.

We integrated our spam filter into a unikernel positioned at the beginning of the SMTP receiver stack. This acts as a first line of defence in an eventual attack targeting the receiver in order to maintain functionality. The spam-filter unikernel can be extended to act as an antivirus by analysing the email attachment for certain features known to characterise malware. We’ve already set the premises for the antivirus by using a prototype analysis of the email attachments. Moreover, the spam-filter unikernel can contribute with a list of frequent spammers to the firewall, which we plan to add into the SMTP receiver as the next step in our development of SCoP.

How the Technology Works

DKIM, SPF, and DMARC are three communication protocols meant to ensure email security by verification of sender identity. The latest RFC standards for DKIM, SPF, and DMARC are RFC8463, RFC7208, and RFC7489, respectively.

DKIM provides a signer protocol and the associated verifier protocol. DKIM signer allows the sender to communicate which email it considers legitimate. Our implementation of the DKIM verifier is associated with the SMTP receiver, it follows the RFC8463 standard and supports the ED25519 signing algorithm, i.e., the elliptic curve cryptography generated from the formally verified specification in the fiat project from MIT.

SPF is an open standard that specifies a method to identify legitimate mail sources, using DNS records, so the email recipients can consult a list of IP addresses to verify that emails they receive are from an authorised domain. Hence, SPF is functioning based on the blacklisting principle in order to control and prevent sender fraud. Our implementation of the SPF verifier follows the RFC7208 standard.

DMARC (Domain-based Message Authentication, Reporting, and Conformance) enables a sender to indicate that their messages comply with SPF and DKIM, and applies clear instructions for the recipient to follow if an email does not pass SPF or DKIM authentications (reject, junk, etc.). As such, DMARC is used to create domain reputation lists, which can help determine the actual email source and mitigate spoofing attacks. Our implementation of the DMARC verifier is integrated in the SMTP receiver and follows the RFC7489 standard.

Our secure-by-design SMTP stack contains the DKIM/SPF/DMARC verifier unikernel on the receiver side. This unikernel verifies the email sender’s DNS characteristics via a TLS communication pipe, and in case the DNS verification passes, the spam-labelled email goes to the SMTP relay to be dispatched to the email client. However, in case the DNS verification doesn’t pass, we can use the result to construct a DNS reputation list to improve the SMTP security via a blacklisting firewall.

Matrix Server

The Matrix server in our OCaml Matrix implementation manages clients who are registered to rooms that contain events. These represent client actions, such as sending a message. Our implementation follows the Matrix specification standard. From here, we extracted the parts describing the subset of the Matrix components we chose to implement for our OCaml Matrix server MVP. The OCaml implementation environment provides secure-by-design properties and avoids various vulnerabilities, like the buffer overflow recently discovered that produces considerable information disclosure in other Matrix implementations, e.g., Element.

The Matrix clients are user applications that connect to a Matrix server via the client-server API. We implemented an OCaml-CI client, which communicates with the Matrix servers via the client-server API and tested the integration of the OCaml-CI communication with both Synapse and our OCaml Matrix server. Please note that our OCaml Matrix server supports a client authentication mechanism based on user name identification and password, according to the Matrix specification for authentication mechanisms.

Spam Filter

We researched the state of the art in email spam filtering and we identified machine learning as the main trend. We follow this trend and we equip our email architecture with a spam filter unikernel, which uses a Bayesian method for supervised learning of spam and acts as a proxy to the internet communication in the SMTP receiver. The spam filter implementation works in two stages: preparation, when the unikernel is trained to detect spam, and operation, when the unikernel is integrated into the SMTP receiver architecture of unikernels to filter the spam emails. It is worth mentioning that the spam filter unikernel can be used independently as an individual anti-spam tool to help enforce the GDPR rules and protect the user's privacy by preventing spam induced attacks such as phishing.

We integrate the spam filter into an unikernel positioned at the beginning of the SMTP receiver stack as the first line of defence in an eventual attack targeting the receiver. In this situation, the unikernel format provides isolation of the attack and allows the SMTP receiver to maintain functionality. The spam filter unikernel can be extended to act as an antivirus by analysing the email attachment for certain features that are known to characterise malware. We have already set the premises for the antivirus by a prototype analysis of the email attachments. Moreover, the spam filter unikernel could contribute with a list of frequent spammers to the firewall, which is planned to be added into the SMTP receiver, as the next step in future work.

The DAPSI Initiative

Much of the SCoP project was possible thanks to the DAPSI initiative. They gave Tarides the incentive to further explore an open and secure infrastructure for communication protocols, especially emails. First, DAPSI supported our team by providing necessary financing, but their contribution to our project’s prosperity runs much deeper than funding. DAPSI facilitated multiple coaching sessions that helped broaden our horizons and established reachable goals. Notably, their business coaching enabled us to identify solutions for our market strategy. Their technical coaching and training offered access to data portability experts and GDPR regulations, which opened our perspective to novel trends and procedures. Additionally, DAPSI helped raise our visibility by organising public communications, and DAPSI’s feedback revealed insights on how to better exploit our project’s potential and what corners of the cyber-ecosystem to prioritise. We are deeply grateful to DAPSI for their support and backing, and we’re thrilled to have passed Phase 2!

Up Next for SCoP

We’re excited to further develop this project. We’ll be experimenting with deploying unikernels on a smaller chipset, such as IoT. We’d also like to research secure data porting in other domains such as journalism, law, or banking.

Of course we’ll be maintaining each of the SCoP components in order to follow the latest available standards and state-of-the-art technology, including periodical security analyses of our code-base and mitigation for newly discovered vulnerabilities.

As in all of our work at Tarides, we strive to benefit the entire OCaml community and beyond. Please find more information on SCoP through our blog posts: DAPSI Initiative and DAPSI Phase 1.

Sequence of entity logos: in association with NGI, EU, Zabala, FGS, cap-digital, IMT Starter, Fraunhofer IAIS.

Open-Source Development

Tarides champions open-source development. We create and maintain key features of the OCaml language in collaboration with the OCaml community. To learn more about how you can support our open-source work, discover our page on GitHub.

Explore Commercial Opportunities

We are always happy to discuss commercial opportunities around OCaml. We provide core services, including training, tailor-made tools, and secure solutions. Tarides can help your teams realise their vision