Team of PhD Researchers Unveil AI-Powered Open Source Platform For COVID-19 Vaccine Development

  • Dr. Daouda and his team built using open source technologies
  • They also have a public API, and the code is available in GitHub under an open-source license

A team of machine learning, immunology, and bioinformatics researchers have unveiled It is an AI-powered, open source, an interactive web platform to help accelerate vaccine development for COVID-19.

The team is led by Tariq Daouda, PhD, who is currently a postdoctoral researcher at Harvard Medical School. It consists of volunteers who have doctoral degrees in machine learning and immunobiology, bioinformaticians, and web developers.

Public API

The process to develop a vaccine is typically a very lengthy and costly process. It is due to the number of virus-infected cells that need to be analysed. It tends to be rare, fragile, and precious. Dr. Daouda developed an AI algorithm that can predict which parts of a virus are more likely to be exposed at the surface of infected cells, which are called epitopes. He did this while obtaining his doctorate degree at the Institute for Research in Immunology and Cancer at the Université de Montréal.

These predictions can be used by researchers to generate a significantly shorter list of potential targets to test in the creation of a vaccine. It reduces a process that typically takes weeks or months to hours.

Dr. Daouda and his team built using open source technologies. It allowed them to accelerate the development of the project as well as allow researchers who use the platform to more easily collaborate and share results with each other. They also have a public API, and the code is available in GitHub under an open-source license.

ArangoDB serves as the backend of the portal

Tariq Daouda added, “COVID-19 stresses the need to accelerate the design of vaccines and therapies to reduce the human and economic impact of global pandemics. People infected with COVID-19 tend to have a significant decrease in circulating immune cells during the acute infection phase, making it difficult to systematically isolate enough immune cells to study appropriately in a lab. Through the results now available on, the team is utilizing open source technologies to connect machine learning to biomedicine to help accelerate learnings and findings. The hope is that the scientific community will be able to leverage these results to help prioritize ongoing experimental work towards developing effective vaccination strategies.”

The neural network, called CAMAP, has been made available for any researcher to use on It generates predictions for potential vaccine targets and also contains interactive visualizations that will allow researchers to plot their results and use them for further research.

Free cloud resources

ArangoDB, the open source multi-model graph database, serves as the backend of the portal. It stored over 182,000 epitopes and their metadata. It consisted of approximately 39,000 from SARS-CoV-2, 39,000 from SARS-CoV-1, and 104,000 from normal human sequences for comparison. ArangoDB gives it a streamlined deployment stack with data access flexibility as the project evolves in the future.

Jörg Schad, Ph.D., is the Head of Engineering and Machine Learning at ArangoDB and a core member of the team. He is responsible for cloud infrastructure and maintenance. ArangoDB is providing free cloud resources in the form of its fully-managed cloud offering, ArangoDB Oasis

Schad added, “The current situation with the novel coronavirus pandemic requires the combined expertise from a wide range of experts and diverse backgrounds . We are thrilled to be able to collaborate and contribute our knowledge and platform for scalable and high-performance data, and especially graph, processing to this project.”


Please enter your comment!
Please enter your name here