Stability AI is best known for its open-source Stable Diffusion image generator. But the company wants to explore other AI business areas as well.
Stability AI is investing in OpenBioML, a research community that aims to focus on the positive use of AI in the life sciences. It’s a field where big tech heavyweights like Alphabet and Deepmind also see potential. Compared to other research groups, however, they have much better access to computing capacity – an unequal competition.
This is where OpenBioML comes in: The decentralized research lab aims to give many researchers access to computing and storage capacities that were previously only available to the best-funded research institutions. In addition, OpenBioML aims to create incentives to publish the most advanced predictive models.
“We all stand to gain from improved biotechnology, and if machine learning is to become increasingly central to computational biology, we need to ensure these capabilities are discovered and exploited to the fullest,” the organization writes.
Image AI advances for DNA sequence prediction
Stability AI supports the organization with a cluster hosted on AWS with more than 5,000 Nvidia A100 GPUs for AI training. This capacity is said to be sufficient to train up to ten Alphafold 2-like AI models. Alphafold 2 is Deepmind’s open source predictive AI for protein folding.
BioLM is currently working specifically on three projects:
- In DNA Diffusion, diffusion models, known from DALL-E 2 or Stable Diffusion, will be trained to generate DNA sequences based on text. The goal is to have models that can generate cell type-specific or context-specific DNA sequences with specific regulatory properties based on text input. The project is led by pathology professor Luca Pinello at Massachusetts General Hospital and Harvard Medical School.
- The BioLM project will apply advances in natural language machine processing to biology and chemistry. Together with EleutherAI, the research group aims to train and open source specific biochemical language models. The models should be able to solve a range of tasks, such as generating protein sequences.
- Librefold aims to give more researchers access to protein folding prediction systems similar to Alphafold 2. The project builds on preliminary work done by RoseTTAFold and Nvidia’s OpenFold. Librefold is designed to facilitate experiments with various protein folding prediction systems. The project is led by researchers at University College London, Harvard and Stockholm.
For more information and to participate, visit the OpenBioML website.