AIDO.Protein: Revolutionizing Protein Understanding with Efficient Mixture-of-Experts Models
Type | research |
---|---|
Area | AIDigital Twin |
Published(YearMonth) | 2412 |
Source | https://www.biorxiv.org/content/10.1101/2024.11.29.625425v1 |
Tag | newsletter |
Checkbox | |
Date(of entry) |
Proteins are fundamental to life, yet their complexity poses challenges for understanding and design. AIDO.Protein, a novel addition to the AI-driven Digital Organism framework, introduces the first mixture-of-experts (MoE) architecture for protein modeling. Scaling to 16 billion parameters, this sparse model activates only two of eight experts per input token, achieving significant computational efficiency during training and inference. Pretrained on 1.2 trillion amino acids from UniRef90 and ColabfoldDB, AIDO.Protein delivers state-of-the-art performance across tasks in the xTrimoPGLM benchmark and ProteinGym Deep Mutational Scanning (DMS) assays, outperforming non-MSA-based models. It also excels in structure-conditioned protein sequence generation, setting new standards for protein design. By combining efficiency with effectiveness, AIDO.Protein provides a powerful foundation for advancing protein science in medicine, agriculture, and environmental applications. Models and code are openly available to accelerate collaborative research.