AIDO.Cell: Scaling AI to Model Cellular Systems with Transcriptome-Wide Context

Typeresearch
AreaAIDigital Twin
Published(YearMonth)2412
Sourcehttps://www.biorxiv.org/content/10.1101/2024.11.28.625303v1
Tagnewsletter
Checkbox
Date(of entry)

AIDO.Cell, a transformative addition to the AI-driven Digital Organism framework, addresses the challenge of modeling cellular systems with transcriptome-wide precision. By pretraining dense Transformer models ranging from 3M to 650M parameters on 50 million human cells from diverse tissues, AIDO.Cell captures the entire human transcriptome without truncation or sampling. This is achieved using a read-depth-aware masked gene expression pretraining objective and advanced computational methods like FlashAttention-2 and large-scale distributed systems. AIDO.Cell achieves state-of-the-art performance in zero-shot clustering, cell-type classification, and perturbation modeling. The model’s scaling behavior reveals key insights into optimizing single-cell modeling. With its comprehensive capabilities, AIDO.Cell sets a new standard for understanding cellular systems and advancing single-cell research. Models and code are openly available to support further innovation.