Towards multimodal foundation models in molecular cell biology

The rapid advent of high-throughput omics technologies has created an exponential growth in biological data, often outpacing our ability to derive molecular insights. Large-language models have shown a way out of this data deluge in natural language processing by integrating massive datasets into a...

Full description

Saved in:
Bibliographic Details
Main Authors: Cui, Haotian (Author) , Tejada-Lapuerta, Alejandro (Author) , Brbić, Maria (Author) , Sáez Rodríguez, Julio (Author) , Cristea, Simona (Author) , Goodarzi, Hani (Author) , Lotfollahi, Mohammad (Author) , Theis, Fabian J. (Author) , Wang, Bo (Author)
Format: Article (Journal)
Language:English
Published: 16 April 2025
In: Nature
Year: 2025, Volume: 640, Issue: 8059, Pages: 623-633
ISSN:1476-4687
DOI:10.1038/s41586-025-08710-y
Online Access:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.1038/s41586-025-08710-y
Verlag, lizenzpflichtig, Volltext: https://www.nature.com/articles/s41586-025-08710-y
Get full text
Author Notes:Haotian Cui, Alejandro Tejada-Lapuerta, Maria Brbić, Julio Saez-Rodriguez, Simona Cristea, Hani Goodarzi, Mohammad Lotfollahi, Fabian J. Theis & Bo Wang
Description
Summary:The rapid advent of high-throughput omics technologies has created an exponential growth in biological data, often outpacing our ability to derive molecular insights. Large-language models have shown a way out of this data deluge in natural language processing by integrating massive datasets into a joint model with manifold downstream use cases. Here we envision developing multimodal foundation models, pretrained on diverse omics datasets, including genomics, transcriptomics, epigenomics, proteomics, metabolomics and spatial profiling. These models are expected to exhibit unprecedented potential for characterizing the molecular states of cells across a broad continuum, thereby facilitating the creation of holistic maps of cells, genes and tissues. Context-specific transfer learning of the foundation models can empower diverse applications from novel cell-type recognition, biomarker discovery and gene regulation inference, to in silico perturbations. This new paradigm could launch an era of artificial intelligence-empowered analyses, one that promises to unravel the intricate complexities of molecular cell biology, to support experimental design and, more broadly, to profoundly extend our understanding of life sciences.
Item Description:Online veröffentlicht: 16. April 2025
Gesehen am 29.10.2025
Physical Description:Online Resource
ISSN:1476-4687
DOI:10.1038/s41586-025-08710-y