Genomic Foundation Models Collapsing Core Analysis Work
#1A new class of foundation models trained on genomic sequence data at scale is directly performing the core analytical functions of bioinformatics. Evo (Arc Institute, 2024) was trained on 2.7 million prokaryotic genomes and performs sequence generation, variant effect prediction, and regulatory element design. Geneformer (Theodoris et al., Nature 2023) enables single-cell analysis, gene regulatory network inference, and in-silico perturbation modeling. Enformer (DeepMind, Nature Methods 2021) predicts gene expression from sequence with superhuman accuracy for characterized loci. scGPT (University of Toronto, Nature Methods 2024) handles single-cell multi-omic data integration, cell type annotation, and perturbation prediction. These are not incremental improvements to existing tools — they are replacements for entire analytical workflows.