New AI Model Sheds Light on Tissue Organization and Cell Interactions

Researchers at Helmholtz Munich and the Technical University of Munich (TUM) have introduced Nicheformer, a novel large-scale foundation model that merges single-cell analysis with spatial transcriptomics. This model, trained on over 110 million cells, provides a fresh approach to understanding how cells are structured and interact within tissues, vital knowledge for advancing insights into health and disease.

Single-cell RNA sequencing has revolutionized biological research by revealing the genes active in individual cells. However, this methodology necessitates the removal of cells from their original environments, thus losing essential information about their spatial context and neighboring cells. In contrast, spatial transcriptomics retains this spatial information but faces challenges in scalability and technical limitations. For a long time, researchers have struggled to investigate both cell identity and tissue architecture simultaneously.

Nicheformer addresses these challenges by learning from both dissociated cellular data and spatial information. The model has the capability to “transfer” spatial context back to cells that were previously analyzed in isolation, effectively reconstructing their roles within the broader context of a tissue. To facilitate this, the research team developed SpatialCorpus-110M, one of the largest curated datasets of single-cell and spatial data available.

In their findings published in Nature Methods, the researchers demonstrated that their model consistently surpassed existing methodologies and revealed that spatial patterns leave discernible traces in gene expression even when cells are dissociated. In addition to improving performance, the study also focused on the interpretability of Nicheformer, uncovering biologically significant patterns within its internal layers, thus providing deeper insights into the model”s learning processes.

“With Nicheformer, we can now scale the transfer of spatial information onto dissociated single-cell data,” stated Alejandro Tejada-Lapuerta, a Ph.D. student at Helmholtz Munich and TUM, who co-authored the study alongside Anna Schaar. “This advancement opens numerous possibilities for investigating tissue organization and cellular neighborhoods without the need for additional experiments.”

The research aligns with the emerging concept of a virtual cell, which serves as a computational representation of cellular behavior and interactions within their native environments. While this idea is gaining traction in both biology and AI, prior models have typically regarded cells as isolated units without considering their spatial relationships. Nicheformer is the first foundation model that learns directly from spatial organization, enabling the reconstruction of how cells perceive and influence their surroundings.

Beyond introducing this innovative capability, the researchers also proposed a comprehensive suite of spatial benchmarking tasks designed to challenge future models in capturing tissue architecture and collective cellular behavior, essential steps toward the creation of biologically realistic AI systems.

“With Nicheformer, we are laying the groundwork for developing general-purpose AI models that accurately represent cells in their natural context, forming the basis for a Virtual Cell and Tissue model,” remarked Prof. Fabian Theis, Director of the Computational Health Center at Helmholtz Munich and a professor at TUM. “Such models have the potential to revolutionize our understanding of health and disease while guiding the creation of new therapies.”

Looking ahead, the team plans to develop a “tissue foundation model” that also learns the physical relationships between cells. This model could significantly aid in analyzing tumor microenvironments and other intricate biological structures, with direct implications for diseases such as cancer, diabetes, and chronic inflammation.