AI Foundation Models for Understanding Cellular Responses to Radiation Exposure
Authors:
Abraham Stroka1* ([email protected]), Rebecca Weinberg1, Sara Forrester1, Casey Stone2, Rafael Vescovi2, Dan Schabacker1, Thomas S. Brettin3, Arvind Ramanathan3, Rick Stevens3
Institutions:
1Biosciences Division, Argonne National Laboratory; 2Data Sciences and Learning Division, Argonne National Laboratory; 3Computing, Environment and Life Sciences Directorate, Argonne National Laboratory
Goals
Development of a machine learning pipeline which identifies key morphological features in cells treated with low dose radiation.
Abstract
The use of machine learning models in cellular biology has drastically increased with rapid advances in artificial intelligence. These models, trained on cellular images, often analyze thousands of different features of an image of cells, from the number of cells to nucleus diameter. However, there needs to be more research done in the use of vision transformers for cellular classification models.
Vision transformers use techniques different from other image classification models, such as convolutional neural networks (CNNs). Vision transformers, for example, do not run the source image through various data augmentation layers but instead segment the original image into multiple patches, which are then linearly passed through the transformer. Many vision transformers have been developed in recent years, demonstrating promising validation accuracy when classifying large numbers of images; for the purposes of this project, researchers have implemented the MURA vision transformer, an efficient version that has shown reliable validation accuracy.
While there are many applications for the use of vision transformers when analyzing cellular images, for the purposes of this study, researchers have focused on the analysis of human umbilical vein endothelial cells (HUVEC), which have undergone low-dose radiation exposure. A large amount of research has been done on the effects of acute, high doses of radiation on the morphological profile of human cells. However, the effects of low-dose radiation on cellular morphology have yet to be studied. If phenotypic features of the HUVEC cell’s morphology can be identified using a vision transformer, it can lead to advanced and more efficient screening for low-dose radiation exposure.
The image data from the JUMP-Cell Painting Consortium was used to develop and fine-tune the vision transformer pipeline. The JUMP-Cell Painting Consortium is a collaboration between multiple laboratories to create a large repository of publicly available cell painting images. More specifically, the JUMP-Cell Painting Consortium contains approximately 115 terabytes of images of human osteosarcoma (U2OS) cells that have undergone either a chemical treatment or genetic perturbation and then have had cell painting performed on them for imaging. The size, consistency, and variety of this dataset provide an excellent benchmark for the MURA model’s validity and the developed pipeline’s effectiveness.
To provide clean and accurate data for the vision transformer model, the group performed cell segmentation on the cell painting images before training the model. The process of cell segmentation involves identifying the borders of each individual cell within the source image and extracting it so that each cell is contained within its own image to be fed into the model. This ensures factors such as the number of cells or clustering are not considered when training the vision transformer. The CellProfiler application was integrated into the group’s pipeline to perform cell segmentation. CellProfiler is a widely used software designed for biological image processing. For the purposes of this project, researchers developed a pipeline that will stack each image channel of the cell painting images, identify the cellular components, segment, and export the cell images into the group’s vision transformer pipeline.
Funding Information
Argonne National Laboratory’s work on the LUCID: Low-dose Understanding, Cellular Insights, and Molecular Discoveries program was supported by the U.S. DOE’s Office of Science BER program, under Contract DE-AC02-06CH11357.