A Prompt Engineering Approach for Root Confocal Image Segmentation Using the Segment Anything Model

Authors:

Song Li^1,2* ([email protected]), Upasana Sivaramakrishnan^1,2, Andrea Ramirez³, Sanchari Kundu¹, José Dinneny³

Institutions:

¹School of Plant and Environmental Science, Virginia Tech–Blacksburg; ²Department of Electrical and Computer Engineering, Virginia Tech–Blacksburg; ³Biology Department, Stanford University–Palo Alto

Goals

Establishing a digital anatomical atlas for roots of 11 members of the Brassicaceae family to inform the understanding of gene function and connection between genotype and phenotype. The long-term goal is to develop stress tolerant oil-seed crops to advance a sustainable production of biofuel.

Abstract

Comparative anatomical studies of diverse plant species are vital for the understanding of changes in gene functions such as those involved in solute transport and hormone signaling in plant roots. Through the extraction of quantitative phenotypic data of root cells, researchers can further characterize their response to environmental stimuli, facilitating an in-depth characterization of how genes control root cell development. As the first step for comparative anatomical analysis of root cells, accurate segmentation of individual cells is essential to the analysis of whole root traits. Existing software such as PlantSeg and MorphographX utilized neural networks called U-Net for cell wall segmentation. U-Net was a last generation neural network model, which requires training with large amount of manually labeled confocal images. It is time consuming to retrain the model in order to adapt to new images. Foundational models like the Segment Anything Model (SAM) hold promise across various domains due to its zero-shot learning capability alongside prompt engineering can reduce the effort and time traditionally consumed in dataset annotation, facilitating a semiautomated training process. In this research, the team evaluated SAM’s segmentation capabilities against PlantSeg, a state-of-the-art model for plant cell segmentation. The team found that PlantSeg were able to segment 2,332 plant cells from 20 confocal images of Arabidopsis roots. However, 792 such segmentations (34.0% of total segmented cells) were incorrect based on a manual inspection. In contrast, SAM model without finetuning (Vanilla SAM, or V-SAM) was able to segment 1,052 cells, with only 7.8% of cells were incorrectly segmented. Although V-SAM can only find 68.3% of correct cells found by PlantSeg, this is a surprisingly good performance because V-SAM was never trained on root confocal images. Researchers further fine-tuned V-SAM with human prompt of ~1,000 cells, by drawing rectangle bounding boxes around cells that were not segmented by V-SAM. Note this is a substantially simpler annotation as compared to the required labeling by U-Net, which is to label every pixel of the cell wall from each training image. With the finetuned SAM (f-SAM), researchers were able to segment 2,885 cells correctly from the 20 confocal images, which is 187% of that obtained by PlantSeg. These findings demonstrate the efficiency of SAM in confocal image segmentation, showcasing its adaptability and performance compared to existing tools. By addressing challenges specific to confocal images, this approach offers a robust solution for studying plant structure and dynamics. Overall, this research highlights the potential of foundational models like SAM in specialized domains and underscores the importance of tailored approaches for achieving accurate semantic segmentation in confocal imaging.

Funding Information

This project is supported by DOE-BER, DE-SC0022985.