Functional Characterization of GT47 Glycosyltransferases in Duckweed to Facilitate Predictive Biology
Authors:
Breeanna R. Urbanowicz1,2* ([email protected]), Charles J. Corulli1,2, Alexander S. Graf1, Tasleem Javaid1,2, Daniel H. Tehrani1,2, Digantkumar Gopaldas Chapla1,2, Samantha Hennen3, Samantha J. Ziegler3, Vivek S. Bharadwaj4, Kelley W. Moremen1,2, Maria J. Peña1,2, Yannick J. Bomble3, Pradeep K. Prabhakar1,2
Institutions:
1Complex Carbohydrate Research Center, University of Georgia; 2Department of Biochemistry and Molecular Biology, University of Georgia; 3Biosciences Center, National Renewable Energy Laboratory; 4Renewable Resources and Enabling Sciences Center, National Renewable Energy Laboratory
Goals
The long-term objective is to develop optimized experimental design schema for utilizing computational prediction and high-throughput functional validation to study plant processes at the systems level and efficiently translate knowledge gained to link genome sequence with gene function.
Abstract
Duckweeds are fast-growing, aquatic energy crops that produce large amounts of biomass that is enriched in complex, non-cellulosic carbohydrates that are highly amenable to conversion into fuels and bioproducts. Enzymes called glycosyltransferases (GTs) participate in the biosynthesis of these carbohydrates that enable storage of carbon and energy as glycopolymers. To date, precise functional predictions are extremely difficult or completely unreliable for GTs, as many families are polyspecific. We are performing family-wide characterization of GT47 enzymes in duckweeds as a model for high-throughput (HTP) functional studies of enzymes involved in carbohydrate metabolism. This family is highly expanded in energy crops, and members display diverse substrate specificities, ultimately contributing to the synthesis of almost every class of polysaccharide within biomass (Zhang et al. 2023). Gene functional validation efforts are being performed using a multi-disciplinary approach involving enzyme and substrate library construction, HTP biochemical assays, and computational biology. The combined data are being used to populate a machine-learning framework to better enable prediction of plant GT function. We will also present preliminary work on how regression models, trained on GT sequences can be used to predict donor specificity, highlighting the need for robust curated data for interpretive machine learning frameworks. Functional validation achieved through this research project will be used to assign plant gene function and study processes at the systems level to efficiently link genome sequence with function in a feedstock-agnostic manner.
References
Zhang, L., et al. 2023. “Glycosyltransferase Family 47 (GT47) Proteins in Plants and Animals,” Essays in Biochemistry 67(3), 639–52. DOI:10.1042/EBC20220152.
Funding Information
This research was supported by the U.S. DOE, Office of Science, BER program, GSP grant no. DE- SC0023223. A portion of this research is supported by the Facilities Integrating Collaborations for User Science (FICUS) program (DOI:10.46936/fics.proj.2023.60868/60008910) and uses resources at the DOE Joint Genome Institute (JGI) (https://ror.org/04xm1d337) and the Environmental Molecular Sciences Laboratory (EMSL) (https://ror.org/04rc0xn13), which are DOE Office of Science User Facilities operated under Contract Nos. DE-AC02-05CH11231 (JGI) and DE-AC05- 76RL01830 (EMSL).