Genomic Science Program
U.S. Department of Energy | Office of Science | Biological and Environmental Research Program

Abstract Test

TitlePIPI InstitutionPresenterResearch Area
A Gene-Editing System for Large-Scale Fungal Phenotyping in a Model Wood DecomposerZhangUniversity of Minnesota–Saint PaulZhangBiosystems DesignUniversity

The team combines CRISPR/Cas9-based genome-editing and network analysis for large-scale phenotyping in a model wood decomposer fungus relevant to the DOE mission area. The overall goal is to develop a high-throughput genetic platform that can allow the discoveries of distinctive genes and genetic features enabling the fast wood degradation in brown rot fungal species. Through this research, the team hopes to provide stand-alone tools and resources for discovering novel fungal genetic mechanisms that can be used in combination to advance relevant plant biomass conversion research in the post-genomic era.

This research focuses on a group of unique wood decomposer basidiomycete fungi–brown rot fungi, which harbor the industrially relevant pathways to extract carbohydrates from lignocellulose and have broad relevance to global carbon cycling. Distinct from other fungi, brown rot species use nonenzymatic reactive oxygen species (ROS) mechanisms to modify lignin and selectively extract sugars. Their degradative mechanisms, from a process-efficiency standpoint, represent a pathway ‘upgrade’ relative to the ancestral approaches in white rot species (Hibbett and Donoghue 2001, Eastwood et al. 2011). Fungi obtained this capacity evolutionarily by shedding rather than gaining carbohydrate-active enzymes (CAZys) repertoire genes (Martinez et al. 2009, Floudas et al. 2012, and Riley et al. 2014). This paradox therefore makes brown rot fungi a promising candidate for discovering unknown genetic mechanisms governing plant biomass degradation. Although DOE mission relevance is clear, and the team has made major genomically informed advances in brown rot, progress is limited by an inability to manipulate genes in any brown rot fungal strain.

The team has known that reshuffling of fungal genome and gene regulation might play key roles in determining the brown rot efficacy (Zhang et al. 2016, Zhang et al. 2019, and Zhang et al. 2017). Using functional genomic tools, recently team members elucidated a staggered two-step (i.e., oxidation-then-hydrolysis) gene regulation model for brown rot in PNAS in 2016 and in mBio in 2019 (Zhang et al. 2016, Zhang et al. 2019). Although these genomic studies have greatly advanced understanding of brown rot, its genetic basis remains uncharacterized and unharnessed. Targeted gaps are still remaining for understanding its genetic mechanism. For example, (1) the functions of genes involved the two-step model remain unverified and ambiguous, (2) the gene regulatory mechanism used to control and consolidate two steps is unclear, and (3) the functions of majority of genes identified by multiomics are either hypothetical or unknown and are waiting for interrogation. The existence of these gaps is, to a large extent, due to the lack of a robust genome-editing tool that can allow the validation and discovery of brown rot genetic features.

In this project, team members plan to integrate systems biology, genome-editing, and network modeling to address these key gaps. Three objectives were included, and progresses were made recently towards accomplishing the project goals:

Obj. 1: Create the CRISPR/Cas9-based editing system to validate brown rot gene functions.

To make the genetic manipulation available in brown rot fungal species, researchers first built a genetic transformation platform in a model species: Gloeophyllum trabeum. Based on this, researchers then developed a gene reporting system that relies on a laccase reporter gene and its rapid, colorimetric detection method for optimizing expression elements and transformation procedures (Li et al. 2023). With this, a codon-optimized eSpCas9 gene originally retrieved from Streptococcus pyogenes, was fused to eGFP and expressed in the G. trabeum nucleus led by a nucleoplasmin nucleus localization signal. The sgRNAs targeting a cellulase gene Cel5A (Gene ID 57704) were then expressed in the G. trabeum-Cas9 mutant for the plasmid-based CRISPR-Cas9 gene-editing. Three U6 promoters from Aspergillus niger, Trametes versicolor, and G. trabeum were tested for their efficiencies in driving the sgRNA’s expression, respectively. Parallelly, a preassembled Cas9-sgRNA ribonucleoprotein method was also tried in the same brown rot species for the scarless gene disruption. Going forward, once the gene-editing tool is fully built-up, researchers will continue to work on the genes, which were not functionally validated yet, towards generating a first-ever single-gene mutant library for brown rot phenotypic studies.

Obj. 2: Build a carbon utilization network to discover key genetic features used by brown rot.

To build a gene co-expression network for discovering novel brown rot genetic features, the team measured the transcriptomes of brown rot species in response to a broad spectrum of lignocellulose derivative carbon sources. Two brown rot species, G. trabeum and Rhodonia placenta, were used for cross-species comparisons for discovering the shared or distinct mechanisms. By a genome-wide co-expression modeling, key modules and its “hub” genes associated with lignocellulose polymers or monomers were discovered. DAP-seq (DNA affinity purification) was then used to identify the cis- and trans-regulatory elements involved in the carbon signaling pathway and revealed the regulatory machineries unique to brown rot (papers in prep.). Connecting to the whole project, this objective will complement the gene pool, aiming for large-scale phenotypic screening.

Obj. 3: Integrate network analysis with CRISPR/Cas9 library for large-scale phenotypic screens.

This objective aims to develop a pipeline to use the multiplexing sgRNA library for genome-editing and mutant library construction for large-scale phenotypic screens, followed by NGS to discover key functional genes. Firstly, tens of plasmid constructs were used for simultaneously delivery into G. trabeum and were combined with the laccase reporter to optimize a high-throughput transformation procedure. Once this ongoing trial succeed, the team will continue to integrate it with the plasmid-based CRISPR/Cas9 method for large-scale gene editing.

Collectively, by this project researchers anticipate providing stand-alone tools and resources to elucidate fundamental microbial processes relevant to DOE mission area, advancing new engineering designs for lignocellulose bioconversion.

Model-Guided Design of Synthetic Microbial Consortia for Next-Generation Biofuel ProductionZenglerUniversity of California–San DiegoThiruppathyBiosystems DesignUniversity

The team propose to lay the foundation for bioproduction using multifaceted microbial communities. Researchers will build metabolic community models of increasing complexity by integrating multiomics datasets. These models will guide engineering designs for optimized production of biofuels from lignocellulosic biomass. Furthermore, researchers will use innovative approaches to augment existing communities for improved bioproduction and complete conversion of different biomass feedstock. Overall, these strategies will provide knowledge of the functional metabolic exchanges driving interspecies interactions in microbial communities, thus providing insights into fundamental biological processes. Lessons learned here would be crucial for designing stable microbial communities for various biotechnology applications in the future.

Microbial communities are everywhere, and their influence on the environment is gaining recognition for their industrial potential, such as bioenergy production. The multiplicity of intertwined, interspecies metabolite interactions within these communities regulates their ultimate functional organization and assembly. This allows them to perform complex functional tasks unreachable by axenic systems, such as the breakdown of hardy lignocellulosic materials into high-energy volatile fatty acids (VFAs).

Bioproduction of one such fatty acid, butyrate (BA), from sustainable lignocellulosic sources has gained attention owing to butyrate’s versatile applications as a precursor for jet-fuel, polymers, fibers, and even cosmetics. However, current industrial processes have been forced to rely on monoculture setups requiring expensive enzymatic raw-material preprocessing. Thus, there is a need to rationally design reproducible, tunable consortia that can replicate the collective capabilities of natural communities, thereby negating the need for the expensive preprocessing steps and significantly intensifying the economic benefit of using cheap feedstocks.

Here, team members characterized the metabolic interactions of the mutualistic co-culture of Clostridium thermocellum and Clostridium thermobutyricum, recently shown to be effective in converting lignocellulosic biomass to butyrate (Chi et al. 2018), and identified bottlenecks that could be relieved by augmentation with additional microbes to increase biomass conversion performance and efficiency. The approach is two-pronged. Researchers first used high quality and manually curated genome-scale metabolic models (GEMs) for both species to unravel the metabolic exchange network of the co-culture, compartmentalized as a community model containing 1,777 reactions, 1,679 metabolites, and 1,569 genes. This allowed the team to computationally identify metabolic bottlenecks responsible for the co-culture’s limited butyrate production efficiency. Simultaneously, team members experimentally identified substrate inefficiencies in the co-culture setup by measuring solids deconstruction percentages and characterizing the monomeric and oligomeric sugars in the substrate left unused. Researchers tested the co-culture on raw corn-stover and deacetylated-milled corn stover (DMR) and determined highest butyrate production from DMR (without enzymatic pre-processing), with a solid deconstruction of 83.1%. The largest percentage of leftover unused sugar moieties were xylose oligomers along with some arabinose and glucose oligomers.

To identify candidate isolates that could augment the co-culture to improve carbohydrate utilization, researchers collected various soil samples and enriched them on raw corn stover, bagasse, switchgrass, poplar, and pine substrates as well as on supernatants from the DMR/co-culture experiments. The enrichments were carried out under anoxic conditions at 55°C, identical to those used with the Clostridia co-culture. They were passaged multiple times into fresh media, keeping the raw lignocellulosic plant materials as their sole carbon source, to ensure the selected members are producing butyrate, which was validated by HPLC.

Following isolation, the isolates will be tested for compatibility with the co-culture using a high- throughput community design and construction method (Coker et al. 2022). These constructed consortia will then be further engineered to optimize production of butyrate. Engineering strategies will be guided by community metabolic models, ensuring the collective capability of the designed community is reproducible and optimized toward bioproduction. This study will lay the foundation for advanced bioproduction using multifaceted microbial communities inspired from nature and will expand knowledge on intra-community microbial interactions.

Reproducible Plant Growth in Fabricated Ecosystems (EcoFAB 2.0) Reveals that Nitrogen Supply Modulates Root ExudationZenglerUniversity of California–San DiegoNorthenBiosystems DesignUniversity

This project couples novel lab and field studies to develop the first predictive model of grass-microbiomes based on new mechanistic insights into dynamic plant-microbe interactions in the grasses Sorghum bicolor and Brachypodium distachyon that improve plant N use efficiency (NUE). The results will be used to predict plant mutants and microbial amendments that improve low-input biomass production for validation in lab and field studies. To achieve this goal, the team will determine the mechanistic basis of dynamic exudate exchange in the grass rhizosphere, with a specific focus on the identification of plant transporters and proteins that regulate root exudate composition and how specific exudates select for beneficial microbes that increase plant biomass and NUE. Researchers will further develop a predictive plant-microbe model for advancing sustainable bioenergy crops and will predictively shift plant-microbe interactions to enhance plant biomass production and N acquisition from varied N forms.

Understanding plant-microbe interactions requires examination of root exudation under nutrient stress using standardized and reproducible experimental systems. Researchers grew Brachypodium distachyon hydroponically in novel fabricated ecosystem devices (EcoFAB 2.0) under three inorganic nitrogen forms (NO3−, NH4+, or NH4NO3), followed by nitrogen starvation. In liquid chromatography with tandem mass spectrometry (LC-MS/MS) analyses of exudates, biomass, medium pH, and nitrogen uptake showed EcoFAB 2.0’s low intra-treatment data variability. Furthermore, the three inorganic nitrogen forms caused differential exudation, generalized by abundant amino acids/peptides and alkaloids. Comparatively, N-deficiency decreased N-containing compounds but increased shikimates/phenylpropanoids. Subsequent bioassays with two shikimates/phenylpropanoids (shikimic and p-coumaric acids) on the rhizobacterium Pseudomonas putida or Brachypodium seedlings revealed that shikimic acid promoted bacterial and root growth, while p-coumaric acid stunted seedlings. The next objective was to identify transport mechanisms for organic acids and inorganic nitrogen by creating plant mutants with knockout ABC or nitrogen transporters. These mutations caused significant phenotypic and exometabolic changes. In conclusion, results suggest: (1) Brachypodium alters exudation in response to nitrogen status, which can affect rhizobacterial growth; (2) EcoFAB 2.0 is a valuable standardized plant research tool; (3) the plant root exudation can be altered by membrane transport engineering.

Microbial Guilds and Niches Enable Targeted Modifications of the Microbiome ZenglerUniversity of California–San DiegoMoyneBiosystems DesignUniversity

This project couples novel lab and field studies to develop the first predictive model of grass-microbiomes based on new mechanistic insights into dynamic plant-microbe interactions in the grasses Sorghum bicolor and Brachypodium distachyon that improve plant N use efficiency (NUE). The results will be used to predict plant mutants and microbial amendments that improve low-input biomass production for validation in lab and field studies. To achieve this goal, the team will determine the mechanistic basis of dynamic exudate exchange in the grass rhizosphere with a specific focus on the identification of plant transporters and proteins that regulate root exudate composition and how specific exudates select for beneficial microbes that increase plant biomass and NUE. The team will further develop a predictive plant-microbe model for advancing sustainable bioenergy crops and will predictively shift plant-microbe interactions to enhance plant biomass production and N acquisition from varied N forms.

Microbiome science has contributed greatly to understanding of microbial life and provided insights on the essential roles of microbial communities, from global elements cycling to human health. However, a comprehensive understanding of how these communities are assembled, maintained, and function as a system is still lacking. In particular, the nature of microbe-microbe interactions and how microbial communities respond to perturbations remains poorly understood. As a result, current microbiome science is largely descriptive and correlation-based, rather than predictive and based on mechanistic understanding.

To achieve predictive microbiome science, it is necessary to comprehensively elucidate the metabolic role of each microbe and its interactions with others. Such knowledge would enable the manipulation of a microbe’s trajectory within a community, for example by selectively promoting or limiting its growth.

In this study, the team presents a new method that integrates transcriptional and translational regulation measurements to reveal how each microbe allocates its resources for optimal proteome efficiency. Protein translation is the most energy-intensive process in a cell, and microbes closely regulate their resource allocation by prioritizing essential functions through differential translational efficiency (TE). Direct measurement of TE in a microbial community sample would provide insights into the metabolic role of each member of the community and allow for a better understanding of interactions with other members.

The team performed metatranscriptomics and metatranslatomics analysis to directly measure TE in situ, in a 16-member synthetic community (SynCom) composed of rhizosphere isolates grown in a complex culture medium. This approach allowed us to perform a guild-based microbiome classification, grouping microbes according to the metabolic pathways they prioritize independent of their taxonomic relationships. Team members demonstrated that guilds predicted competition between members of the same guild with 100% sensitivity and 74% specificity (77% accuracy) in the SynCom. Furthermore, gene-level analysis of TE allowed us to predict each microbe’s substrate preferences, i.e., their niche in the community. Such Microbial Niche Determination (MiND) predicted which particular microbes would benefit from substrate supplementation with 54% sensitivity and 83% specificity (78% accuracy) in the SynCom.

It is worth noting that as microbes adapt their translational regulation to community settings, such predictions could not be achieved using axenic culture approaches (e.g., phenotypic microarray or growth curves) or partially functional measurements (e.g., metagenomics or metatranscriptomics).

By combining TE-based MiND and guilds predictions, researchers were able to selectively manipulate the SynCom, by increasing or decreasing the abundance of targeted members, either by providing preferred substrates or by giving an advantage to their competitors. Importantly, this method is scalable to more complex, natural samples. Team members applied MiND to native soil samples and demonstrated its applicability in predicting changes and manipulating microorganisms in complex microbiomes.

In conclusion, the method presented in this study represents a significant step towards achieving predictive microbiome science by providing a comprehensive understanding of the metabolic role of each microbe and its interactions with others. The guild-based microbiome classification and MiND approach allows for the manipulation of microbial communities and has potential applications in various fields such as agriculture, biotechnology, and human health.

Unraveling Metabolic Interactions in a Rhizosphere Microbial Community Through Genome-Scale ModelingZenglerUniversity of California–San DiegoKumarBiosystems DesignUniversity

This project couples novel lab and field studies to develop the first predictive model of grass-microbiomes based on new mechanistic insights into dynamic plant-microbe interactions in the grasses Sorghum bicolor and Brachypodium distachyon that improve plant N use efficiency (NUE). The results will be used to predict plant mutants and microbial amendments that improve low-input biomass production for validation in lab and field studies. To achieve this goal, researchers will determine the mechanistic basis of dynamic exudate exchange in the grass rhizosphere with a specific focus on the identification of plant transporters and proteins that regulate root exudate composition and how specific exudates select for beneficial microbes that increase plant biomass and NUE. The team will further develop a predictive plant-microbe model for advancing sustainable bioenergy crops and will predictively shift plant-microbe interactions to enhance plant biomass production and N acquisition from varied N forms.

The team presents manually curated genome-scale metabolic models for 17 bacteria commonly found in the rhizosphere. These bacteria isolated from the switchgrass rhizosphere represent dominant members of the rhizosphere of grasses belonging to various genera, such as Arthrobacter, Bacillus, Bosea, Bradyrhizobium, Brevibacillus, Burkholderia, Chitinophaga, Lysobacter, Methylobacterium, Mucilaginibacter, Mycobacterium, Niastella, Paenibacillus, Rhizobium, Rhodococcus, Sphingomonas, and Variovorax. The models were generated using standard pipelines and incorporated a total of 3,877 reactions and 2,663 metabolites. Each model contains between 790 to 1,788 genes, covering 15% to 30% of the respective microbial genomes. The models were curated using information from literature and public databases such as KEGG, UniProt, and MetaCyc. All models were simulated on 215 different carbon and nitrogen sources and results were compared with experimental measurements and to improve model predictability. The models accurately predicted the growth phenotypes of rhizosphere bacteria 90% of the time. Specifically, there were 1,475 true-positive predictions (correctly identified growth), 1,542 true-negative predictions (correctly identified no growth), 388 false-positive predictions (incorrectly identified growth), and 250 false-negative predictions (incorrectly identified no growth).

Next, the team deployed these highly curated metabolic models to study the interactions in a synthetic microbial community (SynCom) of the rhizosphere. The team developed a computational framework for community metabolic models. For this the team included members in the community reconstruction with relative abundance above a minimum cutoff (>0.01%) under any tested condition. Each member’s metabolic model was treated as a separate compartment linked to a shared metabolic pool, with connections refined by experimentally determined phenotypic data (i.e., Biolog plates). Flux balance analysis was used to simulate the growth of individual members as well as that of the entire community. The biomass of each member and the community was optimized during the simulation. To further improve the computational framework, the team added a module that constrains the activity of reactions using metatranscriptomics and metatranslatomics data, reflecting internal resource allocation for each bacteria. This analysis predicts metabolic exchanges between community members and uncovers the nature of interactions, such as competition and cooperation, between rhizosphere microbes.

Designing Novel Enzymes for Complete Degradation of Recalcitrant PolyamidesZanghelliniArzeda CorporationSanghaBioenergy

The project objectives are to (1) design enzymes capable of complete depolymerization of nylon 6 and nylon 66 and (2) engineer bacterial strains able to metabolize the degradation products to higher-value sustainable materials.

As of 2015, a total of 6.3 billion tons of plastic waste had been generated globally. It is estimated that only 9% of this total has been recycled, while 12% has been incinerated to recover energy values, and the remainder has entered landfills. New technologies are needed to address this ever-growing problem. An alternative approach, harnessing the power of biology to not just depolymerize plastics back to their monomer precursors but convert them into higher-value products, offers stronger economic incentives and in turn would be expected to drive more rapid and widespread adoption. Toward that end, this work focuses on combining cutting-edge computational protein design and synthetic biology to address the challenge of complete biodegradation and upcycling of the recalcitrant polymers nylon 6 and nylon 66.

Although natural enzymes have been shown to be capable of degrading amorphous portions of polyamides such as nylon 6 and nylon 66, complete enzymatic degradation has not been demonstrated. Researchers hypothesize this to be due in large part to a lack of natural enzymes able to efficiently catalyze degradation of the crystalline portion of the polymer. Researchers are computationally designing enzymes to alleviate this limitation by introducing and optimizing polyamide hydrolysis activity in scaffolds with open active sites. In conjunction, researchers are screening and engineering bacterial strains able to metabolize nylon 6 and nylon 66 degradation byproducts directly into central metabolism. Such platform strains can be used to produce a wide variety of fermentation products from central metabolites. Integration of nylon 6 and nylon 66 depolymerizing enzymes into these engineered hosts will provide a novel, elegant, and cost-effective consolidated fermentation process for nylon upcycling to higher-value sustainable materials.

Encapsulin Nanocompartment Systems in Rhodococcus opacus for Compartmentalized Biosynthesis ApplicationsYungLawrence Livermore National LaboratoryYungBiosystems DesignEarly Career

This project is focused on understanding how encapsulin nanocompartment systems can be used to enhance the biosynthesis of next-generation biomaterials in Rhodococcus species. The project seeks (1) to probe the mechanistic basis for how these compartments are regulated, biosynthesized, and maintained, and (2) to engineer these systems to achieve new biosynthetic functions (e.g., inorganic nanoparticle biosynthesis).

With recent innovations in synthetic biology, engineered microbes now have the potential to produce a wide variety of bioproducts from renewable sources (e.g., biomass) to support the U.S. bioeconomy. However, biosynthetic pathways leading to these products are often hindered by poor reaction efficiencies and toxicity, resulting in low yields.

Compartmentalization of these pathways could potentially overcome these challenges through co-localization, concentration, and sequestration. The goal of this Early Career research is to identify mechanisms for engineering compartmentalized biosynthesis in the emerging model bioproduction bacterium, Rhodococcus opacus PD630, using its native encapsulin nanocompartment system (herein called encapsulins). Toward this goal, the native regulation, biosynthesis, and maintenance of the encapsulin system will be investigated to uncover potential pathways for controlling encapsulin production. Gene-editing methods will be used to engineer encapsulins with novel structural properties for expanded bioproduction capabilities. As a case study, the R. opacus encapsulin system will be redirected to support and control the biosynthesis of cadmium sulfide (CdS) nanoparticle semiconducting materials used in optical and electronic applications (e.g., solar panels, light-emitting diodes). CdS nanoparticles have gained considerable interest as a bioproduction target because they can potentially be produced from upcycling cadmium-contaminated waste streams. Ultimately, this work will establish encapsulin compartmentalization systems as a means of improving yields and enabling new biosynthetic routes toward next generation bioproducts and biomaterials in support of DOE’s mission to build a strong bioeconomy and thus enhance U.S. energy security.

Systems Biology to Enable Modular Metabolic Engineering of Fatty Acid Production in CyanobacteriaYoungUniversity of Wisconsin–Madison ZunigaBioenergyUniversity

The overall objective of this project is to use systems biology to identify metabolic control points and bottlenecks that regulate flux to free fatty acids (FFAs) in cyanobacteria. The central hypothesis is that cyanobacterial lipid metabolism can be modularized into pathways “upstream” and “downstream” of the nodal metabolite acetyl-CoA, which can be separately studied and optimized to enhance overall FFA production. The team plans to test its central hypothesis and accomplish the overall objective of this project by pursuing the following specific aims:

  1. Identify (upstream) metabolic control points regulating acetyl- CoA precursor availability. The working hypothesis is that engineering glycolytic pathways in PCC 7002 will reveal rate-controlling steps that can be manipulated to maximize acetyl-CoA availability.
  2. Assess flux bottlenecks in the (downstream) fatty acid biosynthesis pathway. The working hypothesis is that multiomics analyses of thioesterase-expressing strains will elucidate regulatory nodes that control FFA production and overall lipid metabolism in PCC 7002.

Cyanobacteria are attractive hosts for biomanufacturing because of their ability to rapidly fix CO2, grow in nutrient-poor environments, and produce renewable chemicals directly from photosynthesis. Unlike triacylglycerol production in green algae, producing free fatty acids (FFAs) using genetically engineered cyanobacteria results in the secretion of the product into the culture medium for efficient recovery. However, an incomplete understanding of the regulation of cyanobacterial lipid metabolism limits the ability to engineer high-titer FFA-producing strains rationally. The overall objective of this project is to use systems biology to identify metabolic control points and bottlenecks that regulate flux to FFAs in the fast-growing, halotolerant Synechococcus sp. strain PCC 7002 via the modular optimization of metabolic pathways that are “upstream” and “downstream” of the nodal metabolite acetyl-CoA in a “push-pull” metabolic engineering strategy. Recent work has centered around the previously identified bottleneck of cyanobacterial FFA synthesis, FabH. Unexpectedly, overexpressing the kinetically superior E.coli FabH inhibits FFA production in a C8-producing PCC 7002 strain. Team members are using a suite of systems biology approaches, including 13C flux analysis, metabolomics, lipidomics, and proteomics to investigate this and other distinctive FFA production phenotypes. These data will allow us to identify and correct metabolic bottlenecks limiting FFA biosynthesis, “i.e., pull,” and optimize the carbon flux directed towards FFA synthesis, “i.e., push.” This approach will provide a deeper understanding of how fatty acid flux is regulated upstream and downstream of acetyl-CoA, enabling integrated “push-pull” metabolic engineering strategies to produce lipid products directly from photosynthetic CO2 fixation in cyanobacteria.

 

Inter-Facility Collaboration: Overarching Challenges and Opportunities Identified Through the “Genomes to Structure and Function” Virtual WorkshopAdamsLawrence Berkeley National LaboratoryAdamsStructural Biology

The goal of this workshop was to explore the need for the DOE Biological and Environmental Research (BER) community to combine genomic, functional, and structural approaches to advance their research.

The goal of BER is to achieve a predictive understanding of complex biological, Earth, and environmental systems with the aim of advancing the nation’s energy and infrastructure security. To pursue this goal, collaborations among experts in diverse research areas that lead to multidisciplinary projects are indispensable. The roles of DOE’s user facilities, which offer unique and powerful resources for such research projects, are evolving; and expectations for the facilities are increasing. To respond to users’ needs, the Joint Genome Institute (JGI) and Environmental Molecular Sciences Laboratory (EMSL) initiated the Facilities Integrating Collaborations for User Science (FICUS) program in 2014. This collaboration has grown into a successful program, advancing more than 100 multidisciplinary projects to date. Similarly, new interfacility collaborations among the JGI, EMSL, and user resources for BER structural biology and imaging at the Basic Energy Sciences program’s synchrotron and neutron facilities are becoming essential for cutting-edge transdisciplinary science. To explore the need for the BER research community to combine genomic, functional, and structural approaches to advance their research, team members hosted the “Genomes to Structure and Function” virtual workshop, focusing on molecular structures, intracellular organization, material synthesis and decomposition, imaging the rhizosphere, and cellular organization. This workshop identified three major overarching challenges and opportunities: science, technology development, and interfacility integration challenges. The team recently completed the report summarizing these findings. On behalf of the organizing committee, here is the summary of these findings.

Metabolism in Microbial Communities and the Associated Biochemistry of Polymer DeconstructionYeatesUniversity of California–Los Angeles DOE Institute for Genomics and Proteomics GunsalusStructural BiologyUCLA DOE IGP

This team’s microbiology projects within the UCLA DOE Institute for Genomics and Proteomics employ molecular, biochemical, and in silico approaches to examine model microbial communities and their metabolic partners to better understand the processes that drive anaerobic carbon recycling in nature. This information impacts multiple areas of BER interest including bioconversion of model substrates in natural and manmade environments, the associated biochemistry of key degradative enzymes, and the design of plant-based biomass deconstruction strategies for biofuel production. Metabolic pathways for key substrates are being elucidated in model syntrophic communities with focus on their key enzymes and associated oxidation-reduction reactions. Next-generation omics methods are in development to interrogate environmentally relevant pathways and interactions in microbial communities as well as to test newly proposed functions where possible. Using the cellulolytic model microorganism, Clostridium thermocellum, team members are examining how anaerobic microbes synthesize and assemble their extracellular cellulosome structures that degrade lignocellulose.

Major activities within the UCLA-DOE Institute in the past year deal with three core areas of investigation.

Elucidation of syntrophic microbial pathways for metabolism of model substrates. Genomic, proteomic, and informatic studies were performed on defined microbial communities to elucidate how representative fatty acid substrates are metabolized. Core pathway enzymes for short- and branched-chain fatty acids were elucidated and further characterized in Syntrophomonas wolfei and S. wolfei sub sp methylbutyratica cells when grown with Methanospirillum hungatei or Methanobacterium formicicum as the methanogenic partner. Recombinant and structural studies of acyl-CoA reductase enzymes of the beta oxidation pathways were performed to further explore the biochemistry of these thermodynamic limiting steps during syntrophic cell growth. Associated electron transfer pathways leading to hydrogen production were also examined and documented.

To further explore syntrophic microbial diversity, PacBio long read sequencing approaches were used to sequence, assemble, and annotate genomes of previously isolated syntrophic bacterial strains that utilize other model substrates when grown in co-culture with suitable methanogen partners. Team members are extending the gene annotation methods beyond the standard homology-based interferences to those based on co-evolution such as phylogenetic profiling, phenotypic profiling, and operon conservation with the goal of supporting microbial pathway prediction and modeling.

Acyl-lysine modification of syntrophic pathway proteins. Proteomic and mass spectrometry studies were performed to further characterize protein post-translational modifications of carbon and electron transfer pathway enzymes in model syntrophic strains. As protein modification can affect enzyme activity, these data will decipher their relationship with the metabolism of syntrophic microbial communities. Acyl-lysine modifications, which can arise from reactive metabolites, were strikingly found in high abundance in the proteome of model syntrophic bacteria. Acetyl, butyryl, 3-hydroxybutyryl, and crotonyl modifications were observed in both S. wolfei, and S. wolfei sub sp methylbutyratica. Interestingly, the methylbutyratica subspecies, capable of metabolizing longer carbon substrates, also displayed instances of methylbutyrylation, valerylation, and hexanoylation. The type and relative abundance of these modifications do significantly change in response to different carbon sources, correlating with metabolic bottleneck points in the microbes’ degradation pathway.

Cellulosome assembly and display in cellulolytic anaerobic bacteria. In companion microbial studies, the team is investigating how highly cellulolytic anaerobic bacteria synthesize, assemble, and display cellulosomes. Clostridium thermocellum, a model bacterium capable of directly converting cellulosic substrates into ethanol and other biofuels is being used to investigate how the cell fine-tunes the enzyme composition of its cellulosome using anti-σ factors to control gene expression in response to sensing extracellular polymers. Team studies have shown that the RsgI9 anti-σ factor interacts with cellulose via a C-terminal bi-domain unit that is likely extended from the cell surface. Current work seeks to elucidate the mechanism of signaling through its Conserved RsgI Extracellular (CRE) domain, which researchers hypothesize is proteolyzed to transduce signals to downstream σ factors that modulate the expression of cellulosome components. Researchers are also employing in silico comparative genomics approaches to identify conserved cellulosome biogenesis pathway components whose functional importance will be assessed in C. thermocellum. The results of these studies will provide new insight into anaerobic carbon recycling by naturally cellulolytic bacteria and could guide rational engineering efforts to create microbes that are capable of converting of plant biomass into biofuels, materials, and chemicals.

Electron Diffraction for High-Resolution Structure Determination of BiomoleculesYeatesUniversity of California–Los Angeles DOE Institute for Genomics and ProteomicsVlahakisStructural BiologyUCLA DOE IGP

The team aims to develop electron diffraction (ED) methods to extract high-resolution structural information from biomolecules and to determine novel biomolecular structures, expanding phasing methods, and identifying the extent to which ED data can inform on emergent properties of structures. Current goals include resolving the presence of substrates bound to enzymes within nanocrystals, identifying significant differences in electron scattering originating from crystals of opposite chirality, deconvoluting nanoscale lattice imperfections to resolve structures from more homogenous subregions of a single crystal, and understanding the rates and impact of electron beam damage on biomolecular crystals during data collection. Ultimately, the project’s objectives are to address the remaining gaps where electron crystallography still lags behind its X-ray counterpart and develop methods and strategies for closing them such that accurate structural information can be learned, even from nanocrystals.

Electron diffraction has recently emerged as a powerful means of elucidating atomic structures from 3D crystals on the scale of nanometers in size, circumventing the requirement of large, pristine crystals required for successful structure determination by X-ray diffraction. The group aims to develop ED methods further, such that high-resolution structural information about biologically compelling molecules may be extracted from a wider range of targets with greater confidence. The team will discuss work done to expand the scope of molecular targets for electron diffraction as well as the range of ways to retrieve phase information from ED data, through the determination of structures encompassing short oligopeptides, cyclic peptides, natural products, and proteins. The team also interrogates the extent to which ED can be expected to provide accurate information on certain granular but nonetheless critical qualities of a molecule’s structure, such as the presence and configuration of a substrate bound in an enzyme’s active site or a chiral molecule’s absolute configuration. To carry out these investigations, researchers employ data collection from two methods in parallel:

  • continuous-rotation selected area diffraction (microED), which enables researchers to routinely determine structures and assess the quality of structural information readily accessible to the vast majority of current investigators in the field, and
  • scanning nanobeam diffraction (4D-STEM), which allows researchers to correlate high resolution-information in real space and reciprocal space to visualize lattice heterogeneity within single crystals and potentially resolve structures from crystal sub-domains.

As biomolecules are almost ubiquitously sensitive to damage by the electron beam, researchers probe mechanisms of radiation damage experienced by nanocrystals during data collection with both methods, offering insights into how the final structures solved by ED are impacted by beam damage and enabling the development of strategies for minimizing the damage delivered during ED experiments. Lastly, the team will perform ongoing work to validate and innovate sample preparation and data acquisition strategies for resolving improved macromolecular structures by electron diffraction.

Investigating Plant and Microbe Systems for Understanding the Formation and Modulation of Amyloid and Amyloid-Like ProteinsEisenbergUniversity of California–Los Angeles DOE Institute for Genomics and ProteomicsLutterStructural BiologyUCLA DOE IGP

The team aims to investigate plant and microbe systems as a platform for understanding the formation and modulation of amyloid or amyloid-like assemblies. This includes low complexity structures that create phase-separated physical systems that may be functional or pathogenic as well as those that form large macromolecular assemblies, particularly fibrils. Researchers therefore aim to better identify potential amyloid-forming sequences across microbial and plant species and identify compounds or molecules that inherently modulate the amyloid-forming propensity of such segments. The downstream aims of the work involve the identification of functional or pathogenic roles for such assemblies and specifically their roles in plant stress tolerance in response to increased pressures on agriculture resulting from climate change.

The aggregation of proteins into the amyloid state is often associated with disease, but amyloids or amyloid-forming sequences exist throughout the tree of life and have been known to carry out functional roles under normal physiological conditions. The team is therefore interested in the identification of amyloid-forming segments across microbes and plants and their modulation by endogenous molecules.

For amyloid prediction, the team relies on the fact that at their core, amyloid fibrils contain peptide segments assembled into tightly mated and interdigitated structures referred to as ‘steric zippers.’ Algorithms that rely on protein design approaches, such as secondary structure predictors and Rosetta modelers, have accurately predicted amyloid propensity for short segments, but they are computationally expensive to apply on the scale of entire genomes. Machine learning offers a faster and potentially accurate alternative to the prediction of amyloid propensity. Team members analyzed the amyloid-forming propensity of short peptide segments by training a neural network on the calculated Rosetta energy scores of computed six-residue steric zipper structures. Training a network on this data leverages many hours of compute time already invested in zipper structure predictions and promises to replace the expensive process with fast evaluation by the trained network. The network can yield propensity scores for 6-residue segments in seconds rather than hours or days. After training the network on over a million structures, researchers evaluated its performance by scoring its accuracy against a computed set during training and assessed its speed of prediction when scoring entire genomes containing millions of hexapeptide sequences. Experimentally, researchers evaluate predicted scores from the network by probing the propensity of synthetic peptides to form steric zippers in solution. For those segments that form what appear to be amyloid assemblies, researchers characterize their structures by X-ray crystallography or electron diffraction and compare their structures and scores to those predicted by the network. Predictions thus far are competent in either—forming fibrils crystals with cross-beta structures. Separately, the team has applied bioinformatic tools to determine proteins with amyloid sequence features from the Arabidopsis thaliana proteome. Gene ontology data was then used to identify proteins with functional roles in environmental stress responses, including heat and drought tolerance as well as roles in response to pathogens. From these, proteins with previously validated in vivo functional roles were selected for investigating the roles of the propensity of the LCDs to drive protein phase separation and aggregation in the function of the protein with the aim of modulating this behavior and the associated stress-response functions.

Our efforts to identify amyloid modulators have focused on plant natural products identified using open-access databases and computational evaluation of absorption, distribution, metabolism, excretion, and toxicity (ADMET) profiles. Researchers score these compounds based on their drug-like physico-chemical properties and targeting of model amyloids. The effects of highest ranking compounds on known amyloid proteins were evaluated in vitro using a fluorescent reporter Thioflavin T and transmission electron microscopy. Preliminary results show an inhibitory effect on amyloid formation or significant changes to sample morphology by several of the tested plant small molecules. Further characterization will involve determining the effect of the compounds on aggregation and their prion-like spread from cell to cell in biosensor cell assays.

In conclusion, this research will identify plant amyloid-forming proteins and plant molecules that influence amyloid aggregation with implications to their functional roles in environmental stress responses.

Phenotypic Characterization of Sorghum Nitrogen Responsive Gene Edits Using High-Throughput PhenotypingJinUniversity of Nebraska–Lincoln JinBioenergyUniversity

Inefficient use of inorganic nitrogen (N) fertilizers can result in environmental issues and increased farming costs. To address this issue, this project, supported by the U.S. Department of Energy (DOE), aims to investigate the phenotypic and molecular effects of N treatments on a set of sorghum gene edits identified as N-responsive candidate genes through previous studies. As a pilot study, the team used UNL’s LemnaTec phenotyping facility to collect imagery data for five sorghum edits (including one triple gene edit, two double gene edits, and two single gene edits) throughout different developmental stages. The team extracted numerical phenotypic values from the images and performed statistical analyses to determine the effects of N treatment on each edit. The project’s results showed statistically significant phenotypic effects for several edits in response to N treatments. Based on these preliminary findings, the team will refine phenotypic characterization procedures and conduct additional phenotyping for other gene edits.

The outcomes of this project will contribute to the development of sustainable and efficient crop production methods, thus advancing agricultural practices in a more environmentally responsible manner.

A Leaf-Level Spectral Library to Support High-Throughput Plant Phenotyping: Predictive Accuracy and Model TransferWijewardaneMississippi State UniversityWijewardaneBioenergyUniversity

Leaf-level hyperspectral reflectance data has become an effective tool for high-throughput phenotyping of plant leaf traits due to its rapid, low-cost, multisensing, and nondestructive nature. However, model calibration is often expensive regarding the number of samples, time, and labor; and models show poor transferability among different datasets. Building large spectral datasets across multiple species enables accurate model calibration and improves model transferability. The team pursued three specific objectives: (1) assemble a large library of leaf hyperspectral data (n=2,460) constructed from maize and sorghum, (2) evaluate two machine-learning approaches to estimate nine leaf properties, and (3) investigate the utility of this spectral library for predicting external datasets (n=445) including maize, sorghum, soybean, and camelina using extra-weighted spiking. Partial least squares regression models exhibited higher predictive performance than deep neural network models.

Models calibrated solely using the spectral library showed poor performance when applied to external datasets (R2<0.3 for N, P, and Ca with camelina samples). However, model transferability improved significantly when extra-weighted spiking with a small dataset (n=20) was employed (R2>0.6 for N, P, and Ca with camelina samples), indicating that it can improve the effectiveness and utility of spectral libraries in high- throughput phenotyping contexts.

Ghost Imaging of Biological Samples Using Non-Degenerate Entangled PhotonsWernerLos Alamos National Laboratory RyanBioimaging

This project aims to image plants using noninvasive and nondisruptive methods. Using entangled photons, plant samples will be probed with near-infrared photons (NIR), but an image will be generated with visible photons that never interacted with the sample in an approach called ghost imaging. This method will measure the NIR absorption of plant matter for chemical composition without requiring the introduction of fluorescent probes. The advantage of ghost imaging is that the plant will experience extremely low light flux from the probing photons, leaving the organism completely undisturbed.

Studying dynamic processes in plants using optical techniques is challenging because plants are sensitive to visible light. For example, camelina and sorghum are two species that are drought tolerant and have high economic values. However, water redistribution within these plants during various growth phases is difficult to measure without disturbing the ongoing biological processes. As photosynthesis is active when plants are illuminated with visible light, the process of probing plants inherently induces an undesirable response. The team reports on efforts to image plants in the near-infrared (NIR) region of the spectrum where chemical signatures can inform about the distribution of specific biomasses such as water, lipids, and lignocellulose.

While most common techniques in the NIR (i.e., FT-IR and Raman spectroscopy) involve high photon fluxes to measure chemical signatures, team members are pursuing ghost imaging to ameliorate any potential disruption to the functions of the plants. Ghost imaging involves generating photon pairs that are spatially and temporally correlated. One photon of a pair is used as a probe to be absorbed or transmitted through a living plant. The other photon is sent along a different optical path to be detected by an imaging device that records the spatial origins of the photon. By correlating the photons that are transmitted though the plant with those that arrive on the imaging detector, an absorption image can be formed with extremely low light flux. Furthermore, the two photons of a pair can be different wavelengths, allowing the plant to be probed in the NIR, but the image formation is in the visible. Ghost imaging is challenging because of the requirements on the imaging detector. Common scientific cameras cannot operate at high enough frame rates to perform the correlation, and other detector technologies, such as single-photon avalanche photodiode arrays (SPAD array), have not reached maturity for their application to these types of imaging experiments.

Team members demonstrate the application of a low-light imager, Nocturnal Camera (NCam), to ghost imaging for biological research. NCam is capable of recording single-photon arrivals with the spatial and temporal resolution to perform ghost imaging. Specific details about the design of a microscope based on ghost imaging with NCam will be presented as well as the most recent results from imaging experiments.

Biological Imaging of Plant and Fungal Samples Using Infrared LightMoralesLos Alamos National LaboratoryMoralesBioimaging

The team aims to measure water, lignocellulose, and lipid content in plants using a quantum ghost imaging microscope that exploits entangled photon pairs. In this approach, one photon from the entangled photon pair passes through the sample and is detected by a single element detector, while the other is detected by an imaging sensor. The microscope probes a sample at the near- or mid-infrared (where a molecular vibrational fingerprint is detectable) and images in the visible where efficient low-noise imaging detectors are available. Entangled photon pairs are generated one pair at a time by spontaneous parametric down conversion and by detecting the photons simultaneously the noise of the system is greatly reduced by direct counting of coincidences of photons correlated in detector arrival time. This system generates high signal-to-noise ghost images with an illumination less than starlight, providing extended monitoring over time with little to no perturbation to the biological sample.

The team demonstrates the utility of infrared spectroscopy on environmental and plant samples as a complementary method to identifying key molecular features for biological monitoring using quantum ghost imaging microscopy. For this project, team members characterized two plant models (poplar and sorghum) and a species of fungus to assess the features obtained by infrared light on samples broadly relevant to the DOE BER research portfolio. Researchers perform conventional Fourier transform infrared spectroscopy to obtain infrared spectra of chemical features such as lignocellulose, proteins, and lipids in samples to generate a database relevant to sample biomass. This database is then analyzed by principal component analysis to determine the spectral basis for which they can be separated according to phenotypic differences. By identifying the features that are most divergent between samples, researchers can assess the major biological shifts being observed at probe wavelengths during long-term monitoring by quantum ghost imaging.

Putting Microorganisms on the Map: Continental-Scale Context for Thousands of Newly Sampled Microbial Genomes from North American WetlandsWrightonColorado State UniversityWrightonEnvironmental MicrobiomeUniversity

Despite their vital roles in transforming nutrients and controlling greenhouse gas (GHG) fluxes in wetlands, microbial knowledge is often limited to taxonomic identity alone and rarely includes cross-site comparisons. The team proposes to address this knowledge gap using coordinated, reproducible field measurements collected across a wetland-methane continuum from at least 25 wetlands in the continental United States. The overarching project goal is to decode the unifying microbial properties governing soil carbon decomposition and methane fluxes, both within and among freshwater wetlands. This project tests the overarching hypothesis that microbial genomic attributes are conserved across high methane-emitting samples or wetlands, such that some level of biological representation into models will enhance predictions of soil methane fluxes. First, the team will use a cross-wetland approach to define the microbial membership, physiology, and interactions directly contributing to wetland methane production. Next, the team will uncover the microbial decomposition network features that classify high methane emitting wetlands. Using this information, the team will test the genomic resolution needed to make robust predictions of methane fluxes across regional and global models. These integrated field, laboratory, and modeling approaches will identify the biotic attributes conserved across methane-emitting wetlands, such that some level of biological representation into models will enhance predictions of soil methane fluxes.

Today wetland contributions to the global methane budget are estimated from ecosystem scale models. These models exclude representation of soil microbial metabolism or are based on incomplete or outdated knowledge on the physiological controllers on soil methane metabolism. Instead, these models use abiotic (e.g., temperature) and indirect biotic (e.g., gross primary production) variables to approximate the environmental states enabling soil methane flux. However, years of observations from more than 40 freshwater wetlands showed that these variables only partially predicted annual methane fluxes. The deviation of the predictions from observations indicates predictions derived only from abiotic variables incompletely represented methane flux, especially for the highest emitting wetlands. Team members posit that knowledge of methane-cycling microorganisms and their physiological networks will enhance freshwater wetland model predictions. In this proposed research the team will identify the microbial processes impacting wetland methane fluxes and evaluate their biogeographic conservation. Researchers will distill this content into an ecosystem model with the goal of closing “the gap” between measured and predicted methane fluxes from wetlands.

In the first year of this project, the team leveraged publicly available 16S amplicon data to begin addressing the biogeographical conservation of microbial community composition and metabolic function across wetlands. Researchers cataloged the microbial community membership from 1,118 wetland samples collected from nine geographically dispersed wetlands. Samples screened with 16S rRNA were also contextualized with soil chemistry and flux data, yet many sites also included genome-resolved metagenomics, transcriptomics, and metabolomics. The analysis shows that marshes, fens, and bogs have distinct microbial communities such that wetland type—more than geography, climate, or soil features—drives microbial community composition. Despite sitewide differences at the community level, researchers did observe six genera of methanogens (e.g., Methanoregula, Methanothrix) and four genera of methanotrophs (e.g., Methylobacter) were core across wetland samples. Notably, the three highest methane emitting wetlands, all marshes, shared methanogen and methanotroph membership and distribution patterns. In fact, the dominance of Methanoregula is a strong predictor of methane flux across wetlands. Researchers also show that wetlands with annual methane emission that deviate the most from temperature-based predictions had the highest Methanoregula relative abundance. Together this preliminary cross-wetland data illuminates the conservation of microorganisms across high methane-emitting wetlands, narrowing the diverse soil carbon cycling community to a “most-wanted” list of fruitful targets for genomic and metabolic network efforts, yielding process-based knowledge needed for biologically aware models.

To begin to illuminate the metabolic features of carbon decomposition in high methane emitting wetlands, the team created the second version of the Genome Resolved Open Watershed (GROW2) database. This public, genomic resource contains the identity and distribution of 26,000 unique microbial genomes from wetlands, including over 500 methanogen and methanotroph genomes. This spatial sampling scheme coupled to a breadth of ecological dimensions (e.g., wetland type, methane emission rate, land use) will enable us to systematically identify the microorganisms and metabolic networks associated with high methane-emitting wetlands. As a preliminary approach, the team performed site-specific, co-occurrence analysis to uncover the network of microorganisms coordinated to methanogens across each site. Linking the 16S rRNA data to the project’s GROW2 genomes, team members developed metabolic profiles for these methanogen connected taxa. Researchers show that obligate fermenters (e.g., many syntrophs) have the highest connectivity to the most abundant methanogens. Additionally, the highest methane-emitting wetlands had the least amount of methanogen connections, suggesting streamlined metabolic circuits may contribute to enhanced methane production across wetland soils. Ultimately, GROW2 is a living road map, articulating the power of microbiome science to decode microbial organismal and metabolic patterns at scales necessary for ingestion into predictive modeling frameworks. GROW2 is publicly available on KBase, engendering collaborative enterprises with the goal to advance a new era of climate-driven research in wetlands.

Construction of a Microbial Methane Observatory Reveals Metabolic Dynamics of Freshwater Wetland MicrobiomesWrightonColorado State UniversityOliverioEnvironmental MicrobiomeUniversity

This project interrogates microbial contributions to carbon cycling in soils from a freshwater, coastal wetland adjacent to Lake Erie, OH. This site was selected as it has the highest annual methane fluxes within the AmeriFlux network. To profile the microbial contributions to methane release the team instrumented the wetland with in situ porewater and greenhouse gas (GHG) flux (i.e., CO2 and CH4) measurements and flux tower for site-wide flux measurements. A first goal of this project required construction of a microbial genome-resolved database generated to interrogate microbial physiological contributions to wetland GHG fluxes. Secondly, the team developed computational application in the annotation software Distilled Refined Annotation of Metabolism (DRAM) to profile the microbial traits in this genomic database. Lastly, the team mapped metatranscriptomes collected from temporal sampling and highly spatially resolved sampling to resolve expressed microbial community metabolic traits over years, within seasons, and along centimeter depth to meter land-coverage gradients. Leveraging the multiomics data in conjunction with geochemical, metabolomic, and GHG measurements, researchers provide unprecedented insight into how hydrological perturbations impact methane flux from a coastal, freshwater wetland.

Freshwater wetlands contribute over a third of global methane emissions and store 30% of global soil carbon, which may become increasingly available for microbial degradation into GHGs. Despite their global climatic significance, it remains difficult to establish a mechanistic understanding of the microbial controls on soil biogeochemical processes, limiting the ability to predict GHG emissions. Here, researchers built a comprehensive catalog of 17,333 metagenome-assembled genomes representing 2,502 dereplicated genomes spanning 72 phyla from freshwater wetlands, of which 57% represent novel lineages with no genomic representation. The team then coupled its wetlands genome database with 133 genome-resolved metatranscriptomes and highly resolved compositional and geochemical profiling of over 700 samples to identify the dominant axes of environmental variation shaping the composition and transcription of microbial genomes in the wetlands system. Researchers further delineate the contributions of major microbial lineages to biogeochemically relevant processes including methanogenesis, sulfate reduction, and denitrification.

Although soil methanogens and methanotrophs have been cultivated for decades, the project’s genome recovery approach resulted in 88 methanogen genomes representing all methanogenic pathways. These include three novel families, 17 novel genera, and for some taxa, the first genomes identified in a wetland environment—illuminating the phylogenetic and metabolic diversity harbored in terrestrial ecosystems. First, the team used NMR-metabolites to survey the distribution of methanogenic substrates across the wetland; with flooding, researchers observed increases in methanogen substrates of acetate and methanol only in the surface soils. Concomitant with increased availability of these substrates, the team observed gene activity from methanogens utilizing acetate and methanol in these surfaces increased 4.9 and 1.6-fold respectively with flooding. This expanded substrate availability, and concentrations likely contributed to the nearly 5-fold increase in methane production reported in the surface soils with flooding. Surprisingly, while researchers expected the shift to anoxia with flooding would decrease aerobic methanotrophy in surface soils, gene expression data indicated aerobic methanotrophs metabolized methane using very low oxygen concentrations. Team members also demonstrated the first gene expression data evidence for nitrate-enabled methanotrophy in soil systems. Given that methanol utilizing methanogens and anaerobic methanotrophs are currently not accounted for in climate models, the findings show important data add increased realism to biogeochemical models of terrestrial methane emissions.

Using co-expression network analysis, the team revealed coordinated, active microbial neighborhoods that are localized to depth but stable across months, seasons, and years. These microbial guilds are predictive of in situ GHG concentrations. Using integrated metabolite and metatranscriptome data researchers reconstructed the metabolic circuitry explaining a substantial fraction of variability in these GHGs. Findings link biogeochemical shifts to the genome transcription of specific microbial lineages and processes and begin to establish an integrative framework for leveraging high-dimensional multiomics data towards process-based models of the ecology and biogeochemistry of freshwater wetlands.

Membrane Potential Dynamics during the Cell Cycle of Single Proliferating and Nonproliferating Bacterial CellsWeissUniversity of California–Los AngelesBharadwajBioimaging
  • To develop optical probes that use the quantum confined stark effect and quantum tunneling techniques to quantitatively investigate bacterial membrane potentials (MP).
  • To develop advanced detectors for quantum technologies, including time-gated single-photon avalanche diode (SPAD) array imagers and time-correlated single- photon counting (TCSPC) avalanche diode (SPAD) arrays for various imaging
  • To develop the phasor fluorescence lifetime imaging microscopy (phasor-FLIM) approach for classical pulsed laser light MP imaging using the SwissSPAD3 detector, and install current and future detectors on an inverted microscope with a scanning stage and a biofilm microfluidic
  • To develop and benchmark non-classical light FLIM for membrane potential imaging using entangled two-photon absorption and antibunching g(2)(r) imaging of MP.
  • To test both classical and non-classical light MP imaging techniques on DOE-relevant model biofilms of Bacillus subtilis and Shewanella oneidensis MR-1.

Membrane potential (MP) is a crucial aspect of cellular physiology. Recent studies have explored its role in cell-cell communication and metabolic coordination within bacterial biofilms. MP plays an important role in broadcasting the metabolic status of cells located in different parts of the biofilm. MP has also been shown to play an important role in the stress response of microbial cells under various stressors.

Despite the expanding understanding of MP’s role in bacteria, little is known about its changes during the cell cycle as well as under different growth conditions. In this project, the team studies MP changes in single bacterial cells associated with different growth conditions (proliferating vs non-proliferating) and under various stressors, using VoltageFluor (VF) dyes. VF dyes insert into the lipid bilayer of the bacterial cell’s plasma membrane. They are presumed to exhibit electric field-dependent photo-induced electron transfer (PeT) process that modulates the fluorescence quantum yield and the excited state lifetime of the dye. Preliminary intensity-based widefield fluorescence measurements have been performed over long periods of time to record MP fluctuations in a single cell, in both rapidly growing as well as inhibited cell cultures of Bacillus subtilis (B. subtilis). Very different MP time-trajectories have been recorded for proliferating and non- proliferating cells (Figure 1).

Further experiments utilizing the dye’s excited-state lifetime as an MP readout are being performed using (1) a commercial confocal FLIM and (2) a home-built widefield FLIM set-up that utilizes a time-gated SPAD camera and a phasor-based analysis to quantitively capture small changes in resting MP during the cell cycle. A better understanding of MP changes in proliferating bacteria may provide better tools to manipulate microbial cell growth.

The team plans to image whole biofilms and resolve MP changes in individual cells within the community under steady-state conditions (i.e., steady-state fluctuations) and dynamically under non-equilibrium conditions upon introducing various environmental stressors, spatial gradients, nutrient limitation, and substrate consumption conditions.

Laboratory for BioMolecular Structure: A DOE-Funded National Cryo-EM CenterMcSweeneyBrookhaven National LaboratoryWangStructural Biology

The Laboratory for BioMolecular Structure (LBMS) at Brookhaven National Laboratory (BNL) provides peer-reviewed research access, support, and training for the use of cryo-electron microscopy (cryo-EM). Cryo-EM has been widely employed to determine biomolecular structures and the establishment of LBMS makes it possible for the research community to advance the foundational knowledge of the biological complexity of plant and microbial metabolism and their interfaces. A key goal for LBMS is to attract DOE-BER sponsored researchers to take advantage of LBMS’ cryo-EM and help them in all phases including project initiation, sample preparation, data collection, processing, and interpretation. Three tiers of trainings are offered to teach new users so that they can conduct independent research later and to ensure experienced users get the very best data possible at LBMS. The LBMS also establishes a culture of innovation to extend the state-of-the-art through exploring new methods of sample preparation, data collection and analysis, and automation leading to an improvement in throughput and accuracy of the structures determined.

To advance understanding of biological processes and complexity, cryo-electron microscopy (Cryo-EM) has become a preferred method for studying structures of biological macromolecules and high-order machinery. However, for many institutions and research groups, acquiring and operating a state-of-the-art cryo-EM facility remain cost prohibitive. With the joint support from the New York State and the Department of Energy (DOE), the Laboratory for BioMolecular Structure (LBMS) at Brookhaven National Laboratory, a national cryo-EM center, provides cutting-edge instruments and operations for imaging biological structures and processes. LBMS provides merit-based, no-cost access to non-proprietary users as well as cost-recovery access to proprietary users.

The mission of LBMS is to accelerate fundamental understanding about the building blocks and their functions in all living organisms. LBMS strives to foster expeditious developments in biotechnology and medicine to meet the Nation’s urgent needs in biofuels and healthcare. LBMS fulfills its mission by offering training and access to highly advanced cryo-EM capabilities to the broad research community. In 2022, LBMS supported 92 users to collect 809,032 images, and 71 structures were been determined. Ten papers were accepted/published and fourteen manuscripts are under review. LBMS also offers trainings to current and potential users. In addition to in-person trainings for sample preparation and screening, and remote training for data collection on the high-resolution EM, the team hosted the second annual cryo-EM course that was attended by ~400 attendees from 24 countries. For current and potential LBMS users, the team organized four quarterly cryo-EM workshops that focused on practical aspects of cryo-EM techniques and involved intensive interactions among instructors and attendees. The average rating of the workshops is 4.5/5.0 and 100% of the attendees will recommend the workshops to others.

To support users to perform their best achievable research, LBMS explores and optimizes emerging methods of instrumentation, sample preparation (negative staining, vitrification, cryo-focused ion beam milling), data collection (single particle analysis, cryo-electron tomography), data analysis, and automation. LBMS is a user-centric facility with excellence through user training, user support, instrument operation, and facility development.

Phage Engineering for Targeted Editing of Microbial CommunitiesNorthernLawrence Berkeley National LaboratoryAdlerEnvironmental Microbiomem-CAFEs

Understanding the interactions, localization, and dynamics of grass rhizosphere communities at the molecular level (genes, proteins, metabolites) to predict responses to perturbations and understand the persistence and fate of engineered genes and microbes for secure biosystems design. To do this, advanced fabricated ecosystems are used in combination with gene editing technologies such as CRISPR-Cas and bacterial virus (phage)-based approaches for interrogating gene and microbial functions in situ—addressing key challenges highlighted in recent DOE reports. This work is integrated with the development of predictive computational models that are iteratively refined through simulations and experimentation to gain critical insights into the functions of engineered genes and interactions of microbes within soil microbiomes as well as the biology and ecology of uncultivated microbes. Together, these efforts lay a critical foundation for developing secure biosystems design strategies, harnessing beneficial microbiomes to support sustainable bioenergy, and improving the understanding of nutrient cycling in the rhizosphere.

Bacteriophages are estimated to be the most abundant biological entities on Earth, outnumbering bacteria by ten to one. Owing to their natural ecological abundance, genetic diversity, and ability to transduce DNA, they represent attractive gene delivery vehicles to edit microbial communities in situ. However, the ability to broadly edit phages themselves has been limited by a diversity of mechanisms for phages to protect DNA genomes. In order to edit the diversity of phages, we establish a generalizable editing for phage genetic manipulation based off RNA-guided, RNA-targeting endonuclease, LbuCas13a. Researchers find LbuCas13a to be a remarkably potent phage inhibitor, suggesting that phage RNA is generally vulnerable during viral infection. When challenged against Escherichia coli phage phylogeny, researchers find no apparent phage-encoded limits to LbuCas13a antiviral activity. Further, researchers highlight how leveraging this potent anti-phage activity can be used to flexibly edit diverse phages with edits as small as a single codon or as large as multi-gene deletions. Researchers further discuss opportunities for engineered phages to edit microbial within fabricated ecosystems using phage-derived base editing technology and novel phages infecting members of synthetic microbial consortia. The ability to robustly edit bacteriophages will not only lead to a deeper understanding of phage genetic diversity but also facilitate meaningful genetic changes to microbial communities.

Metabolic Modeling and Genetic Engineering of Enhanced Anaerobic Microbial Ethylene SynthesisNorthThe Ohio State UniversityNorthBioenergyUniversity

To develop robust and optimized anaerobic ethylene pathways in photosynthetic and lignocellulosic bacteria for high-yield conversion of renewable CO2 and lignocellulose into bioethylene. This will be accomplished by:

1: First, bioinformatically mining and experimentally screening methylthio-alkane reductase homologs, SAM hydrolase homologs, and alcohol dehydrogenase homologs from cultivated and uncultivated organisms for functional enzymes that enhance ethylene yields.

2: Next, constructing and employing predictive systems-level models of ethylene production. Using a physics-based R. rubrum model, researchers predict enzymes that participate in competing or supporting pathways and are thus targets for selection studies to increase ethylene yields.

3: Finally, researchers metabolically engineer bacteria for enhanced, sustained ethylene production from CO2 and lignocellulose. Researchers assemble the best-performing genes under control of optimized active transcription elements on a modular DNA fragment in a combinatorial manner with guidance from predictive models (Aim 2).

Previously researchers have detailed a pathway in phototrophic bacterium, Rhodospirillum rubrum that produces ethylene in the absence of oxygen from methionine and ATP (Fig. 1; North et al. 2020). Traditional ethylene production involves energy intensive cracking of petroleum fossil fuels to meet the 300 million metric ton annual demand. Thus, a sustainable microbial platform for the renewable production of ethylene is urgently needed. The goal of this project is to optimize this anaerobic ethylene production pathway, as outlined above.

Enzyme Screens: Experimentally verified amino acid sequences for each reaction (Fig. 1, *) were queried against Uniref, KEGG, GenBank, and JGI-IMG databases using an e-value cutoff of 1e-10. Sequences were also searched using BLAST (e-value cutoff of 1e-10) in a curated set of assembled genome databases spanning soils, rivers, and the human gut. Combined sequences from all databases were dereplicated using CD-HIT to remove identical sequences. Methylthio-alkane reductase genes were synthesized by the JGI DNA synthesis program, and SAM hydrolases and alcohol dehydrogenases synthesis in process. Insertion of the methylthio-alkane reductase genes into an R. rubrum native methylthio-reductase deletion strain revealed sequences from closely related alphaprotebacteria like Rhodoblastus sphagnicola increase ethylene synthesis from 2-methylthioethanol up to 2-fold. Furthermore, these screens revealed sequences from clostridia, bacilli, negativicutes, and fibrobacter species that are also methylthio-alkane reductases, expanding the understanding of the diversity of organisms that can synthesis hydrocarbons from volatile organic sulfur compounds. In parallel to these enzyme screens, researchers have further enumerated the number of substrates and reactions catalyzed, revealing that propane, propylene, and butane can also be synthesized.

Physics-based Modeling: In order to ensure that the model of cell metabolism is complete enough to capture the important dynamics for ethylene production, researchers iteratively built and tested the model on four photoheterotrophic growth conditions: growth on fumarate, malate, acetate and ethanol. Testing and refining the model on a variety of growth conditions helps to ensure that the principles discovered regarding the metabolic processes are general and not artifacts of an incomplete model. Researchers have now tested the model with each growth condition and a range of redox states and compared the results to various experimental studies. Summarizing the results, the model predicts that flux through phosphoenolpyruvate carboxy kinase (PEP-CK) during growth on malate and fumarate is thermodynamically feasible to synthesize phosphoenol-pyruvate in lower glycolysis, as observed experimentally (McCully and McKinlay 2016). However, the model indicates that the reductive TCA (rTCA) cycle is the thermodynamically favored mechanism for pyruvate synthesis to support gluconeogenesis, amino acid, and de novo nucleotide synthesis. In contrast, during growth on acetate and ethanol, the thermodynamically preferred route to synthesize pyruvate utilizes the ethylmalonyl-CoA pathway for acetyl-CoA synthesis followed by reductive carboxylation to pyruvate by pyruvate ferredoxin oxidoreductase. To predict ethylene production, the computational model has been analyzed with metabolic control analysis (concentration control coefficients) and with computational protein knock-down (reduced pathway activity) and overexpression (increased pathway activity) to predict which enzymes need to be up or down regulated to enhance ethylene production. Initial results have had very good agreement with parallel experimental work.

Metabolic Engineering: In Rhodospirillum rubrum at least 6 ethylene pathway enzymes are under transcriptional regulation of SalR, a member of LysR family transcription factors. SalR and thus ethylene synthesis activity is sensitive to available exogenous sulfur, such that ethylene synthesis is turned off at sulfur concentrations ≥ 200 μM. Instead of replacing the promoter for each gene, researchers are taking a more efficient approach by engineering SalR expression and activity to activate ethylene synthesis. When SalR is constitutively expressed the engineered strain continuously produces ethylene, even in presence of ≥ 500 uM of exogeneous sulfate and methionine. However, exogenous cysteine still inhibits ethylene similar to when SalR is expressed from its native promoter suggesting cysteine is a SalR effector molecule. Through predictive structural modeling (Alpha Fold) researchers have identified residues for substitution studies.

Using 13CO2 to Track Carbon Flow Mediated by Fungal-Bacterial Interactions in Grassland SoilsNguyenUniversity of Hawaiʻi–MānoaYuanEnvironmental MicrobiomeUniversity

As the dominant groups of soil microbes, fungi and bacteria together drive essential biogeochemical cycles belowground. However, the dynamics, mechanisms, and ecological implications of bacterial-fungal interactions (BFIs) are not well-understood, especially on the community level and under abiotic stress. The broader goal is to build a quantitative and mechanistic framework to address how BFIs determine the availability and fate of C and N across the complexity of soil niches. The three interrelated objectives each addresses a critical factor determining the behavior of BFIs across soil C source, mineralogy, and water availability: (1) measure how grassland BFIs are shaped by the availability of different C source, and in turn mediate soil C and N mineralization; (2) determine how BFIs may mediate C stabilization and mineralization via aggregation and mineral surface interaction across soils of different mineralogies; and (3) quantify how reduced water availability interplays with C source, C availability, and soil mineralogy in structuring BFIs and BFI-mediated soil processes.

Fungi and bacteria are the two dominant groups of soil organisms that consume, process, and translocate plant-derived soil organic matter (SOM) and thus are critical to global biogeochemical cycling. While fungi and bacteria’s separate stereotypical processing of SOM are relatively well documented, there is an increasing recognition that niche-bridging fungal network and their interacting bacteria commonly co-mediate the flow and fates of plant-derived carbon (C). Less understood is how bacteria and fungi interact, and how these interactions change in response to environmental perturbations to influence the rate of C processes and fates of C in soil. The hyphosphere, soil explored by fungal hyphae, represent a potential hotspot niche of microbial activity that is just starting to be understood.

To characterize the interactions between fungi and bacteria in driving the flow of photosynthetic C belowground, researchers developed the Dynamic Ecosystem Labeling (DEL) system to deliver 13CO2 to live plants, which assimilated the 13C and transferred it belowground via rhizodeposition. In a large-scale field manipulation study on the impacts of reduced precipitation on soil microbial interactions in a Mediterranean grassland, the team conducted four separate labeling events. Each labeling event differed in duration, between 5 to 14 days, and covered different phenological timepoints of the dominant grass Avena spp.: seedling, exponential growth, and peak biomass. During labeling, the DEL system tightly controlled the chamber headspace CO2 concentration, driving the headspace average 13C atom% to 42.9 ± 13.3% and a maximum of 70.3%. A part of the 13C incorporated into the ecosystem was quickly mineralized and detected as CO2 over at least four weeks after labeling ended. Incredibly, 23% of the 13C entered soil was still detectable after 2 years, three quarters of which were found to be associated with soil minerals. Sufficient stable isotope was incorporated into the belowground microbial communities such that labeled C could be followed into the DNA through quantitative Stable Isotope Probing (qSIP). Using amplicon sequencing of 16S and ITS genes coupled with qSIP, the DNA of nearly 150 taxa of bacteria and fungi were found to be significantly enriched with 13C after 5 days of labeling, indicative of a potential food web that facilitates the bioprocessing and flow of rhizodeposits. Although the fungal and bacterial community composition and co-occurrence networks changed most profoundly over time, reduced precipitation significantly reduced the number of taxa that were 13C enriched. This suggests that bacteria and fungi under reduced precipitation might have less access to newly fixed plant C. 13C provides a powerful and sensitive tracer to follow and quantify the flow of C mediated by soil bacteria and fungi.

In a greenhouse experiment, the DEL system was used to label Avena barbata growing in mesocosms that had an air gap to separate the hyphosphere of arbuscular mycorrhizal fungi (AMF) Rhizophagus intraradices from the rhizosphere. The team quantified the amount of C being transported away from the rhizosphere by AMF hyphae crossing the air gap, and studied the microbial composition associated with AMF hyphae and their 13C enrichment. In six weeks, over 1% of the total soil C in the hyphal compartment was 13C labeled, and a quarter of which was found associated with soil minerals. Amplicon sequencing indicated that AMF significantly modified the soil prokaryote community composition, but not diversity; nineteen amplicon sequence variants significantly increased in the presence of AMF, including Arthrobacter sp., Caulobacter sp., Rhizobium sp., Dongia sp., and Verrucomicrobia. Identification of the 13C enriched taxa, which could be the primary consumers of 13C imported via AMF hyphae, is underway.

Incorporating 13C-informed C pool sizes and microbial activity into the Microbial ENzyme Decomposition (MEND) model improved the prediction of the decomposition rates for different C pools. The next steps include (1) developing a BFI module for the MEND model; (2) using field-based fungal ingrowth core experiment and simplified soil interactions microcosm (SIM) experiments to derive the essential parameters that indicate the types and strengths of BFIs; and (3) assessing the soil C stability and storage mediated by the BFIs in different soil mineralogies and under reduced water availability.

Fungal-Bacterial Interactions: Bridging Soil Niches in Regulating Carbon and Nitrogen ProcessesNguyenUniversity of Hawaiʻi–MānoaNguyenEnvironmental MicrobiomeUniversity

As dominant groups of soil microbes, fungi and bacteria together drive essential biogeochemical cycles belowground. However, the dynamics, mechanisms, and ecological implications of bacterial-fungal interactions (BFIs) are poorly understood, especially on the community level and under abiotic stress. The goal is to build a quantitative and mechanistic framework to address how BFIs determine the availability and fate of C and N across the complexity of soil niches. The three interrelated objectives each addresses a critical factor determining the behavior of BFIs across soil C source, mineralogy, and water availability:(1) measure how grassland BFIs are shaped by the availability of different C source, and in turn mediate soil C and N mineralization; (2) determine how BFIs may mediate C stabilization and mineralization via aggregation and mineral surface interaction across soils of different mineralogies; and (3) quantify how reduced water availability interplays with C source, C availability, and soil mineralogy in structuring BFIs and BFI-mediated soil processes.

Fungi and bacteria are the two dominant groups of soil organisms that consume, process, and translocate plant-derived organic matter and thus are critical to global nutrient cycling. Fungal hyphal networks are important gateways for C and nutrient exchanges between plants and soils, and there is an increasing recognition that such processes are co-mediated by their interactions with bacteria. Yet the understanding of these interactions has generally been correlative, and the mechanisms of these interactions in the context of nutrient cycling are far from understood. The team hypothesize that bacterial-fungal interactions (BFIs) fundamentally determine the outcomes of soil ecosystem function by enabling C and N mineralization, competing for limited nutrients, and contributing to soil organo-mineral interactions and aggregate formation. Through this project, researchers aim to build a quantitative and mechanistic framework of how BFIs can change soil processes, the availability and the fate of C and N across the complexity of soil niches in different soil types and abiotic conditions. Researchers will present the initial concepts of this project through a set of three experiments: (1) Characterize how Mediterranean and tropical grasslands BFIs are shaped by the availability of different C source, and quantify soil C and N mineralization mediated by the BFIs; (2) Understand whether and how BFIs mediate C stabilization and destabilization in soils of different mineralogies; and (3) Quantify how drought interplays with C source, C availability, and soil mineralogy in influencing BFIs and BFI-mediated soil processes.

Researchers propose three major tasks that scale in complexity, starting with a field experiment where the team will segregate soil niche compartments by using ingrowth cores that sequentially exclude incoming roots and fungi. This allows researchers to trace 13C in isotopically labeled photosynthate as it moves through these niche compartments. Next, in a field-based mesocosm experiment, researchers will deploy the same ingrowth cores into intact megaliths of different soil types and use 13C and 15N tracers to measure how soil mineralogy interplay with microbial processes that influence the incorporation of these tracers into soil aggregates and mineral surfaces. Next, researchers will fine-tune the mechanisms that control BFIs in a laboratory soil microcosm experiment that measures the molecules involved when bacteria and fungi come into direct contact and the functional outcomes of these interactions. These three experiments will be performed under a drought treatment that can help address how microbial interactions can change with environmental conditions. Finally, researchers will integrate the data from these experiments using network analysis and omics-informed, niche-identified ecosystem (MEND) modeling.

The project will address the gap resulting from the over-simplified culture-based BFI studies and the mainly correlative field surveys of fungal-bacterial co-occurrence. Leveraging the above methods and technologies, researchers will be able to identify the mechanisms important to BFIs across broad grassland ecosystems, quantify C and N dynamics, model interactions in natural soil environments, and use this powerful set of data to better predict terrestrial C and N cycles under climate change.

Broadening Aromatic Compound Degradation in Acinetobacter baylyi Using Synthetic Metabolic PathwaysNeidleUniversity of GeorgiaBaughBiosystems DesignUniversity

This project expands the ability of the soil bacterium Acinetobacter baylyi ADP1 to degrade aromatic compounds. Applications range from lignin valorization to bioremediation of recalcitrant environmental pollutants. Biotechnology applications will benefit from new methods to design and evolve efficient pathways for the metabolism of specific compounds. Pathways targeting the degradation of aromatic compounds were constructed and evolved in ADP1, an ideal model organism for synthetic metabolic pathways. To enable consumption of target compounds, researchers are combining parts of characterized foreign pathways. For compounds without known metabolic routes, including pyrogallol and syringol, the synthetic pathways incorporate modified, as well as foreign, enzymes. Pathways are improved via laboratory evolution, targeted gene amplification, and serial transfer, which enable growth-based selection. Novel metabolic functions can be demonstrated in ADP1 and exported to other microbes.

Lignin is a vastly underutilized, energy-rich renewable resource. Initial processing yields heterogenous mixtures of aromatic compounds that are naturally degraded slowly by microbial consortia. Industrial applications could benefit from combining the metabolic abilities of these bacteria into a single laboratory strain. Carbon funneled through central metabolism could then be used to synthesize a product. Researchers use A. baylyi ADP1, a genetically malleable bacterium in which chromosomal changes can be introduced by simply mixing cells and linear DNA fragments (Bedore et al. 2023).  During adaptive evolution, genetic regions can be duplicated, facilitating the accumulation of beneficial mutations (Pardo et al. 2023). Together, these methods offer a generalizable system to create catabolic modules that can be mixed, matched, and applied to diverse aromatic compounds.

As the goal is the metabolism of compounds with unknown natural degradative routes, the team designed a pathway for one such compound, pyrogallol. Extradiol cleavage enzymes that use a structurally similar substrate, catechol, have been reported to cleave pyrogallol, albeit poorly. Bacteria have been discovered to use pyrogallol as a carbon source, but the natural catabolic pathway is unclear. As a first step in the synthetic pathway, researchers inserted genes from Pseudomonas putida into the A. baylyi chromosome. These genes encode extradiol catechol dioxygenases, enzymes that might be modified for pyrogallol cleavage. Two different enzymes, XylE and TodE, replaced a native catechol dioxygenase, which, unlike each foreign enzyme, catalyzes intradiol ring cleavage. Thus, the ring-cleavage products generated by the foreign enzymes are not normally encountered by A. baylyi. A second module is needed to route the non-native ring-cleavage intermediates to central metabolism. This second module consists of a foreign catabolic pathway encoded by multiple pra genes. This foreign pra-gene pathway was previously expressed in A. baylyi, and its functionality was achieved during laboratory evolution. The pra-encoded enzymes can route the products of both catechol and pyrogallol cleavage to central metabolism.

There are several challenges to this approach. Enzymes often initially function sub-optimally in new hosts. Therefore, functionality of XylE or TodE is not guaranteed, even for cleaving the natural substrate, catechol. Additionally, in using catechol or pyrogallol as the carbon source, the catechol dioxygenase must work in conjunction with the pra-encoded enzymes. This pathway comes from a Paenibacillus species with a different natural substrate (protocatechuate) and regulatory context. Growth requires balanced metabolism of many chemical intermediates, several of which are toxic. As a final hurdle, the function of a non-native dioxygenase must be honed and altered to improve activity on pyrogallol, rather than catechol.

Strategies to engineer pyrogallol degradation exploit growth-based laboratory evolution and methods to alter gene dosage. Genes encoding a catechol dioxygenase and the Pra pathway are first integrated in different positions in the A. baylyi chromosome. Independent gene amplification of either module can facilitate selection of functional strains (Tumen-Velasquez et al. 2018). Fluorescent biosensors can detect pathway intermediates to assist strain selection (Jha et al. 2015). After strains with novel capabilities are isolated, the contribution of mutations to the resulting phenotype, identified via whole genome sequencing, can be determined using transformation methods (Bedore et al. 2023).

In this project researchers evolve and select for A. baylyi strains with foreign extradiol catechol dioxygenases capable of cleaving aromatic growth substrates. In these strains, growth using catechol, anthranilate, and benzoate, each as a sole carbon source, required interdependent functions of a foreign catechol dioxygenase and pra-encoded enzymes. The current focus is altering substrate specificity of the dioxygenase to allow use of pyrogallol as the carbon source. Through strategic design of synthetic pathways, researchers are developing A. baylyi ADP1 for biotechnology applications. The unique genetic system of this strain allows researchers to rapidly engineer strains, interchanging different metabolic and regulatory modules.

Quantify the Impact of Your Data with DOE JGI’s Genome Citation ServiceFagnanJGIFagnanCrosscuttingJGI

Development of a new resource that deepens the community’s understanding of the impact of data reuse.

Steady increases in sequencing capacity, combined with rapid accumulation of publications and associated resources, have increased the complexity of maintaining associations between literature and genomic data. Accumulated errors and omissions in the literature and databases compound the difficulty of the task. Automated approaches to maintaining and confirming associations among these resources have become necessary.

Here researchers present the U.S. Department of Energy (DOE) Joint Genome Institute’s (JGI) Genome Citation Service (GCS), which discovers literature that incorporates genome data whether or not the source of the data was properly attributed by the authors. This service provides a number of advantages over manual curation including consistent coverage of public resources, automatic updating of genome project metadata, and augmentation of genome project metadata through documentation of previously unrecognized uses by the scientific community. The service significantly reduces labor costs associated with manual literature review while improving the quality, accuracy and consistency of genome metadata maintained by the DOE JGI.

The DOE JGI seeks to deepen its understanding of the impact of its user community’s science by connecting its data products to publications. The GCS facilitates this understanding, improves credit attribution for data generators, and can encourage data sharing by allowing scientists to see how reuse amplifies the impact of their original studies. Doing so supports JGI’s commitment to FAIR data practices and allows JGI to meet its obligations as a DOE Office of Science Public Reusable Research (SC PuRe) data resource.

The GCS increases the number of known publications that incorporate JGI data products and, as a publicly available resource, the GCS enables researchers to better understand their impact. Researchers seek feedback from the Genomic Science program (GSP) community on the usability of this resource.

Engineering Novel Microbes for Upcycling Waste PlasticMoon Washington University–St. LouisDiaoBiosystems DesignUniversity

The goal is to develop a consolidated biological process to upcycle waste polyethylene terephthalate.

Polyethylene terephthalate (PET) represents 12% of global solid waste. PET chemical recycling has been an option to solve this global problem, but it suffers from its relatively high process cost and the extremely low price of virgin PET (~$1/kg). One solution to address this issue is to upcycle waste PET rather than recycle it to generate the same PET typically with low quality. PET upcycling can be achieved by depolymerizing PET into terephthalic acid (TPA) and ethylene glycol (EG) and biologically converting these monomers into value-added products. However, there are only a handful of reports demonstrating microbes capable of growing on both TPA and EG generated from PET as sole carbon sources. To overcome this limitation, researchers have performed strain screening to discover a Rhodococcus strain RPET that can grow well on the alkaline hydrolysis products of PET as the sole carbon source without any purification step. Notably, this strain can grow on a mixture of TPA and EG at extremely high concentrations (up to 0.6M) and high osmolarity resulting from alkaline hydrolysis and pH neutralization. The resultant media supported RPET’s growth without any purification and sterilization step except for their dilution. In addition, many synthetic biology tools, developed for a related species Rhodococcus opacus (DeLorenzo et al. 2018; DeLorenzo et al. 2021; Diao, Carr, and Moon 2022), were functional in RPET, facilitating its metabolic engineering. In this presentation, researchers discuss the effort to develop this novel chassis for waste PET valorization with PET conversion into carotenoids (up to $7,500/kg) as a proof-of-concept demonstration (Moon et al. 2022). Specifically, researchers discuss lycopene production up to 1.3 mg/L from PET using this technology (Diao et al. 2023). Along with other efforts (Moon 2022), this technology can solve global plastic pollution issues and sustainable chemical production problems.

HypoRiPPAtlas: An Atlas of Hypothetical Natural Products for Mass Spectrometry Database SearchMohimaniCarnegie Mellon UniversityGulerComputational BiologyUniversity

Recent analysis of hundreds of thousands of public microbial genomes has resulted in the discovery of over a million biosynthetic gene clusters (BGCs; Hadjithomas et al. 2016; Blin et al. 2016; Kautsar et al 2021). Gene-to-molecule approaches are therefore urgently needed for microbial and plant natural product (NP) discovery in light of rapidly growing microbial and plant genetic resources. Currently, the NPs for the majority of BGCs remain unknown. Global natural product social (GNPS) molecular networking infrastructure harbors billions of mass spectra of NPs with unknown structures and biosynthetic genes. In order to bridge the gap between large scale genome mining and mass spectral datasets for NP discovery, researchers developed HypoRiPPAtlas, an atlas of hypothetical NP structures, which can be readily used for in silico database search of tandem mass spectra.

HypoRiPPAtlas is constructed by mining the genomes of 22,671 microbial strains from the RefSeq database using seq2ripp, a novel machine learning tool for prediction of ribosomally synthesized and post-translationally modified peptides (RiPPs). Seq2ripp outperforms currently existing RiPP mining tools in identification of known MiBIG RiPPs from genomic inputs. Searching the hypothetical molecules from the Atlas against 46 mass spectral datasets from GNPS resulted in the discovery of numerous RiPPs, including two novel lassopeptides and one lanthipeptide from Streptomyces sp. NRRL B-2660, WC-3904 and WC-3560. Moreover, seq2ripp discovered ten plant RiPPs including elaeagnin, a member of a new BURP-domain-derived RiPP class with a novel post-translational modification (PTM) from silverberry (Elaeagnus pungens). By addressing the fundamental challenge of predicting structures from NP biosynthetic genes, the HypoRiPPAtlas approach has therefore the potential to close the gap between biosynthetic genes and their natural products in genomic NP discovery, which could be extended to other NP classes in the future by implementing corresponding biosynthetic logic.

HypoRiPPAtlas and the seq2ripp pipeline are both publicly available at hyporippatlas.npanalysis.org. Users can examine hypothetical RiPPs mined from publicly available genomes and upload their own paired genomic and mass spectral datasets to launch custom seq2ripp runs.

Scalable Computational Tools for Inference of Protein Annotation and Metabolic Models in Microbial CommunitiesMillerUniversity of Colorado–DenverDavoudiComputational BiologyUniversity

High-throughput omics technologies have made the assembly of microbial genomes recovered from the environment routine. Computational inference of the protein products encoded by these genomes, and the associated biochemical functions, should allow for the accurate prediction and modeling of microbial metabolism, organismal interactions, and ecosystem processes. However, a lack of scalable, probabilistic protein annotation tools limits the full potential of metabolic modeling. The approach to inference of improved models relies on developing new computational tools in three main areas: 1) improved protein annotations, 2) iterative cycles of gap-filling metabolic models with improved protein annotations and informing probabilistic protein annotations based on metabolic models, and 3) integrating improved protein annotations with community-level flux balance metabolic models. Researchers aim to make these tools broadly accessible via the DOE Systems Biology Knowledgebase (KBase; Arkin et al. 2018).

In the past year researchers continued to improve the genome annotation and modeling capabilities emerging from this project. Researchers have achieved this via advancement of traditional homology-based approaches, and advancement of new approaches leveraging genome-scale metabolic models and approaches from the field of natural language processing.

The DRAM (Shaffer et al. 2020) app in KBase was significantly improved, enhancing reliability, usability, and overall quality of output annotations, all within the KBase framework that allows for interoperability of these annotations with other annotation and modeling apps. This work culminated in a new publication highlighting the utility of DRAM within KBase (Shaffer et al. submitted). The team also extended DRAM to be able to predict microbial traits (e.g. nitrate reducer, aerobe, fermenter, etc..) from protein annotations. These traits were developed and validated (below) via extensive expert curation.

Beyond DRAM, researchers also deployed a new tool called GLM4EC in which researchers trained and fine-tuned a modification of Generalized Language Models (GLMs; also known as Large Language Models) based on ProteinBERT (Brandes et al. 2022) to the task of annotation of microbial proteins with Enzyme Commission (E.C.) numbers. This model was trained to learn sequence embeddings and annotation classification from a subset of UniRefKB annotated with E.C. numbers. On held-out test sets, the model predicts E.C. numbers with high precision and recall, regardless of input sequence length. Researchers are now testing new models that have additional global features beyond E.C. numbers available for pretraining and exploring the utility of alternative model architectures. Integration of GLM4EC into KBase, along with the improved version of DRAM and other existing annotation pipelines, provides multiple hypotheses for the function of gene products within genomes and metagenome assembled genomes (MAGs) in KBase, all of which can be integrated or explored in a common, interoperable framework with other KBase tools.

To aid in determining which of the potentially alternative functions a gene product actually performs, researchers developed machine learning classifiers to predict growth phenotypes based on multiple functional annotations. These classifiers are now loaded into KBase, along with apps enabling them to be applied to predict phenotypes for any KBase genome or MAG. Further, in collaboration with the Hoffmockel Science Focus Area (SFA), researchers have integrated apps in KBase that reconcile metabolic models with predicted phenotypes, using the alternative functions proposed by the new annotation apps to associate gene candidates with gapfilled reactions. This leads to dramatic improvements in model accuracy from 56% to 72% on average. This also leads to numerous corrected annotations across all genomes, particularly in MAGs where many genes and associated functions are often missing due to incomplete assemblies. Combined with the recent enhancements to the ModelSEED pipeline to improve energy biosynthesis prediction in methanogens (a key target for this project), researchers now have models with greatly improved accuracy to predict both core and periphery metabolism.

With all these components in place, researchers are applying this improved pipeline to build models for over 2,000 MAGs loaded into KBase from the Genome Resolved Open Watersheds (GROW) project. The MAG models from this analysis now have many more reactions and annotations, particularly for poorly annotated clades. Researchers are now assembling these MAG models into compartmentalized community metabolic models for each of the 178 GROW samples. Researchers are loading hand-curated microbial traits (in part aided by DRAM inference) and corresponding trophic interaction networks into KBase as phenotypes, enabling the gapfilling of community models to replicate these hypothesized expert-curated trophic webs. The degree of gapfilling required to replicate trophic webs provides valuable feedback for identification of potential errors in these webs. The resulting models can also be tested against a growing collection of metatranscriptome data gathered from these samples, measuring agreement between reaction flux and associated gene expression.

Systems Metabolic Engineering of Novosphingobium aromaticivorans for Lignin ValorizationMichenerOak Ridge National LaboratoryMichenerBiosystems DesignEarly Career

To engineer a non-model bacterium, Novosphingobium aromaticivorans, for valorization of depolymerized lignin to value-added bioproducts. The project involves (1) discovery and optimization of pathways for assimilation of lignin-derived aromatic compounds, (2) engineering conversion pathways that match the stoichiometry of aromatic catabolism, and (3) development of genome-scale mapping techniques to identify new engineering targets in non-model bacteria.

Lignin is one of the abundant renewable materials found in nature. This heterogeneous aromatic polymer is composed of a variety of p-hydroxyphenyl (H), guaiacyl (G), and syringyl (S) monomers that are connected by diverse chemical linkages. Lignin valorization would improve biofuel economics, potentially through bacterial conversion of thermochemically depolymerized lignin into valuable bioproducts. Novosphingobium aromaticivorans F199 is an Alphaproteobacterium capable of degrading G, S, and H monomers and, due to its genetic tractability and broad catabolic capabilities, is an emerging model organism for conversion of lignin-derived aromatic compounds. However, F199 cannot natively catabolize every component of depolymerized lignin, which limits conversion yields (Azubuike et al. 2022).

Researchers are identifying new aromatic degradation pathways to increase the catabolic potential of F199 using a combination of barcoded transposon insertion sequencing, proteomics, experimental evolution, and in vitro biochemistry. The team demonstrated this approach with the aromatic monomer syringate (Cecil et al. 2018), the β-1 linked dimer 1,2-diguaiacylpropane-1,3-diol (DGPD; Presley et al. 2021), and, more recently, the monomer guaiacol (Bleem et al. 2022). However, there are multiple lignin-derived aromatic compounds that F199 catabolizes poorly or not at all. Researchers have evolved F199 to rapidly and completely catabolize the common β-O-4 dimer guaiacylglycerol-β-guaiacyl ether (GGE), and in the process identified an uncharacterized native catabolic pathway. Researchers have also isolated a Novosphingobium strain that can assimilate the β-β linked dimer pinoresinol and identified the relevant catabolic pathway. Current efforts include detailed characterization of the key pinoresinol pathway enzymes and transfer of the pathway into F199.

In addition to optimizing lignin assimilation, researchers are converting the resulting intermediates into value-added products, such as building blocks for bio-derived polymers, and have demonstrated production from a model lignin-derived aromatic compound, ferulate. Finally, to better understand the effect of host genetic variation on pathway function, researchers are adapting a novel technique, bacterial quantitative trait locus (QTL) mapping, to F199. Researchers have demonstrated intraspecific recombination between strain of N. aromaticivorans and are currently studying and optimizing this process. By combining novel pathway discovery, heterologous expression, and optimization, researchers are engineering N. aromaticivorans F199 to efficiently valorize lignin-derived compounds.

Advanced Photon Source Capabilities for Environmental and Biological ScienceMichalskaArgonne National LaboratoryMichalskaStructural Biology
  1. Development of an integrated research platform and comprehensive user support at the Advances Photon Source for BER-relevant research
  2. Building a diverse user base through outreach and training activities
  3. Integration with BER user facilities for streamlined access and enabling multimodal experiments

Light sources provide a wide range of x-ray-based tools for research pertinent to the Office of Science Biological and Environmental Research (BER) mission. Over the past 25 years, Advanced Photon Source (APS) at Argonne National Laboratory has been at the forefront of research in biological, geological, geochemical and environmental sciences. The ongoing generational upgrade of the APS facility will offer transformative opportunities for BER community to address scientific challenges. After completion in 2024, the so-called APS-U will become the nation’s brightest high-energy, storage-ring based x-ray source, delivering x-rays that will be 500 times brighter than they are today. The APS-U will allow researchers to study samples at higher resolutions and unprecedented spatial and temporal scales. Combination of macromolecular crystallography, x-ray fluorescence microscopy, tomography, absorption spectroscopy, and small/wide angle x-ray scattering will enable visualization of biological and environmental samples at scales ranging from Angstroms to centimeters and timescales from picoseconds to seconds. With the x-ray source’s high brightness, investigation of dynamics of biological processes will be achievable. In addition to the extraordinary spatial resolution across a large field of view, high-throughput and multimodal data collection will provide unprecedented statistical analysis of complex biological and environmental systems, allowing to address their enormous heterogeneity. To maximize APS-U impact on BER science, the eBERlight program is being developed to specifically support the user community pursuing research within the BER mission. eBERlight is expected to allocate beamtime, facilitate and coordinate access, and support its users along their entire interaction with the APS, helping with the project and proposal development, design of the experimental workflow, sample preparation, data collection and analysis. To ensure an optimal infrastructure for one-stop portal, the program will also leverage additional Argonne resources for sample preparation and data analysis.

Root-Mediated Impacts of Plant Volatile Organic Compound Emissions on Soil CarbonMeredithUniversity of ArizonaHonekerEnvironmental MicrobiomeUniversity

The overarching project goal is to verify and quantify volatile organic compounds (VOCs) as direct and indirect contributors to soil C stabilization within the rhizosphere and beyond through teleconnections in the soil matrix, and to determine their underpinning ecological and metabolic mechanisms. The long-term motivation for the project is to transform the current conceptual understanding and predictive capacity of microbial systems and soil C stabilization to include the important roles of volatile compounds. This presentation falls under the objective to determine the direct pathways and contributions of root-released VOCs and VOC transformations by soil microbiomes to soil C cycling and stabilization. Specifically, in this task, researchers will aim to identify VOC-consuming microbes and traits and identify pathway(s) for VOC-C stabilization in soil pools using soil incubations and time-resolved 13C-VOC stable isotope labeling.

Plants are recognized as the dominant source of VOCs to the atmosphere, where they play critical roles in air quality and climate, yet the parallel impact of plant-derived VOCs on the pedosphere (soil) remains poorly quantified. VOCs released by decomposing litter can contribute to soil carbon (C) pools including those associated with soil C stabilization, and researchers hypothesize that root VOCs can also contribute to these soil C pools. Furthermore, researchers anticipate that this pathway for soil C stabilization will depend on plant physiological traits, rhizosphere microbes and their activity, and soil environmental factors.

While the composition and flux of root-soil emissions of VOCs in the rhizosphere remain poorly quantified due to the lack of developed methods, here the team characterized the belowground release of plant VOCs to the soil system in two distinct ecosystems. Beneath the bioenergy relevant crop sorghum in a semi-arid agroecosystem at the Maricopa Agriculture Center in AZ, the team quantified soil VOC concentrations (proton transfer reaction time of flight mass spectrometer interfaced with novel in situ diffusion gas probes; Roscioli et al. 2021; Gil-Loaiza et al. 2022) and root metabolites (nuclear magnetic resonance; NMR) to identify root VOCs and estimate their potential in situ uptake rates in soil. In addition, the team leveraged an experimental soil warming treatment in the temperate coniferous Blodgett Experimental Forest in the Sierra Foothills in CA to explore associations between regions of root growth, increased VOC concentrations (NMR), and enhanced soil organic carbon (SOC) stocks. In bulk surface soils, researchers observed increases in both SOC stock and VOC concentrations including ethanol and methanol. Together, these field observations help to begin to define the contributions of root VOCs to soil C stabilization. They also serve as a reference for upcoming experimental tests of the impact of plant traits and environmental conditions on this under-resolved C transformation pathway.

Direct Routes for Microbial Carbon Stabilization of Volatile Organic Compounds in SoilMeredithUniversity of ArizonaGil-LoaizaEnvironmental MicrobiomeUniversity

The overarching project goal is to verify and quantify volatile organic compounds (VOCs) as direct and indirect contributors to soil carbon (C) stabilization within the rhizosphere and beyond through teleconnections, and to determine their underpinning ecological and metabolic mechanisms. The long-term motivation for this project is to transform the current conceptual understanding and predictive capacity of microbial systems and soil C stabilization to include the important roles of volatile compounds. This presentation falls under the objective to determine the direct pathways and contributions of root-released VOCs and VOC transformations by soil microbiomes to soil C cycling and stabilization. Specifically, in this task, researchers will aim to identify VOC-consuming microbes and traits and identify pathway(s) for VOC-C stabilization in soil pools using soil incubations and time-resolved 13C-VOC stable isotope labeling.

VOCs are ubiquitous carbon pools in the Earth system, but often remain uncharacterized as vectors of soil organic C transformations. Roots, litter, aboveground vegetation, and microbial metabolism are all sources of VOCs; however, little is known about how these omnipresent metabolites (Honeker et al. 2021; Meredith and Tfaily 2022) can contribute to C cycling in soils. In this project, researchers aim to verify and quantify the direct contributions of VOCs to soil C pools and determine their underpinning ecological and metabolic mechanisms. Here, researchers present the efforts to identify and quantify VOC uptake rates in soil, identify VOC-consuming microbes and traits, and determine whether the VOC uptake contributes to soil C pools. In a first experiment, the team exposed soil columns to controlled amounts of three VOCs (isoprene, methyl vinyl ketone, ethanol) below the soil surface to mimic root emissions. By monitoring subsurface VOC concentrations using novel in situ diffusive soil probes (Roscioli et al. 2021; Gil-Loaiza et al. 2022) connected to a proton transfer reaction time of flight mass spectrometer, researchers identified different uptake capacities for the three VOCs. Furthermore, the team discovered that microbial communities appeared to dramatically enhance their isoprene uptake rates in response to isoprene exposure, revealing that microbial VOC metabolism can acclimate to resource availability. By tracking shifts in community composition and the isoprene monooxygenase gene copy number in these experiments, the aim is to identify the responsible microbes and the metabolic pathways involved. In a second experiment, by characterizing shifts in soil C concentration and composition (Fourier-transform ion cyclotron resonance mass spectrometry) in response to additions of different VOCs (methanol, acetaldehyde, acetone, isoprene, and monoterpene ɑ-pinene), researchers will test the hypothesis that VOCs can have a direct impact on soil carbon composition. Researchers will use these results to refine plans for subsequent 13C isotope-labeling experiments to track VOC-C into soil C pools and microbial groups. These experiments will provide new understanding of the direct impacts of VOCs on soil C and the importance to the conceptual models of soil C.

Deploying Top-Down and Bottom-Up Strategies for Genetic Engineering of Auxenochlorella protothecoides for the Production of Sustainable BiofuelsMerchantUniversity of California–BerkeleyRothBiosystems DesignUniversity

Auxenochlorella protothecoides, a Trebouxiophyte oleaginous alga, is a reference organism for discovery and a platform for synthetic biology driven by photosynthesis. Researchers will expand transformation markers, regulatory sequences and reporter genes, improve transformation efficiency, and develop RNP-mediated gene-editing methods for genome modification. Systems analyses and metabolic modeling approaches will inform genome modifications for rational improvement of photosynthetic carbon fixation and strain engineering to produce cyclopropane fatty acids. The team will identify regulatory factors and signaling pathways responsible for activating fatty acid and triacylglycerol biosynthesis and will manipulate them to increase lipid productivity. Non-photochemical quenching and a regulatory circuit for maintaining photosynthesis under Cu limitation, both of which are absent in A. protothecoides, will be introduced to improve photosynthetic resilience, and the performance of engineered strains will be modeled. Here, researchers focus on two genetic engineering strategies, improving core photosynthesis and upregulating de novo fatty acid biosynthesis.

Microalgae can play an integral role in a sustainable bioeconomy by helping to meet the rising demand for energy and products. Microalgae use solar energy to capture and convert CO2 into biomass and achieve rapid growth without competing with food crops for land and water. However, there are considerable practical limitations in the photosynthetic production of biofuels from microalgae, resulting in low productivity and high costs. A. protothecoides has a highly flexible metabolism with rapid growth reaching high density under photoautotrophic and heterotrophic conditions, the latter in which cells accumulate large amounts of triacylglycerol (TAG). However, as with all photosynthetic organisms, one of the challenges that ultimately limits the theoretical yield is the low efficiency of photosynthetic energy conversion. Photosynthesis works efficiently when light is limiting. Under moderate to high light intensities, carbon fixation becomes limiting, and photoprotection mechanisms known as non-photochemical quenching (NPQ) are induced to minimize the generation of reactive oxygen species and photoinhibition. To enhance photosynthesis efficiency, the team will increase sedoheptulose-bisphosphatase activity to relieve the rate-limiting step of the Calvin-Benson (CB) cycle, thereby enhancing carbon fixation and decreasing the dissipation of light energy through the induction of NPQ.

cis-regulatory elements, regions of non-coding DNA, play critical roles in transcriptional regulation in algae and plants and can be used in genetic engineering approaches to increase the expression of fatty acid biosynthesis (FAS) genes. Here, researchers established a pipeline to extract transcription factors based on Interproscan IDs from the available A. protothecoides 0710 genome. Using available transcriptomic data, the team identified novel putative transcription factors involved in regulation of de novo FAS and glycolysis, which produces the precursor for FAS, pyruvate, and then tested candidate transcription factors using time-resolved qPCR analysis to confirm their upregulation during TAG accumulation. Researchers are also using systems biology tools such as cis-regulatory element discovery using MEME suite (Bailey et al. 2015) to identify key regulatory motifs potentially involved in regulation of FAS. Simultaneously, researchers will generate and integrate transcriptomic and proteomic data to create gene regulatory networks and identify hub transcription factors and use DAP-Seq (Bartlett et al. 2017) to find and validate binding targets in promoters, untranslated regions (UTR) and introns. Altogether, this work will inform the engineering of strains to improve total lipid accumulation in A. protothecoides.

Strains with engineered SBPase will be characterized for their growth, biomass, and photosynthetic capacities to test whether they have enhanced carbon fixation. Metabolomics analysis of polar metabolites will inform researchers on how the flux through the CB cycle has changed and offer new strategies for strain improvement. Similarly, strains engineered with the transcription factors will be analyzed for lipid accumulation. The metabolome data will be fed to metabolic flux modeling to iteratively improve the engineering strategies through the DBTL cycle. While characterization of each of these strategies will be interesting and important to understand the regulation of each biochemical pathway, combining both in a single strain will likely be necessary to optimize lipid production.

Tools and Targets: Engineering Modified Fatty Acids and Improved Photosynthetic Resilience in Auxenochlorella protothecoidesMerchantUniversity of California–BerkeleyMoseleyBiosystems DesignUniversity

Auxenochlorella protothecoides, a Trebouxiophyte oleaginous alga, is a reference for discovery and a platform for photosynthesis-driven synthetic biology and sustainable bio-production. Researchers will expand transformation markers, regulatory sequences and reporter genes, improve transformation efficiency, and develop RNP-mediated gene-editing methods for genome modification. Systems analyses and metabolic modeling approaches will inform genome modifications for rational improvement of photosynthetic carbon fixation and strain engineering to produce cyclopropane fatty acids. Regulatory factors and signaling pathways responsible for activating fatty acid and triacylglycerol biosynthesis will be identified, and researchers will manipulate them to increase lipid productivity. Non-photochemical quenching and a regulatory circuit for maintaining photosynthesis under Cu-limitation, both of which are absent in A. protothecoides, will be introduced to improve photosynthetic resilience, and the performance of engineered strains will be modeled.

Researchers have established transformation and targeted gene-replacement by homologous recombination in A. protothecoides and now seek to engineer strains with improved photosynthesis that produce modified fatty acids. As an essential preliminary step, researchers have resolved the organellar genomes and generated a gapless, ~22 Mbp (haploid size), phased diploid nuclear genome of strain UTEX 250. Iso-seq and RNA-seq analyses facilitated the annotation of >7500 gene models on 12 chromosomes. Elemental analysis by ICP-MS was used to optimize the defined growth medium, enabling luxury consumption of essential nutrients. Subsequent systems analyses will be carried out to identify the full complement of metalloproteins encoded in the genome and to understand how metal nutrients are allocated. Two biosafe transformation markers, Arabidopsis THIC, restoring thiamine prototrophy, and Saccharomyces SUC2, enabling sucrose assimilation, are in routine use, along with the nptII gene, conferring resistance to the aminoglycoside antibiotic, G418. Additional herbicide resistance transformation markers are under development, along with counter-selectable recyclable marker genes that will allow for scarless and marker-free integrations. Strong promoters from genes encoding an ammonium transporter, stearoyl-ACP desaturase, methionine synthase, RuBisCO small subunit, photosystem I and light-harvesting complex components have been demonstrated to activate expression of codon-optimized transgenes. Chlamydomonas BKT1, encoding beta-carotene ketolase, Gaussia princeps luciferase, Venus and mCherry have been used as quantifiable reporter genes for evaluating promoter strength, and plastid targeting of the fluorescent proteins was demonstrated. Researchers have also investigated the optimal arrangement of ORFs in polycistronic constructs; the results are consistent with more balanced translation when the shorter of the two ORFs is upstream. This is compatible with the finding that ORFs in endogenous polycistronic genes across the green lineage usually have the shorter ORF upstream.

One stated goal is to modify Auxenochlorella fatty acid and lipid biosynthesis to produce medium (mid)-chain length cyclopropane fatty acids (CPFAs) suitable as precursors for jet fuel. This will require chain-length control to increase mid-chain fatty acids, control of saturation level, since cyclopropane fatty acid synthases (CPS) compete with endogenous microsomal desaturases for mono-unsaturated substrates, and increased exchange between phospholipids (the site of CPFA synthesis) and Kennedy pathway intermediates in TAG biosynthesis (so that CPFAs are incorporated into storage lipids). Initial experiments established that accumulation of C12:0 and C14:0 fatty acids increased in strains expressing the Cuphea wrightii FATB2 thioesterase gene, driven by a SAD2 promoter that is activated during N-starvation and lipid production. C16:0 levels were increased by expressing FATB3 from Brassica juncea. Researchers will test whether mid-chain levels can be enhanced further by co-expressing CwFATB2 with a beta-ketoacyl-ACP synthase gene (CwKASA1) from the same species. To generate CPFAs the team intend to first make Auxenochlorella strains with reduced Δ12-desaturase activity, thus removing the most significant competing activity for CPS. These strains will simultaneously overexpress phosphatidylcholine: diacylglycerol cholinephosphotransferase (PDCT), lysophosphatidylcholine acyltransferase (LPCAT), and lysophosphatidic acid acyltransferase (LPAAT) from Sterculia foetida and Litchi chinensis; both plant species that accumulate high amounts of CPFAs in their seed oils. The team will then screen CPS from E. coli and CPFA-accumulating plants or marine bacteria to identify the most active enzymes in Auxenochlorella.

Researchers will also introduce a regulatory circuit into Auxenochlorella to maintain photosynthetic resilience in response to Cu-deficiency. The major sink for Cu in plants is the thylakoid lumen protein plastocyanin, which transfers electrons from the cyt b6/f complex to oxidized photosystem I. Consequently, Cu limitation, which is common in many environments, severely reduces plant photosynthesis and growth. Some algae and cyanobacteria acclimate to Cu deficiency by substituting a heme-containing cytochrome that performs the same electron carrier function as plastocyanin, thereby maintaining high rates of photosynthetic electron transfer and reducing their cellular Cu quota. In Chlamydomonas the CYC6 gene, encoding Cyt c6, is regulated by a copper-sensing transcription factor CRR1. A potential CRR1 homolog is identified in the Auxenochlorella genome, and preliminary RNA-seq analysis indicates that a putative copper transporter gene, CTR1, is activated during Cu starvation. The team will use the CTR1 promoter to activate expression of a synthetic, codon-optimized CYC6 gene in Cu-deficient Auxenochlorella cells, and test for improved photosynthetic performance and growth. Efficient import of heterologous Cyt c6 into the Auxenochlorella thylakoid lumen may require replacement of the native bipartite transit peptide with the transit peptide of an endogenous lumen targeted protein, such as Auxenochlorella plastocyanin. This work will establish A. protothecoides as a powerful photosynthetically driven cell chassis for sustainable bioproduction of fuels and specialty products.

A Gapless and Phased Diploid Genome Assembly for Auxenochlorella protothecoides Facilitates Metabolic Modeling and Proteomics AnalysesMerchantUniversity of California–BerkeleyBoyleBiosystems DesignUniversity

Auxenochlorella protothecoides, a Trebouxiophyte oleaginous alga, is a reference for discovery and a platform for photosynthesis-driven synthetic biology and sustainable bio-production. Researchers will expand transformation markers, regulatory sequences and reporter genes, improve transformation efficiency, and develop RNP-mediated gene-editing methods for genome modification. Systems analyses and metabolic modeling approaches will inform genome modifications for rational improvement of photosynthetic carbon fixation and strain engineering to produce cyclopropane fatty acids. Regulatory factors and signaling pathways responsible for activating fatty acid and triacylglycerol biosynthesis will be identified, and researchers will manipulate them to increase lipid productivity. Non-photochemical quenching and a regulatory circuit for maintaining photosynthesis under Cu-limitation, both of which are absent in A. protothecoides, will be introduced to improve photosynthetic resilience, and the performance of engineered strains will be modeled.

High quality reference genomes and structural annotations are the foundation of many systems and synthetic biology approaches. Researchers have produced a gapless and phased genome assembly for the diploid A. protothecoides strain UTEX 250 using high-coverage Pacific Biosciences (PacBio) HiFi sequencing and Omni-C linked read sequencing. The haploid length of the UTEX 250 nuclear genome is 22 Mb, which is arranged on 12 chromosomes ranging from 0.5 to 4.1 Mb. The genome is GC-rich (64%) and generally highly heterozygous; the two haplotypes differ at ~3% of sites, enabling allele-specific transformation and allele-specific gene expression to be quantified. However, approximately a third of the genome is homozygous, including three entire chromosomes, suggesting widespread loss-of-heterozygosity events as observed in other vegetative diploids (e.g. yeasts). Complete circular plastome (84.6 kb) and mitogenome (54.0 kb) assemblies have also been produced.

To produce highly accurate structural annotations, researchers have sequenced PacBio Iso-Seq libraries from mixotrophic and heterotrophic conditions, and ~60 million paired-end and stranded RNA-seq reads from several other growth conditions. Utilizing these data, researchers have annotated ~7,600 gene models per haploid genome, approximately 70% of which are supported by full-length Iso-Seq reads. More than 200 complex gene models were corrected by manual annotation. The quality of the annotations are supported by a Benchmarking Universal Single-Copy Orthologs (BUSCO) score of ~99% completeness, with all missing BUSCOs manually confirmed to be biologically absent in the genome. The A. protothecoides genome is remarkably compact with respect to gene content and features <5% repeats, although a handful of potentially active DNA transposons have been identified.

The A. protothecoides haploid gene number is less than half as many as the reference green alga Chlamydomonas reinhardtii. Researchers are presently performing comparative analyses among several high-quality algal genomes to functionally characterize gene presence and absence, with particular focus on the GreenCut of proteins that are typically found in photosynthetic plants and green algae. Researchers are using the structural annotations to improve synthetic biology approaches, including the identification of potential condition-specific promoters and the refinement of Kozak sequence and codon bias optimization. A. protothecoides is also unusual among green algae in that several geographically and environmentally diverse isolates are available in culture. Researchers are targeting an ~20 strain pan-genome, with a view to the identification of standing genetic variation that may facilitate strain improvement e.g. growth in brackish water.

The new genome sequence will be used to update and improve the first draft metabolic network of A. protothecoides that the team published previously. In order to model growth in a variety of conditions, a complete biomass analysis will be performed for growth in heterotrophic, autotrophic and mixotrophic conditions. The experimentally obtained data will be used to develop accurate biomass objective equations for each growth condition. Constraints based on metabolic modeling approaches, such as flux balance analysis (FBA), will be used to predict growth and yields in different environmental and genetic backgrounds. These simulations will also be used to identify gene targets to further improve production of cyclopropane fatty acids. Isotope assisted metabolic flux analysis (13C-MFA) will also be used to quantify fluxes in different growth conditions and mutant strains.

Assessment and quantification of protein production is a key step in evaluating engineering outcomes. Researchers have extended the pipelines to A. protothecoides using state-of-the-art technologies resident at PNNL, and in a single pilot experiment the team captured >6100 A. protothecoides proteins, representing around 80% of the ~7600 proteins encoded in the genome. More detailed analyses are now possible due to the completion of the genome. For applications where it is critical to know the exact abundance of proteins, researchers will employ targeted proteomics with selected reaction monitoring (SRM) to determine the absolute amount of a subset of proteins. For instance, metabolic models of flux from carbon-fixation to triacylglycerol biosynthesis will be more accurate when they can incorporate the concentration of active sites for key pathway enzymes. Transgenes from engineering efforts, orthologs of fatty acid and lipid biosynthesis enzymes, CBC enzymes, PSI, PSII and light-harvesting proteins in Auxenochlorella are candidate targets for the SRM approach.

Ghost Imaging in the X-ray RegimeMcSweeneyNational Synchrotron Light Source IIGoodrichStructural Biology

The objective of the NSLS-II Quantum Microscope Project is to explore the possibility of reducing the dose of X-rays interacting with biological samples, while enhancing the resolution and contrast of the measurements, by using ghost imaging (GI) technique (Pittman et al. 1995; Shapiro and Boyd 2012). Experiments were performed with 15 keV and 9.6 keV x-ray beams in type-I (quantum) and type- II (classical) GI, respectively.

The poster will report details on the results obtained with type-I GI, where a diamond crystal was used to down convert 15 keV x-rays to correlated photon pairs, which were afterwards detected by a 1.56 ns event-based photon-counting Timepix3 camera with 55 x 55 microns squared pixels (CERN 2023). The Timepix3 camera served at the same time as signal and idler detector and the collection of spatially correlated photon pairs demonstrated the presence of type I phase matching spontaneous parametric down conversion (SPDC).

Furthermore, results will be reported on a type-II GI experiment using speckle patterns generated by a membrane diffuser illuminated by a 9.6 keV X-ray beam. The speckled light was used to periodically illuminate the object following a sample-in/sample-out method. The ghost image of the object was created by the classical correlations between the corresponding produced frames. To our knowledge, this is the first ghost image of a biological sample (E. Cardamomum seed) with a variable transmission profile, highlighting the capability of this technique to image objects beyond binary masks or shadows.

Development of a CenH3-Based Haploid Inducer in Hexaploid Camelina sativaHenryUniversity of California–DavisHenryBioenergyUniversity

Camelina (Camelina sativa) is a promising oilseed crop that is particularly well suited for cultivation in the Northwest of the United States. The ECON project is an interdisciplinary project focused on two main objectives i) enhance nitrogen utilization efficiency and ii) boost oil yield. The long-term goals are to increase the economic profitability of camelina cultivation, by reducing the negative impact of nitrogen fertilization and increasing productivity competitiveness with other major oilseed crops such as canola. Approaches include characterizing genetic and genomic natural variation within camelina for the ability to absorb, translocate and assimilate nitrogen, and for recruiting beneficial rhizo-microbes to improve nitrogen acquisition. Researchers are also investigating the mechanisms underlying these differences to optimize yield potential by increasing seed size and enhancing oil synthesis. Within the ECON project, the laboratory aims at developing a haploid inducer line for camelina, a tool that will be instrumental for accelerating the breeding of several loci of interest in a polyploid background.

Haploid induction is powerful breeding tool. Amongst others, it allows the rapid production of complex genotypes when constructing experimental and breeding lines. This is particularly critical when dealing with polyploid genomes such allohexaploid C. sativa. For example, selfing a parent with 6 heterozygous non-linked loci is expected to result in 0.024% homozygous progeny. Crossing the same parent to a haploid inducer (HI) will produce 1.5% progeny with the desired genotype. Modification of the centromere-specific histone variant CENH3 engenders haploid induction in Arabidopsis thaliana. Specifically, crosses between a haploid inducer line carrying a mutated form of CENH3 and a wild-type line results in frequent elimination of the haploid inducer chromosomes and produces offspring of different types in similar numbers: paternal haploids, aneuploids, and diploids. Haploids are formed when all the HI chromosomes are lost and only the maternal WT chromosomes are retained.

The goal is to develop a cenH3-based haploid inducer in C. sativa. C. sativa is a very close relative of A. thaliana but the situation is complicated by the fact that the genome of camelina harbors three functional copies of the cenH3 genes, all of which need to be modified to produce a HI. A TILLING population of Camelina var. Ames 1043 was previously developed in the laboratory and more than 300 high reliability mutations were identified in the cenH3 genes. Through a series of crosses and selection steps, the following three mutations were combined into a single line, in the homozygous state: a nonsense mutation allele (genome 1), a missense mutation that is known to result in haploid induction in A. thaliana (genome 2), and a splice-site variant (genome 3). The resulting potential haploid inducer was crosses to WT Ames and the progeny screened for the presence of haploids. No haploid plant was recovered but >85% of the progeny lacked at least one chromosome, confirming that genome elimination is occurring in these crosses, albeit not sufficiently efficiently to eliminate all 20 chromosome of the haploid inducer parent. Interestingly, chromosomes from genome 2 were preferentially lost in the aneuploid progeny, consistent with previously documented relative expression dominance of the other two sub-genomes within the Camelina genome. Taken together, the results so far suggest that CENH3-based haploid induction is a feasible approach in camelina but complicated by the fact that a large percentage of aneuploid progeny survive, presumably because of the buffering effect of the polyploid background. Researchers are in the process of combining weaker CENH3 alleles in a new potential haploid inducer line in order to further increase the loss of HI chromosomes and hopefully obtain fully haploid lines.

Transcriptome and Gene Regulatory Network Analyses in Camelina Nitrogen Response and Seed DevelopmentLuMontana State UniversityCorrerBioenergyUniversity

Camelina is a Brassica oilseed crop that has great potential to become a sustainable source of bioenergy in the U. S. However, the low nitrogen use efficiency and the low seed and oil yield compared to other major oilseed crops hinder this potential. The goal of this project is to decipher the genetic and physiological mechanisms that determine the nitrogen use efficiency and oilseed yield during the most critical processes of the camelina life cycle: 1) how camelina, in partnership with soil microbes, maximizes its ability to absorb and assimilate nitrogen into vegetative biomass; and 2) upon the transition to reproductive growth, how nitrogen is efficiently remobilized from senescing tissues (leaves and silicles) into sinks (seeds) to optimize yield potential by increasing seed size and enhancing oil synthesis.

Camelina (Camelina sativa (L.) Crantz) is an oilseed with potential as a crop for second generation biofuel production. To achieve sustainable production of camelina oil, the energy-consuming inputs need to be optimized, especially reducing the amount of nitrogen in the production system. Researchers need to obtain a systems-level understanding of genetic and physiological mechanisms that may be used to enhance the nitrogen use efficiency (NUE) and to improve agronomic and seed traits in camelina. The team grew three C. sativa accessions–Suneson, Cam 70 and Cam 116-under two nitrogen conditions of 0.55 mM and 5.5 mM of nitrate in the nutrient solution, and assessed the transcriptomes of six organs–flower, 10 days after fertilization (DAF) seed, pod, leaf, stem, and root. The largest number of differentially expressed genes (DEGs) is due to the genotype effect for almost all tissues; but differences attributed to nitrogen input are remarkable in flower, pod, and stem. The team identified functional terms enriched with DEGs due to nitrogen, which involved mostly in abiotic responses and changes in photosynthesis-associated processes. In seeds, researchers also identified significant differences in genes of the phenylpropanoid pathway, biosynthesis of flavonoids, and maintenance of seed dormancy.

There is a lack of knowledge in processes that regulate the development of camelina seeds and affect their viability. The previous study showed that the miR167A overexpression results in lower levels of α-linolenic acid, but also larger seeds and delayed seed maturation in camelina (Na et al, 2019). Researchers therefore aimed to identify genes responsible for this altered seed development through a co-expression weighted network analysis. Researchers examined published gene expression profiles of the camelina wildtype (cv. Suneson) and miR167OE, a transgenic line overexpressing miR167A, at 8, 10 and 12 days after flowering (DAF). One group of co-expressed genes increased in expression from 8 to 10-12 DAF in miR167OE; however, their expression levels at 10-12 DAF was similar to those in Suneson at 8 DAF. In this group, the team found significant enrichment of genes in auxin response, seed oilbody biogenesis, seed maturation and seed germination. Genes with the opposite expression pattern included those with functions in the flavonoid biosynthesis pathway including BANYLUS and multiple members of the TRANSPARENT TESTA family. They are potentially involved in the determination of seed coat color but also can influence other seed traits. In addition to genes found in the enrichment analysis, the team also identified other candidates involved in seed germination: DELAY OF GERMINATION 1 (DOG1), ABA-HYPERSENSITIVE GERMINATION 1 (AHG1) and FIE2. The expression profiles of those genes showed profound differences at 8 DAF in miR167OE compared to all the other samples. Based on the patterns of expression in these genes, researchers hypothesized that earlier germination occurs in miR167OE compared to Suneson despite delayed seed maturation. This was confirmed by a germination assay to evaluate the germination rates of both genotypes for two weeks, and the transgenic genotype indeed germinated faster. These outcomes provide evidence that besides changing the oil profile, miR167A overexpression also interferes with seed development and maturation, resulting in faster germination rates in miR167OE.

Developing, Understanding, and Harnessing Modular Carbon/Nitrogen-fixing Tripartite Microbial Consortia for Versatile Production of Biofuel and Platform ChemicalsLinUniversity of MichiganLinBiosystems DesignUniversity

The overall goal of this project is to design, construct, analyze and optimize a synthetic microbial consortium system consisting of three closely interacting members: a CO2-fixing photosynthetic specialist, a N2-fixing specialist, and a third specialist that can convert organic carbon and nitrogen generated by the first two specialists to synthesize a desired product. By integrating complimentary expertise from multiple research laboratories at three institutions, researchers are pursuing three specific objectives: i) Develop tripartite microbial consortia for carbon/nitrogen fixation and production of bio-molecules with various nitrogen/carbon ratios; ii) Investigate molecular and cellular mechanisms governing the tripartite consortia via omics study and predictive modeling; and iii) Explore alternative spatial configurations and develop scalable design principles.

Microbial communities are ubiquitous in nature, exhibiting incredibly versatile metabolic capabilities and remarkable robustness. Inspired by these synergistic microbial ecosystems, rationally designed synthetic microbial consortia is emerging as a new paradigm for bioprocessing and offers tremendous potential for solving some of the biggest challenges society faces. In this project, the team focuses on a tripartite consortium consisting of a CO2-fixing photosynthetic specialist, a N2-fixing specialist, and a third specialist that can convert organic carbon and nitrogen generated by the first two specialists to synthesize a desired product. In addition to CO2 fixation, a noteworthy feature of this design is the elimination of the requirement for nitrogen fertilizer, which has been produced through ammonia synthesis using the Haber-Bosch process and accounts for an estimated 2% of global energy expenditure. Researchers aim to develop a modular and flexible model system capable of producing diverse bio-molecules (varying C:N ratio) as advanced biofuel or platform chemicals, to dissect this complex ecosystem using a spectrum of cutting-edge systems approaches, and to ultimately derive scalable and broadly applicable design principles for maximizing the system performance.

The first prototype tripartite consortium employs genetically modified strains of photosynthetic cyanobacterium Synechococcus elongatus that secretes sucrose (Abramson et al. 2016) and nitrogen-fixing bacterium Azotobacter vinelandii that secretes ammonia (Barney et al. 2015). respectively, to form a symbiotic foundation for supporting a third producer member. Researchers demonstrated supported growth for a range of producer strain candidates, including a sucrose-metabolizing Escherichia coli K-12 derivative strain, Corynebacterium glutamicum, and Bacillus subtilis, using a multi-chamber bioreactor system under continuous culture conditions (Carruthers et al. in preparation).

On-going investigation include: i) development of predictive mathematical models of the tri-culture system to systematically explore the parameter space to understand how different biological parameters and operating strategies impact the system performance such as yield and productivity; and ii) omics analysis of monocultures and cocultures under controlled yet suboptimal/stressed conditions to identify the molecular bottlenecks limiting the performance of each member and hence the overall tri-culture.

Development of Classically Entangled Light for Depth-Resolved Quantum Mimicry BioimagingLiaoUniversity of Colorado–BoulderLiaoBioimaging

This project aims to build upon the principles of classical entanglement of light and develop new and untested (1) classes of anti-correlated light sources and (2) quantum-inspired imaging protocols that fit into the theoretical framework for recapitulating desirable super-performing imaging traits (e.g., the performance that surpasses those set by classical limits). More specifically, efforts will be focused on testing the quantum-like characteristics of newly developed light pulses and applying them to enhance the performance of optical coherence tomography, a label-free cross-sectional imaging method that is suited for in situ probing of plant biology.

Quantum imaging has attracted growing interest over the past three decades, motivated by successful demonstrations that it could outperform its classical counterpart in several aspects. However, challenges associated with the low brightness of entangled photons and reliance on photon-sparse imaging protocols have stalled attempts at translating those technologies to practical biological field use. Those issues have also necessitated long data acquisition times and make imaging of dynamical biological processes challenging. Surprisingly, several phenomena once thought to be exclusive to quantum entangled photons had been successfully replicated with classical light carrying anti-correlations or nonseparable degrees of freedom (e.g., spin and orbital angular momenta, wavelengths, spatial, and temporal modes). These discoveries gave rise to an emerging field known as classical entanglement or mode-entanglement of light, such as those involving arbitrarily tailored vector beams. The ability to perform quantum mimicry using special forms of classical light has far-reaching implications, both in the potential of overcoming inherent shortcomings of quantum light sources and in the practical considerations of translating those advantages for robust imaging applications. The project will perform research on the underpinning principles for optical wavefront and field control of structured vector beams, such that the knowledge can be applied to the design and construction of light sources for quantum mimicry imaging. Researchers will subsequently develop interferometric systems and image reconstruction protocols for optical coherence tomographic bioimaging based on the considerations for those classically entangled beams. Characterization of the imaging instrument and validation of its performance will be carried out to benchmark its performance against comparable technologies without classical entanglement. The expected outcomes could potentially lead to quantum-like imaging advantages without sacrificing optical brightness. The quantum-like advantages or enhancement pursued in this project include low-noise, high-sensitivity imaging through turbid and scattering media. These enhanced capabilities could benefit plant research on multiple fronts, from imaging dynamically evolving bio-events with high precision to probing photo-sensitive biosystems with the lowest dose possible. By collaborating with experts in plants and microbiological systems at a later phase of the project, the developed imaging technology will be designed to be applicable for future in situ imaging of plant biological systems relevant to biomass and bioenergy investigations.

Discovering Transcriptional Regulators of Photosynthesis in Energy Sorghum to Improve ProductivityLongUniversity of Illinois at Urbana–ChampaignPelechBioenergyCABBI

This research aims to identify and investigate the transcription factors involved in the regulation of photosynthesis in energy sorghum. The major goal of this project is to model and validate gene regulatory networks that reveal the relationship between transcription factors and photosynthesis, particularly those that cause a loss of efficiency in lower leaf canopy leaves. This information will allow researchers to rank transcription factors by importance and thus, will guide future design strategies for developing energy sorghum cultivars with improved photosynthetic light-use efficiency in overall productivity.

C4 grasses such as annual energy sorghum hybrids (Sorghum bicolor) have great potential for both carbon sequestration and as feedstocks for biofuels and building materials. Sorghum is also exceptionally drought tolerant, which allows cultivation on land that is marginal for most food crops and as a vegetative crop, it also avoids the problems faced by grain crops during the water deficit sensitive reproductive phase (Mullet et al. 2014). However, in contrast to most plants, sorghum belongs to a clade of C4 species that has undergone a maladaptive loss of photosynthetic efficiency in self-shaded leaves within the canopy and current models predict that this loss results in a 15-20% reduction in potential productivity (Pignon et al. 2017). Specifically, most plants have evolved to dynamically tune their photosynthetic machinery by shifting the stoichiometry of proteins involved in the light reactions of photosynthesis to maintain a high maximum absolute quantum efficiency of CO2 assimilation (𝛷𝐶𝑂2,𝑚𝑎𝑥) in the shade. Seminal work has shown that the lower self-shaded leaves from C4 bioenergy crops (bioenergy sorghum, Miscanthus and maize) do not retain a high 𝛷𝐶𝑂2,𝑚𝑎𝑥 compared to their upper sun-exposed leaves, which is due to the change in light environment, not leaf age (Pignon et al. 2017; Collison et al. 2020). Variation in the severity of this 𝛷𝐶𝑂2,𝑚𝑎𝑥 loss between sorghum cultivars suggests that this maladaptive trait may be the result of difference in the expression of one or more genes (Jaikumar et al. 2021). Since transcription factors (TFs) are key regulators of gene expression in response to environmental stimuli such as changes in light intensity and quality, researchers hypothesize that key TFs cause the observed maladaptive loss of photosynthetic efficiency in energy sorghum and optimizing their expression will restore photosynthetic efficiency and alleviate suboptimal 𝛷𝐶𝑂2,𝑚𝑎𝑥 in the shaded canopy. Researchers further hypothesize that genes influencing 𝛷𝐶𝑂2,𝑚𝑎𝑥 will have expression patterns that correspond to measurable changes in photosynthetic traits and that researchers will be able to identify these genes by comparing changes in expression in response to the light environment across energy sorghum cultivars and canopy positions. Therefore, researchers will identify these key transcription factors by analyzing variations in gene expression and photosynthetic traits such as 𝛷𝐶𝑂2,𝑚𝑎𝑥 across light conditions and sorghum cultivars. Researchers will also use in planta validation of TF gene targets to model a gene regulatory network to describe the regulation of photosynthesis in sorghum. Identifying the cause of photosynthetic inefficiency in shaded energy sorghum canopies and engineering solutions to restore the 15-20% loss in productivity and enhance yield will improve the overall potential of this bioenergy crop to meet the growing needs for energy security.

Overview of CABBI Conversion ThemeZhaoUniversity of Illinois Urbana–ChampaignShenBioenergyCABBI

The overarching goal of the Conversion theme is to investigate the cellular metabolism and gene regulation mechanisms of three non-model yeasts including Issatchenkia orientalis, Yarrowia lipolytica and Rhodosporidium toruloides and to develop new synthetic biology and systems biology tools for characterization and engineering of these yeasts for production of chemicals and fuels from renewable plant biomass.

Microorganisms are increasingly used to produce biofuels and chemicals. However, developing robust microorganisms for the economical production of biofuels and bioproducts from low-cost, often-recalcitrant feedstocks at large scale with high titers, rates, and yields (TRYs) remains a significant challenge. Key reasons include: i) lack of understanding of how native metabolism and physiology constrains the production of non-natural compounds; ii) difficulty in identifying compounds that can be efficiently produced in living organisms and the best host, natural or engineered, for doing so; iii) the time-consuming and expensive design-build-test-learn (DBTL) cycle for metabolic engineering; and iv) lack of known enzymes with desired activity and substrate specificity for the synthesis of target natural or non-natural compounds.

To address these scientific challenges, the Conversion theme will build on the accomplishments of the first five years of research and focus on the following main objectives: (1) Develop a self-driving biofoundry for metabolic engineering and enzyme engineering; (2) Develop artificial intelligence (AI)/machine learning (ML) algorithms for biosystem design; (3) Engineer non-model yeasts for cost-effective production of four main target chemicals: 3-hydroxypropanoic acid (3-HP) and citramalate in Issatchenkia orientalis and triacetic acid lactone (TAL) and fatty alcohols in Yarrowia lipolytica and Rhodosporidium toruloides. The metabolic engineering efforts will continue to be accelerated by the tools developed, (4) Investigate the underlying biological mechanisms that lead to high-level production of target products using systems biology and biofoundry; (5) Investigate the molecular basis for genetic instability and develop engineering strategies to create robust production organisms for large-scale fermentation; (6) Develop an end-to-end pipeline coupled with techno-economic analysis/life cycle analysis (TEA/LCA) to demonstrate the process economics for fermentative production of these target chemicals, which not only provides guidance to the metabolic engineering efforts in (3) and (5), but also integrates the Conversion theme with the Feedstock and Sustainability Themes.

To achieve these objectives, researchers have assembled an interdisciplinary team with broad and complementary skills, including experts in metabolic engineering, synthetic biology, systems biology, enzyme engineering, analytical chemistry, catalysis, bioprocessing and fermentation, process economics analysis, data mining and machine learning, and plant synthetic biology.

In this poster, researchers will present an overview of the Center for Advanced Bioenergy and Bioproducts Innovation (CABBI) Conversion research program and highlight one representative project in which multiomics analysis is performed on Sacchromyces cerevisiae and Issatchenkia orientalis to understand their metabolism.

Integrating Measurements and Models to Improve Projections of Ecosystem Carbon Balance in Bioenergy AgricultureBrzostekCABBIJuiceBioenergyCABBI

The goal of the Center for Advanced Bioenergy and Bioproducts Innovation (CABBI) Sustainability Theme research is to design a sustainable bioeconomy (Figure 1). A critical part of meeting this goal is to integrate empirical measurements with models to project future scenarios of bioenergy systems that can inform sustainable management choices that enhance carbon (C) sequestration and nitrogen (N) retention. Here, researchers present multiple case studies demonstrating how observational and experimental measurements have been used to enhance the predictive capabilities of ecosystem models.

Research in the CABBI Sustainability Theme aims to understand the interactions between candidate bioenergy feedstocks, and the environmental conditions and geographic locations under which they are grown in order to inform the understanding of key factors needed for the economically and environmentally sustainable production of bioenergy and bioproducts for fossil fuel displacement. Within this framework, advancing predictive understanding of bioenergy systems through ecological modeling is critical to address how to provide energy, economic, and ecosystem C benefits to help slow the rate of climate change. To meet this challenge, CABBI researchers couple empirical measurements to ecosystem models that simulate plant yields and nutrient dynamics in order to predict future ecosystem C stocks under different environmental and feedstock scenarios. Here, researchers present three case studies illustrating how CABBI researchers integrate data with models to improve projections of bioenergy C balance.

  1. Researchers developed a new bioenergy ecosystem model, FUN-BioCROP that simulates plant-microbe interactions, microbial physiology, and emerging mechanisms of stable soil C creation and Model parameterization integrated both observational and experimental data. On the observational side, this included long-term measurements of total soil C pools and the form of soil C stabilization under different bioenergy crops. On the experimental side, researchers used a novel laboratory experiment to trace the fate of two bioenergy litters (miscanthus and corn) to improve the parameterization of key microbial traits. When the team ran the improved model forward, researchers found divergent responses of bioenergy feedstocks to environmental change.
  2. The predictive ability of ecosystem models is primarily constrained by data quality and quantity. Using model sensitivity analyses, researchers identified root and rhizome biomass, the response of plant C allocation to N fertilization and cycling of the litter layer as key data limitations to the predictive understanding of miscanthus by the Agro-IBIS model. To address these limitations, the team designed targeted field campaigns. The team found that N fertilization alters both the magnitude and timing of belowground C allocation and that the litter layer in miscanthus comprises ~1/4 of the total aboveground biomass. Both findings represent key processes and C pools that are priorities for future model revisions.
  3. DayCent has been at the forefront of models used to predict the C and N consequences of bioenergy production. A common criticism of DayCent has been its assumption that soil decomposition follows first order decay. To address this criticism, researchers integrated the explicit microbial dynamics of FUN-BioCROP into DayCent. The microbial DayCent model better captured the seasonal profile of ecosystem respiration of miscanthus and switchgrass derived from eddy-covariance measurements. Moreover, it also showed an upper limit to soil C accumulation over time, whereby ongoing plant inputs enhanced microbial biomass leading to priming losses of soil C. These results have important implications for estimating soil C accumulation in the emerging bioeconomy.

Across these case studies, the integration of empirical data into models resulted in improved C balance projections that represent the most up to date state of knowledge of the factors creating persistent soil C and allowing for the most sustainable bioenergy crop production.

Increasing the Value of Bioenergy Grasses—Expressing Engineered Traits in the Right Place at the Right TimeMarshall-ColonCenter for Advanced Bioenergy and Bioproducts InnovationFuBioenergyCABBI

The overarching goal of the Center for Advanced Bioenergy and Bioproducts Innovation (CABBI) Feedstock Production Theme research is to deliver resilient, highly productive grasses that contain large amounts of lipids. Researchers have made significant advances in engineering production of oils, specialty fatty acids, and other organic compounds in vegetative tissues of these grasses, as well as increasing biomass yield, resource use efficiency, and environmental resilience. A specific challenge researchers are still working on is targeting expression of engineered traits in the right place at the right time, paving the way for CABBI crops that produce oil in stem storage tissues at the end of the season.

Overview: To contribute to the development of the bioeconomy, feedstocks need to be improved to be economically and environmentally sustainable for both processor and farmer. CABBI strives to do this by engineering carbon allocation to produce oils, specialty fatty acids, and other organic compounds, as well as by increasing biomass yield, resource use efficiency, and environmental resilience. Because such improvements require coordinated changes in multiple plant traits, researchers are working simultaneously on trait improvements and how best to combine and implement them to get the traits in the right place at the right time. In the high biomass C4 grass feedstockssorghum, miscanthus and sugarcanethe right place for value-added products is stems. As the right time for different traits varies, researchers are performing analyses across multiple key developmental time points.

Spatial atlas of the sorghum stem: Bioenergy sorghum’s 4-5m stems account for ~80% of the harvested biomass. Stems accumulate high levels of sucrose that could be used to synthesize bioproducts if information about stem cell-type gene expression and regulation was available to enable engineering. To obtain this information, Laser Capture Microdissection (LCM) was used to isolate transcriptome profiles from five major cell types present in vegetative stems of Sorghum bicolor L. Moench cv. Wray. Transcriptome analysis identified genes with cell-type specific and cell-preferred expression patterns that reflect the distinct characteristics and regulatory functions of each cell type. The newly discovered cell type specific genes can be used as markers for downstream analyses, such as single cell transcriptomics. Analysis of cell-type specific gene regulatory networks (GRNs) revealed that 1) different biological functions distinguish vascular and non-vascular cell types; 2) distinct transcription factor families regulate the cell type specific expression of genes; and 3) cell type specific transcription factors have both direct and indirect modes of regulation to modulate the expression of cell type specific genes. The team used the LCM data to gain insights into stem secondary cell wall (SCW) networks. By combining the spatial resolution of the LCM-derived stem cell-type specific transcriptome with a stem developmental profile of SCW formation, researchers uncovered 1) the previously unknown spatial expression of key SCW genes across sorghum stem cell types, 2) cell type specific SCW regulatory networks and network motifs, and 3) potential regulators that repress SCW formation in pith parenchyma cells. The cell-type transcriptomic dataset provides a valuable source of information about the function of sorghum stems and GRNs that will enable the engineering of bioenergy sorghum stems.

Future directions: The spatial transcriptomes provide rich information about steady-state gene expression and identified cell-type specific hub genes that likely play key regulatory roles in signaling within each cell type. However, the products of gene expression, the proteome and metabolome, better reflect macro-level phenotypes because they strengthen the link between gene expression and gene function. A continuing collaboration among CABBI, GLBRC, and Environmental Molecular Sciences Laboratory (EMSL) will expand the sorghum stem molecular atlas by incorporating spatial proteomic and metabolomic data into the existing GRNs, which will improve network predictions by more directly linking genes to stem phenotypes. Likewise, researchers will expand the network analysis to include spatial transcriptomes from two additional developmental time points (onset of anthesis and post-anthesis). These combined analyses will reveal the spatio-temporal dynamics of stem metabolic and signaling networks. This deep understanding of these spatio-temporal dynamics is critical to being able to engineer stems and redirect bioproducts to the right place at the right time.

Connections across Feedstocks: An actionable outcome of the above analyses is the identification of specific promoter elements that drive cell type specific expression at different points in development. Such molecular tools provide a direct path to engineer sorghum stems to accumulate high-value bioproducts of interest to CABBI and its sister Bioenergy Research Centers (BRCs) and decrease the burden on conversion groups in the biofuel industry. Hence, this is a cross-BRC priority. With the close relation between sorghum, sugarcane, and miscanthus, the team expects that these findings will provide insights into stem-specific expression across the feedstocks of interest. The team also expects many of the findings and methodologies from this study to be useful to researchers interested in engineering other grasses of interest, such as switchgrass and maize. Combining this knowledge with CABBI advances in accumulating oil in sugarcane and sorghum, improvements in water use efficiency and photosynthesis, and breeding to identify plants with higher yields and greater geographic range will generate feedstocks that can be a foundation for a strong bioeconomy.

Pooled Microbial CRISPR Screens Using Single-Cell RNA SequencingCarothersUniversity of WashingtonBrandnerBiosystems DesignUniversity

The goal is to design genome-wide CRISPRa/i programs for carbon-conserving bioproduction. To achieve this goal, researchers will develop new approaches for high-throughput analysis powered by single-cell RNA sequencing.

CRISPR activation and interference tools have transformed the ability to reprogram microbial hosts for bioproduction. However, building large genetic programs is a time-consuming and iterative process due to the limited understanding of host metabolic and transcriptional regulatory networks. The goal is to develop a custom bacterial single-cell RNA sequencing platform to profile the impact of multi-gene CRISPR gene regulatory programs on thousands of transcriptomes. This approach will provide a high-throughput, low-cost, and information-rich technology to investigate design rules for CRISPR activation and interference (CRISPRa/i), identify heterogeneity in engineered strains, and rapidly assess both intended and unintended transcriptional responses from CRISPR programs. In eukaryotic systems, similar approaches have transformed the ability to interrogate gene function and delineate regulatory networks, but these methods have not yet been implemented in bacteria.

Here, researchers have applied a custom microbial single-cell RNA sequencing platform (microSPLiT) to profile the impact of CRISPRa perturbations on transcriptomic states in engineered E. coli. To validate the platform, researchers targeted genes involved in aromatic amino acid biosynthesis. Single-cell analysis revealed distinct gene expression signatures and variable stress responses for the CRISPRa targets despite belonging to the same metabolic pathway. These results demonstrate that the platform can provide information-rich readouts from CRISPRa programs for high-throughput metabolic engineering in bacteria.

Uncovering the Microbial Networks that Degrade Plant-Derived Phenolic Compounds and Their Role in Peatland Soil Carbon Sequestration: Revisiting the ‘Enzyme Latch’ HypothesisKostkaGeorgia Institute of TechnologyKostkaEnvironmental MicrobiomeUniversity

The goal is to elucidate the fundamental principles driving physiology and metabolic exchange within microbial interaction networks that regulate the rate-limiting steps in soil organic matter (SOM) degradation, specifically the oxidation of phenolic compounds derived from lignocellulose and lignin-like polymers in carbon-rich peatlands and their role in the preservation of organic matter under anaerobic, water-saturated conditions. The project combines multiomics with advanced analytical chemistry to test the enzyme latch hypothesis and its response to climate change drivers. Field and laboratory investigations will be integrated to construct and calibrate a predictive framework that links specific microbial processes and interactions to the mechanisms driving the rate limiting steps of enzymatic SOM decomposition (phenolic compound oxidation, hydrolysis), SOM persistence, and greenhouse gas production in peatland soils. The project leverages infrastructure and extensive datasets of DOE’s Spruce and Peatland Responses Under Changing Environments (SPRUCE) in Marcell Experimental Forest.

Peatlands represent climate critical regions that cover only 3% of the Earth’s land surface but store approximately 1/3 of all soil carbon (C). The future role of peatlands in C sequestration remains uncertain and depends on the impact of global change-related perturbations on their C balance. Hypotheses driving the proposed research are: 1) Under flooded anoxic conditions, persistent, plant-derived compounds (lignocellulose and lignin phenols) act as bottlenecks to microbial SOM decomposition by binding and inhibiting microbial hydrolase enzymes (e.g. CAZymes, peptidases); thus their degradation is the rate limiting step in SOM decomposition. 2) Soil moisture content and O2 availability along with SOM quality largely determine the functional diversity of heterotrophic microbes and metabolic pathways of lignocellulose and lignin degradation, which in turn regulate soil C storage through the enzyme latch. 3) Climate change drivers, which are expected to warm and dry out peatlands, will release the enzyme latch and accelerate SOM decomposition by enhancing the oxidation of phenolic compounds and concomitantly stimulating hydrolase activity. 4) Conversely, warming induced shifts in plant species composition away from mosses and toward lignin-rich vascular plants (ericaceous shrubs) will act to bolster the enzyme latch, inhibiting microbial decomposition through the accumulation of plant-derived phenolic compounds.

The project leverages an unprecedented time series generated from S1 Bog at the SPRUCE site (MN, U.S.) including, shotgun metagenomic profiles (131 metagenomes, 2.4 Tbp of sequences), amplicon sequencing, and physical-chemical-biological data from 2014-2022. 810 dereplicated metagenome-assembled genomes (MAGs) of prokaryotes have been obtained across all metagenome datasets, and short read recruitment against these MAGs reveals that they represent the majority of the sampled microbial communities at all peat depths. Taxonomic diversity is dominated by the Acidobacteria, which comprise half of the 10 most abundant MAGs, with some recovered genomes comprising up to ~15% of the total community. These organisms are known to be metabolically flexible and contain an abundance of genes that encode degradation of plant-derived polysaccharides under both aerobic and anaerobic conditions. Metagenomes were screened for phenol oxidase and peroxidase genes, revealing the metabolic potential for phenolic compound oxidation.

Plant-derived phenolic compounds implicated as inhibitory compounds in the latch mechanism, including sphagnum acid, are persistent in surface soils. These compounds are supplied at the surface by plant litter but appear resistant to microbial decay as evidenced by their accumulation at depths down to 200 cm. Using microcosm experiments researchers quantified the inhibitory effect of soluble phenolics on anaerobic C mineralization and linked this effect to soil organic matter quality and peatland type. By manipulating the concentration of free soluble phenolics with polyvinylpyrrolidone (PVP), a compound that binds and inactivates phenolics, thereby preventing phenolic-enzyme interactions, rates of CO2 and CH4 production in soils were shown to be 62% and 54% inhibited by naturally occurring porewater phenolics, respectively.

Phenol oxidase and hydrolase activities were profiled with depth and between whole-ecosystem warming treatments in soils of the SPRUCE enclosures. At the in situ pH of 4 and room temperature, all measured enzyme activities (phenol oxidase, b-glucosidase, cellobiohydrolase, b-N-acetylglucosaminidase, acid phosphatase) declined with peat depth. A generational drought occurred in 2021, which provided intriguing evidence for the complex controls of the enzyme latch mechanism. Phenol oxidase and hydrolase activities were inversely correlated with water table elevation, which dropped by 0.11 m during the drought and declined with whole ecosystem warming. Thus, researchers hypothesize that enzyme activity is enhanced by putative oxygenation due to drought and higher evapotranspiration rates in the warming treatments.

The recent metabolomic observations from a range of peatlands indicate that the decomposability of peat SOM, and conversely C sequestration, is directly correlated with carbohydrate content (more reactive C substrates) and inversely correlated with aromatic content (recalcitrant C that resists degradation). Notably, Sphagnum-dominated peat soils are outliers, exhibiting a high proportion of labile substrates (carbohydrate content), but low GHG production rates. A range of site-specific factors are likely to impact enzyme-phenol interactions, including biotic (functional diversity of microbial communities, vegetation) and abiotic (temperature, pH) parameters. While the investigations to date suggest that temperature limits the latch mechanism over other factors, soluble phenolics from Sphagnum mosses appear to be especially effective at limiting decomposition under anoxic conditions.

Computational Tools for Multiomic Data Standardization and Integration to Represent Whole Microbial CommunitiesKostkaGeorgia Institute of TechnologyKonstantinidisEnvironmental MicrobiomeUniversity

The goals of this research are to: i) standardize methods for detecting the relative abundance of molecular features (e.g., genes, pathways or species) in various omics datasets (e.g., metagenomics, metatranscriptomics, metaproteomics and metabolomics) for use by the scientific community; ii) extend the previously developed dynamic mathematical models for water-based ecosystems by integrating the standardized data from (i) and additional omics data, such as metabolomics, as parameters of the model towards identifying microbe-microbe and microbe-environment interactions within a microbial community; and iii) apply the advanced models to appropriate multiomic data from the DOE’s Spruce and Peatland Responses Under Changing Environments (SPRUCE) project to provide insights into the microbial interaction networks that mediate belowground carbon cycling in these peatland soils as well as how these interaction networks may be altered by climate change drivers (e.g., elevated temperature and CO2).

Microbial species, especially in soils, are engaged in incredibly complex interactions based on their physiological responses to the environment and chemical communication via a wide range of molecules in low concentrations. Deciphering the multi-dimensional causes and consequences of such interactions during environmental transitions is challenging because traditional methods reveal only the numerically dominant members of the community related to the flow of the major carbon and nitrogen sources, and/or are typically limited to static correlation networks of abundances that cannot encompass well the dynamic environment. A predictive understanding of how the functioning of soil (and other) ecosystems responds to future environmental perturbations is limited by the inability to elucidate the physiological interactions within complex soil microbial communities and the effect of the physicochemical environment on those interactions. Further, the identification of key microbial guilds, i.e., microbial groups of species that exploit the same resource(s) related to carbon turnover, remains essentially elusive.

To address these challenges, researchers have recently developed mathematical models that represent microbe-microbe and microbe-environment interactions within a community and can predict how these interactions change in the future when environmental parameters such as temperature or precipitation change, i.e., the models represent dynamic models of whole microbial communities. The mathematical models are based on the principle of the ecological Lotka-Volterra (LV) differential equations and require time-series omic data (Dam et al. 2016). Applications of these models to available metagenomic data from freshwater lakes led to a number of insights and predictions, some of which are supported by biological evidence. For instance, an interaction cluster was identified in Lake Lanier (Atlanta, GA) that included cyanobacterial primary producers and proteobacterial heterotrophs that live on the exudates of the cyanobacteria (Dam et al. 2020). Notably, ~46% of all species-species interactions in Lake Lanier were negative indicating competition, while others were positive suggesting cooperation. These results contrasted with those for Lake Mendota (Madison, WI), a lake that freezes in the wintertime that indicated a higher level (~66%) of competition, presumably driven by the fact that Lake Lanier experiences much milder weather fluctuations (Dam et al. 2020). Even though some of the findings may appear to be somewhat anticipated based on existing knowledge, it is important to note that the mathematical biologically agnostic approach is able to quantify these effects on population abundance dynamics and interactions, which is essential for forecasting future behavior. Therefore, the modeling framework provides a new strategy for integrating omics data to summarize the functionality and species-species interactions within natural habitats, while the application of these models to soil data as part of this project represents a novel contribution. Researchers will report on the efforts to adapt these LV models to the soil multiomic data available from the DOE’s SPRUCE project.

An essential part of mathematical modeling is the accurate estimation of in situ abundance of molecular features (e.g., genes, pathways or species). However, how to precisely measure the abundance of features in metagenomic or other omic datasets remains challenging because the available methods have not yet been standardized, and it is not clear how the data from different approaches can be compared and interpolated. Estimation of in situ abundances is the cornerstone for several additional downstream analyses such as identifying differentially abundant taxa between samples, metabolic modeling, etc. Researchers will present the approaches to standardize the methods for detecting the relative abundance of features in various omics datasets (e.g., metagenomics, metatranscriptomics, metaproteomics and metabolomics) for use by the scientific community. As a representative example, researchers have recently advanced the tool for metagenomic (or metatranscriptomic) read recruitment plotting to provide precise estimates of whole-genome and individual gene abundances, and the extent of intra-population gene-content and sequence diversity (Gerhardt et al. 2021). Further, by analyzing publicly available genome and metagenome data, researchers show that the diversity within species is organized in distinct 99.5% Average Nucleotide Identity (ANI) clusters that can be used to consistently describe genomovars and strains (Rodriguez-R et al. 2022). Using these standards and concepts, researchers will also present the efforts to quantify how individual species and strains within species respond to the temperature and CO2 treatments applied at the DOE’s SPRUCE project.

Identification of Regulatory Mechanisms Underlying Cell Differentiation in Sorghum BiomassKirstUniversity of FloridaVermerrisBioenergyUniversity

Researchers plan to alter the genetic regulation of the cellular developmental programs that generate the vegetative tissues of sorghum, with the aim to increase the proportion of cells that are less recalcitrant to biomass deconstruction.

Plant biomass is comprised of distinct cell types, which largely determine its physical and chemical properties, and hence, its recalcitrance to biomass processing aimed at generating fermentable sugars that microbes can convert to biofuels. For example, the walls of parenchyma cells in the stalks of maize (Zea mays L.) and sorghum (Sorghum bicolor [L.] Moench) can be broken down using milder pretreatment conditions and with lower cellulase loadings than the lignified cells present in the outer rind of the stalk (Zeng et al. 2012; Zeng et al. 2012; Li et al. 2018). The different cell types within the sorghum plant develop from undifferentiated meristem cells through variation in the spatio-temporal expression of regulatory genes that control structural genes. The understanding of the role of specific genes and their regulation in this developmental process is incomplete. Uncovering the function of the complete ensemble of genes involved in the differentiation and maturation of the cells that make up sorghum biomass creates the opportunity to manipulate its cellular composition, impacting its physical and chemical properties and, consequently, its value for bioenergy.

Researchers propose applying single-cell genome and transcriptome analysis of the sorghum shoot apex and stem to identify the function of genes involved in differentiating cells that determine biomass composition. Inferred cell lineage trajectories involved in the development of the cellular components of biomass will be explored to discover their regulators. The specific objectives are to (1) define the function of each gene (including specific members within gene families) with respect to the development of the main cell types that determine sorghum biomass and its cell-wall composition; (2) construct the cellular lineages that give rise to each cell type that composes biomass (from the shoot apical and vascular cambium meristem cells to cells in the stem), and identify genes and cis-regulatory elements that contribute to the lineage progression; (3) categorize the function of gene and associated cis-regulatory components for their relevance in the control of cellular lineages that lead to each cell type, and (4) validate multiple targets in isolation and in parallel, to confirm their role in biomass development and their potential for enhancing biomass yield and its properties.

Single-Cell Genomics of Poplar Wood DevelopmentKirstUniversity of FloridaPereiraBioenergyUniversity

(1) Uncover the cellular developmental program of woody biomass in the perennial bioenergy crop poplar by dissecting the cell lineages that originate in the vascular cambium. (2) Modify the regulatory programs that lead to the formation of the various cell types in poplar wood to achieve less recalcitrant biomass for bioenergy production.

The vascular cambium is responsible for the production of the secondary xylem, which comprises the most abundant form of biomass on Earth, wood. Despite its massive global importance, the genetic networks underlying the production of secondary xylem remain partially ambiguous. Differentiation of stem cells in the plant apex gives rise to aerial tissues and organs. These cells later differentiate to form the vascular cambium, from which secondary xylem is generated. Here the team used single-nuclei RNA sequencing (snRNA-seq) to determine cell-type specific transcriptomes of the Populus trichocarpa vegetative shoot apex and lignified stem to create a cell-type specific atlas of their tissues and uncover the regulators of cell lineage trajectories.

From P. trichocarpa shoot apex, researchers identified highly heterogeneous cell populations that clustered into seven broad groups represented by 18 transcriptionally distinct types (Fig. 1). Next, researchers established the developmental trajectories of the epidermis, leaf mesophyll, and vascular tissue. Motivated by the high similarities between Populus and Arabidopsis cell populations in the vegetative apex, researchers applied a pipeline for interspecific single-cell gene expression data integration. Researchers contrasted the developmental trajectories of primary phloem and xylem formation in both species, establishing the first comparison of vascular development between a model annual herbaceous and a woody perennial plant species. In addition to providing a cell atlas of the shoot apical meristem and its derived lineages, the results offer a valuable resource for investigating the principles underlying cell division and differentiation between herbaceous and perennial species.

In parallel to the Populus shoot apex analysis, researchers performed an snRNA-seq of 11,673 nuclei derived from lignified stem to profile the transcriptome and create a single-cell atlas. Cell-type specific marker genes were utilized to identify 20 transcriptionally distinct cell clusters representing nearly all cell types within the sampled tissue. Reporter gene assays were carried out to confirm the cluster identity of vessel elements, fibers, ray parenchyma, cambial cells, and sub-cell type vessel-associated cells. Finally, the developmental trajectory of cambial cells and their xylem-specific derivates was carried out to identify lineages containing putative regulators related to vascular development and xylogenesis. This trajectory analysis identified putative regulators of the cell lineages that result in the formation of fibers and vessels. The functions of these regulators are being evaluated in knockout experiments, in which researchers will assess if cell lineages can be redirected toward developing specific cell types.

Scenario-Based Technoeconomic and Life-Cycle Analyses of Biomass Conversion TechnologiesKeaslingLawrence Berkeley National LaboratorySimmonsBioenergyJBEI

The vision of Joint Bioenergy Institute (JBEI) is that bioenergy crops can be converted into economically viable, carbon-neutral, biofuels and renewable chemicals currently derived from petroleum, and many other bioproducts that cannot be efficiently produced from petroleum.

Cellulosic biofuels have not yet reached cost parity with conventional petroleum fuels. One of the central approaches used at JBEI to identify promising biofuels and bioproducts, and the conversion technologies capable of producing them, is combining technoeconomic and life-cycle analysis. At JBEI, researchers have developed scenario-based methodologies to conduct this work. This poster will present findings from three of these scenarios: (1) engineering bioenergy crops to generate value-added bioproducts in planta can reduce input requirements relative to microbial chassis and skip costly deconstruction and conversion steps (Yang et al. 2022; Yang et al. 2020), (2) presents detailed process configurations for the bioadvantaged sustainable aviation fuel, dimethycyclooctane (DMCO) production to estimate the minimum selling price and life-cycle greenhouse gas (GHG) footprint considering three different hydrogenation catalysts and two bioconversion pathways (Baral et al. 2021), and (3) evaluation of protic and aprotic ionic liquids for the deconstruction and conversion of mixed woody biomass feedstocks using a one-pot configuration (Achinivu et al. 2022). This work provides new insights into the tradeoffs, challenges, and opportunities present in the advanced biomass conversion technologies being developed at JBEI realized at the commercial scale.

Characterizing Lignocellulose Breakdown Mechanisms in Anaerobic Gut FungiKeaslingJBEIJinBioenergyJBEI

The vision of Joint Bioenergy Institute (JBEI) is that bioenergy crops can be converted into economically viable, carbon-neutral, biofuels and renewable chemicals currently derived from petroleum, and many other bioproducts that cannot be efficiently produced from petroleum.

Lignocellulose is an attractive feedstock for renewable chemical manufacturing and bio-based chemical production, which reduces dependency on petroleum, but its recalcitrant structure requires multiple catalytic steps for sufficient hydrolysis, rendering current production technologies energetically and economically expensive. Anaerobic gut fungi (AGF) provide an attractive alternative strategy for biomass valorization by secreting a variety of carbohydrate-active enzymes (CAZymes) to efficiently degrade unpretreated biomass (Solomon et al. 2016). Many CAZymes are colocalized in fungal cellulosomes (enzyme complexes) that are thought to accelerate lignocellulose breakdown compared to the action of freely diffusing enzymes (Lillington et al. 2021). Anaerobic gut fungi control cellulosome compositions by regulating the production of biomass-degrading enzymes based on fungal life stage and the complexity of available substrates, as supported by advanced microscopy imaging and transcriptomic characterization (Solomon et al. 2016; Lillington et al. 2021). To further assess the differences in mechanistic properties among cellulosomes, researchers first developed an isolation technique to harvest these complexes from fungi grown on aqueous cellulose (cellobiose), fibrous cellulose (Whatman filter paper) and lignocellulose (reed canary grass), followed by protein purification via fast protein liquid chromatography. Researchers used nanostructure-initiator mass spectrometry (NIMS) probes, which quantify oligosaccharide abundance in solution, to measure cellulases and hemicellulase activity based on the hydrolysis of NIMS probes (Deng et al. 2014). Amorphous cellulose probes, crystalline cellulose probes and hemicellulose probes were used to detect any preference in degradation among enzyme complexes. Researchers observed that cellulosomes produced when growing fungi on lignocellulose were the most active and digested all three NIMS probes at a comparable level. Additionally, cellulosomes secreted when grown on filter paper, even though they perform better in cellulose digestion than that grown on cellobiose, exhibited a strong preference toward cellulose over hemicellulose. Currently, researchers are screening a library of 200+ synthesized genes from fungal cellulosomes to unmask their functions and roles in lignocellulose degradation via a combination of proteomics and NIMS characterization.

Additionally, there is evidence from two-dimensional heteronuclear single quantum coherence nuclear magnetic resonance (2D-HSQC-NMR) spectrometry, which suggests an up to 8% reduction in β-aryl ether linkage, a typical lignin bond, in addition to the changes in lignin monomer composition after incubating biomass (sorghum, switchgrass and poplar) with Neocallimastix californiae and Anaeromyces robustus, two AGF strains. The same phenomenon is observed in lignin content analysis and gel permeable chromatography. This is the first time that lignin modification is observed with high confidence under an anaerobic environment. Researchers are working on expanding the biomass library to include different types of softwood and hardwood, to develop an increased mechanistic understanding of anaerobic lignin modification in plants with different lignin compositions. Moreover, researchers are developing and applying lignin-based probes to characterize possible mechanisms of anaerobic lignin breakdown.

Understanding the Role of Lignin in Native Architecture of Engineered Plant Cell Walls via Multi-Dimensional Solid-State NMRKeaslingJBEIGaoBioenergyJBEI

The vision of Joint Bioenergy Institute (JBEI) is that bioenergy crops can be converted into economically viable, carbon-neutral, biofuels and renewable chemicals currently derived from petroleum, and many other bioproducts that cannot be efficiently produced from petroleum.

As a major component (∼30%) of the plant secondary cell wall, lignin is a promising renewable feedstock for the production of platform chemicals due to its high aromaticity. However, the heterogeneity of the lignin structure leads to biomass recalcitrance, which significantly impedes the efficiency of total biomass conversion in a biorefinery context. Researchers propose that understanding how lignin interacts with the other major cell wall components (i.e., cellulose and hemicellulose) and contributes to the 3D cell wall nanoarchitecture will provide invaluable insights into the nature of biomass recalcitrance. This, in turn, will support the successful engineering of bioenergy crops with optimized biomass, which can be fully deconstructed and valorized into biobased products. These studies employ multi-dimensional solid-state nuclear magnetic resonance (ssNMR), including one-dimensional cross-polarization and direct polarization, two-dimensional refocused Incredible Natural Abundance Double Quantum Transfer Experiment (INADEQUATE) and Proton Driven Spin Diffusion (PDSD), and spin-lattice relaxation time measurements, to monitor the native polymer arrangements in the intact secondary cell walls of engineered plants with reduced recalcitrance traits. Here researchers will present data on the arrangement of cell wall polymers in model and crop species with reduced lignin. Researchers use this information to understand factors underlying cell wall properties, and to support predictive cell wall design and biomass deconstruction.

Evolutionary Engineering is a Versatile Strain Optimization Approach for Sustainable BioproductionKeaslingJBEIFeistBioenergyJBEI

The vision of Joint Bioenergy Institute (JBEI) is that bioenergy crops can be converted into economically viable, carbon-neutral, biofuels and renewable chemicals currently derived from petroleum, and many other bioproducts that cannot be efficiently produced from petroleum.

Strain engineering of microbes for bioproduction is a challenge and requires multiple optimization approaches and engineering cycles to reach commercial viability. An approach that is gaining in utility in the Design-Build-Test-Learn (DBTL) cycle is adaptive laboratory evolution (ALE) as it can uniquely solve problems encountered in the stain engineering process using selection and the natural ability of microbes to adapt. A platform to effectively utilize ALE built around custom automation, process control software, and bioinformatics pipelines was developed and it has been effectively applied to engineer a range of microorganisms. Specific use cases demonstrating how ALE can be used in different implementations of the DBTL cycle will be presented (Sandberg et al. 2019). First, an implementation where ALE is used after the Build step but before the Test step to engineer Pseudomonas putida to utilize non-native hemicellulose monomers is presented (Lim et al. 2021). Second, an implementation where aggregated ALE generated mutations in the ALE database (ALEdb) are utilized in the Design step to introduce novel mutations for enhanced glycerol uptake is described (Phaneuf et al. 2021). The third implementation will describe how ALE can replace the Design and Build steps to generate a strain with enhanced secretion and tolerance to L-serine in one DBTL cycle to contribute to a commercially viable production strain (Mundhada et al. 2017). Finally, what is likely achievable for ALE and laboratory automation in the short term and how it can be broadly applied to solve more problems in industrial bioproduction is presented.

Domesticating Cyanobacteria Through Development of New Genome Engineering Tools, and Isolation of New Bio-Prospected Model StrainsChurchHarvard UniversitySchubertBiosystems DesignUniversity

Cyanobacteria are facile models of photosynthesis and chassis organisms for carbon-negative bioproduction. To help advance the development of cyanobacteria biotechnology, a team of researchers is both developing new genetic tools for cyanobacteria and bio-prospecting novel cyanobacteria model strains. New genetic tools aim both toward genome-scale study of cyanobacterial genomes, including applications like recoding and directed evolution of bacterial genomes, toward improved performance in bioproduction. Novel cyanobacterial model strains possess unique differences to common model strains that further understanding of and improve photosynthetic microbial bioproduction, further enabling the carbon-neutral bioeconomy.

Facile genome engineering tools further the ability to explore and apply biology, and previous work generating recoded and/or biocontained organisms, or performing directed evolution in bacteria, shows that recombineering is a critical tool (Mandell et al. 2015; Lajoie et al. 2013; Schubert et al. 2021). Recombineering is enabled by phage sequential structure alignment program (SSAP) proteins, which both produce and stabilize single-stranded DNA recombination intermediates and recruit them to the replisome (Filsinger et al. 2021; Caldwell et al. 2019). These proteins are known to have host-specific activity, and finding SSAPs that function efficiently in a specific clade of bacteria is a first step toward improving tools for precision genome engineering (Filsinger et al. 2021; Wannier et al. 2020, 2021). Researchers have identified 22 candidate SSAPs within both protein databases and metagenomic databases occurring within cyanobacteria or their phages. These proteins could extend recombineering approaches enabling multiplex engineering, recoding, and genome-scale remodeling into cyanobacteria and help form a roadmap for identifying efficient SSAPs in new model microbes.

Recombineering is one tool for engineering genomes which is well-suited to rational approaches. In contrast, transposon mutagenesis and Transposon Insertion Sequencing (TnSeq) have been successful at improving understanding of genomes through irrational approaches and fast generation of large pools of genetic diversity (Gray et al. 2015). This project demonstrates that simple modifications to existing transposon mutagenesis procedures result in many random transposon insertions within cyanobacterial genomes. The resulting strains with large numbers of inactivated and/or upregulated genes suggest strategies for directed evolution and genome streamlining.

The project is also seeking to apply these technologies in unreported cyanobacterial model strains that could improve photosynthetic-based bioproduction. Indeed, various cyanobacterial model strains are used in the literature for these efforts as well as studying the fundamental processes of photosynthesis (Goodchild-Michelman et al. 2023). Together with the Two Frontiers project (twofrontiers.org) which aims to explore microbial communities in extreme environments, including high CO2 environments, researchers have isolated two closely related cyanobacterial strains with promising growth phenotypes from seawater in the photic zone off the coast of Sicily. The two strains have ~99% nucleotide identity with each other with 32,897 single nucleotide polymorphisms differing between the strains and ~98% identity to the closest relative with genome sequence available, Cyanobacterium aponium PCC10605. Comparative genomics reveal each novel strain possesses 50–60 unique genes differing between both these strains and PCC10605, and 108 shared genes differing between them and PCC10605. One novel strain remarkably grows to a higher density in batch growth than Synechococcus sp. PCC11901, which holds the published record for high density batch biomass growth in cyanobacteria (Włodarczyk et al. 2020; Mills et al. 2022). Research shows that larger and more dense cells than common model strains may improve the economics of dewatering cyanobacterial biomass, and thus production of bioproducts. The second strain possesses unique characteristics such as programmed formation of large aggregates and phototactic motility. In sum, these strains obtained from a CO2-emitting volcanic vent are a promising new model for studies in cyanobacteria and possibly for photosynthetic bioproduction.

Defining the Influence of Environmental Stress on Bioenergy Feedstocks at Single-Cell ResolutionColeJGIDaoBioenergyEarly Career

Biomass derived from plant feedstocks is a renewable and sustainable energy resource, but these resources are vulnerable to environmental stress such as water and nutrient limitations. Understanding how cells work independently and in concert to regulate plant responses to stress will be crucial to improving their performance. The Early Career Research Project awarded to this group aims to apply several cutting-edge single-cell and spatially resolved transcriptome sequencing approaches to construct a comprehensive single-cell resource for plants and to better understand the complexity behind stress response among diverse cell types. To this end, researchers have profiled thousands of individual sorghum root cells grown under normal and phosphate-limited conditions. Preliminary results have indicated several genes whose expression is potentially altered by stress in a cell type–specific way, with genes in the vasculature being particularly affected. The team is currently integrating this nascent data with additional single-cell data from other species, including maize, Brachypodium and switchgrass. Researchers are also characterizing environmental stress using other advanced profiling methods, including spatial transcriptomics and spatial metabolomics. The team hopes to build a multispecies model of cell type–specific stress responses that can be tested under agriculturally relevant conditions using the EcoPOD technology at Lawrence Berkeley National Laboratory, and eventually use to develop new targeted intervention strategies.

Expression of S-Adenosyl Methionine Hydrolase Modifies Lignin in SorghumKeaslingJBEIEudesBioenergyJBEI

The vision of Joint Bioenergy Institute (JBEI) is that bioenergy crops can be converted into economically viable, carbon-neutral, biofuels and renewable chemicals currently derived from petroleum, and many other bioproducts that cannot be efficiently produced from petroleum.

Plant biomass represents a large renewable source of fermentable sugars for the synthesis of bioproducts. These sugars are stored as cell wall polymers, mainly cellulose and hemicellulose, and are embedded with lignin, which makes their enzymatic hydrolysis challenging. One of the strategies to reduce cell wall recalcitrance is the modification of lignin content and composition. Lignin is a phenolic polymer of methylated aromatic alcohols and its synthesis in tissues developing secondary cell walls is a significant sink for the consumption of the methyl donor S-adenosylmethionine (AdoMet) due to the involvement of methyltransferases. Researchers previously demonstrated in Arabidopsis that specific expression of S-adenosylmethionine hydrolase (AdoMetase, E.C. 3.3.1.2) in stem tissues reduces AdoMet content and impacts lignin biosynthesis (Eudes et al. 2016). This engineering approach was tested in the bioenergy crop sorghum (Sorghum bicolor L.). AdoMetase was expressed in sorghum lignifying tissues using the promoter of the caffeic acid O-methyltransferase gene. Both AdoMet content and lignin are reduced in transgenics. 2D-HSQC NMR analysis of cell walls showed relative enrichment of non-methylated p-hydroxycinnamyl (H) units and a reduction of tricin (T), guaiacyl (G) and syringyl (S) lignin units in transgenics. Gel permeation chromatography revealed differences in the number average molecular weight (Mn) and weight averaged molecular weight (Mw) of lignin in AdoMetase lines. Quantification of cell wall-bound hydroxycinnamates showed a reduction of ferulate in all transgenic lines. These modifications in engineered sorghum result in a diminution of cell wall recalcitrance since higher yields of glucose and xylose were obtained after enzymatic saccharification of biomass compared to wild type plants. Considering that some transgenic lines display no important diminution of biomass yields, this engineering approach provides a valuable option for the improvement of lignocellulosic biomass feedstock.

Development of a Quantum-Optimal Bioimaging System for Plant-Microbiome InteractionsKasevichStanford UniversityReynoldsStructural Biology
  • Design and develop a quantum-information-optimal multi-pass (MP) microscope
  • Demonstrate MP stimulated Raman scattering microscopy for high-sensitivity, label-free chemical imaging
  • Develop technologies for quantum-optimal quantitative phase imaging and extend this approach towards interaction-free measurements
  • Apply artificial intelligence/machine learning methods to establish Raman signatures of different bacterial and fungal species in different environmental conditions
  • Apply the Raman MP microscope to follow plant-bacteria interactions during infections of isolated plant cells and tissues
  • Use the Raman MP microscope to elucidate how individual bacterial species interact with each other under a changing biofilm environment
  • Develop microfluidic devices coupled with Raman MP microscopy for efficient separation and concentration of soil bacteria into single species
  • Single cell geno- and phenotyping and cryo-electron tomography of purified cells

Researchers present progress towards the development of quantum-information-optimal multi-pass imaging technologies based on re-imaging optical systems (Juffmann et al. 2016). A multi-pass microscope interrogates a sample multiple times in a programmable and deterministic fashion. This leads to a metrological advantage for imaging weak scatterers. This enhanced sensitivity can yield a significant reduction in the damage imparted to the sample or can reduce image acquisition time. The approach can enter a quantum non-destructive regime where the photon interaction with the image target is fully coherent, and the imaging process becomes quantum non-destructive when conditioned upon the detection of single photons. Recent theoretical analysis has shown that this imaging approach saturates quantum information bounds and compares favorably with bounds obtained using squeezed and other entangled probe states, but avoids the technical complexity associated with the production of such states (Koppell and Kasevich 2022).

As proof-of-concept experiments, researchers will use these protocols for the study of microbe-microbe and microbe-plant interactions in multi-pass stimulated Raman microscopy (MP-SRS) configurations. These configurations will enable volumetric, chemically specific, imaging of thick samples. Furthermore, building on the demonstration of continuous-wave multi-pass flow cytometry (Israel et al. 2022), the Raman MP microscope will be integrated into extremely efficient, label-free microfluidic separators for isolating single species of soil bacteria. Researchers plan to design these quantum imaging technologies into compact and robust systems for shared use among the BER science community.

Development of High-Throughput Light-Sheet Fluorescence Lifetime Microscopy for 3D Functional Imaging of Metabolic Pathways in Plants and MicroorganismsKasevichStanford UniversityBowmanStructural Biology

The goal is to realize a high-speed lifetime imaging platform for light-sheet microscopy of metabolic pathways and plant-microbe interactions using electro-optic fluorescence lifetime microscopy (EO-FLIM; Bowman and Kasevich 2021; Bowman et al. 2019). Wide-field optical modulators allow efficient lifetime capture combined with low noise readout on standard scientific cameras. This system will find broad applications in plant imaging and will provide lifetime contrast using both fluorescent labels and endogenous autofluorescence. Multi-dimensional imaging optics enable lifetime multiplexing and unmixing of autofluorescent signatures. These optics allow simultaneous acquisition of space, polarization, nanosecond time, and wavelength.

Researchers have completed the design and construction of two custom microscopes for EO-FLIM imaging. The first microscope [Fig. 1(a, b)] allows multi-dimensional wide-field FLIM and will be expanded in the future for 2-photon lightsheet excitation. Its design includes two 40 MHz resonant Pockels cells and a compact multi-spectral unit (MSU) for simultaneous FLIM imaging in several spectral bands. The MSU is broadly applicable to any microscope, and in the system, it enables ongoing experiments with multi-label and autofluorescence unmixing. The microscope is equipped with a supercontinuum laser source for multi-band excitation and also a doubled Ti:Sapphire laser and pulse-picker for ultraviolet excitation. The second microscope (Fig. 1c) is a lightsheet platform for large field-of-view FLIM imaging. The resonant Pockels cell for this system is driven at 80 MHz and provides a 17 mm aperture for imaging. Imaging optics have been optimized and initial volume acquisitions are underway.

Critical to both systems is an efficient data analysis pipeline. Previous work has primarily focused on rapid single-frame lifetime estimation using a single camera exposure. By combining multiple exposures taken at different Pockels cell drive phases, it is also possible to extract multi-exponential information. Multi-phase EO-FLIM data is well suited to phasor analysis to study multi-exponential decays without fitting or histogram binning. Researchers are now implementing phasor analysis on EO-FLIM datasets to allow real-time display of lifetime data as it is acquired [Fig. 1(e, f)]. Phasor analysis will also enable multi-label unmixing and visualization of lifetime shifts from autofluorescent species upon binding to different substrates.

Several biological samples have been developed for FLIM imaging. A collection of engineered Pseudomonas putida cells, each with a different fluorescent protein of choice, have been generated covering a large spectral range to be used for unmixing FLIM signals from a population of bacterial cells containing different fluorescent proteins. The same strains are being used for phasor analysis of outer membrane vesicles. eGFP fused with tetraspanin Tet-8 serves as a marker of vesicles in Arabidopsis plants for live FLIM imaging of the root hair cells (Fig. 1d). Two carbon cycling enzymes, 4-hydroxybutyl-CoA dehydrogenase (4HBD) and enoyl-CoA reductase/carboxylase (ECR), as well as glucose-6-phosphate dehydrogenase (G6PD) have been expressed and purified with and without autofluorescent molecules, NADPH and/or FAD, for establishing FLIM and phasor signatures (Fig. 1e), which will be used for unmixing multidimensional live imaging of cells expressing these enzymes.

Researchers have also applied the EO-FLIM optics to kilohertz rate high-speed FLIM imaging of a FRET-based genetically encoded voltage sensor in Drosophila, enabling lifetime detection of action potentials in vivo. Lifetime readout significantly improves the signal-to-noise and stability of voltage recordings (Bowman et al. 2023). This work enables future directions for imaging dynamic signals throughout plants.

Exploring Switchgrass Genetic Diversity with Multiple Reference GenomesJuengerUniversity of Texas–AustinSchmutzBioenergyUniversity

Overall, researchers are striving to improve bioenergy feedstock production by understanding the genetic basis of plant-environment interactions. This goal includes testing for climate adaptation, modeling beneficial and stressful biotic interactions, and exploring the mechanisms of abiotic stress responses. During their work (Lowry et al. 2019; Lovell et al. 2021; Napier et al. 2022), researchers discovered a massive amount of physiological and molecular variation in switchgrass. While this diversity is the raw material that allows breeders to improve feedstock production, making use of this variation is very challenging—the immense DNA differences between some switchgrass genotypes means that traditional methods to explore genetic diversity simply do not work. Under the work presented here, researchers are developing multiple genome resources that span this diversity to provide the foundation for molecular characterization of switchgrass biomass production, stress responses and biotic interactions.

A single haploid reference genome gives breeders the resources to connect alleles to traits; a significant step towards accelerating crop improvement. However, breeding programs often leverage highly diverged germplasm, which contain large-scale variants that are not readily identified by a single reference genome. For example, in switchgrass, the fast-growing southern lowland AP13 genotype (which serves as the reference genome; Lovell et al. 2021) is ~1 million years diverged from the cold-tolerant northern upland gene pool. To assist breeding and gene discovery efforts, researchers have developed 16 total reference genome haplotypes from eight outbred heterozygous genotypes that span the genetic diversity of switchgrass, including a new annotated, haplotype resolved AP13. These fully de novo chromosome-scale genomes include three northern uplands, three southern lowlands, and two spanning the latitudinal range of the newly discovered coastal ecotype. Here, researchers present the progress on these genomes and an analysis of structural variation, including the presence of a putative minor chromosome that appears to segregate in coastal switchgrass populations. These variants can serve as a priori targets for ongoing molecular breeding efforts to make switchgrass a more economically and ecologically sustainable biofuel feedstock.

The Influence of Switchgrass Internode Anatomy on Biofuel Production-Relevant TraitsJungerUniversity of Texas–AustinBartleyBioenergyUniversity

This project aims to understand the environmental and genetic influences on switchgrass composition towards increasing sustainability of switchgrass production for biorefining by developing generalist and specialist plant ideotypes that maximize biomass yield and composition, stress tolerance, and carbon sequestration capacity.

Switchgrass (Panicum virgatum L.) is a perennial warm-season Tallgrass and a promising feedstock for production of lignocellulosic biofuels. Local environmental conditions produce phenotypic variation across the geographical range of switchgrass. Researchers hypothesized that internode anatomy plasticity, representing evolutionarily driven local adaptation, modulates traits important for efficient biofuel production, including biomass yield (i.e., height), hydraulic conductivity, and biomass digestibility. Researchers analyzed internode anatomy of lowest above ground internodes in clones of upland (VS16, DAC) and lowland (AP13, WBC) switchgrass genotypes at three common garden sites in: South Texas, central Missouri, and central Michigan. Lowlands are larger in many traits including height, average xylem diameter, and internode annulus radius. A few traits such as cortical cell wall thickness (CCWT) and sclerenchyma radial thickness deviate from this pattern, lack isometry with height or internode diameter, and rank differently among genotypes across sites. Tillers with larger total leaf area, height, outer diameter, annulus radius, and average xylem diameter had larger maximum stem specific hydraulic conductance (KS). Additionally, sheath radial thickness positively correlates with KS (SCC = 0.94) while CCWT and rind % annulus radius negatively correlate with KS (SCC = -0.43 and -0.47, respectively). Comparing different digestion treatment parameters revealed that CCWT negatively correlates with biomass digestibility (R= -0.52) while no significant relationships were found with lignin or cellulose content. Thus, the significant anatomical plasticity in biofuel-production relevant traits present among genotypes across environments provides support towards optimizing internode anatomy that favors cell wall deconstruction efficiency for lignocellulosic biorefining.

Structural Biology Center at Sector 19 of Advanced Photon SourceJoachimiakArgonne National LaboratoryJoachimiakStructural Biology
  1. Provide an integrated x-ray macromolecular structural biology platform and advanced user support at the Advances Photon Source for BER and general users.
  2. Development and implementation of new methods and applications in macromolecular.
  3. Support a diverse user outreach and training.

The Structural Biology Center (SBC) at Argonne National Laboratory operated insertion device (ID) and bending magnet (BM) beamlines at Sector 19 of the Advanced Photon Source (APS) as a user facility for macromolecular crystallography since 1997. The facility was funded by Department of Energy Office of Biological and Environmental Research. The 19ID undulator beamline was designed and built to take full advantage of the high flux, brightness and quality of x-ray beams delivered by the APS and was the first macromolecular crystallography facility open to users. 19BM was added to user program in 1999. These two beamlines delivered small, very low angular divergent x-ray beams onto micrometer-size crystal samples thereby permitting studies of large and complex molecular systems at atomic resolution. The high flexibility, inherent to the optics design, coupled with a kappa-geometry goniometer and beamline control software enabled development of optimal strategies for protein crystallographic experiments, thus maximizing the chances of their success. A large-area detectors allowed high-quality diffraction data to be measured rapidly to the crystal diffraction limits.

Users collected data on site or remotely and data were collected, processed and structures determined with advanced software in near real time. Many users had limited synchrotron/crystallography expertise and SBC staff provided extensive training and support. The facility offered a flexible schedule on one of the most efficient data collection and structure determination platforms for protein crystallography demonstrate high productivity (19ID 4966 and 19BM 1408 PDB deposits respectively and 2,796 peer reviewed publications).

The SBC promoted scientific and technological innovation in support of the DOE mission by providing world-class macromolecular crystallography facility to BER and biology research community. The SBC exploited major advances in macromolecular x-ray crystallography and addressed the most challenging structural biology problems to expand scientific knowledge. The SBC was an important component in structural biology innovation, structural genomics, metagenomics, and proteomics, and genomics research, with a major focus on systems biology, bio-nanomachines, medicine and bio-catalysis. These fields are highly relevant to bioenergy resources, health, national security, and a clean, sustainable environment. More recently, the SBC has been contributing to serial crystallography, data analysis high-performance computing pipeline and SARS-CoV-2 research.

Engineered Overlapping Genes Paired with Directed Evolution Prolongs the Evolutionary Stability of a Genetic CircuitParkLawrence Livermore National LaboratoryParkEnvironmental MicrobiomeSecure Biosystems Design

A primary goal of this Science Focus Area (SFA) project is to establish genetic sequence entanglement as a generalizable biocontainment strategy to protect engineered functions against mutational inactivation and to mitigate the horizontal transfer of invasive genes. Sequence entanglement was inspired by overlapping genes found in many viral genomes and involves the synthetic encoding (entangling) of two genes within the same DNA sequence space through use of alternative reading frames. As such, mutations within the entangled region likely impact the function of both genes, providing a mechanism to constrain the allowable mutational space. The team thus hypothesize that by entangling a gene-of-interest (GOI) with an essential gene, the evolution of the GOI can be constrained by rendering mutations in one frame non-permissible due to deleterious mutations in the frame encoding the essential gene. As a proof-of-concept, researchers assessed the utility of an entanglement design in which a toxin is entangled with an essential gene, to improve genetic stability of a kill-switch circuit.

The development of synthetic biological circuits that maintain functionality over application relevant timescales remain a significant challenge. Synthetic circuits are often burdensome to cellular fitness and are subject to evolutionary pressures, which select for mutated and non-functional circuits. In this study, researchers employed a gene overlap technique called synthetic sequence entanglement—in which one protein is encoded entirely within an alternate reading frame of another gene—to enhance the sequence stability of a burdensome engineered genetic circuit. Specifically, the toxin-encoding relE gene was entangled within ilvA, which encodes threonine deaminase, an enzyme essential for isoleucine biosynthesis. This pairing allows the ability to test whether an essential function (isoleucine biosynthesis) can increase the mutational robustness of a gene prone to mutational inactivation (e.g., relE).

Starting from a partially functional entanglement design in which significant missense mutations (~79% of entangled residues) were introduced within the ilvA sequence to accommodate a wild-type amino acid sequence for RelE, the team made targeted modifications of an internal ribosome binding site that simultaneously enhanced the expression of the RelE toxin and the function of IlvA. Using this optimized design, researchers show that entanglement of relE with ilvA significantly increased the evolutionary stability of the toxic relE gene, which retained function for >130 generations. This stabilizing effect was achieved through a complete alteration of the allowable mutational landscape such that mutations inactivating both entangled gene products were disfavored. Instead, small deletions, insertions, and point mutations accumulated within the regulatory region of ilvA for the majority of lineages. By reducing baseline relE expression, these more benign mutations lowered circuit burden, which suppressed the accumulation of relE inactivating mutations, thereby prolonging kill-switch function. Overall, this work demonstrates the utility of sequence entanglement to increase the evolutionary stability of burdensome synthetic circuits.

Engineering Overlapping Genes in BacteriaJiaoLawrence Livermore National LaboratoryJiaoEnvironmental MicrobiomeSecure Biosystems Design

A primary goal of the Science Focus Area (SFA) is to establish genetic sequence entanglement—in which two genes are encoded within the same DNA sequence through use of alternative reading frames—as a generalizable biocontainment strategy to protect engineered functions against mutational inactivation and to mitigate the horizontal transfer of invasive genes. Achieving sequence entanglement remains a significant challenge due to sequence constraints that necessitate large-scale redesign of the entangled proteins. Through design-build-test-learn (DBTL) iterations using the Constraining Adaptive Mutations using Engineered Overlapping Sequences, eXtended (CAMEOX) algorithm (Blazejewski, Ho, and Wang 2019), high throughput functional assays, and state-of-the-art machine learning and protein structural prediction algorithms, researchers aim to improve the accuracy of entanglement designs and expand the application to a broad range of microbes.

To generate an initial data set for model training and development, researchers initiated a DBTL campaign of an entanglement pair composed of infA and aroB. infA encodes the translation initiation factor 1 (72 AA) that is essential for growth and aroB encodes 3-dehydroquinate synthase (362 AA) that is required for aromatic amino acid biosynthesis. CAMEOX was used to generate 130,000 entanglement designs, among which, 2,000 designs were selected for experimental testing. The functionality of infA and aroB were assayed separately through selection. The results revealed that between ~10-30% of aroB variants and ~25% of infA variants were highly enriched in the surviving population, indicative of protein function. Researchers found that the gene fitness scoring metric generated by CAMEOX— pseudolikelihood score—correlates well with the experimental enrichment scores, confirming the pseudolikelihood score as a reliable indicator for protein fitness. Combing the results from both assays for infA and aroB, 14 variants were identified with high enrichment values for both genes and are being tested for functionality within the entangled context.

Using the experimentally measured fitness data for infA and aroB, the team trained random forest (RF) classifiers based on amino acid composition and used the model to predict the fitness of CAMEOX-designed entanglement solutions. Researchers found that simple classifiers such as the frequency of certain amino acids were able to predict variant fitness with high accuracy, even when a small number of measured variants was used in the training set. Using these RF models, researchers screened the complete set of 130,000 infA/aroB CAMEOX designs and identified 29 with potential functionality for both genes. Experimental testing of these variants is underway.

In addition to RF models, researchers have leveraged AlphaFold to rank CAMEOX variants according to how foldable their structures are. Relying on AFRank (Roney and Ovchinnikov 2022), the team predicted the structures of CAMEOX variants and used the predicted confidence metrics as a proxy for variant fitness. This approach complements sequence-based screening methods to in silico select the best variants for experimental testing. Besides better ranking the variants, researchers have also modified the algorithm to expand the diversity of proposed CAMEOX solutions. Researchers have developed gradient-based Markov Chain Monte Carlo (MCMC) methods for designing the entangled nucleotide sequences. This new optimization protocol generally improves over the previous greedy optimization algorithms and enables the generation of more diverse sequences with better fitness scores.

In addition to engineering entanglement for specific gene pairs, the team seeks to comprehensively assess entanglement feasibility of a wide array of gene pairs to identify features of DNA/ protein sequences that make genes co-encoding more amenable. Leveraging improved speed and automation of CAMEOX, the team undertook a campaign to computationally generate entanglement designs for nearly all conditionally essential genes in Escherichia coli (94) by entangling them with one another. An additional 24 genes of interest (positive controls with naturally entangled phiX174 genes, reporter genes that allow for quantitative phenotypic characterization, antibiotic resistance cassettes) were also included. This in silico campaign yielded > 8 million pairwise entanglements solutions in both +1 and +2 reading frames. Using custom evolutionary models of the parental proteins, the team developed a scoring rubric for CAMEOX designs that allows the ability to quantify and compare the entangle-ability of individual genes as well as the compatibility of gene pairs. By generating and testing specific designs, the team found CAMEOX can successfully design functional protein variants with low sequence conservation (<50% identify) to their natural orthologs. Furthermore, researchers have identified specific features of proteins that correlate with better entanglement outcomes such as enrichment for amino acids with a higher degree of codon degeneracy, a property also observed with naturally occurring entangled genes.

Developing Genome Engineering Technologies in Cupriavidus necator for Carbon-Negative BiomanufacturingIsaacsYale UniversityNakamuraBiosystems DesignUniversity

The accelerating climate crisis combined with rapid population growth poses some of the most urgent challenges to humankind. A major contributing factor to this crisis is the unabated release and accumulation of CO2 across the biosphere. Researchers can take advantage of this abundance of available CO2 to transform the way the world produces and uses carbon by engineering CO2-fixing biosystems to produce commodity fuels and chemicals. Engineering efforts in CO2-fixing organisms are currently limited by an incomplete understanding of genotype-phenotype relationships and inefficient genome engineering tools to discern these relationships. To address such issues, researchers have developed an integrated computational and experimental workflow–computer-aided design of synthetic genetic elements (CAD-SGE)-to redesign multigene biological pathways for mobilization, expression, and characterization in versatile organisms (Patel et al. 2022). Heterologous expression and characterization of large pathways (i.e., tyrocitabine pathway, violacein pathway) has been demonstrated using this CAD-SGE technology in diverse prokaryotes and eukaryotes, including Escherichia coli, Pseudomonas putida, Klebsiella aerogenes, Salmonella enterica, and Saccharomyces cerevisiae. Here, researchers demonstrate the expansion of the landing pad-based mobilization strategy into Cupriavidus necator, an aerobic autotroph that utilizes the Calvin cycle. Genomic integration of the landing pad will allow streamlined integration and expression of large heterologous genetic elements in C. necator. By implementing the CAD-SGE technology in C. necator and other CO2-fixing organisms, researchers hope to improve the understanding of CO2-utilizing microbes and contribute to the development of sustainable and carbon-neutral technologies.

A Cell-Free Protein Evolution PlatformJewettNorthwestern UniversityLandwehrBiosystems DesignUniversity

The biosynthesis of high-value sustainable chemicals is a major goal of synthetic biology and securing the energy future. For many important molecular biotransformations, efficient enzymes have yet to be discovered or engineered. High-throughput methods to prepare enzyme mutants and measure their activity, as well as evaluate their engineering potential and transform them into industrial tools are important to overcoming this bottleneck. Researchers addressed this challenge by developing a cell-free DNA assembly and protein synthesis platform, allowing researchers to rapidly screen 1000’s of sequence defined enzyme mutants in iterative design-build-test-learn cycles. These rich datasets are also amenable to machine learning algorithms that try to capture the sequence-fitness landscape. As a model, researchers demonstrate the utility of the platform by engineering a formyl-CoA synthetase for the activation of formate, a transformation that does not readily occur in nature and has implications in building synthetic carbon-fixing metabolic pathways. To design, researchers used molecular structure and sequence homology to guide selection of amino acid residues for creating new-to-nature enzymes that catalyze the desired chemistry. To build, researchers used site saturation mutagenesis to create a DNA library of enzyme variants containing mutations at the identified residues. To test, researchers developed a visual detection scheme to bypass the need for traditional chromatography-based analytics. The team anticipates the work engineering a formyl-CoA synthetase will serve as a blueprint to enable future efforts aimed at the maturation of a diverse array of biocatalysts.

Plant Methanol Emission at the Interface of the Photosynthetic C1 Pathway, Leaf Water Status, and GrowthJardineLawrence Berkeley National LaboratoryJardineBioenergyEarly Career

The Poplar Esterified Cell Wall Transformations and metabolic INtegration (PECTIN) project aims to study the metabolism of cell wall ester modifications and volatile intermediates, and their role in central physiological processes in the emerging biofuel species California poplar (Populus trichocarpa). A key goal of this research is to evaluate abiotic stress responses in plants with modified expression patterns of key genes involved in cell wall metabolism with altered amounts of methyl and acetyl groups present on cell walls. These genetic modifications will be evaluated for potential impacts on plant hydraulics, physiology, and stress responses. Understanding and manipulating the metabolism of cell wall modifications will not only provide important knowledge on the physiology and ecology of plants but will also allow the generation of engineered bioenergy crops such as poplar for sustainable production of biofuels and bioproducts, addressing BER’s goal of developing renewable bioenergy resources.

While previously considered a byproduct of growth, the release of methanol from methylated pectin in primary cell walls of plants dramatically alters their elasticity, a critical parameter controlling initiation and propagation of tissue morphogenesis and growth. Given the large reservoir of methylated pectin in leaves and other plant tissues associated with the primary cell wall, leaf methanol emissions are assumed to derive from this large, stored carbon reserve with no apparent direct connection to photosynthesis. Plant methanol emissions are assumed to derive from light-independent temperature-driven growth process, and therefore assumed to have no direct metabolic connection with photosynthesis. In this study, researchers use 13CO2-labeling to demonstrate that methyl esters on primary cell walls of leaves of C3 plants are directly produced from photosynthetically linked C1 metabolism, not related to photorespiration within minutes of light exposure through a proposed series of intracellular and extracellular connected pathways. This occurs in parallel with methanol release from the primary cell wall during temperature-stimulated growth processes, which are constrained by midday leaf water stress. Upon illumination of individual leaves and branches, 13C/12C-methanol emission ratios continuously increased during photosynthesis under an elevated 13CO2 atmosphere (500-1000 ppm). Dynamic branch 13CO2 labeling of photosynthesis lasting 2.5 days showed daily increases of 13C/12C-methanol emission ratios peaking at the end of the light period with 13C-methanol emissions gradually increasing at the expense of 12C-methanol, reaching 13C/12C-methanol up to 50%. In the dark, branch 13C/12C-methanol emission ratios remained constant despite strong night-time 13C-methanol emission dynamics that mimicked 12C-methanol emissions. At midnight, when the leaf water potential recovered (-0.2 +/- 0.1 MPa) from midday values (-1.0 +/- 0.1 MPa), branch emissions of both 12C-methanol and 13C-methanol increased steadily throughout the night, despite leaf temperature and transpiration slowly decreasing. The results are consistent with a distinction between biosynthesis and incorporation of photosynthetically derived C1 carbon into leaf primary cell walls and methanol production during growth. An accelerated growth phase occurs between midnight and midday, where growth and methanol increased positively with temperature, and a decelerated growth phase between midday and midnight, where growth and methanol emissions decreased with temperature, constrained by leaf water stress. The results provide evidence for a rapid and direct connection between photosynthesis and photorespiration-independent C1 metabolism; a temperature independent hydraulic control over methanol emissions and growth rates at night; and a temperature stimulated (morning), followed by inhibited (afternoon) methanol emission and growth during the day.

The observations are consistent with a biochemical model integrating CO2 fixation by the Calvin cycle with the C1 pathway involving: 1) Photosynthesis, 2) The phosphorylated serine pathway, which synthesizes the donor methyl group of methionine in the chloroplast 3) Export of methionine to the cytosol followed by activation to S-adenosylmethionine (AdoMet), which is imported into many organelles and used to transfer methyl groups to polysaccharides, nucleic acids, proteins, lipids, and secondary metabolites, 4) Methyl esterification of new pectin monomers in the Golgi with AdoMet, 5) Transport, export, and incorporation of the newly synthesized highly methyl esterified pectin into the growing primary cell wall, and 6) Methanol production during growth associated with pectin demethylation during the day and night. These observations are consistent with the emerging view of methanol emissions as a chemical signal of leaf growth primarily occurring at night and early morning, due to reduced leaf water stress. The results are also consistent with a critical role of methanol production linked to diurnal changes in primary cell wall elasticity associated with pectin demethylation and growth. Although the rise in atmospheric CO2 inhibits major metabolic pathways like photorespiration and the isoprenoid pathway, the photorespiration-independent photosynthetic C1 pathway may accelerate. Thus, photosynthetic production of AdoMet may play a critical, yet poorly understood role in enhancing growth rates of plants and net primary productivity of ecosystems during terrestrial CO2 fertilization.

AI/ML for Bioenergy Research in CABBILeakeyCABBIZhaoBioenergyCABBI

Center for Advanced Bioenergy and Bioproducts Innovation (CABBI) aims to develop and apply a wide variety of artificial intelligence (AI)/machine learning (ML) tools for bioenergy research ranging from plant growth to biosystems design to process development to sustainability study.

Thanks to recent advances in genomics, data science, and automation, there is a growing trend in developing and applying AI/ML tools for bioenergy research. In the past few years, CABBI has made much progress in this emerging research area and is now poised to take the lead in developing it into a mature field. In this poster, researchers will highlight a few representative AI/ML projects in CABBI. These examples include: (1) leveraging AI/ML to predict end-of-season yield and phenotype for sorghum and miscanthus from an aerial imagery time series (Leakey lab), (2) designing gRNA for CRISPR-based genetic engineering of plants (Hudson lab), (3) developing a self-driving biofoundry for biosystem design (Zhao/Sweedler labs), (4) enabling single nucleotide polymorphisms genotyping of Issatchenkia orientalis by AI/ML (Yoshikuni lab), and (5) using AI/ML and satellite data to predict miscanthus yields in research plots and commercial fields (VanLoocke lab).

From Strand Design Principles to SNP Detection—Probing Oligonucleotide Hybridization at the Single Molecule LevelChurchHarvard Medical SchoolSteinBiosystems DesignUniversity

Advance understanding of oligonucleotide hybridization at the single molecule level in order to maximize detection sensitivity and throughput in DNA-based highly multiplexed fluorescence microscopy applications.

Recent advances in DNA nanotechnology have become a major driver of highly multiplexed and super-resolution fluorescence microscopy (Beliveau et al. 2012; Porreca et al. 2007; Boyle et al. 2011). Clever sequential schemes of in situ sequencing and barcoding together with both libraries of Oligopaint (oligonucleotide-based primary hybridization) probes (targeting DNA and RNA) and with DNA-functionalized antibodies (targeting proteins) can provide visual access to a vast number of targets in the genome, transcriptome, or proteome with subcellular resolution (Beliveau et al. 2012; Larsson, Frisén, and Lundeberg 2021; Bouwman, Crosetto, and  Bienko 2022; Zhuang 2021; Jerkovic´ and Cavalli 2021; Hickey et al. 2022). Signal amplification strategies such as rolling circle amplification (Lee et al. 2014), linear appending (Kishi et al. 2019), or a minimum number of hybridization probes per RNA/DNA (Wang et al. 2016) target allow robust detection yet come at the cost of resolution and limited throughput. This work sets out to establish a simple single-molecule approach to assay the absolute efficiency of successful 1:1 probe hybridization events, given a known number of surface-immobilized DNA origami each carrying just a single copy of the complementary target site. Researchers further quantify the influences of oligo purification, directionality, and length of single-stranded overhang and derive design principles that maximize target hybridization and, hence, minimize the need for signal amplification. In a final proof-of-concept, the team demonstrates the versatility of this approach quantifying single nucleotide polymorphism detection efficiencies.

Novel Systems Approach for Rational Engineering of Robust Microbial Metabolic PathwaysJarboeIowa State UniversityJarboeBiosystems DesignUniversity

The goal of this project is to develop and implement a process for improving bioproduction under conditions that are appealing for industrial processes, such as high temperature and low pH. This approach addresses the failure of metabolic reactions due to inhibition, denaturation, misfolding or disorder of enzymes. Researchers will develop and implement a framework that identifies these enzymes and then identifies their robust replacement enzymes. The engineering strategy of replacing enzymes to improve bio-production is well-established, but rarely applied to system-wide stressors. Researchers apply a systems genomics approach to improve bioproduction, with Escherichia coli as the model organism. Butanol production at high temperature and succinate production at low pH are the model systems. This approach is complementary to improvement of microbial robustness by engineering the cell membrane and has advantages relative to evolutionary-based organism improvement by prioritizing bioproduction rather than growth.

Temperature Tolerance: Given the increase in thermostability data available from recent proteomics studies, researchers are taking a closer look at which enzymes are limiting growth at higher temperatures using metabolic models. While the melting temperature of an enzyme is useful, knowledge of the range of temperatures a given enzyme can withstand and remain active can help provide a more complete picture of an organism’s sensitivity to temperature. Researchers are using embeddings from protein language models such as esm2 to predict enzyme temperature optimum as well as the melting temperature when not available and have found improvements over previous methods (e.g., ProTstab2, DeepET). Researchers are exploring ways to improve predictive models such as by integrating protein structures with embeddings from the language models. Growth-limiting enzymes in response to increasing temperature in E. coli have been previously identified through the use of genome-scale models (Chang et al. Science 2013). This prior analysis relied on estimated values for many of the key enzyme-specific temperature parameters. In the years since this original study, there have been multiple proteomic studies of enzyme melting temperature (Tm) and researchers have used this data, along with literature reports of enzyme assays and the ProTstab2 Tm predictors (Yang et al. BMC Genomics, 2019) to compile an updated set of E. coli Tm values for use in assessment of enzyme thermosensitivity and prioritizing enzymes for experimental assessment.

Acid Tolerance: The pH tolerance efforts have prioritized modeling the effect of pH on the allocation of cellular resources. In the current version of the model, researchers assume that the cell interior is maintained at near-neutral pH by the energetic investment in proton extrusion via ATP. Researchers have completed a two-stage optimization model. In the first stage, cell growth is maximized with constraints of thermodynamics and total available enzyme protein allocation. The flux distributions associated with this maximum growth are then used in a minimization of overall enzyme protein cost. This model predicts a sharp decrease in specific growth rate when the extracellular pH drops below 4.9. Next, researchers investigated the thermodynamic and kinetic properties of individual enzymes in the model. Generally, low pH will lead to additional cost of enzyme protein to maintain a thermodynamic feasibility and reaction rate, represented by the thermodynamic item and kinetic item, respectively (Figure 2). Interestingly, the glucose transporter and glycolysis enzymes account for the majority of those with the highest enzyme protein requirement especially at low pH.

Researchers have compiled E. coli enzyme pH sensitivity data from the literature and are in the process of assessing the accuracy of the patcHwork sequence/structure-based tool for estimating enzyme pH sensitivity (https://patchwork.biologie.uni-freiburg.de/upload.php). Among the enzymes with literature data available, phosphatidylserine synthase (PssA) is especially pH sensitive. Researchers tested the effect of increased expression of PssA on growth at a range of pH values. At values as low as pH 4.0, the increased expression of PssA was beneficial for growth. Efforts to increase the pH tolerance of the PssA enzyme are ongoing.

Modeling of enzyme stability: The investigations of existing predictions of protein stability found them to be unreliable. Researchers are beginning to apply protein language models such as ESM1 to the problem. But have just now found similar DNA codon language models (Outeiral and Deane 2022) that will be more effective for this type of prediction and are developing new computational machine learning models using this type of DNA language model for predicting protein melting temperatures. (This relates to the importance of specific codons for the rate of translation as it affects co-translational folding.) Researchers also plan to extend this type of model to the prediction of optimal pH for E. coli enzymes, which is a relatively underdeveloped problem.

This builds upon the team’s own expertise in utilizing protein language models to make major improvements in protein sequence matching and function prediction (Kilinc, Jia, and Jernigan 2023).

A Pipeline for High-Throughput Genomic Recoding in Organisms Beyond E. coliChurchHarvard Medical SchoolTasBiosystems DesignUniversity

This project presents a pipeline for construction of genomically recoded non-model organisms with a focus on a 5.9 Mb Pseudomonas putida genome to harbor 59 codons. The team designed the 59-codon genome with a newly developed computational model that considers the impact of individual and combinatorial codon changes in gene expression, gene function, and growth. Required input datasets have been obtained for the computational model, and new genome engineering tools have been developed to optimize the introduction of synonymous mutations.

Genomic recoding consists of removing a set of codons from the entire genome while maintaining the same protein sequences (Ostroy et al. 2016, 2020). Recoding provides attractive properties including tight biocontainment, virus resistance, and efficient non-standard amino acid incorporation (Lajoie et al. 2013; Mandell et al. 2015; Robertson et al. 2021; Nyerges et al. 2022). However, to date, it has been only possible to fully recode the model organism Escherichia coli (Ostrov et al. 2020; Fredens et al. 2019). To enable recoding in non-standard organisms, computational- and experimental-based broad-host methods are needed to fully leverage this enormous potential.

Here, the team presents the datasets, computational tools, and genome engineering technology that is being assembled to enable high-throughput genome recoding in non-standard organisms to validate in P. putida. Researchers first generated high-quality genome maps and datasets to identify all relevant genomic elements. The identification of promoters and ribosome binding sites in the P. putida genome was emphasized.

Researchers next applied a computational framework that connects genome sequences with growth using information from metabolism, expression, and regulation to design a 59-codon of P. putida. This model predicts a minimum reduction in growth compared to the wild-type P. putida strain.

Finally, the team developed highly efficient recombineering tools in P. putida, including new recombinases and retron-based approaches to work with both single-stranded DNA and double-stranded DNA. In addition, researchers are working on extending the capacity for recombineering-based approaches, e.g., CRISPR-associated nucleases.

In summary, this study presents a pipeline to recode organisms beyond E. coli and computational and experimental techniques that have been established. This work aims: (1) to validate first fully recoded non-standard model organism P. putida-59; (2) to achieve the first environmental bacterium to be biocontained and resistant to viruses; (3) to establish design rules for computational and experimental expansion of recoding beyond E. coli; and (4) to accelerate recoding processes.

High (School)-Throughput Screening of BAHD TransferasesAchesonUniversity of Wisconsin−MadisonAchesonBioenergyGLBRC

BAHD acyltransferases represent a large family of enzymes typically found in plants. They use acyl-CoA donors (produced from acyl-CoA ligases) to form esters or amides with alcohol or amine acceptor molecules. The products of these reactions are incorporated into large polymers such as lignin and suberin or into small secondary metabolites including biofuel precursors, antimicrobials, antifungals, or compounds that contribute to drought resistance. The goal is to elucidate the identities and functions of these enzymes and use them in conjunction with acyl-CoA ligases, to precision engineer bioenergy crops. Pairing specific acyl-CoAs and BAHD transferases can allow fine tuning of lignin content either for simple deconstruction (Zip-lignin) or incorporation of useful aromatics that can then easily be clipped-off increasing the net value of the plant biomass (Karlen et al. 2016; de Vries et al. 2022).

BAHD acyltransferases have the ability to produce valuable molecules in bioenergy crops. The discovery and characterization of specific BAHD acyltransferases led to the creation of Zip-lignin, where introduction of ester-linked monolignols allows hydrolysis under mild conditions, avoiding harsher chemical treatments needed to remove lignin during bioenergy processing (Karlen et al. 2016). Further investigation showed that specific aromatics could be incorporated into terminal lignin positions, such as p-hydroxybenzoate that can easily be clipped off due to attachment via an ester linkage (de Vries et al. 2022). Thus, the ability to tune lignin composition not only allows for improved deconstruction, but also poises lignin as an attractive source of energy rich molecules. By taking advantage of consistently improving genomic data and tools the team curated lists of high-potential target genes focusing on two priority bioenergy crops and a model plant (poplar, sorghum, Arabidopsis). Selected genes were synthesized into cell free expression vectors by the Joint Genome Institute (JGI) and were then screened using a wheat germ cell free system (Cell Free Sciences) by a team of high school student laboratory members. The expressed proteins were screened for potential activity and categorized by preferred substrates. Active enzymes catalyzing interesting reactions were then incorporated into Populus sp. to assess in vivo impacts, and cell-based expression systems such as Escherichia coli have been used to facilitate structural and biochemical characterization. The work presented here has given further understanding of the breadth of molecules this large family of enzymes can produce, and how these molecules may be useful in producing more energy efficient plants or provide engineered plant sources for fine chemicals.

Chemical Imaging of Plant-Soil-Microbe Systems at the Stanford Synchrotron Radiation LightsourceHodgsonSLAC National Accelerator LaboratoryRichardsonStructural Biology

The Structural Molecular Biology resource at the Stanford Synchrotron Radiation Lightsource (SSRL) develops, operates, and supports synchrotron capabilities for biological and environmental research including macromolecular crystallography, small angle scattering, x-ray absorption spectroscopy (XAS) and x-ray fluorescence (XRF) imaging. Three dedicated XRF imaging beamlines cover a range of spatial scales (µm to cm) and elements of biological importance (P, S, K, Ca, and metals). A powerful aspect of the XRF imaging beamlines is that they can perform µ-XAS to characterize the oxidation state, or chemical species, at a single point within a sample. Combining XRF with XAS is a tool for generating spatial distribution images of individual chemical species of an element within a sample. Researchers will present examples of P, S, K, Mn and Fe chemical imaging from BER-relevant systems, such as microbial aggregates, plant components, mineral-organic matter complexes and soil cores as well as detailing the rich information that can be gained from synchrotron analyses using two specific examples.

Sphagnum moss is a key genus in terrestrial peatlands, responsible for most of the primary production and recalcitrant organic matter in these ecosystems, in turn impacting both C and N cycles. Sphagnum growth and production is partly dependent on a microbial symbiosis with N-fixing microbes. Researchers from ORNL used XRF imaging at SSRL to visualize the exchange of S compounds during stages of Sphagnum colonization by cyanobacteria. Data showed increased production of sulfate from Sphagnum as the percentage of colonization increased. Additionally, an increase in the local production of reduced S compounds (thiols/thio-ethers, likely in amino acids) and sulfonate (in either taurine or sulfoacetate) was observed within colonized hyaline cells only. These data support the hypothesis that Sphagnum produced choline-sulfate is being exchanged for microbially derived S and N bearing amino acids. Understanding the function of this relationship under warming conditions will provide insight into whether peatland ecosystems will remain net C sinks or become C sources due to climate change.

Potassium is persistently limited in most environments, however, studies of rhizosphere-phyllosphere K cycling are generally based on bulk measurements, which provide limited information on the biological processes that control K bioavailability. Researchers at EMSL have developed synthetic soil habitats (SSH), which simulate soil chemical and physical properties and are compatible with multi-modal imaging techniques. In collaboration with SSRL, researchers from EMSL are using SSH and multiple lines of synchrotron investigation to visualize the organic and inorganic processes controlling fungal sourcing, transport, and transformation of K during C-limitation, including: (i) XRF imaging of fungal hyphae on SSH in the presence and absence of an inorganic K source, (ii) XRF imaging of hyphae removed from SSH to determine the K phases and their role in hyphae, (iii) XRF imaging of the SSH surface to determine the availability of K resulting from fungal degradation of an inorganic K source, and (iv) XAS and theoretical calculations to determine the K bonding environment in organic K compounds to improve the knowledge of the dominant chelating ligands and properties. Combined with mass spectrometry imaging and multiomics analyses, these data indicate the importance of fungal exudated tartaric and citric acids, which are likely responsible for sensing K rich minerals and uptake/storage of K by fungi, respectively. Additionally, the formation of 10s µm size clay minerals on the SSH after 30 days of fungal growth provides insight into fungal mineral degradation over environmentally relevant spatial and temporal scales. This readily bioavailable pool of K could be a source for other microbes and plants in a more complex system.

The SSRL Structural Molecular Biology resource supports the development of advanced methodologies and research, collaborative research, service, training, and dissemination in structural molecular biology using synchrotron radiation at SSRL. The integrated program is supported by the DOE Office of Biological and Environmental Research, and by the National Institutes of Health, National Institute of General Medical Sciences (P30GM133894).

Harnessing Regulatory Variation to Elucidate Drought Resilience Mechanisms in SorghumEvelandDonald Danforth Plant Science CenterEvelandBioenergyUniversity
  • Overall project objective: To define and functionally characterize genes and pathways related to drought stress tolerance in sorghum and the molecular mechanisms by which these factors drive phenotypic diversity.
  • Establish a foundation for deep explorations of gene regulatory networks in sorghum through integrative genomics analyses.
  • Enhance understanding of how genotype drives phenotype and environmental adaptation using high-resolution, field-based phenotyping of sorghum mutant collections and a novel diversity panel.
  • Map and characterize genes contributing to drought responsive phenotypes in sorghum.
  • Experimentally validate predictions of gene function using molecular and genetic assays and targeted gene editing.

Development of the next generation of bioenergy feedstocks will require strategies that utilize resource-limited agricultural lands, including the introduction of novel traits into crops to increase abiotic stress tolerance. This project investigates the innate drought resilience of sorghum (Sorghum bicolor), a bioenergy feedstock and cereal crop. Drought is a complex trait and identifying the genes underlying sorghum’s innate drought tolerance and how they are regulated in the broader context of the whole plant and its environment requires advanced approaches in genetics, genomics, and phenotyping.

This project leverages a field-based phenotyping infrastructure at Maricopa, Ariz., which provides an exceptional capability for managed stress trials in a hot and arid environment through controlled irrigation. An automated field scanner system collects high-resolution phenotyping data using a variety of sensors throughout the growing season, from seedling establishment to harvest. A sorghum mutant population was phenotyped under the field scanner to compare drought-stressed and well-watered plants. Each mutant’s genome has been sequenced so that sequence variants can be linked with phenotypes. Being able to assess the genotype-to-phenotype link in response to drought over the life cycle of the plant will facilitate discovery of genes and their functions. The team has also constructed a diverse panel of sorghum lines, which maximizes variation in water-use efficiencies as well as genetics and geographic origin. Each of these lines have also been sequenced through various efforts. This population was phenotyped in a controlled environment drought response experiment where samples were collected for population-level expression analyses and will subsequently be field-phenotyped under the field scanner this summer. State-of-the-art phenotyping data analytics pipelines have been developed as part of this project and DOE-funded initiatives (see poster by Gonzalez et al.) and are being extended to define stress-related phenotypes at multiple scales. Bulked segregant analysis-seq is used to accelerate mapping of causal loci that underlie mutants of interest. So far, researchers have identified candidate genes underlying defects in leaf senescence, plant architecture, and male fertility. Regulatory maps generated from diverse sorghum lines in response to stress are being used to nominate gene candidates and place them in the larger context of a drought response network. In addition, gene editing and transgenic methods are being used to characterize gene function.

This work will identify control points for enhancing the productivity of bioenergy crops in marginal environments through precision breeding or engineering, and thus accelerate the development of improved varieties that are high yielding with limited water resources.

Opportunities Linking Omics and Structural Biology at PNNL: Excelling at Cryo-EMEvansPacific Northwest National LaboratoryEvansStructural Biology

This project is focused on the operation of a new state-of-the-art cryogenic Transmission Electron Microscope (Krios G3i) at the Environmental Molecular Sciences Laboratory (EMSL) to advance DOE BER user research in protein/small-molecule structural biology and whole-cell ultrastructure. The operation of EMSL’s new Krios G3i instrument is a joint funding venture between EMSL and DOE BER and the microscope is available to the general EMSL user community and DOE/BER researchers in a 50/50 split allocation.

This project was designed to rejuvenate cryo-electron microscopy (cryo-EM) research at EMSL with a new microscope and new semi-automated or automated workflows. EMSL users can access this new instrument free of charge via the normal EMSL user proposal calls which permit combining cryo-EM with other capabilities at EMSL such as mass spectrometry or super-resolution fluorescence microscopy. Access is offered free of charge for DOE BER users with PNNL staff time funded by this current project. The DOE BER access mechanism allows for an expedited submission and review process for cryo-EM only projects.

The new KriosG3i microscope is fully operational and has been applied to multiple EMSL and DOE BER user projects. The microscope has complete screening, data collection, and image processing workflows for: (1) micro-electron diffraction of small molecule or protein crystals; (2) single-particle analysis of soluble and membrane protein complexes; and (3) electron tomography of whole cells or isolated organelles. It is equipped with a K3 direct electron detector, Ceta-D camera, phase plate, and Bioquantum energy filter. In addition to semi-automated data collection, researchers have installed automated image processing workflows for real-time monitoring feedback of session quality and full 3D reconstruction of all workflows. To date, the team has demonstrated sub-angstrom resolution micro-electron diffraction, sub-2 angstrom resolution from 3D single-particle protein structure determination, and sub-nanometer resolution for whole-cell tomography. While researchers can provide rapid access for samples that arrive pre-frozen on clipped and pre-screened grids, they can also begin with samples that arrive in buffer and require all steps of the cryo-EM workflow. In a subset of cases, researchers can also start from a provided gene of interest and employ or cell-free expression system to produce enough protein for structural characterization. This poster presentation will highlight several of the recent user results as well as demonstrate an example of going from receiving a gene clone through cell-free expression and cryo-EM structure determination in less than 9 days. The team will also showcase the new user-friendly AutoMicroED software, which permits quick and direct determination of small molecule structure even from heterogenous datasets to accelerate science discovery with micro-electron diffraction.

Fluorescence Lifetime Measurements using Entangled-Photon CoincidencesLaurenceLawrence Livermore National LaboratoryEshunStructural Biology

Researchers have utilized entangled photon pair coincidences to measure the fluorescence lifetime of the organic dye Rhodamine 6G, allowing lifetime measurements to be conducted at low intensities with a continuous wave (CW) laser and without ultrafast pulsed lasers.

Quantum entangled photons have been studied over the years for their strong correlations and potential for novel applications in imaging, communication, and computing. In the realm of imaging and spectroscopy, these correlations allow measurements to be made that may offer increased information, as well as to be conducted with much lower light intensities, which is advantageous in the imaging and spectroscopy of sensitive materials. With these goals in mind, researchers have generated a high flux of entangled photon pairs using spontaneous parametric downconversion (SPDC) and with this, have measured the fluorescence lifetime of the organic dye Rhodamine 6G, utilizing entangled photon pair coincidences, conducting these measurements at low intensities, with a CW laser and without the use of ultrafast pulsed lasers. This differs from alternative methods of obtaining fluorescence lifetimes without pulsed or modulated lasers, such as the use of antibunching, which requires significantly longer integration times.

Ghost imaging, which involves using one photon to probe an object in front of a bucket detector with the other photon serving as a reference to reconstruct the image has been widely demonstrated and proven to work (Pittman et al. 1996;  Padgett and Boyd 2017). Many experiments utilizing the concept of ghost imaging have been theorized, such as a method hypothesized by Scarcelli and Yun (2008) for using fluorescence excited by entangled photons for imaging with coincidence counts; one photon is absorbed by the sample, producing fluorescence, and the other holds information on where that absorption took place. The ability to successfully measure the fluorescence induced by low intensity excitation is critical towards achieving such fluorescence imaging and can be further exploited by using the time correlations of the photon pair to measure the fluorescence lifetime of the substance being studied. The signal photon acts as a probe, excites the sample, and causes emission of a fluorescence photon, which is measured by a single photon detector. A delayed time correlation exists between the idler photon at a reference detector and the fluorescence photon; thus, coincidence counts can be measured between them when fluorescence occurs and a plot of coincidences against time delay will reveal a correlation peak whose decay time is the fluorescence lifetime of the sample. Lifetime measurements are usually carried out with the time correlated single photon counting method, using pulsed lasers or alternatively a form of modulated excitation. The entangled photon method achieves these measurements with a continuous wave laser and runs with 1 minute integration times.

Entangled photons centered around 532nm were generated using Type-I spontaneous parametric down conversion (SPDC). A 266nm continuous wave laser (Topwave 266-300, Toptica) was used to pump a 1mm thick BBO crystal cut for degenerate Type-I SPDC [NCBBO5100-266(1)-A, Newlight Photonics]. Harmonic beam splitters (Y4-2037-45, CVI Laser Optics) filter out any residual 532nm from the third harmonic generation of the 266nm pump. The pump power can be attenuated by rotating a half wave plate (HWP [WPH10M-266, Thorlabs]) placed in front of a Glan prism polarizer (GLB10-UV, Thorlabs) before being passed through a narrow bandpass filter (ET270/15BP, Chroma) to ensure a clean 266nm centered pump. The pump passes unfocused through the BBO crystal and then through a 35mm focal length collimating lens (LA4052-UV, Thorlabs). After the crystal and collimating lens, a dichroic mirror (355nm reflection cutoff; Di01-R355-25×36, Semrock) separates the remaining 266nm pump from the generated SPDC light, reflecting the pump into a beam dump. A wide bandpass interference filter (545QM75, Omega Optical) further cleans the SPDC emission from any spurious pump light. Photons of an entangled photon pair have corresponding energies that add up to that of the pump. Therefore, the signal and idler photons can be separated via wavelength, as each photon on one side of the SPDC center wavelength has a corresponding photon on the opposite side. A dichroic mirror cut near 532nm (FF538-FDi02-t3-25×36, Semrock) transmits all light 532nm and above to create the idler path and reflects all light 532nm and below to create the signal path.

The idler path acts as a reference while the signal path is the excitation arm. An optical relay is constructed in the signal arm with two 250mm lenses (AC254-250-A-ML, Thorlabs) to ensure the beam profile of the SPDC ring is clearly conjugated onto the back aperture of the excitation objective. A dichroic mirror (Di02-R532-25×36, Semrock) reflects the signal beam into an objective lens, and the sample is held above the objective in a glass bottom dish. Two objective lenses, a 20x 0.42 Numerical Aperture objective (M Plan Apo 20x 378-804-3, Mitutoyo), and a 25x 1.1 Numerical Aperture water immersion lens (N25X-APO-MP 25x, Nikon) were used interchangeably in this study. The resulting emission is collected via the same objective and transmitted through the dichroic mirror and emission filter and focused on the signal detector. The two photon counting detectors (PMA Hybrid 40 Photomultiplier Detector, Picoquant) are connected to a time controlling device (ID900, IDquantique) to measure coincidence counts.

The measurement in Fig. 1 shows the fluorescence lifetime decay of a 100mM solution of Rhodamine 6G. These measurements were conducted with 3 different concentrations (100, 10 and 1 𝜇𝑀; all graphs not shown) of the dye and with two objectives with varying numerical apertures (NA). As expected, the measured coincidence counts increase with objective NA and with dye concentration, confirming the fluorescence origin of the measured signal. Ethanol is used as a control and shows background level coincidence counts and no correlations (Fig 1). Fitting the decay of the measured curves provides the fluorescence lifetime and researchers obtained lifetimes of 3.7ns, which agrees with previously measured fluorescence lifetime values for this dye (Beaumont, Johnson, and Parsons 1993; Magde, Wong, and Seybold 2002). With this study, researchers have successfully shown that fluorescence lifetime can be measured at low intensities with entangled photons, without pulses using a CW laser.

Conformational Equilibria Underlying Electron Bifurcation and Transfer in Thermotoga Maritima Fix/EtfABCX: Novel Structural States and Their Correlation to Catalysis by Anaerobic SEC-SAXSHuraLawrence Berkeley National LaboratoryMurrayStructural Biology

The conformational states associated with electron bifurcation in metalloenzyme complexes (bifurcases) is sought in order to obtain a mechanistic understanding of their functions. Static structures of bifurcase enzymes often lack the detail necessary to fully provide such an understanding, so a small-angle X-ray scattering (SAXS) approach has been adopted for the characterization pipeline. Anaerobic size-exclusion chromatography-coupled SAXS (SEC-SAXS) achieves sufficient isolation of macromolecular species and protection from oxygen to provide an assay for bifurcase and metalloenzyme complex structure in solution. It is with these tools that the current project seeks to understand electron bifurcation for the potential application of engineering biofuel systems.

Electron bifurcation is a newly discovered albeit evolutionarily ancient means of energy conservation used by anaerobic microorganisms that simultaneously delivers low- and high-potential electrons to acceptors in a thermodynamically favorable fashion. The cryo-EM structure of one enzyme performing flavin-based bifurcation from Thermotoga maritima–Fix/EtfABCX–was recently solved to 2.9 Å resolution (Feng et al. 2021). This structure suggested a model for its catalytic mechanism that implies the low-potential electrons generated during bifurcation transfer to ferredoxin and high-potential electrons reduce menaquinone. This structure depicts Fix/EtfABCX as a symmetric superdimer, each half of the superdimer formed by an ABCX heterotetramer, and as membrane-associated. Presented here, anaerobic small-angle X-ray scattering analysis (Classen et al. 2013; Rosenberg, Hura, and Hammel 2022) on Fix/EtfABCX shows the enzyme to adopt an asymmetric morphology in solution, which can be further exaggerated or reverted towards symmetry by the presence of NAD+ or NADH, respectively. These conditions also impact the overall rigidity of the system which, together with the observed conformational changes, imply coenzyme presence primes the system for accepting reducing equivalents, interaction with substrates, and progression through its catalytic cycle. Interestingly, the presence of NADH induces formation of supertetrameric Fix/EtfABCX from two superdimer particles, suggesting change in redox state and/or presence of this coenzyme imparts a conformation conducive to oligomerization and a subsequent role in the bifurcation process. Collectively, these solution state observations provide structural narratives for which the states of Fix/EtfABCX must abide during its catalysis and also serve as a foundation for future studies of oxygen-sensitive metalloenzymes involved in electron bifurcation.

Impacts of Altered Climate on Microbial Growth and Nutrient Assimilation in an Ombitrophic Peat BogHungateNorthern Arizona UniversityBellEnvironmental MicrobiomeUniversity

The work proposed here will integrate genomics- and isotope-enabled measurements of Growth Rate, growth Efficiency, and the stoichiometry of Essential Nutrients during growth, an integration researchers call GREEN omics. The overarching objective is to develop and apply omics approaches to investigate microbial community processes involved in nutrient cycling. The specific objectives of the proposed work are 1) to evaluate the microbial ecology of nutrient uptake, testing hypotheses about nutrient assimilation in response to temperature variation; 2) to evaluate the ecology of nutrient-use efficiency for soil microorganisms within a framework of ecological theory, and 3) to develop new isotope-enabled genomics and transcriptomics techniques that probe the microbial ecology of nutrient dissimilation. This work will push the frontier of isotope-enabled genomics by connecting quantitative stable-isotope probing to ecological theory about nutrient assimilation, nutrient-use efficiency, metabolic efficiency, and by applying these tools to understand the basic biology and ecology of soil microorganisms and how they transform nutrients in the environment.

Numerous studies have examined overall shifts in microbial community composition and diversity metrics under increased temperature, while far fewer have considered the importance of elevated CO2 due to climate change. Standard sequencing methods to profile soil microbiomes cannot determine whether community shifts are driven by increases in select members, losses of others, or masked by large pools of relic DNA. Instead, researchers used quantitative stable isotope probing with 18O-water to estimate the in situ growth rates of individual bacterial taxa under long-term elevated CO2 and across a gradient of warming treatments in a northern Minnesota peat bog at the SPRUCE (Spruce and Peatland Responses Under Changing Environments) experiment, representing a particularly vulnerable terrestrial carbon reservoir.

The microbial communities at SPRUCE have been subjected to experimental conditions for more than four years, and therefore had ample time to acclimate to the treatments. Researchers found that a large proportion of bacterial taxa displayed little to no growth across temperatures under ambient CO2 concentrations, but faster growth under certain temperatures with elevated CO2, highlighting a strong interplay between warming and CO2 concentrations. Growth responses of multiple taxa could be clustered into three response patterns under ambient CO2, and just two response patterns under elevated CO2. Elevated CO2 shifted the temperature of maximum growth for the two dominant lineages of the peat microbiome. The temperature of maximum growth among Proteobacteria increased under elevated CO2, contrasting the Acidobacteria whose maximum growth temperature decreased. Researchers found support for phylogenetic conservation of growth patterns among Acidobacteria and Proteobacteria at approximately the genus-level under ambient, but not elevated CO2. Among Proteobacteria, groups with tight plant associations such as Rhizobiales (N-fixing symbionts) exhibited enhanced growth only at elevated temperature and CO2, suggesting these microorganisms may benefit from increases in rhizodeposition. These results suggest that certain taxa may be predisposed for growth under altered climate conditions, with a disproportionate influence on carbon cycling and peatland feedbacks to climate change.

Northern peatlands are also of interest due to their exceedingly low inorganic nitrogen (N) contents and reliance on organic N. As such researchers have additionally leveraged peat collected from SPRUCE to conduct a laboratory incubation to explore the interactive effects of N source and temperature on nutrient assimilation and growth. Researchers used multiple isotope tracers to characterize either microbial growth (18O-water) or N assimilation with three different substrates representing inorganic (15N-ammonium and 15N-nitrate) and organic (15N-glutamate) sources. Metabolomics profiles using FT-ICR indicate significant similarities among unique metabolites across the N treatments relative to controls. N-amendment increased the richness of molecules classed as condensed hydrocarbons (less bioavailable) at 25oC while decreasing richness of protein-like (more bioavailable) molecules at 15oC relative to matched unamended controls. Further, the average metabolite pool contained less potential energy at 25oC, across all N amendments as well as control soils. Preliminary results reveal a significant impact of both temperature and substrate at the bulk level on respiration and N assimilation into microbial biomass, with respiration most strongly impacted by glutamate amendment but N-assimilation into microbial biomass highest under ammonium amendment. Ongoing work to generate taxon-specific growth rates and nutrient use efficiencies will be used to clarify whether differences at the bulk level are driven by the differential responses of individual taxa, and SIP-metagenomes and SIP-proteomes will be leveraged to explore altered metabolism and nutrient allocation under the treatments.

Together these experiments reveal how individual microorganisms vary in their acquisition, use, and release of nutrients—attributes that directly impact the rate and fate of environmental nutrient transformations in a globally important ecosystem.

Impact of Soil Viruses on Microbial Compositions and FunctionsHofmockelPacific Northwest National LaboratoryWuEnvironmental MicrobiomePhenotypic Response of SOIL Microbes

PNNL’s Phenotypic Response of Soil Microbiomes Science Focus Area aims to achieve a systems-level understanding of the soil microbiome’s phenotypic response to changing moisture. Researchers perform multi-scale examinations of molecular and ecological interactions occurring within and between members of microbial consortia during organic carbon decomposition, using chitin as a model compound. Integrated experiments address spatial and inter-kingdom interactions among bacteria, fungi, viruses, and plants that regulate community functions throughout the soil profile. Data are used to parametrize individual- and population- based models for predicting interspecies and inter-kingdom interactions. Predictions are tested in laboratory and field experiments to reveal individual and community microbial phenotypes. Knowledge gained provides fundamental understanding of how soil microbes interact to decompose and sequester organic carbon and enables prediction of how biochemical reaction networks shift in response to changing moisture regimes.

Soil is known to harbor diverse and abundant viruses, but most soil viruses are uncharacterized. The ecological impacts of soil viruses and their responses to climate change remain understudied. To address these knowledge gaps, researchers launched a cross-scale study, from viral genes of interest to viral communities in soil microcosms and field experiments. Viruses carry auxiliary metabolic genes (AMGs) that potentially contribute to soil metabolic processes while tuning the host machinery towards their own replication (Jansson and Wu 2022). We, therefore, first focused on AMGs as viral genes of interest. In collaboration with the Joint Genome Institute (JGI), the Environmental Molecular Sciences Laboratory (EMSL) and the Stanford Synchrotron Radiation Lightsource (SSRL), researchers verified the activity of a chitosanase encoded by a soil viral AMG and determined the first protein structure within the all-chitosanase family GH75 at ultra-high resolution (<0.9 Å; Wu et al. 2022). Co-crystal structures with site-directed mutants and chitohexaose further elucidated the catalytic mechanism of the viral chitosanase. This study provides more molecular evidence that soil viruses may aid their hosts in organic carbon decomposition. To quantify the metabolic contribution of soil viruses at the community level, researchers next investigated viral population dynamics under the impact of environmental perturbation.

Using three contrasting field experiments, researchers tested the viral response to changes in soil moisture, studied viral communities in a range of grassland soils with different historical precipitation patterns (Wu et al. 2021), and then generated hypotheses to test in a soil incubation experiment (Wu et al. 2021). In collaboration with the National Energy Research Scientific Computing Center (NERSC), researchers assembled the deeply sequenced soil metagenomes (>1 Tb each; Nelson et al. 2020) and recovered a total of 2,631 viral contigs including 14 complete circular viral genomes (Wu et al. 2021). Researchers found that soil with a lower historical moisture content harbored significantly higher viral diversity and abundance, while displaying less evidence of virus-host interactions, suggesting a predominance of lysogenic viruses in drier soils. The detection of AMGs involved in 18 metabolic pathways further supports the finding of viral contributions to carbon metabolism in soil (Wu et al. 2022). Researchers then selected the grassland soil exposed to an intermediate historical precipitation, and either experimentally wetted the soil to saturation or air-dried the soil to represent experimental wet and dry treatments, respectively (Wu et al. 2021).  Researchers observed a lower overall level of transcription in drier soil, but across more diverse DNA viruses. A higher fraction of non-coding RNAs and more transcripts of lysogenic markers (i.e., integrases and excisionases) were detected in drier soil, further supporting a higher prevalence of lysogenic viruses in drier soils as shown in the field study.

To demonstrate the direct viral impact on soil microbiome with changing soil moisture, researchers applied High-Throughput Chromosomal Confirmation Capture (Hi-C) metagenomics to capture and identify viruses that were infecting hosts at the time of sampling and metatranscriptomics to detect the transcriptional activity of the host-associated viruses (Wu et al., submitted). Although the host-associated viruses accounted for only 5.3% to 15.0% of the total viral sequence abundance, they shared similar patterns that were previously detected in the whole viral communities (Wu et al. 2021; Wu et al. 2021). The host-associated viruses in wetter soils had higher transcriptional levels and were inversely correlated with abundances of their hosts (p < 0.05). The richness (number of different types of virus) of the host-associated viruses and the average viral copies per host (VPH), however, were higher in drier soils. These results suggest that viral infections were mostly lytic under wet conditions while more prevalent and lysogenic under dry conditions. The hosts infected by soil viruses were found to be central in community co-occurrence networks, highlighting the impact of viral infections on soil microbiome structure. This study is the first to target the detection of host-associated viruses in soil and reveals the impact of soil viruses on microbial composition with changing soil moisture.

Future work aims to bridge the findings across scales by leveraging the modeling capabilities (e.g., mechanistic modeling to simulate viral predation in porous systems) to test and generate a more comprehensive and transferable understanding of soil viruses. The cross-scale framework will continue to provide new information of the influence of changing soil moisture on viruses and their potential ecological impacts on soil microbiomes.

OMEGGA: A Computationally Efficient Omics-Guided Global Gapfilling Algorithm for Phenotype-Consistent Metabolic Network ReconstructionHofmockelPacific Northwest National LaboratorySongEnvironmental MicrobiomePhenotypic Response of SOIL Microbes

PNNL’s Phenotypic Response of Soil Microbiomes Science Focus Area (SFA) aims to achieve a systems-level understanding of the soil microbiome’s phenotypic response to changing moisture. Researchers perform multi-scale examinations of molecular and ecological interactions occurring within and between members of microbial consortia during organic carbon decomposition, using chitin as a model compound. Integrated experiments address spatial and inter-kingdom interactions among bacteria, fungi, viruses and plants that regulate community functions throughout the soil profile. Data are used to parametrize individual- and population-based models for predicting interspecies and inter-kingdom interactions. Predictions are tested in laboratory and field experiments to reveal individual and community microbial phenotypes. Knowledge gained provides fundamental understanding of how soil microbes interact to decompose organic carbon and enable prediction of how biochemical reaction networks shift in response to changing moisture regimes.

Genome-scale metabolic networks are a valuable tool for gaining a mechanistic understanding of microbial metabolism and predicting trophic interactions within microbial communities. Metabolic network models are constructed through three key steps–draft metabolic model construction based on genome annotations, filling knowledge gaps in biochemical pathways by adding missing reactions (gapfilling), and further refinement and curation. The DOE Systems Biology Knowledgebase (KBase) automates this process by providing a suite of computational apps and modules (https://docs.kbase.us/apps/analysis/metabolic-modeling). Draft networks constructed based on genome annotations alone do not contain all of the key reactions necessary for robust predictions, and therefore fail to predict biomass production/cell growth as experimentally observed. Gapfilling (i.e., identifying and adding those missing reactions) is a critical next step for enhancing the predictive power of metabolic networks. Current gapfilling algorithms seek a minimal number of reactions following the parsimonious approach and repeats this process in a sequential manner for a given set of phenotypic growth data. However, the reactions added as such are not always biologically relevant, causing the model predictions to be inconsistent.

To address these issues, researchers designed a new gapfilling algorithm termed OMEGGA (OMics-Enabled Global GApfilling). As indicated by its name, OMEGGA uses diverse data sources (such as amplicon, transcriptomic, proteomic, and metabolomic data) to simultaneously fit a draft metabolic model to all available phenotype data. In this work, researchers demonstrate the two major capabilities of OMEGGA: global and omics-guided gapfilling.

For global (or simultaneous) gapfilling, researchers developed a linear programming (LP)-based algorithm to identify a minimal set of reactions meeting all experimentally observed growth conditions, without iterative fitting. The LP-based algorithm shows far superior performance compared to existing mixed integer linear programming (MILP)-based algorithms as demonstrated through a case study using Escherichia coli. While the computational burden builds up exponentially as the number of media conditions increases, the actual computational time was indeed acceptable, indicating that the algorithm can be flexibly extended to more complex datasets by leveraging higher performance computational power in KBase. Importantly, the clever design of LP-based algorithm allows researchers to use non-proprietary LP solvers, avoiding any potential licensing issues. In parallel, researchers also developed an algorithm that incorporates–(1) gene annotations from multiple complementary pipelines (e.g., RAST, Prokka, Koala, DeepEC), and (2) additional omics data (e.g., transcriptomic profiles). In the case study of E. coli, the gapfilling solutions showed much stronger genomic and experimental consistencies than the typical parsimonious gapfilling. The inclusion of biologically relevant reactions is critical to avoid false positives, which traditionally requires manual curation.

Researchers will extend the test cases to include non-model organisms by using condition-specific multiomics and phenotype data from the Model Soil Consortia-2 (MSC-2) and associated isolated organisms developed through PNNL’s Soil Microbiome SFA. Researchers are working with the KBase team to incorporate the OMEGGA algorithm into KBase. The team will build an external application library for gapfilling and incorporate that into existing KBase apps that can leverage the multiomics data to derive more biologically relevant and realistic gapfilling solutions. To improve quality and supportability of the software, testing and documentation will also be incorporated into automated processes. The new optimization algorithms and KBase apps greatly facilitate the construction of high-quality metabolic networks by simultaneously incorporating molecular and phenotypic observations, eliminating the need for time-consuming, manual troubleshooting.

Removal of Primary Nutrient Degraders Reduces Growth and Modifies Functional Pathways of Soil Microbial Communities with Genomic RedundancyHofmockelPacific Northwest National LaboratoryMcClureEnvironmental MicrobiomePhenotypic Response of SOIL Microbes

PNNL’s Phenotypic Response of Soil Microbiomes Science Focus Area aims to achieve a systems-level understanding of the soil microbiome’s phenotypic response to changing moisture. Researchers perform multi-scale examinations of molecular and ecological interactions occurring within and between members of microbial consortia during organic carbon decomposition, using chitin as a model compound. Integrated experiments address spatial and inter-kingdom interactions among bacteria, fungi, viruses, and plants that regulate community functions throughout the soil profile. Data are used to parametrize individual- and population-based models for predicting interspecies and inter-kingdom interactions. Predictions are tested in laboratory and field experiments to reveal individual and community microbial phenotypes. Knowledge gained provides fundamental understanding of how soil microbes interact to decompose and sequester organic carbon and enable prediction of how biochemical reaction networks shift in response to changing moisture regimes.

Many ecosystem functions related to plant growth or carbon and nutrient cycling and stabilization rely on microbial metabolism. As a result, microbial communities are major drivers of carbon use efficiency (CUE). Furthermore, ecological theory indicates the importance of keystone community members that may carry out critical aspects of community functioning or interact with many other community members positively or negatively. Therefore, a quantitative assessment of species-specific responses within a community context is required to understand how microorganisms and keystone species within a soil community interact to support collective growth and community function related to carbon cycling.

To investigate how individual members of a microbial community contribute to decomposition, community growth, and CUE, researchers used a model substrate, chitin, and a Model Soil Consortium, MSC-2 (McClure et al. 2022). While MSC-2 can grow using chitin as the sole carbon source, the individual functions and metabolic contributions of the constituent species remain unknown. To quantify specific roles within this model community, researchers carried out experiments leaving out members of MSC-2 to test the implications for community biomass yields, CO2 production, proteomic and lipidomic profiles, and extracellular metabolites. Researchers chose two members to iteratively leave out: Streptomyces, as it is predicted via gene expression analysis to be a major chitin degrader in the community, and Rhodococcus as it is predicted via species co-abundance analysis to interact with several other members (McClure et al. 2020). The experiments revealed that when MSC-2 lacked Streptomyces, growth and respiration of the community was severely reduced, even though other members of MSC-2 can degrade chitin. Removal of Streptomyces also led to changes in abundance for several other species compared to the complete MSC-2 community, pointing to a comprehensive shifting of the community structure when important members are removed. In addition, while the absence of Streptomyces led to differences in microbial taxonomy compared to the complete MSC-2 community there were only minor changes compared to the community’s starting point, indicating minimal growth and activity when this keystone species is removed. In contrast, while the absence of Rhodococcus also led to taxonomic changes compared to the complete MSC-2 community, removal of this keystone species had little effect on community growth and respiration. A further multiomic analysis of communities lacking Streptomyces showed that without this member the proteomic profile of the community was distinct from the complete MSC-2. Proteins from Sinorhizobium and Ensifer were the most abundant in a community lacking Streptomyces while proteins from Ensifer and Sphingopyxis were more prevalent in the complete MSC-2. Major differences were also seen between the lipidomic profiles of MSC-2 with and without Streptomyces with the breakdown of triglycerides more prevalent in complete MSC-2 communities. As complete MSC-2 communities grow far better than those lacking Streptomyces this triglyceride breakdown may be a response to chitin being fully metabolized, forcing members to rely on other energy sources at the later timepoints of the experiments. Interestingly, polar metabolite abundances in the culture supernatants were relatively similar between communities with and without Streptomyces.

These results show that when keystone, chitin degrading members are removed, other members, even those with the ability to degrade chitin, do not fill the same metabolic niche to promote community growth. In addition, highly connected members may be removed with similar or even increased levels of community growth and respiration. This suggests that removal of keystone members can have positive or negative effects on overall community growth, an outcome driven not only by identity of the keystone member but the magnitude and type of the interactions it has with other members. The findings are critical to a better understanding of soil microbiology, specifically in how communities maintain activity when biotic or abiotic factors lead to changes in biodiversity in soil systems.

Crosslinking The Department of Energy Systems Biology Knowledgebase (DOE-KBase) and the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RSCB PDB) to Support Protein Function DiscoveryHenryArgonne National LaboratoryHenryComputational BiologyKBase

Systems Biology Knowledgebase (DOE-KBase) and the RCSB Protein Data Bank (PDB) offer synergistic functionality to investigate and engineer proteins. The collection of systems biology data and tools in KBase enables scientists to analyze their datasets in the context of public data and share their findings. RCSB PDB provides access to >200K experimentally determined, rigorously validated, and expertly biocurated 3D structures of proteins and nucleic acids within the open PDB archive. The RCSB.org web portal provides a variety of tools for searching, analyzing, and visualizing 3D biostructure data together with annotations from ~50 public resources. RCSB.org now supports parallel delivery of >1M computed structure models from AlphaFold DB and RoseTTAFold. The objective of this project was to lay the foundation for interoperation of these two resources, streamlining the ability of users to leverage structural biology data and workflows provided by RCSB PDB within the KBase platform.

Many projects currently funded by DOE BER aim to mechanistically understand a wide range of complex biological systems with the ultimate goal of supporting the rational manipulation, prediction, and design of these systems. The large fraction of proteins with unknown or incompletely characterized function is one of the greatest impediments to this goal. Structural biology is central to resolving and understanding protein function, particularly with the advent of the AlphaFold2 and RoseTTAFold algorithms for rapidly predicting new computed structure models (CSMs) of proteins with accuracies comparable to that of low-resolution experimental structures. Yet, structural biology approaches are greatly amplified when combined with systems biology data and tools. Toward this end, the KBase and RCSB PDB teams collaborated to develop a series of applications within the KBase platform that leverage the powerful capabilities of RCSB.org data-delivery and API services for integrating PDB data into systems biology and structural biology workflows across KBase and RCSB PDB.

Researchers demonstrate these newly developed workflows with two exemplar scenarios. In the first scenario, researchers identify genes encoded by the bacterium Micrococcus luteus that are responsible for producing proteins capable of degrading pyridine, a toxic compound found in coal tar. This workflow demonstrates combined use of transcriptomics, mechanistic modeling, and chemoinformatics to propose candidate genes for a novel biochemical pathway for pyridine degradation. Researchers then apply the new KBase-RCSB PDB pipeline to: (1) rapidly search the PDB for experimental structures that are homologous to the candidate gene products; (2) seek experimental structures of proteins co-crystallized with pyridine to identify pyridine binding sites; and (3) import and view AlphaFold2-generated structures for the candidate gene products, comparing each predicted structure with the closest experimental structures represented in the PDB archive. Researchers then conducted structure motif searches at RCSB.org to further characterize the pyridine binding site and perform structure comparisons of the AlphaFold predictions with the collection of experimental PDB structures and CSMs available at RCSB.org. Ultimately, binding site analyses aided in the performance of docking simulations in KBase to refine the gene candidates for the novel pathway.

In the second scenario, researchers exemplify concerted use of KBase and RCSB PDB to discover the unknown pyrimidine reductase (EC 1.1.1.193), a key enzyme in the Riboflavin (vitamin B) biosynthesis in Arabidopsis thaliana. Using annotation, modeling, and gapfilling tools in KBase, the team confirmed that the gene encoding pyrimidine reductase was not identified in the Arabidopsis annotation selected for this scenario. Researchers applied two newly developed KBase-RSCB PDB interface apps. The PDB-Import PDB Metadata into KBase Genome app enabled the team to query PDB for experimental structures that match sequences of any of the gene products in the entire Arabidopsis genome. Doing so exposed two Arabidopsis gene products with significant similarity to multiple experimental structures of microbial proteins in PDB currently annotated as pyrimidine reductases. The team used the Query RCSB Databases for Protein Structures app to import and view these structures, as well as offer links to views of these structures in RCSB.org. Then used a capability within RCSB.org to compare the experimental structures of interest with Arabidopsis AlphaFold2 CSMs now available on RCSB.org. The team used the RCSB.org pairwise structure alignment tool to determine that AlphaFold2 CSMs of both Arabidopsis candidate gene products aligned to distinct portions of a microbial pyrimidine reductase structures housed in the PDB. Although both Arabidopsis proteins are structurally similar, a detailed 3D analysis revealed that AT3G47390 lacks essential zinc-binding residues within its putative deaminase domain. Taken together with the observation that AT4G20960 had already been identified as the deaminase led to the hypothesis that AT3G47390 is the pyrimidine reductase (which was confirmed in publications).

The poster will display all KBase Narratives and RCSB tools applied in each of these scenarios, which are also described in detail in a publicly available training workshop on YouTube: https://www.youtube.com/watchv=vs_UyhhtSFk&list=PLHib7JgKNUUf8Z8jSK57FsJrms94w paL0&index=1Z

Integration of Enzyme Function Initiative Tools in the KBase PlatformHenryArgonne National LaboratoryHenryComputational BiologyKBase

Protein families of unknown function are a significant challenge facing the DOE BER research community because they prevent comprehensive metabolic reconstructions of both individual microorganisms and microbiome systems. While many tools in KBase and elsewhere today permit the discovery of completely new protein families, unfortunately very few tools exist, particularly in KBase, to study the function of these families. Fortunately, the Enzyme Function Initiative (EFI; http://enzymefunction.org) offers a suite of tools specifically designed to address this important problem. In this project researchers are working to fully integrate the EFI toolset into KBase, with complete ties to DOE BER sequencing sources including all sequence data in KBase as well as the JGI IMG Database.

Advances in computational methods and DNA sequencing now allow for single projects to generate tens to hundreds of metagenome sequences and potentially tens of thousands of isolate or metagenomically assembled genomes (MAGs) from diverse ecosystems. In theory, computational inference of the protein products encoded by these genomes, and the associated biochemical functions, should allow for the accurate prediction and modeling of key microbial traits, organismal interactions, and ecosystem processes that drive biogeochemical cycles. Unfortunately, the rate and generation of metagenomes, isolate genomes, and MAGs, along with related multiomic datasets, currently far outpaces the ability to translate these genome-enabled findings into ecosystem-informed predictive knowledge.

One of the most significant challenges currently inhibiting the understanding of complex biological systems from genomic and multiomic data is the staggering number of proteins that have completely unknown functions. About 50% of the proteins encoded by the genes in complete microbial genomes, and an even higher proportion of those encoded by microbial genes from environmental samples, cannot be reliably assigned a function. These unknown functions translate into large gaps in the metabolic reconstructions, prevent researchers from explaining more than 25% of most metabolomes, and obfuscate the functional interdependencies that guide the structure of all microbiome systems. Despite these challenges, virtually all of the functional annotation tools currently available in KBase and other platforms focus largely on assigning functions to proteins that are very similar to other proteins of known function (e.g. via propagation of function based on close sequence homology). Because the sequence boundaries between functions cannot be specified in the absence of orthogonal information, homology-based annotations often are incorrect. Tools are needed that are designed to integrate multiple sources of evidence to decode the functions of uncharacterized protein families and understand the limits of annotation propagation.

The Enzyme Function Initiative (EFI) toolkit is designed to fill this exact niche in protein function discovery. The EFI tool pipeline is comprised of three analysis steps: (1) generation of sequence similarity networks (SSNs) enabling the semi-automated reconstruction of high-quality protein families built around any protein sequence of interest (EFI-EST; https://efi.igb.illinois.edu/efi-est/); (2) parallel exploration of the genome neighborhood context of a protein family across a diverse set of input genomes to discover potential functionally linked gene products/enzymes that can be used to infer novel enzymatic functions and metabolic pathways (EFI-GNT; https://efi.igb.illinois.edu/efi-gnt/); and (3) determination of metagenome abundance of clusters in the SSNs for protein families using chemically guided functional profiling (CGFP) to discover the physiological/environmental context in which the proteins are expressed (EFI-CGFP; https://efi.igb.illinois.edu/efi-cgfp/). The EFI tools also provide links to structure data in Protein Data Bank (PDB) to gain further clues about protein function.

Researchers are deploying the EFI tools into the KBase platform, with re-engineering on the backend to permit a seamless integration with the KBase database of isolate, reference genome, metagenome, and MAG sequences. This will greatly enhance the value of the EFI tools to the DOE BER research community; greatly expand the ability of these tools to access more diverse sequence data and annotation sources (e.g. IMG); significantly ease the long term maintenance of the EFI tools by linking to the KBase data update cycle; and greatly enhance the capacity for users of the KBase platform to study protein families of unknown function.

IMAGINE BioSecurity: Genome-Scale Engineering and Modeling for Secure Biosystems DesignGuarnieriNational Renewable Energy LaboratorySuzukiBiosystems DesignIMAGINE BioSecurity

The Integrative Modeling and Genome-scale Engineering for Biosystems Security (IMAGINE BioSecurity) Science Focus Area project seeks to establish an understanding of the behavior of engineered microbes in controlled versus environmental conditions to predictively devise new strategies for responding to biological escape. To this end, the IMAGINE Team integrates core capabilities in synthetic and applied systems biology to develop a high-throughput platform for the design, generation, and analysis of biocontainment strategies in industrially relevant and emerging, next-generation microbes.

Genetically modified organisms (GMOs) have emerged as an integral component of a sustainable bioeconomy, with an array of applications in agriculture and bioenergy. However, the rapid development of GMOs and associated synthetic biology approaches raises several biosecurity concerns related to environmental escape of GMOs, detection thereof, and impact upon native ecosystems. To establish a secure bioeconomy, novel biocontainment strategies—informed by a fundamental understanding of systems level governing mechanisms—are needed. To this end, the IMAGINE Team is developing an array of passive and active synthetic biocontainment strategies to effectively minimize laboratory escape frequency while concurrently maintaining maximal laboratory performance. Researchers have selected a series of non-model, industrial, and/or next-generation microbial hosts to serve as chassis for secure biosystems design, including Pseudomonas putida, Synechocystis sp. PCC6803, Clostridium ljungdahlii, Mycoplasma mycoides, and Saccharomyces cerevisiae.

To facilitate the analysis of combinatorial constructs in the target organisms, a method termed combinatorial genetics en masse (CombiGEM; Wong et al. 2016) for generating combinatorial genotypes en masse and tracking them in mixed populations using DNA barcodes and next-generation sequencing was implemented. Combinatorial biocontainment strategies are being developed and evaluated for the capacity to reduce GMO escape frequency in laboratory and environmental simulation settings. Additional efforts to target synthetic carbon, nitrogen, and phosphorus storage auxotrophies are under development. In parallel, researchers have initiated assessment of the metabolic burden associated with implementation of these strategies, with the goal of maximizing biocontainment while maintaining optimal microbial fitness in deployment settings. Engineered strains are experimentally analyzed via growth, escape frequency, and bioproductivity using high-throughput screening in laboratory and environmental mesocosm settings. Strains are concurrently subjected to fitness and escape frequency screening assays to assess the effect of genetic safeguards on strain fitness and biocontainment efficacy. Researchers have also initiated the assessment of the robustness and fitness of the engineered microbes via computational robustness and genome-scale metabolic modeling to understand the underlying mechanisms that govern the efficacy of biocontainment and metabolic fitness.

Systems level analyses of these hosts in the absence and presence of biocontainment constraints will elucidate principles that (i) govern effective biocontainment and laboratory performance and (ii) drive biological systems in their natural environments. These learnings will establish an extensive library of biocontainment modules and strains, testing platform, and systems knowledgebase, and lay the foundation for predictive design of biocontainment strategies with enhanced stability and resilience in diverse microbial hosts. Combined, these efforts will reduce the risk associated with deployment of GMOs, ultimately forwarding a secure bioeconomy.

IMAGINE BioSecurity: Mesocosm-Based Methods to Evaluate Biocontainment Strategies and Impact of Industrial Microbes Upon Native Ecosystems.GuarnieriBiosciences CenterGuarnieriBiosystems DesignIMAGINE BioSecurity

The Integrative Modeling and Genome-scale Engineering for Biosystems Security (IMAGINE BioSecurity) Science Focus Area project seeks to establish an understanding of the behavior of engineered microbes in controlled versus environmental conditions to predictively devise new strategies for responding to biological escape. To this end, the IMAGINE Team integrates core capabilities in synthetic and applied systems biology to develop a high-throughput platform for the design, generation, and analysis of biocontainment strategies in industrially relevant and emerging, next-generation microbes.

Genetically modified industrial production microbes and their associated bioproducts have emerged as an integral component of a sustainable bioeconomy. However, the rapid development of these innovative technologies raises biosecurity concerns, namely, the risk of environmental escape. Thus, the realization of a bioeconomy hinges not only on the development and deployment of microbial production hosts, but also on the development of secure biosystems and biocontainment designs. Current laboratory-based biocontainment testing systems do not accurately reflect complexities found in natural environments, necessitating an environmentally relevant analysis pipeline that allows for the detection of rare escapees, the effect of associated bio-products, and the impact on native ecologies. To this end, researchers have developed an approach that utilizes soil mesocosms and integrated systems analyses to evaluate the efficacy of novel biocontainment strategies and to assess the impact of production systems upon terrestrial microbiome dynamics. Researchers demonstrate the utility of this approach by modeling a perturbation event by contaminating the mesocosms with the industrial microbial chassis, Saccharomyces cerevisiae. The resultant data demonstrate that researchers can track the fate of the contaminating microbe with high sensitivity in the soil, as well as monitor broader impacts of the perturbation on the underlying soil microbiome with a high degree of spatial and temporal resolution. The findings presented here support the use of this mesocosm-based approach to assess the environmental impact of industrial microbes and to validate biocontainment strategies.

The Genetics of Pathogen and Microbiome Control in the Switchgrass LeafLowryGLBRCWallendaelBioenergyGLBRC

This study aims to uncover genetic regions responsible for governing switchgrass responses to both pathogenic and mutualist fungi, and to understand how these responses may be coordinated.

Leaf fungal microbes can be fundamental drivers of host plant success, as they consist of pathogens that devastate crop plants as well as taxa that enhance nutrient uptake, discourage herbivory, and antagonize pathogens. In a replicated diversity panel of biofuel switchgrass, researchers quantified genetic and environmental variation in leaf fungal relationships, both for the whole microbiome and for a specific pathogen, leaf rust. While fungal colonization of the leaf varies over space and time, researchers uncovered genome-wide associations (GWAs) with several informative loci. In particular, three cysteine-rich receptor-like kinase genes (crRLKs) were linked to a genetic locus associated with microbiome structure. Since each of these genes is consistently upregulated in switchgrass genotypes typically more susceptible to fungal disease, researchers conclude that they may play a central role in the plant’s response to pathogens. Response to leaf rust is polygenic and environmentally sensitive, but resistance alleles are associated with higher biomass, indicating that breeding for rust-resistant plants will benefit growth without trade-offs in the absence of rust. Switchgrass response to fungal colonists is complex and variable, but an experimental design that accounts for variation over space and time allows for greater definition on genetic loci underlying fungal interactions.

Systems Framework to Enhance the Potential of Camelina as Oilseed CropGrotewoldMichigan State UniversityGrotewoldBioenergyUniversity

The project will 1) characterize the genetic variation, gene expression and chromatin accessibility across Camelina varieties and growth conditions, and 2) develop the tools to understand and manipulate Camelina gene expression. An important goal is to identify key genes and genomic regions to target in breeding efforts to enhance productivity, while providing the research community with a number of tools to understand and manipulate Camelina gene expression. An important objective of this project is to make available to the community a new set of tools and resources for Camelina that have been limited in the past to model plant systems.

The adoption of Camelina sativa as an industrial oilseed crop hinges on being able to increase its modest yield. This is in part constrained by the limited knowledge of the gene regulatory networks responsible for plant growth and responses to the environment, and by a poor understanding of the genetic diversity and gene content across Camelina accessions. To address these shortcomings, researchers have started the development of a Camelina transcription factor (TF) open reading frame (ORF) collection (TFome) that will accelerate the discovery of protein-DNA interactions. For example, Camelina TF ORFs were used for carrying out DNA affinity purification with high-throughput sequencing (DAP-seq) towards the identification of candidate fatty acid regulators (Gomez-Cano et al. 2022). Researchers have standardized conditions for embryo and seed Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq) to identify accessible chromatin regions and comparing them between the three subgenomes that comprise the hexaploid genome of C. sativa. Researchers are also in the process of completing the sequencing of a new version of the Camelina variety Suneson. Finally, researchers are standardizing conditions for the growth and transformation of the diploid and tetraploid precursors of hexaploid varieties, to facilitate crop resynthesis and introduction of novel genetic diversity.

A Temporal Atlas and Response to Nitrate Availability of 3D Root System Architecture in Diverse Pennycress (Thlaspi arvense L.) AccessionsSedbrookIllinois State UniversityGriffithsBioenergyUniversity

This project employs evolutionary and computational genomic approaches to identify key genetic variants that have enabled Thlaspi arvense L. (Field Pennycress; pennycress) to locally adapt and colonize all temperate regions of the world. This, combined with knowledge of metabolic and cellular networks derived from first principles, guides precise laboratory efforts to create and select high-resilience lines, both from arrays of random mutagenesis and by employing cutting-edge CRISPR genome editing techniques. This project will deliver speed-breeding methods and high-resilience mutants inspired by natural adaptations and newly formulated biological principles into a wide range of commercial pennycress varieties to precisely adapt them to the desired local environments.

Roots have a central role in plant resource capture and are the interface between the plant and the soil affecting multiple ecosystem processes. Field pennycress (Thlaspi arvense L.) is a diploid annual cover crop species that has potential utility for reducing soil erosion and nutrient losses; and has rich oil seeds amenable as a biofuel (30-35% oil) or high-protein animal feed. The objective of this research was to (1) precisely characterize root system architecture and development, (2) understand adaptive responses of pennycress roots to nitrate nutrition, (3) and determine genotypic variance available in root development and nitrate plasticity. Using a root imaging and analysis pipeline 4D pennycress root system architecture was characterized under four nitrate treatments across time. Significant nitrate condition response and genotype interaction was identified for many root traits with a greater impact on lateral root traits. In trace nitrate conditions a greater lateral root count, length, interbranch density, and a steeper lateral root angle was observed compared to high nitrate conditions. Genotype by nitrate condition interaction were observed for root width, width depth ratio, mean lateral root length, and lateral root density. Further, a large format mesocosm system has been developed and used to visualize root system architecture of mature plants with neighbors. Using this system, researchers are assessing how variation in pennycress root system architecture can affect ecosystem service and abiotic stress tolerance scaling from single plant to canopy level traits. These results illustrate root trait variance available in pennycress accessions and useful targets for breeding of improved nitrate responsive cover crops for greater productivity, resilience, and ecosystem service.

Understanding Plant Signaling via Innovations in Probe Delivery and ImagingGreenbergThe University of ChicagoGreenbergBioimagingUniversity

The team aimed to optimize nanofibers to deliver DNA expression constructs to plant cells and to develop and use a custom-built fiber optic microscope and image analysis platform that enables iterative, non-destructive measurements of plant tissues over time. These tools were developed together with research aimed at understanding receptor-mediated trafficking of the growth-promoting PSK peptide and responses. The goals are (a) to use the microscope to image the trafficking of a fluorescent bioactive peptide and its receptor; (b) to improve and test different nanofiber designs for delivering probes to plants, (c) to further discover and validate the transcriptional changes due to PSK-induced signaling.

Microscope:    The team upgraded the fiber optic microscope to optimize plant stabilization and imaging. The microscope includes 2 LED light sources plus a new white LED for brightfield imaging and interchangeable fiber optic lenses with different magnifications. The team created a mount to allow upright imaging, fabricated a 3D-printed leaf clip, and mounted the fiber on an extensible arm with 5-axis control (X-Y-Z plus pitch and yaw) for precise sample manipulation and fine focus to obtain high-resolution, iterative micrographs using live plants. With these refinements, the team observed receptor-dependent trafficking of the bioactive peptide PSK-TAMRA from one side of a leaf to the other.

Nanofibers:     The team reported success in using vertically aligned carbon nanofiber arrays (VANCFs) to deliver and get expression of DNA constructs to various plant tissues (Morgan et al. 2022). Through a user proposal at the Center for Nanophase Material Sciences, the team designed and implemented a strategy to transfer VACNFs from a rigid silicon substrate to SU-8, a flexible substrate. Importantly, this permits use of the fibers to deliver reagents to curved plant structures. To overcome the hydrophobicity of SU-8, fibers in the flexible film were coated with a 2-3 nm layer of silicon oxide. Using a rolling motion to drive fibers through plant cells, the team succeeded in delivering DNA and dye to various curved plant organs. The team has been invited to submit a manuscript for JoVE Journal that includes the innovations that were recently achieved.

Biological materials/deliverables:   The team conducted a time series transcriptomic analysis of root and shoot responses to PSK and revealed tissue-specific and time-dependent plant responses to this peptide hormone. The study also included a comprehensive analysis of PSK effects on whole seedlings during early development. The team found that PSK down-regulates the expression of a specific transcription factor family that regulates plant defense genes. This family is also the most enriched transcription factor in PSK down-regulated genes that associated with plant immunity. By comparing the transcriptome data with publicly available data, the team found that PSK has the opposite regulatory effects on that transcription factor family and defense-related genes compared with the responses triggered by the microbe-associated molecular pattern peptide flg22. This observation may explain the antagonism between these two peptide ligands (PSK and flg22). Researchers are currently testing predictions from the RNA-seq experiments using qPCR, physiological and biochemical readouts and are preparing a manuscript based on the findings. DOE funded research benefits for dissemination and deployment of bioimaging technology:

  1. A major advance is the iterative, non-destructive fluorescence imaging of bioactive peptides, their receptors and output signaling responses in intact plants that are highly relevant to improving traits for energy applications. This includes documenting changes in growth parameters and cell longevity and the accompanying signaling events.
  2. Nanofibers for introducing non-permeable probes and biomolecules into plant cells accelerates the discovery of plant signaling response components in many species in response to many stimuli/environmental Fibers serve the dual goal of providing fiducial markers for the iterative imaging developed. Finally, the approach can also be used for genome editing.
Metagenomic Insights into Microbial Traits Influencing Community Dynamics and N Cycling Post-FireGlassmanUniversity of California–RiversideNelsonEnvironmental MicrobiomeUniversity

Climate change coupled to shifting land use patterns have increased the frequency and size of wildfires across the globe. These wildfire disturbances deplete soil microbial biomass and alter the community composition of the soil microbiome, which drives critical terrestrial biogeochemical cycling across ecosystems. Through this project, the team aims to couple field-derived metagenomic datasets, biogeochemical data via experimental pyrocosms, and ecosystem models to evaluate the impact of wildfire on microbially mediated N cycling across ecosystems (Mediterranean grasslands, chaparral shrublands, coniferous forests).

Wildfires, which are increasing in both frequency and severity with climate change, reduce soil microbial biomass and alter the community composition of the soil microbiome, selecting for pyrophilous taxa with encoded traits that enable them to thrive in burned soil. The soil microbiome plays a vital role in biogeochemical cycling and ecosystem function and is an important player in terrestrial nitrogen (N) cycling, but it is poorly understood how the altered post-fire soil microbiome community composition influences microbially mediated soil N cycling and subsequent emissions of greenhouse gasses (GHGs) like nitrous oxide from post-fire ecosystems. Multiomics (i.e., metagenomics and metatranscriptomics) data allows researchers to infer lifestyle traits (i.e., Grimes’ C-S-R framework) and function of pyrophilous taxa in post-fire soils. Through this project, the team has compiled an extensive multiomic dataset including 108 metagenomes and 12 metatranscriptomes from fire-impacted Colorado coniferous forests representing different burn severities (low and high severity) and across time (60-year chronosequence to 1-year post-fire). These sequencing efforts, totaling nearly 9 Tb of data, have resulted in 1651 metagenome-assembled genomes that span the Actinobacteria (n = 861 MAGs), Proteobacteria (n = 315), Acidobacteria (n = 115), along with 17 other bacterial phyla. Further, putative pyrophilous taxa from previous studies are represented, including the Actinobacteria Arthrobacter (n = 14) and Blastococcus (n = 11), and Proteobacteria Massilia (n = 9). Data from 1 year post-fire (Nelson et al. 2022) revealed that pyrophilous traits (e.g., fast growth, heat resistance, ability to use pyrogenic carbon) were critical in the post-fire soil microbiome, with their importance increasing with increased burn severity. Further, the dominance of MAGs exhibiting these traits was coupled to the loss of N cycling functions, including the absence of evidence for the expression of the bacterial gene catalyzing N fixation (nifH) and loss of both nitrifying taxa (Nitrospira) and genes (amoA and nxrAB) in severely burned surface soils. Further analyses on MAGs derived from longer-term studies (3, 5, and 11 years post-fire to 6 decades post-fire) and other ecosystems (i.e., California grasslands and chaparral shrublands) will reveal whether these short-term influences on N cycling are unique to CO coniferous forests and if they recede with time following burning. Combined, these datasets will shed light on the impact of wildfire on ecosystem N losses and the emission of GHGs from wildfire-impacted landscapes.

Predicting Post-Fire N Cycling through Traits and Cross-Kingdom InteractionsGlassmanUniversity of California–RiversideGlassmanEnvironmental MicrobiomeUniversity

Wildfires are increasing in frequency, size, and severity across the globe. Unlike ecosystem disturbances that primarily impact vegetation, wildfires kill microbes, thereby dramatically altering the composition, function, and abundance of post-fire soil microbiomes, with downstream impacts on soil nitrogen (N) cycling. Despite widespread microbial mortality during fires, post-fire environments can also favor the growth of pyrophilous fire-loving microbes. Pyrophilous microbes have been documented in widespread post-fire habitats, yet their traits and impacts on ecosystem N losses remain largely uncharacterized. Here, researchers focus on how wildfire severity and pyrophilous microbial interactions regulate N cycling and the emission of greenhouse gasses (GHG) like nitrous oxide, a powerful greenhouse gas with 300x the warming potential of carbon dioxide, with implications for long-term ecosystem recovery, regional air quality, and Earth’s climate. The overarching project goal is to answer the question: Do conserved genomic traits and cross-kingdom interactions drive post-fire N cycling across ecosystems?

Understanding how microbial interactions and their traits govern N cycling is critical to forecasting post-fire soil N dynamics and ecosystem recovery. Using pyrophilous microbiomes as model systems, the team will scale up across systems of increasing complexity, from individual genomes to more complex microbiomes to predict the impacts of wildfire disturbance on ecosystem N cycling with the DEcomposition Model of ENzymatic Traits (DEMENT). Across three ecosystems that are experiencing increased fire frequency (Mediterranean grasslands, chaparral shrublands, montane coniferous forests) the team asks: (1) How do microbial traits change during post-fire succession? (2) How does fire severity influence microbial succession and gene expression of N cycling functions? (3) How do cross-kingdom interactions change during post-fire succession? (4) How do traits and interactions affect ecosystem N fates and cycling?

To answer these questions, the team will (1) identify putative pyrophilous traits across microbiota and biomes and cross-kingdom interactions among archaea, bacteria, fungi, and viruses that affect N cycling genes and biogeochemistry using metagenomic datasets and culture-and microcosm-based assays; (2) test and refine the trait predictions by coupling microbiomes and N cycling genes to N biogeochemistry via experimental pyrocosms to simulate soil heating under controlled and replicable conditions; and (3) scale up microbial traits and interactions to the ecosystem level by integrating the measurements with the trait-based DEMENT model. Microbiological insights assessed via genomics, metagenomics, metatranscriptomics, and viromics will be paired with biogeochemical approaches that combine traditional laboratory assays with isotopic approaches to track the production and consumption of substrates involved in GHG emissions. Experimental approaches and modeling will span microbial domains (archaea, bacteria, fungi, and viruses) and diverse fire-impacted ecosystems (Mediterranean grasslands, chaparral shrublands, and montane coniferous forests) to assess the generality of the results at broad spatial and temporal scales.

Optimizing Enzymes for Plastic Upcycling using Machine Learning Design and High Throughput ExperimentsGauthierDana-Farber Cancer InstituteFramBioenergyUniversity

This project aims to create new and optimized polyethylene terephthalate (PET)-depolymerizing enzymes (PETases) useful for industrial application. [Aim 1] Design novel PETases that are significantly different (25-65+ mutations) from known PET-depolymerizing enzymes and contain unique properties useful for performant enzymatic PET recycling and upcycling. Introducing many simultaneous mutations, while maintaining function, will enable researchers to more efficiently search for altered properties that depend on primary amino acid sequence. [Aim 2] Optimize previously described PETases by testing millions of mutagenized variants using directed evolution. Starting with existing functional PETases and exploring small changes in many distinct sequences using a novel ultra-HTP functional assay, researchers will optimize enzymes with improved properties by varying experimental conditions. [Aim 3] Characterize performance metrics of new and optimized PETases in detail including solvent tolerance, stability, catalytic rate, and substrate promiscuity.

Plastic use is ubiquitous in the modern world, and PET is one of the most abundantly produced plastics (and the most highly produced polyester), with ~65 million metric tons manufactured annually. To the consumer, PET is likely most recognizable as the plastic used to make beverage bottles. Like many plastics, traditional mechanical or chemical means of PET deconstruction and upcycling are costly and inefficient.

Recently, biological enzymes capable of breaking down PET into its basic building blocks (terephthalic acid and ethylene glycol) have garnered significant attention as an attractive means of dealing with the plastic problem. These enzymes are currently undergoing pilot studies for implementation in enzyme-based recycling. However, there are significant limitations to current enzymes, including the need to perform costly pre-processing of the plastic waste before the enzymes are able to work. Further optimization of these enzymes is necessary to make the process profitable and thereby incentivize commercialization of this biology-based green recycling technology.

This project aims to apply recent advances in artificial intelligence and machine learning to design new versions of enzymes capable of breaking down PET. Based on preliminary experiments using this evolution-informed computational design strategy, researchers believe it is possible to create a highly diverse set of enzymes that have exceptional properties useful for industrial recycling. Testing these enzymes is typically labor intensive but using a new robotically enabled platform researchers will be able to experimentally characterize key enzymatic properties of thousands of these designed enzymes.

In addition to applying machine learning approaches to design new enzymes, the team has  developed a novel method that, by experimentally testing millions of small changes to enzyme structure, enables optimization of existing enzymes that are known to break down plastic. The key to this approach is to encapsulate individual variations of each enzyme in single droplets together with plastic nanoparticles creating a mini reaction, and then select those droplets, which successfully break down PET to isolate the winning enzyme variants.

Ultimately, the result of these studies will be the discovery of highly optimized enzymes capable of breaking down PET plastics in an industrial recycling setting, enabling a powerful and green solution to the plastic problem.

 

Structure and Molecular Mechanism of a Bacterial ADP-forming 4-Coumarate CoA LigaseFoxUniversity of Wisconsin−MadisonChaudhuryBioenergyGLBRC

Acyl-CoA ligases are enzymes that catalyze the adenosine triphosphate (ATP)-dependent conjugation of carboxylic acids to coenzyme-A (CoA) and enable the entry of these acids into metabolism as activated CoA thioesters. These CoA thioesters are important metabolic intermediates that feed carbon from the organic acids into a variety of anabolic and catabolic pathways allowing for the construction/destruction of diverse molecular scaffolds, thereby enabling cellular function. The aim of this project is the identification and biochemical characterization of acyl-CoA ligases from plant and microbial sources with the end goal of enabling precision metabolic engineering of the important bioenergy crop Populus trichocarpa (black cottonwood) with a focus on lignification.

4-Coumarate CoA ligases (4CLs) are a subset of acid-thiol ligases (E.C. 6.2.1.-) that catalyze the ATP-dependent thioesterification of 4-coumarate to CoA. In plants, these enzymes act upstream in the phenylpropanoid pathway and are therefore attractive targets for metabolic engineering as they provide control of metabolic flux over the downstream reactions that are rich in valuable and useful metabolites (Vogt 2010). 4CLs are also present in microbes where they function in aromatic degradation pathways which proceed through CoA thioester. The researchers report the structure and mechanism of a bacterial 4CL (ferA) from the ferulic degradation cluster of the lignin degrading bacterium Sphingomonas SYK-6 (Masai et al. 2002). This is the first crystal structure report of a conformation of bacterial 4-coumarate ligase. Catalysis in ferA proceeds via a reversible central metabolism-like ADP-forming mechanism through an acyl-phosphate intermediate instead of the AMP-forming mechanism widely reported for plant 4CLs, which proceeds through an acyl-adenylate intermediate. The co-crystal structure of ferA (2.46 Å, Rwork 22%, Rfree 25%) crystallized in the pre-ATP hydrolysis conformation bound to inorganic phosphate, feruloyl-CoA, and non-hydrolysable ATP analogue AMPPNP offers insight into the catalytic machinery and all the ligand binding sites on the ferA enzyme, and sheds light on the molecular interactions that enable enzyme specificity and catalysis. Researchers combine these structural insights with optical and NMR spectroscopic studies of the forward and reverse ligase reaction to elucidate the role of protein conformation and its intricate links to ligand binding and group transfer catalysis.

Extracting Functional Traits from Large Volumes of Field Phenomics DataEvelandDonald Danforth Plant Science CenterGonzalezBioenergyUniversity
  • Use the University of Arizona’s Field Scanner to collect high spatial and temporal resolution field phenomics data on 430 ethyl methanesulfonate (EMS)-mutagenized families in the BTx623 background under well-watered and water limited conditions.
  • Develop software and machine learning (ML) models to extract fine-scale phenotypic trait data at individual plant and organ levels from field phenomics data.
  • Leverage fine-scale phenotypic trait data to study genotype-phenotype associations in response to drought to facilitate discovery of genes and their functions.

Studying dynamic plant responses to environmental conditions has historically been difficult due to the low throughput and long-term cost of longitudinal data collection in the field setting (Reynolds et al. 2019). Recent technological advances have resulted in small, low-cost, and high-resolution sensors that can be used to rapidly collect phenotypic trait data at regular time intervals in field or greenhouse settings (Li et al. 2020; Sooriyapathirana et al. 2021). Today, high spatial and temporal resolution field phenomics data is being collected to extract information on dynamic plant responses to biotic and abiotic stress under real world field conditions. When phenotypic trait data from multiple sensors are combined, a multidimensional understanding of plant morphology and physiology can facilitate the discovery of genes and their functions. The University of Arizona is home to the world’s largest outdoor plant phenotyping system, the Field Scanner that encompasses numerous sensors for collecting plant phenotypic trait data. The Field Scanner collects red-green-blue (RGB), photosystem II (PSII) chlorophyll fluorescence, and thermal images as well as 3D point clouds using laser scanners. The Field Scanner raw data is being processed using PhytoOracle (PO), a series of scalable, modular phenomic data processing pipelines (Gonzalez et al. 2022). The processed data generated by PO is enabling the extraction of increasingly fine-scale traits from various levels, from the field to plot and whole plant to individual organs. To extend phenotyping capabilities, novel machine learning (ML) algorithms are being developed to extract multidimensional phenomics datasets that can be mined for genotype-phenotype associations related to abiotic stress.

ML models that aim to segment plant point clouds can provide fine-scale phenotypic data on plant morphology at individual plant and organ levels (Figure 1). Various traditional shape descriptors can be extracted, including height, volume, and angle. Additionally, topological data analysis (TDA), a mathematical framework for studying the underlying connections of points and their properties, can be used to study shape nuances that may not be captured by traditional shape descriptors. Persistence diagrams and Euler characteristic curves are common TDA methods, which aim to capture topological signatures that summarize shape features from which shape nuances can be studied (Amézquita et al. 2022.; Amézquita 2020; Chazal and Michel 2021). A variety of traditional and TDA shape descriptors are being collected from 430 EMS-mutagenized sorghum families in the BTx623 background under well watered and water limited conditions across the life cycle of plants. These shape descriptors are being leveraged to identify genotype-phenotype associations related to drought stress to enable gene discovery. This work will identify induced variation in drought resilient traits for enhancing the productivity of bioenergy crops under drought conditions through fine-scale phenotyping. This information will drive the breeding and engineering of improved, climate-resilient varieties capable of maintaining productivity under limited resources.

Probing Photoreception with New Quantum-Enabled ImagingEvansPacific Northwest National LaboratoryEvansBioimaging

This project will develop new hybrid quantum-enabled imaging platforms that combine advances in adaptive optics, quantum entanglement, coincidence detection, ghost imaging, quantum phase-contrast microscopy, and multidimensional nonlinear coherent spectromicroscopy to characterize photoreception. This approach has three main aims that are intended to be developed in parallel. The first two aims focus on developing new quantum imaging approaches in which entangled photons will be employed to investigate biological samples with increased spatial resolution (Aim 1) and detection sensitivity (Aim 2) while permitting lower flux or sample interrogation with lower-energy photons. Aim 3 focuses on using coherent (non-entangled) photons and four-wave mixing to visualize photoreception and other quantum coherent processes occurring naturally within biosystems to better track ultrafast protein dynamics and the flow of metabolites between compartments in real-time.

During the current project period, researchers installed the Leica and Olympus optical microscopes with fluorescence, coherent anti-Stokes Raman scattering, stimulated Raman scattering, and magnetic particle imaging modes. The team also installed the Picoemerald optical parametric oscillator and two Coherent lasers to be used in Aims 1–3. Work was initiated on both the ghost imaging and the quantum phase contrast imaging with milestones of developing the theory and numerical simulation for these aims as well as developing the control software for the necessary spatial light modulator. For Aim 2, researchers demonstrated the ability to perform quantum-enhanced phase imaging without coincidence counting, and for Aim 3, 1.2 picosecond quantum beats have been detected with time-resolved coherent Raman scattering. Finally, the team expressed multiple proteins involved with photoreception and have begun their structural and spectroscopic characterization, and researchers have cultured all cell types that will be imaged as part of the testing and commissioning phases. Looking forward into the next project period, plans are to finish development of two-color entangled quantum ghost imaging as well as begin applying Aims 1–3 to quantum-enabled imaging and probing of biological samples. In this poster, the team will detail the progress from the project and highlight future plans for further advancing the technologies as well as making them available to the broader research community.

Unraveling Spatiotemporal and Physicochemical Constraints on Soil Viral Community Composition and Viral Particle IntegrityEmersonUniversity of California–DavisEmersonEnvironmental MicrobiomeUniversity

The overarching goal of this project is to assess and compare the contributions of active, infectious viruses and inert viral particles to biogeochemistry across diverse terrestrial ecosystems. Using a multiomics approach, the team seeks to establish spatiotemporal patterns in soil viral community composition and activity linked to host carbon and nitrogen metabolism in grasslands, shrublands, woodlands, and wetlands. Leveraging a prescribed forest fire and a peatland temperature and atmospheric CO2 manipulation experiment also allows exploration into feedbacks between soil viruses and carbon dynamics in response to environmental change. Through laboratory experiments, researchers are investigating the chemical composition, fate, transport, and integrity of viral particles in soil. By integrating field and laboratory experiments across a variety of soil edaphic properties and spatiotemporal scales, this project is expanding understanding of the soil virosphere and its influence on carbon and nutrient cycling.

Viruses have been recognized as highly abundant but poorly characterized members of the soil microbiome. By infecting soil microbes, viruses likely have substantial impacts on terrestrial biogeochemical processes under their hosts’ control. Viral particles (virions) may also play more direct roles in soil biogeochemical cycling as packets of carbon, nitrogen, and phosphorus, but the time scales and environmental conditions that determine virion infectivity, transport, and/or sorption to soil particles are unknown. This project uses a combination of field, laboratory, and computational approaches to distinguish between infective and inert virions and to assess their respective contributions to soil biogeochemical cycling.

Using a post-0.22 µm ‘viral size-fraction’ metagenomics (viromics) approach, researchers are exploring the conditions and temporal scales over which virions are produced, remain infective, and decay in soil. Research has shown that viromes can recover ~500 times more viral sequence than total metagenomes, but at the start of this project, it was unknown whether viromes reflected recent or long past infections. Results suggest that soil viromes generally capture very recently active viral communities, particularly in moist soil, but can reflect earlier infections in less active communities (e.g., in seasonally dry soils) and can be dominated by compromised (inert) viral particles after extreme temperature perturbations (e.g., heating to 90ºC or freezing).

The project’s interpretation that most soil viromes capture a short window of recent activity is consistent with repeated findings of highly divergent soil viral communities over spatial distances as short as 1 m in nearly all habitats explored thus far. Results from the first 2 years of this award suggested that soil viral communities were so distinct by site on a regional scale that more localized habitat comparisons would be more tractable (Durham et al. 2022). Seven distinct wetland habitats (sites) were thus sampled over a 0.6 km2 area in the Bodega Bay Natural Reserve on the California Pacific Coast. Viral communities were most distinct by site, with few populations shared between sites. Secondarily, viral communities with similar habitat characteristics (e.g., plant community composition and/or salinity) were most similar. Although reducing the spatial area of the study and selecting seemingly similar habitats (all wetlands) improved resolution of viral ecological patterns, ongoing efforts are focused on further reducing complexity.

Centered on a highly spatiotemporally resolved viromic study of two habitats in the Jepson Prairie grassland (eight locations, 30 time points since November 2020), ongoing work seeks to unravel the relative contributions of space, time, habitat, and dispersal on patterns of soil viral community composition. Briefly, viromes at Jepson Prairie were most distinct by habitat (between mounds and their adjacent swales, defined by differences in topography, plant community composition, and hydrology), but viral community compositional patterns over time were different within each habitat. The four swales exhibited similar viral community successional patterns over time, likely reflecting greater mixing in the swale habitats via intermittent flooding, whereas the four mounds (which rise ~0.5 m above swales and never flood) were more distinct over space, reflecting dispersal limitation. Analysis of the full dataset of >300 viromes is ongoing.

To investigate physicochemical constraints on virion integrity and viral community composition, researchers are analyzing data from three burned habitats and a laboratory temperature manipulation experiment. In shrublands and woodlands that burned during the dry season in the LNU Complex Fires in August 2020, the team is characterizing viral community successional dynamics after fire. A prescribed burn in a mixed conifer forest in Spring 2021 was also leveraged to compare burned and unburned soil viral communities. Preliminary results suggest that habitat and location differentiate viral community composition more than the impact of fire, but viral richness was lower post-fire than in unburned or pre-fire soils. The degree of virion inactivation and the timing of viral community recovery post-fire seem to depend on soil moisture and depth. Briefly, heating experiments are revealing that viral particle ‘survival’ thresholds are similar to those known for bacteria, with a reduction in survival at 60ºC and nearly complete removal of intact virions (with some inert, compromised virions remaining) at 90ºC.

Together, these results are revealing that the relative importance of spatial distance (and dispersal), time, and environmental conditions in structuring viral communities varies. With substantial differences in environmental parameters, habitat seems to trump all other factors at both local and global scales, but in local environments under similar conditions, space and time can be important. Results from this project are facilitating a better understanding of viral contributions to terrestrial biogeochemical cycling, both as dynamic components of soil organic matter and through their infection of hosts responsible for carbon and nutrient cycling.

Community Engagement Strategies of the National Microbiome Data Collaborative (NMDC)Eloe-FadroshLawrence Berkeley National LaboratoryKelliherEnvironmental MicrobiomeNMDC

The vision of the National Microbiome Data Collaborative (NMDC) is to connect data, people, and ideas to advance microbiome innovation and discovery. With this vision in mind, the NMDC seeks to support a findable, accessible, interoperable, and reusable (FAIR) microbiome data sharing network through infrastructure, data standards, and community building that addresses pressing challenges in environmental sciences. The NMDC engagement strategy focuses on promoting a collaborative ecosystem for diverse microbiome researchers and implementing community feedback in all of the NMDC efforts and products.

The NMDC is a multi-national laboratory initiative focused on advancing innovation and discovery in the field of microbiome science through the project’s development of products and tools for the environmental microbiome research community (Wood-Charlson et al. 2020). The NMDC provides the community with three products: (1) The Submission Portal, (2) The Data Portal (Eloe-Fadrosh et al. 2022), and (3) NMDC EDGE, each aimed at making multiomics microbiome data FAIR. Each of these products was designed to specifically address the larger research community’s specific needs and wants. The NMDC team implements recommendations and insights gleaned from usability testing and community feedback to continuously improve products. The team routinely engages with microbiome researchers to discuss how they want the NMDC products to look and operate, as well as understand what new functionality would benefit future research. In addition, the NMDC communicates and engages with many types of stakeholders, including funding agencies, publishers, institutions, programs, projects, and individual scientists. As part of these collaborative efforts, the NMDC hosts and co-hosts workshops (Vangay et al. 2021), webinars, presentations, panel discussions, and other events aimed at spreading awareness of and lowering barriers to adoption of FAIR principles in microbiome research and data generation. The NMDC Ambassador program allows early career researchers to host some of these events, thus expanding the overall reach of the content and training materials, while providing the Ambassadors with valuable experiences and career opportunities. The NMDC Champions program brings together microbiome researchers from diverse backgrounds to contribute to the NMDC (e.g., by beta-testing the NMDC products, co-authoring publications with the NMDC team, providing feedback, etc.). The NMDC will continue to prioritize community engagement as the products and network grow.

Refactoring Metabolism to Control the Persistence of Genetically Engineered Microorganisms in the EnvironmentEgbertPacific Northwest National LaboratoryElmoreBiosystems DesignPersistence Control of Soil Microbiomes

The Persistence Control Science Focus Area (PerCon SFA) at PNNL is focused on developing fundamental understanding of factors governing the persistence of engineered microbial functions in rhizosphere environments. From this understanding, the team will establish design principles to control the environmental niche of native rhizosphere microbes for the model bioenergy crop sorghum through data-driven genome reduction and engineered metabolic addiction to plant root exudates. These principles will lead to secure plant-microbe biosystems that promote secure, stress-tolerant, and highly productive biomass crops.

Persistence control is an engineering approach in which survival of genetically modified microorganisms is restricted to a target environmental niche. Refactoring the catabolic repertoire of these organisms such that they have reduced fitness outside of the target niche and enhanced fitness within the target niche is one of several powerful tools for achieving persistence control (Fig. 1). Within the PerCon SFA, researchers are refactoring the catabolism of plant growth–promoting rhizobacteria (PGPR) to use crop-specific root exudate compounds and not use abundant, wide-spread soil compounds (e.g., lignocellulose, chitin, and non-crop-specific root exudates) as carbon sources. This has the potential to enable responsible application of beneficial genetically modified PGPR in the environment, while preventing their uncontrolled spread outside of the crop rhizosphere. Through the use of Sorghum bicolor (sorghum) as a model crop, and with random barcode transposon mutagenesis (RB-TnSeq), researchers have identified genes whose removal from the model sorghum rhizobacteria Pseudomonas facilor TBS28 will abolish its use of common carbon sources. Sorgoleone (a lipophilic benzoquinone) and MHPP (a phenylpropanoid methyl ester) are two compounds entirely or almost entirely unique to sorghum root exudates, respectively. This makes them excellent model nutrients for expansion of the TBS28 catabolic repertoire, but the catabolic pathways for their consumption are unknown. Researchers isolated three bacteria, including a new species, that utilize sorgoleone and found that the model rhizobacteria Pseudomonas fluorescens SBW25 utilizes MHPP. Using RB-TnSeq and RNAseq individually, dozens of genes were found that were potentially involved in sorgoleone and MHPP catabolism. When used together, the number of candidate genes were reduced by an order of magnitude, enabling detection of key enzymes in each catabolic pathway. A core set of four highly conserved genes are essential for use of sorgoleone, serve as a biomarker for sorgoleone catabolic function, and are enriched in members of the sorghum rhizosphere. With a combination of genetic characterization and biochemical enzyme assays, the team identified two enzymes responsible for funneling MHPP into a common aromatic catabolism pathway. This includes a novel family of esterases that demethylates MHPP and other plant-derived phenylpropanoid methyl esters. To enable facile and efficient site-specific integration of non-native catabolic pathways and other heterologous genetic programs into bacterial genomes, researchers developed serine recombinase-assisted genome engineering (SAGE; Elmore et al. 2023). For this project, the team is actively using a multiplexed version of SAGE to evaluate genetic parts for rational control of gene expression in diverse bacteria, complement gene deletions for pathway discovery, introduce machinery for CRISPR interference, and engineer PGPR that can consume MHPP and sorgoleone as carbon sources.

Development of Synthetic Microbial Communities to Study Consortium Engraftment Dynamics and to Improve the Yield Performance of SorghumEgbertPacific Northwest National LaboratoryEgbertBiosystems DesignPersistence Control of Soil Microbiomes

The PNNL Persistence Control Science Focus Area (SFA) aims to gain a fundamental understanding of factors governing the persistence of engineered microbial functions in rhizosphere environments. With this understanding, the SFA is investigating design principles to control the environmental niche of native rhizosphere microbes. Researchers are examining the efficacy of genome reduction and metabolic addiction to plant root exudates in environmental isolates as persistence control strategies using the bioenergy crop sorghum and defined microbial communities as a model ecosystem. The engraftment dynamics of non-native microbes into a reduced-complexity microbial community and the establishment of defined-isolate synthetic communities in field environments are two major areas of current investigation. Effective persistence control will lead to secure plant-microbe biosystems that promote stress-tolerant and highly productive biomass crops.

In the past 2 decades, research of the plant microbiome has shown the importance of plant-associated microbes (PAM) in modulating crop performance (Compant et al. 2019). These studies have paved the way for use of PAM to provide economic and sustainable solutions to current bioenergy cropping and, more generally, agricultural challenges. However, applying PAM within the context of field agriculture has met with mixed success, in part because the introduced microbes must persist within the context of a resilient existing microbiome to be successful. PAM engineering may help overcome these limitations, including via the use of strategies like genome reduction to control the environmental niche of target microbes in agricultural soils (Ke, Wang, and Yoshikuni 2021). A key step to implementing this approach is understanding how engineered microbes may co-colonize or be suppressed by the native microbiome.

In the Persistence Control SFA, researchers have developed two representative, reduced-complexity microbial communities to aid understanding of the colonization dynamics of engineered microbes in field-like conditions. First, a naturally evolved consortia of ~50 species was developed through repeated dilutions and plate passaging on synthetic growth media emulating the rhizosphere nutrient environment of sorghum. Here, with this reduced-complexity community, the team describes new assays examining the engraftment efficiency of an engineered host that is not part of the enrichment community, as well as taxonomic differences in communities that allowed or rejected colonization of the engineered host. These analyses showed that engraftment was possible but appeared to be the exception rather than the rule. In addition, successfully engrafted species showed strong co-abundances with other members of the community, pointing to possible points of microbial interaction that may drive engraftment. These studies begin to reveal the mechanisms behind how addition of a microbe to an existing community as a method of PAM engineering might take place.

To investigate the environmental persistence and plant-growth promotion in field environments, researchers developed a defined synthetic community from sorghum rhizosphere microbiome isolates and tested this community with sorghum in growth chambers and the field. In contrast to the reduced-complexity enrichment community, this community was established using network co-abundance analysis from Sorghum field 16S taxonomic surveys by co-culturing Sorghum isolates of 56 member strains representing 18 bacterial genera. The project demonstrates that this synthetic community can stably colonize the rhizosphere and roots of sorghum during lab-based in planta experiments, and that it enhances overall shoot biomass compared to mock treated controls. Remarkably, field experiments replicate the findings observed in the lab-based in planta data. These results reveal that the synthetic community is a stable and reproducible community that colonizes Sorghum plants and unexpectedly improves their performance in agricultural soils. The team anticipates the enrichment and defined communities will reveal the drivers of isolate colonization into rhizosphere microbiomes and enable the scaling of rhizosphere synthetic biology from laboratory to field settings. This knowledge will promote the responsible deployment of engineered microbial functions in cropping settings to reduce nutrient inputs, promote drought resilience, and suppress plant pathogens.

EndoPopulus: Elucidating the Molecular Mechanisms of N-Fixation by Populus Endophyte, Burkholderia vietnamiensis WPBDotyUniversity of WashingtonAufrechtBioenergyUniversity

The overall goal of this project is to investigate the roles and molecular mechanisms of endophytes in supporting productivity and fitness of Populus. Using systems biology approaches at both lab and field scales, the project will identify the molecular and physiological impacts of the bio-inoculants on the host plant in responding to nutrient and water limitation and determine if bio-inoculants not only increase plant nutrient stores but also “prime” plants for tolerance and resilience to abiotic stresses. Goals include identifying the molecular mechanisms of enhanced plant production and fitness by diazotrophic endophytes. The team will then integrate the plant physiology data with the molecular plant-microbe interactions data to develop a systems-level understanding of the genetic and molecular basis for diazotrophic endophytic mutualism in Populus.

Biological nitrogen fixation (BNF) by microbial diazotrophs can significantly contribute to N availability and uptake in non-nodulating plant species, like Populus spp. There is currently a knowledge gap surrounding the molecular mechanisms underlying plant-diazotroph interactions and the spatial and temporal variations in microbial expression of genes involved in nitrogen fixation. In this work, the team seeks to identify which nitrogenous biomolecules diazotrophs are producing, how BNF is regulated in an axenic culture, and finally how Populus trichocarpa regulates N fixation during co-culture with diazotrophic endophytes. Through a 15N2 time course enrichment study, researchers identified key nitrogenous metabolites and proteins that are synthesized by diazotroph Burkholderia vietnamiensis (WPB). Using a fluorescent transcriptional reporter in the nitrogen fixing gene, nifH, researchers found that nifH is not uniformly expressed across genetically identical colonies of WPB. This result led to conducting a follow-on targeted metabolomics study in colony sections with and without nifH expression to identify which of the key nitrogeneous metabolites are produced in each scenario. Although WPB does not require the host plant to fix N, it was hypothesized that the plant can regulate N fixation via metabolite exchange with the diazotroph under environmental changes including water-limiting conditions. Using liquid chromatography–mass spectrometry, researchers have identified 39 compounds in P. trichocarpa root exudates that are differentially abundant in a drought vs. well-watered condition. Currently, the team is testing the influence of these root exudates on the differential gene expression of WPB (including nifH gene) in axenic culture. Additionally, researchers have cultured WPB with the host plant directly in the RhizoChip, a synthetic soil habitat, which enabled direct imaging of the expression of microbial nifH within root epidermal cells. The team found that nifH expression is heterogeneous within root tissues, depends on the presence of soluble N compounds, and is localized to the root elongation zone where the WPB forms a unique physical interaction with the root cells. Finally, to understand the spatial distribution of metabolites exchanged by the plant, matrix-assisted laser desorption ionization–mass spectrometry imaging (MALDI-MSI) was used to image the distribution of various hormones and metabolites in root cross sections in plants subjected to drought vs. a well-watered condition. This experiment was repeated with plants that were inoculated with a community of endophytes to determine how the presence of endophytes alters the internal molecular environment of the root and how abiotic stressors like drought affect these interactions. These comprehensive experiments merge multiomics, chemical, and optical imaging data from axenic microbial, plant, and co-cultures to identify the key molecular mechanisms regulating beneficial plant-endophyte interactions.

EndoPopulus: Elucidation of the Roles of Diazotrophic Endophyte Communities in Promoting Productivity and Resilience of Populus Through Systems Biology ApproachesDotyUniversity of WashingtonDotyBioenergyUniversity

The overall project goal is to move toward an understanding of the holobiont, how plants and the microbial community within them interact in ways that promote the productivity of the whole. Integration of plant physiology data with the molecular plant-microbe interactions (multiomics) data from greenhouse and field experiments will allow development of a systems-level understanding of the genetic and molecular basis for diazotrophic endophytic mutualism in Populus. This deeper level of understanding of the plant responses will guide construction of microbial communities in order to optimize the impacts of bioinoculants for environmental sustainability of bioenergy crops.

Poplar trees are important feedstocks for bioenergy and ecosystem services, but more efficient and resilient growth is essential for sustainability. The microbiome of wild poplar is a rich resource for plant growth promotion with some of the contributing micro-organisms able to provide nitrogen and bioavailable phosphorus. In addition, these micro-organisms may promote plant tolerance of other environmental stresses as well, such as drought. The first objective of this project is to unravel the molecular mechanisms of nitrogen fixation in an optimized constructed community of endophytes isolated from wild poplar. Initially studying a single aerobic diazotrophic strain with highly dynamic nitrogen-fixation, researchers characterized N-fixation using a nitrogenase gene promoter fusion to green fluorescent protein, fluorescence-activated cell sorting, poplar rhizo-chip assays, and a time series of 15N-guided molecular analyses. Nitrogenase gene expression patterns both in vitro and in planta suggested environmental signals are at play, and the resulting signature of nitrogenous compounds was identified using 15N-metabolomics.

A synergistic effect of specific non-diazotrophic strains with diazotrophs was revealed in vitro, moving towards an understanding of inter-species cooperation in a constructed community. Identification of the potential crosstalk molecules between strains will be a next step. A constructed community of eight endophyte strains was optimized with complementary symbiotic traits including nitrogen fixation and synergistic interactions, phosphate solubilization, and hormone production. Genome-scale transposon mutagenesis of two of the diazotrophs of the constructed community is complete, allowing for a series of experiments to be conducted to evaluate the genetic requirements for nitrogen fixation by these aerobic strains. Directed mutagenesis of specific genes is underway. To determine the molecular and physiological impacts of N-fixing endophytes on the host plant, the team is conducting both field and greenhouse level studies. In April 2022, a field site at the Roza Research Station in Prosser, Washington, was planted with poplar trees inoculated with the constructed community. Trees are being monitored and samples collected for analysis of the impacts of the endophytes versus controls at multiple levels including plant physiological, genomic, and metabolomics levels. Meanwhile, a series of more controlled greenhouse level experiments have been performed, with more currently in progress.

Full genomic sequencing and de novo assemblies were completed for the strains making up the constructed community. The annotated draft-assemblies were submitted to the Type (Strain) Genome Server (TYGS) for whole genome-based taxonomic classification, with four strains (Azospirillum sp. SherDot2, Sphingobium sp. WW5, Herbiconiux sp. 11R-B1, and Rhizobium sp. PTD1) classified as potentially novel species and four strains identified to the species level (Rahnella aceris R10, R. aceris WP5, Azotobacter beijerinckii SherDot1, and Rhodotorula graminis WP1). None of the strains were predicted to be human pathogens by PathogenFinder (v1.1), and in silico analyses were completed cataloging the genetic features associated with plant growth–promoting (PGP) traits for each of the strains. Strain-specific-primer (SSP) sets have been designed and are currently being screened for the ability to successfully target unique DNA sequences in each of the strains. These will be used in digital droplet PCR assays to verify colonization and localization by plant compartment and to estimate the relative abundance of the strains within and between treatment groups. The SSPs will be used in conjunction with metabarcoding studies targeting the 16S rRNA genes to investigate changes in the community structure of the plant microbiome from the field and greenhouse experiments under abiotic stresses, providing information toward the objective of identifying the mechanisms of plant impacts on the microbial community.

By studying both the impacts of a constructed microbial community on the host plant as well as the impacts of the plant on the microbiome, the team hopes to move toward an understanding of the holobiont, how plants and the microbial community within them interact in ways that promote the productivity of the whole.

EndoPopulus: Endophyte Inoculation Enhances Populus Physiological Responses to Abiotic StressDotyUniversity of WashingtonBananBioenergyUniversity

The overall goal of the EndoPopulus Project is to understand how, at a molecular level, the micro-organisms within the poplar tree microbiome can affect the host plant health and stress tolerance. The project will use a plant physiological approach to determine the impacts of endophyte inoculation on host plant performance under normal, nutrient-limited, and water-limited conditions. Data from field and greenhouse trials will be used to develop a process-based plant physiological model to generate testable hypotheses for further greenhouse examination of the microbial mechanisms responsible for altered plant productivity. Finally, physiological results will be integrated with microbiology, metabolomic, and transcriptomic data to generate an improved systems-level understanding of the plant-endophyte system scaling from the molecular to the canopy level.

Sustainable biofuel feedstock production is a key target for securing energy supply under climate change while minimizing environmental impacts. Inoculation with endophytes, mutualist microbes living inside plants, is a potential strategy to achieve this in forestry applications by improving host resource use efficiency, productivity, and stress tolerance. Endophytes isolated from trees in family Salicaceae have been shown to provide benefits such as fixing atmospheric nitrogen and synthesizing phytohormones in various in vitro and in planta experiments. Predicting how these benefits operate under production contexts requires a process-based understanding of plant, endophyte, and environmental interactions. This research investigates the physiological mechanisms by which inoculation with Salicaceae endophytes improves Populus performance under nutrient-limited or water-limited conditions.

Leaf epidermal morphology, photosynthesis, and whole-plant architecture and biomass were measured on native and hybrid poplars inoculated with Salicaceae endophytes in a series of greenhouse and field experiments. Experimental data were used to parameterize a coupled leaf gas exchange model to estimate the contribution of endophyte inoculation on biophysical and biochemical aspects of photosynthesis. Endophyte associated reductions in stomatal conductance varied with time of day and were most pronounced under higher light intensities. Likewise, improvements in photosynthetic water-use efficiency were greatest in inoculated plants under drought while inoculation reduced stomatal guard cell size regardless of water availability. Additionally, late season photosynthetic capacity (Vcmax and Jmax) was greater for inoculated plants under both greenhouse and field conditions. Under nitrogen-limited conditions, inoculated plants were taller and had root systems with greater total root length, branch number, and branching frequency compared to non-inoculated controls. These results suggest that the greatest benefits of endophyte inoculation on host productivity and resource use efficiency may be realized under stress conditions while other aspects of physiology, particularly stomatal morphology and photosynthesis, responded across a range of environments. A combination of microbiology, transcriptomic, and metabolomic approaches will be used to connect changes in plant physiology to endophyte diversity, abundance, and activity. Finally, process-based modeling will be used to scale these leaf- and plant-scale changes to the canopy scale to predict tree performance in biofuel feedstock production applications.

Novosphingobium aromaticivorans for ccMA Production from Lignin BiomassDonohueGLBRCVilbertBioenergyGLBRC

This project will discuss Novosphingobium aromaticivorans as a bacterial host for production of the commodity chemical cis,cis-muconic acid (ccMA). It will provide further knowledge of the metabolism of aromatics by N. aromaticivorans, as well as a method to produce commodity chemicals from renewable carbon sources that are generated using green microbial engineering techniques.

Millions of tons of the commodity chemical cis,cis-muconic acid (ccMA) is produced annually from finite fossil fuel sources to produce nylon, polyesters, and other materials (Choi et al. 2020). The establishment of a sustainable bioeconomy hinges on the ability to use renewable sources for production of these in-demand chemicals. Lignin is an under-utilized abundant renewable resource that represents a potential carbon source for bio-based production of valuable chemicals. Due to the heterogeneity of lignin, it is challenging to extract commodity chemicals from lignin using current methods (Beckham et al. 2016). This poster discusses the team’s work to funnel lignin streams into a single compound by leveraging the selectivity of enzymes with metabolic engineering techniques. N. aromaticivorans is an ideal candidate for lignin funneling as it is genetically tractable and has the native ability to catabolize lignin derived oligomers and metabolize multiple lignin monomers simultaneously. Previously, the lab was able to convert biomass aromatics into 2-pyrone-4,6-dicarboxylic acid from an engineered N. aromaticivorans strain (Perez et al. 2019). This poster will discuss both the genetic engineering of N. aromaticivorans metabolic pathway for production of ccMA as well as the activity of enzymes along this metabolic pathway. In particular, it will highlight the activity of putative N. aromaticivorans enzymes involved in decarboxylation of protocatechuic acid to catechol and oxidative ring cleavage of catechol to ccMA. This work directly compares the activity of the putative N. aromaticivorans enzymes in vitro and in vivo with other known homologous enzymes to engineer the most efficient ccMA production strain in N. aromaticivorans.

Defining Transcriptomic Dynamics in Sorghum in Multiple Abiotic StressesDonohueGLBRCKoBioenergyGLBRC

Increasing crops’ resilience to the changing climate is critical to sustaining the bioeconomy on a large scale. Crop resilience is orchestrated by gene reprogramming events in response to environmental stresses. To better understand how gene expression changes are coordinated in sorghum, an important bioenergy crop, in response to climate stress, researchers performed a time-course transcriptome profiling under three major abiotic stress conditions followed by co-expression network analyses. These analyses provide new insights into the gene regulatory dynamics in response to stress but also fundamental resources for genetic engineering and molecular breeding.

The growth, development, and productivity of crops are challenged by abiotic stresses such as drought, heat, and salinity, whose severity is projected to increase steadily and irreversibly in the future. To survive and thrive in the environment, plants rapidly execute gene expression changes for necessary cellular and metabolic functions. Despite its significance in basic research and field applications, a comprehensive landscape of the abiotic stress-responsive transcriptome changes in bioenergy crops remains elusive to date. The use of dynamic time-course data on multiple abiotic stress-responsive transcriptomes in shoots and roots of sorghum (Sorghum bicolor (L.) Moench) with support of DOE Joint Genome Institute (JGI) allowed dissection of the complex stress-, tissue-, and phase-specific gene responses through a streamlined gene network modeling. The team established a series of co-expression modules of abiotic stress-responsive genes in shoots and roots, separately, where marker genes for various phytohormones are significantly enriched. The expression dynamics in the transcriptome data facilitated gene regulatory network mapping that identified potential candidate transcription factors (TFs) upstream of tissue-specific phytohormone network hub genes. The team proposes that in sorghum, the dynamic regulation of phytohormone marker genes in the abiotic stress-responsive co-expression network modules is coordinated by the master TFs in a tissue-specific manner. This knowledge can be used to boost bioenergy crop productivity and agricultural sustainability through genetic, chemical, and biotechnology engineering.

Plant-Microbe Interfaces: Molecular Insights into the Mutualistic Symbiosis Between Populus and Plant Growth-Promoting BacteriaDoktyczOak Ridge National LaboratoryPiatkowskiEnvironmental MicrobiomePlant-Microbe Interfaces

The goal of the Plant-Microbe Interfaces (PMI) Science Focus Area (SFA) is to characterize and interpret the physical, molecular, and chemical interfaces between plants and microbes and determine their functional roles in biological and environmental systems. Populus and its associated microbial community serve as the experimental system for understanding the dynamic exchange of energy, information, and materials across this interface and its expression as functional properties at diverse spatial and temporal scales. To achieve this goal, the product focuses on (1) defining the bidirectional progression of molecular and cellular events involved in selecting and maintaining specific, mutualistic Populus-microbe interfaces, (2) defining the chemical environment and molecular signals that influence community structure and function, and (3) understanding the dynamic relationship and extrinsic stressors that shape microbiome composition and affect host performance.

Plants have been co-evolving with microbes since their emergence onto land some half a billion years ago, yet predictive understanding of how mutualistic symbioses between these diverse groups of organisms are selected and maintained is still in its infancy. Recently, considerable attention has been focused on plant growth–promoting bacteria (PGPB) for their application to agriculture, bioprotection, and phytoremediation. Such PGPB are known to stimulate plant growth through enhancing nutrient acquisition or modulating hormone levels in select crop species, but little is known about their potential impacts on woody perennials like poplar that are relevant to sustainable biofuel production. To address this knowledge gap, researchers first isolated and sequenced bacteria from the roots of field grown Populus (Blair et al. 2018; Carper et al. 2021). This culture collection represents over 3,200 unique bacterial isolates, and full genome sequences are available for over 550 of these isolates. Using comparative genomics, the team identified potential PGPB isolates that have the enzymatic machinery required for nitrogen fixation. Results showed that one of these isolates, Rahnella sp. OV588, catalyzes acetylene reduction in vitro and that this activity is dependent on the nitrogenase (nifH) enzyme. In co-culture experiments, Rahnella efficiently colonized the endosphere of axenic poplar plants and catalyzed acetylene reduction in planta under nitrogen-limiting conditions. Bacterial inoculation significantly increased root biomass, suggesting that the plant host can benefit from treatment with PGPB. To understand the molecular mechanisms involved in the establishment of this plant-microbe symbiosis, researchers quantified host transcriptomic response over a time course. Gene expression was affected by both time and microbial treatment, with the largest treatment effect observed at 24 hours post-inoculation. Genes induced by PGPB treatment were enriched for biological processes, including root morphogenesis, hormone metabolism, and sugar signaling, but also show signatures of an induced systemic immune response. Findings elucidate the mechanisms by which mutualistic relationships are established between poplar and members of its microbiome and highlight possible strategies to improve biofuel feedstock production in marginal habitats.

Plant-Microbe Interfaces: Capturing and Interpreting the Role of Populus’ MicrobiomeDoktyczOak Ridge National LaboratoryPelletierEnvironmental MicrobiomePlant-Microbe Interfaces

The goal of the Plant-Microbe Interfaces (PMI) Science Focus Area (SFA) is to characterize and interpret the physical, molecular, and chemical interfaces between plants and microbes and determine their functional roles in biological and environmental systems. Populus and its associated microbial community serve as the experimental system for understanding the dynamic exchange of energy, information, and materials across this interface and its expression as functional properties at diverse spatial and temporal scales. To achieve this goal, the project focuses on (1) defining the bidirectional progression of molecular and cellular events involved in selecting and maintaining specific, mutualistic Populus-microbe interfaces, (2) defining the chemical environment and molecular signals that influence community structure and function, and (3) understanding the dynamic relationship and extrinsic stressors that shape microbiome composition and affect host performance.

Microbial communities play an integral role in the health and survival of their plant hosts. In ongoing efforts in the PMI SFA, researchers are capturing members of Populus’ microbiome to understand basic concepts of plant and environmental selection. Representative bacterial strains from environmental samples of Populus roots have been isolated using a direct plating approach and compared to amplicon-based sequencing analysis of root samples (Carper et al. 2021). The resulting culture collection contains 3,211 unique isolates representing 10 classes, 18 orders, 45 families, and 120 genera from six phyla, based on 16S rRNA gene sequence analysis. The collection represents a significant fraction of the natural community of plant-associated bacteria as determined by phylogenetic analysis. Additionally, a representative set of 553 strains have had their genomes sequenced to facilitate functional analyses. This culture collection allows for the exploration of microbial community function and an understanding of basic concepts of plant and environmental dependent selection.

The team is employing this collection to understand the mechanisms of microbial adaptation to Populus’ root endosphere and rhizosphere. Microbial diversity of the endosphere is low compared to rhizosphere, indicating high selectivity of this compartment for specific taxa and microbial adaptation of functions needed to compete in this unique environment. Using Populus-bacterial model systems with communities of specific Variovorax strains, researchers successfully demonstrated that they could identify bacterial strains, genes, and associated functions potentially required for fitness in Populus’ root rhizosphere or endosphere niches. L-fucose metabolism, glycoside hydrolases, pili/fimbriae production, and exopolysaccharide production were identified as important bacterial traits associated with efficient endosphere colonization. The team found the enrichment of genes in the L-fucose metabolic pathway intriguing, as metabolism of cell surface fucose residues has been implicated in enrichment of beneficial mammalian gut bacteria and suppression of pathogens. Additionally, L-fucose biosynthesis and fucosylation of cell surface macromolecules have been demonstrated to play a role in plant immune response. Researchers constructed gene deletion mutants of L-fuconolactonase in the L-fucose metabolic pathway via homologous recombination in Variovorax and confirmed this strain is defective in L-fucose growth and in Populus root colonization, relative to the wild-type strain.

In other applications of the microbial collection, the team has constructed microbial communities to elucidate organizational principles of community formation. Using genome-defined strains, systematic experiments, and computational modeling, researchers are identifying potential metabolic exchanges among species and gaining mechanistic insights into community structure. Co-culture and serial transfer experiments performed in defined media identified emergent, stable microbial communities (Wang et al. 2021; Shrestha et al. 2021). Using a complex medium environment, the effects of different initial inoculum ratios, up to three orders of magnitude, on community structure were investigated. The final compositions of the mixed communities with various starting compositions indicate that community structure is not dependent on the initial inoculum ratio. Modeling and omics analysis provide mechanistic insights into the emergence of community structure and indicate competitive relationships among the persistent organisms. These findings enlighten understanding of bacterial community formation and may guide efforts to manage rhizosphere bacterial communities. Collectively, these diverse applications of cultured representatives of Populus’ microbial community are facilitating understanding of how Populus selects microbial partners and how its microbiome is structured.

Accelerating Carbon-Negative Biomanufacturing Through Systems-Level Biology and Genome OptimizationJewettStanford UniversityFacklerBiosystems DesignUniversity

The accelerating climate crisis combined with rapid population growth poses some of the most urgent challenges to humankind, all linked to the unabated release and accumulation of CO2 across the biosphere. By harnessing the capacity to leverage chemoautotrophic gas–fermenting microorganisms, the project can begin to take advantage of this abundance of available CO2 to transform the way the world creates and uses carbon-based materials. This interdisciplinary project will address existing challenges around genomic optimization of chemoautotrophic gas–fermenting microorganisms to establish versatile and efficient CO2-utilizing biosystems. The team will use in silico, in vitro, and in vivo methods to develop novel tools for genome engineering while advancing systems-level knowledgebases for industrially relevant CO2-fixing organisms. These new developments will be deployed towards generating streamlined genomes in Clostridium autoethanogenum under the contexts of bioproduction and biocontainment.

The project seeks to develop and integrate innovative cell-free tools, genome engineering techniques, and machine learning–based methods for predictive design of CO2-fixing biosystems that deliver new routes to solve energy security and environmental stewardship challenges. The key idea is to develop (1) genome-scale cell-free tools for proteome study and design, (2) genome engineering techniques for systems-level design, (3) high-throughput cell-free enzyme engineering approaches, (4) machine learning and molecular tools to guide enzyme and product selection and engineering, and (5) transcription factor–based biosensors for genetic regulation. The team then aims to apply these new tools across industrially relevant CO2-fixing organisms.

Anaerobic acetogens (specifically the model acetogen, C. autoethanogenum) have emerged as sustainable biomanufacturing platforms capable of producing valuable chemicals from flexible non-food and waste feedstocks that are already used at a commercial scale by LanzaTech today, building on their ability to natively ferment CO2 or CO and produce a valuable product (ethanol) with high selectivity. Advances over the past decades including genome engineering tools, cell-free prototyping, and metabolic models have enabled carbon-negative production of commodity chemicals such as acetone and isopropanol in engineered acetogens at industrially relevant productivities for prolonged periods (Liew et al. 2022).

LanzaTech has previously published a polished genome sequence along with metabolomic, proteomic, and transcriptomic profiles. These datasets have enabled synthesis of sophisticated genome-scale metabolic models and computational analysis for the genome of C. autoethanogenum (Brown et al. 2014; Simpson et al. 2019). From these datasets, ~19% of the genome has been identified as essential for autotrophic growth (Woods et al. 2022). Even for scientists armed with developed models, automated gene engineering capabilities, and gene essentiality information, generation of a genome-streamlined strain through iterative knockout is a daunting task that includes abundant possible permutations that would be resource and time intensive.

To facilitate prioritization of gene knockout targets for strain optimization, cell-free metabolic engineering (CFME) has been previously demonstrated (Liew et al. 2022). Building on this work, the team plans to further expand by identifying and removing protein effectors that have the greatest effects on CO2 to product metabolism. Additionally, using CFME, researchers can screen candidate effectors as combinations to prototype strains that have been iteratively reduced well before the engineered strain exists. This data will improve or validate gene ontology and develop genotype-phenotype linkages that inform in silico modeling of cellular processes.

While CFME has been demonstrated to be able to handle rapid, low-volume and high-throughput screens, genome engineering, in the context of gas fermentation, still relies on at least microliter volumes and generation times significantly longer than those of other model organisms. The team is able to leverage unique gas fermentation capabilities developed at LanzaTech, including the world’s first and only automated biofoundry capable of anaerobic gas fermentation at high throughputs. This integrated system, developed with support from the Biological and Environmental Research (BER) Program’s Genomic Science program DE-SC-0019090, enables genome engineering, colony picking, cultivation, screening, and strain repository at throughputs inaccessible to bench scientists. The large amount of generated data is managed in a custom-built laboratory information management system and provides the team with access to an ever-growing repository of host knockout genotypes for use as platforms for metabolic engineering.

In order to develop strains with increased genomic efficiency that are robust enough to tolerate industrial fermentation, the team, in partnership with the DOE Joint Genome Institute (JGI), has developed the largest dataset to date characterizing transcription factor binding sites (TFBS) across six clostridial genomes. Here, the initial analysis of that dataset and some conserved TFBS motifs are demonstrated. Further work is planned to build and report on regulatory networks and to use that output to inform in vitro experiments. Validation through novel cell-free methods will aid in refining the knowledgebase through elucidating principles of regulation, promoter sequence ‘leakiness,’ dynamic range of the transcription factor in the presence of inducer, and ligand sensitivity.

Ultimately, the team will deliver novel C. autoethanogenum strains that have been streamlined to metabolize CO2 into products through continuous fermentation. These strains will have improved genomic efficiencies with increased functionally stability in the industrial bioproduction specific settings.

Combining GWAS of Metabolomic and Transcriptomic Datasets to Accelerate Discovery of Genes Regulating the Effect of Drought on Plant Growth and Metabolism in Sorghum and SetariaBaxterDonald Danforth Plant Science CenterHubbardBiosystems DesignUniversity

Bioenergy feedstocks need to be deployed on marginal soils with minimal inputs to be economically viable and have a low environmental impact. Currently, crop water supply is a key limitation to production. The yields of C4 bioenergy crops such as Sorghum bicolor have increased through breeding and improved agronomy. Still, the amount of biomass produced for a given amount of water use (water-use efficiency, or WUE) remains unchanged. Therefore, this project aims to develop novel technologies and methodologies to redesign the bioenergy feedstock Sorghum for optimal WUE. Within this broader context, this subproject is leveraging the sorghum pangenome and large phenotypic datasets in Setaria viridis and S. bicolor to discover metabolically important genes for the regulation of WUE in the C4 grasses. This project aims to develop and demonstrate novel methods and resources to accelerate the production of genetic variants and accelerate phenotyping in both reverse genetics and forward genetics approaches leading to discovery of genes regulating metabolic regulators of WUE.

Plants make an amazing array of metabolites to grow and respond to environmental change. The large number of compounds created by plants are poorly characterized, and the genetic programs controlling them are largely unknown. In order to better understand the metabolomic response of C4 plants to drought stress, researchers conducted parallel experiments in Sorghum and Setaria using diversity panels. Plants were grown in a controlled environment phenotyping system at two watering levels, and samples were harvested 6 days after the watering levels were set.

Metabolites for each sample were quantified in an untargeted fashion via liquid chromatography–mass spectrometry (LC-MS) using two different columns in both positive and negative mode to identify a large number of compound classes. A third of the samples were also profiled for RNA transcripts. ~3800 metabolomics samples, each run on two columns in two modes created an immense informatics challenge. To improve the sensitivity and accuracy of metabolite detection in similar large datasets, the team has developed a suite of three computational tools to overcome the challenges of unreliable algorithms and inefficient validation protocols: isolock, autoCredential, and anovAlign (IAA). Isolock uses metabolite-isotopologue pairs (isopairs) to calculate and correct for mass drift noise across LC-MS runs. AutoCredential leverages statistical features of LC-MS data to amplify naturally present 13C isotopologues and validate metabolites through isopairs. AnovAlign, an anova-derived algorithm, is used to align retention time windows across samples to improve delineation of retention time windows for mass features. Using the IAA suite, researchers have quantified thousands of mass features across the 3,800 metabolomics samples. Genome-wide association study (GWAS) analysis has identified a large number of loci affecting these metabolites, including several loci in syntenic regions of the Setaria and Sorghum genomes for the same metabolite. Using informatics tools to harness the sorghum pangenome, researchers are combining the loci with transcriptomic and genomic data to identify candidate genes and alleles underlying the metabolomic response to water deficit, as well as leveraging tandem mass spectrometry to better characterize promising mass features.

The Predictive Power of Phylogeny on Growth Rates in Soil Bacterial CommunitiesHungateNorthern Arizona UniversityWalkupEnvironmental MicrobiomeUniversity

Microorganisms are major engines of the land carbon cycle, responsible for influencing the composition and radiative properties of the atmosphere, and for both creating and consuming soil organic carbon, a resource that provides multiple ecosystem services, and, when lost, exacerbates climate change. This project investigates the interactions within microbial communities and between microbes and their environment that underpin these dual roles of microorganisms in creating and consuming soil carbon. Overarching objectives are to develop and apply omics approaches to investigate microbial community processes involved in carbon and nutrient cycling, develop community and taxon-specific microbial controls over key biogeochemical processes in terrestrial environments, and test quantitative ecological and biogeochemical principles using omics data. This work aims to facilitate scaling of taxon-specific microbial data to connect the ecology of microorganisms with ecosystem level rates of carbon and nutrient cycling.

Predicting ecosystem function is critical to assess and mitigate the impacts of climate change. Quantitative predictions of microbially mediated ecosystem processes are typically uninformed by microbial biodiversity. Yet new tools allow the measurement of taxon-specific traits within natural microbial communities. There is mounting evidence of a phylogenetic signal in these traits, which may support prediction and microbiome management frameworks. Researchers investigated phylogeny-based trait prediction using bacterial growth rates from soil communities in Arctic, boreal, temperate, and tropical ecosystems. Here, research shows that phylogeny predicts growth rates of soil bacteria, explaining up to 58% of the variation within an ecosystem. Despite limited overlap in community composition across these ecosystems, shared nodes in the phylogeny and ancestral trait reconstruction allowed cross ecosystem predictions, which showed that phylogenetic relationships can explain up to 38% of the variation in growth rates across biomes. Results suggest that shared evolutionary history creates similarity in the relative growth rates of related bacteria in the wild, allowing phylogeny-based predictions to explain a significant amount of the variation in taxon-specific functional traits, within and across ecosystems.

Agent-Based Algal Modeling for the Rational Engineering of Chlamydomonas reinhardtiiBoyleColorado School of MinesBoyleBiosystems DesignUniversity

The overall research objective is to develop an experimentally validated multiparadigm multiscale modeling framework that will enable the most advanced and predictive metabolic modeling of diurnally grown photosynthetic organisms to date. The genome-scale metabolic model of Chromochloris zofingiensis will be embedded into an agent-based modeling framework to allow modeling of diurnal growth; the model will also be able to simulate intracellular fluxes, cell-to-cell interactions, cell-to-environment interactions, metabolite diffusion, and spatial distribution. This modeling approach will allow simulation of metabolic shifts that occur due to diel cycles and generation of rational engineering strategies to design production strains that are not impacted negatively by this natural phenomenon.

Economical algae production requires growth under outdoor light, but the diel nature of sunlight complicates modeling efforts. Researchers have developed a solution for that: a fully functional 3D agent-based model, capable of simulating algal growth under diurnal conditions. By combining systems biology data from Chlamydomonas reinhardtii grown in diurnal light (Strenkert et al. 2019) with agent-based modeling and detailed tracking of nutrient and light conditions, this model performs better than traditional steady-state metabolic models (Metcalf et al. 2022). In order to develop a model of growth during diurnal light, the team needed to decouple the standard biomass formation equation to allow different components of biomass to be synthesized at different times of the day. The model was able to more accurately predict qualitative phenotypical outcomes of the starchless mutant, sta6. The model then predicted growth of single-gene knockouts, and potential targets were identified for rational engineering efforts to increase productivity. The team will discuss recent advances in characterizing these mutants and further improvement of the model by including light and nutrient tacking. This model enables evaluation of the impact of genetic and environmental changes on the growth, biomass composition, and intracellular fluxes for diurnal growth.

Predicting Gene Functions in Plants with Single-Cell Genomic DataDinnenyStanford UniversityTimilsenaBioenergyUniversity

The rapid sequencing of genomes and transcriptomes in bioenergy crops and other plant species has outpaced the rate at which gene functions can be accurately annotated. Wet-bench validation for gene functions is very laborious and time consuming for non-model species. Even in a model organism like Arabidopsis thaliana, majority of the gene functions have not been validated with wet lab experiments. Traditional computational methods for assigning gene functions largely rely on sequence homology which could not account for gene expression activities in different tissue or cell types. In this project, researchers tested whether single-cell gene expression data can be used to improve the gene function annotation. The team compared bulk- and single-cell RNA seq datasets from roots and assessed the performance of seven machine learning algorithms for predicting gene functions. Researchers found that random forest works the best among these methods. The team further asked whether single-cell genomic data can provide additional information because expression data from more diverse cell populations are captured by single-cell (sc)RNA-seq as compared to bulk RNA-seq. Surprisingly, bulk RNA-seq were found to have better accuracy in predicting many gene functions as compared to scRNA-seq data. A comparison of scRNA-seq datasets from different tissues showed that leaf scRNA-seq data provides higher accuracy in predicting the chloroplast and photosynthesis related genes as compared to root scRNA-seq data. This observation suggests that the specificity of the information content in single-cell datasets from different tissues is biologically relevant. Because of the diversity of cell types captured by scRNA-seq, research found that an increasing number of Uniform Manifold Approximation and Projection clusters may help to improve the prediction accuracy for single-cell data. The future direction of this work is to incorporate stress responsive scRNA-seq data and regulatory networks (DAP-seq) information to further expand the prediction of novel gene functions in oil seed crops. Experimental validation for selected genes will be performed in the coming years.

Probabilistic Annotation and Ensemble Metabolic Modeling in KBaseD’haeseleerLawrence Livermore National LaboratoryD’haeseleerComputational BiologyKBase

Functional annotation tools such as Rapid Annotation using Subsystem Technology (RAST) or Kyoto Encyclopedia of Genes and Genomes (KEGG) don’t always agree on how to best leverage them for metabolic modeling. This project is developing tools for the DOE Systems Biology Knowledgebase (KBase) to give users a principled way to weigh multiple sources of functional annotation against each other, enable better metabolic modeling of hard-to-annotate organisms and pathways, allow analysis of uncertainty in the resulting models network structure or behavior, and provide an infrastructure on which to build more sophisticated machine learning techniques in KBase.

The µBiospheres Science Focus Area (SFA) at LLNL investigates metabolic interactions in bioenergy-relevant microbial communities. A critical part of this research is development of genome-scale models of metabolism, which requires well-annotated genomes. By combining annotations from multiple sources, researchers can achieve a more complete metabolic network reconstruction, greatly reducing the effort required to curate quality metabolic models (Griesemer et al. 2018). In previous work, researchers developed a set of KBase apps to import, compare, and merge functional annotations from a wide range of different functional annotation tools into KBase for metabolic modeling to achieve significantly improved metabolic models. These apps have proven to be very useful and are currently in daily use in this SFA and several other research groups using KBase.

It is quite common for functional annotation tools to disagree on the function that should be assigned to certain genes, and this uncertainty can have significant consequences on the resulting metabolic networks and the behavior they predict for the organism. The team is now developing a set of tools to deal with these disagreements in a more systematic manner: by calculating the likelihood of metabolic reactions given the annotations from various sources, and then carrying those reaction likelihoods through into the modeling results. Researchers have modified the existing import app to support importing annotation scores and evidence codes, such as reaction probabilities, log likelihoods, Basic Local Alignment Search Tool, or hidden Markov model scores.

Researchers can use a Naive Bayes approach to estimate the probability of each reaction assigned to a gene, given the annotations from a range of different annotation tools. For this the team first needs to evaluate the reliability—False Positive and False Negative rates—for each of the major annotation tools (currently, RAST, Prokka, Distilled and Refined Annotation of Metabolism, and KEGG), by running them on a reference dataset consisting of 15,000 enzymes in Swissprot that have experimental evidence codes. This rigorous validation effort has also led to some significant improvements in the ModelSEED biochemistry database, beyond the well-curated set of template reactions that are normally used by KBase’s metabolic modeling engine.

Enzymes in Swissprot have historically been annotated using Enzyme Commission (EC) numbers, which are far from ideal when needing to map to unique metabolic reactions for modeling. Some ECs are overly generic, forcing omission altogether, or to instantiate them as multiple unique reactions. The team is building support into the Ontology application programming interface (that translates from EC numbers and other annotation vocabularies to ModelSEED reactions) to filter out unbalanced, overly generic, or otherwise unsuitable reactions for metabolic modeling. In the longer term, the team may use the new Rhea reaction identifiers that are being curated into the Swissprot database as the reference dataset, which should provide for a much more direct mapping to ModelSEED reactions.

Once reaction probabilities are associated with all the genes in a genome, researchers can then sample from those probabilities to create an ensemble of metabolic models (Medlock, Moutinho, and Papin 2020). Each of these models can then be analyzed using the existing gapfilling and modeling tools (including support for the next generation of modeling tools that the KBase team is developing), eventually resulting in an ensemble of Flux Balance Analysis solutions, reflecting the uncertainty in the underlying enzyme annotations. The team will develop a set of analysis tools to study this ensemble of solutions, using clustering, averaging, analysis of alternative pathway solutions, etc. This will result in higher quality metabolic network reconstruction, but also in much greater insight in the sources of uncertainty in the network, enabling prioritization of how to most efficiently reduce that uncertainty by additional manual curation or experimental data.

This work will provide the SFA and other KBase users a principled way to weight annotation sources against each other, enable better metabolic modeling of hard-to-annotate organisms and pathways, allow analysis of uncertainty in the resulting models network structure or behavior, and provide an infrastructure on which to build more sophisticated machine learning techniques in KBase.

NLP for Synthetic Biology: Providing Generalizable Literature Mining Through KBaseDehalLawrence Berkeley National LaboratoryYooComputational BiologyKBase

The scientific literature contains many decades of research results which may inform the identification and subsequent engineering of microbial targets for novel applications, yet this knowledge remains largely inaccessible to current researchers due to the scale of the literature and the limitations of current manual information extraction practices using literature search and laborious manual curation processes used in String DB and AraNet. This project will produce a reusable proof-of-concept demonstration applying state-of-the-art natural language processing (NLP) techniques within the DOE Systems Biology Knowledgebase (KBase) framework to automatically extract organism traits from the literature for synthetic biology research. This work seeks to address important knowledge gaps in this field, while simultaneously providing a meaningful staging ground to expose new NLP tools to the KBase community and to gather feedback on their efficacy and use. The team will leverage NLP and data collection methods that have been previously developed and successfully applied in isolated settings to accomplish this effort, working with existing KBase tools and functionality, and producing outreach material to communicate and disseminate this work and its user-facing outcomes.

Earth is facing some serious biological resource problems: a scarcity of renewable energy, lack of novel remedies for endemic infectious diseases, water pollution, shortages of arable soil and the resultant food crises, and the degradation of ecosystems to name a few of the most pressing. The project posits that the ability to domesticate and genetically engineer non-model microorganisms from relevant niches would help assess new potential solutions to many of these life-threatening global challenges. Though there have been technological innovations happening at rapid pace in addressing many of these challenges, the information for each potential new model organism is distributed throughout the literature and inaccessible to many practitioners, making unnecessarily difficult every new synthetic biology, bioenergy, and bioproduct project in a non-model organism. This lack of organized information not only limits machine-readable approaches, but also makes it difficult to assess the scope of work, identify knowledge gaps, and offer suggestions for investment to overcome technological barriers. For example, after decades of development in the field of synthetic biology, it is still challenging to identify suitable microbial targets for specific applications, conditions, and genetic tools necessary for the cultivation and engineering of non-model microorganisms. A generalized literature mining tool that keeps track of new technologies and genetic tools important for biotechnology practitioners would be invaluable. This tool will enable discovery of information gaps and opportunities that are buried within the vast literature. For example, the tool should be able to find information such as if the organism of choice is appropriate for domestication in the lab and which genetic tools exist for the organism with more ease than searching hundreds of primary literature sources, thereby saving time, effort, and money for the DOE-funded project. By building this prototype literature mining service into KBase, the team will be able to surface issues with KBase platform integration (including how the integration should be done), to identify the scoping and scaling needs, and to explore how to best address the needs of users.

Recently, improved techniques from the field of NLP have made it possible to analyze text at unprecedented scales (e.g., millions of documents), while extracting meaningful contextual information in ways not previously possible. These techniques are highly suitable to help address the above-mentioned knowledge gap. Here, the team will apply NLP to the biological literature to extract organism traits (see Figure 1) and deposit this mined knowledge in a useful form, supporting the creation of automated, curated centralized systems essential for growing and engineering of DOE-relevant microorganisms. While the BNL NLP techniques have extremely compelling applications, their value to BER researchers has been limited by access and dissemination of their results. KBase integration will bring the knowledge captured by the literature to a much wider audience.

Ultra-Sensitive Protein-SIP to Quantify Activity and Substrate Uptake in Microbiomes with Stable IsotopesKleinerNorth Carolina State UniversityKleinerEnvironmental MicrobiomeUniversity

The project’s goal is to use stable isotope probing (SIP) to address the question of how microbes and minerals make necromass that persists.

SIP approaches are a critical tool in microbiome research to determine associations between species and substrates, as well as the activity of species. The application of these approaches ranges from studying microbial communities important for global biogeochemical cycling to host-microbiota interactions in the intestinal tract. Current SIP approaches, such as DNA-SIP or nanoscale secondary ion mass spectrometry, allow researchers to analyze incorporation of stable isotopes with high coverage of taxa in a community and at the single cell level, respectively, however they are limited in terms of sensitivity, resolution, or throughput.

The team has developed an ultra-sensitive, high-throughput protein-based SIP approach (Protein-SIP), which cuts cost for labeled substrates by 50-99% as compared to other SIP and Protein-SIP approaches and thus enables isotope labeling experiments on much larger scales and with higher replication. The approach allows for the determination of isotope incorporation into microbiome members with species level resolution using standard metaproteomics liquid chromatography–tandem mass spectrometry measurements. At the core of the approach are new algorithms to analyze the data, which have been implemented in an open-source software. Research demonstrates sensitivity, precision, and accuracy using bacterial cultures and mock communities with different labeling schemes. Furthermore, the team benchmarks the approach against two existing Protein-SIP approaches and shows that in the low labeling range, the team’s approach is the most sensitive and accurate. Finally, researchers measure translational activity using 18O heavy water labeling in a 63-species community derived from human fecal samples grown on media simulating two different diets. Activity could be quantified on average for 27 species persample, with nine species showing significantly higher activity on a high protein diet, as compared to a high fiber diet. Surprisingly, among the species with increased activity on high protein were several Bacteroides species known as fiber consumers. Apparently, protein supply is a critical consideration when assessing growth of intestinal microbes on fiber, including fiber-based prebiotics.

In conclusion, research demonstrates the Protein-SIP approach allows for the ultra-sensitive (0.01% to 10% label) detection of stable isotopes of elements found in proteins, using standard metaproteomics data.

Biofuels Disruption of Membrane Domains: An Unrecognized Mode of Solvent StressDavisonOak Ridge National LaboratoryElkinsBioenergyBiomass Deconstruction

The Solvent Disruption of Biomass and Biomembranes Science Focus Area (SFA) provides fundamental knowledge about how solvents alter the structures of plant cell walls and microbial membranes. The project’s overarching hypothesis is that knowledge of partitioning or binding of the solvent from the bulk phase to biomass or biomembranes can help predict maximal or minimal disruption. Solvents disrupt biological structures comprising amphiphilic molecules and polymers (e.g., membranes and biomass). Determining common biophysical principles of solvent disruption will lead to new understandings of how solvents affect the relevant structures. This information will help determine the ultimate microbial limits in tolerating specific solvents, as well as the eventual design of co-solvents best suited for pretreatment. The SFA will integrate the power of world-class neutron scattering capabilities and leadership-class supercomputing facilities available at ORNL. These capabilities are complemented by expertise in biodeuteration and biomembranes at ORNL, plant cell wall chemistry at the University of Tennessee, and interpreting small-angle neutron scattering (SANS) data at the University of Cincinnati.

A sustainable bioeconomy will undoubtably rely on the efficient production of lignocellulosic biofuels that can be combusted directly in automobile engines or catalytically upgraded to long-chain hydrocarbons for use as diesel and aviation fuels. Typical target biofuels include ethanol and n-/isobutanol which act as amphiphilic co-solvents in the aqueous environment of fermentation. Amphiphilic alcohols are well known to have chaotropic effects on biological molecules which lends to their inherent toxicity to the microbial biocatalysts used for fermentation. While these toxic effects can be broad, targeting all biological macromolecules, the cellular membrane is recognized as especially vulnerable to disruption due to the partitioning of amphiphilic co-solvents into the lipid bilayer. Functionally, this leads to membrane thinning, destabilization, loss of membrane potential, and eventually, cell death.

While the impact on the transverse structure of the membrane from high co-solvent titers is known, the effects on lateral biomembrane structure have not been well-studied. Lateral membrane structure is described as differences in lipid composition and membrane physical properties across the plane of the membrane. This can be understood in analogy to an in-plane phase separation which is dictated by the presence of high and low melting point lipid species as well as sterols (or their microbial analogs). Depending on the ratio of these molecules, phase separation occurs in the membrane, creating regions of different local composition and physical properties. In the biological context, these structures are colloquially known as membrane microdomains or lipid rafts. Rafts play an important role in many cellular functions due to their role as platforms to segregate and organize membrane proteins; this co-localization is critical to oligomerization of membrane proteins, and by extension, optimal cellular function.

In this project, researchers pursue the hypothesis that amphiphilic co-solvents, such as ethanol and n-/isobutanol, alter or disrupt functional membrane microdomains, leading to an unrecognized mode of co-solvent toxicity and cellular stress at non-lethal co-solvent concentrations. The team examines this hypothesis in a range of model and in vivo lipid membrane systems including phase-separating membrane mimics in the form of unilamellar vesicles, microbial membrane extracts, and engineered microbial systems that allow tunable membrane compositions. Domain structure and behavior is examined with a variety of non-destructive techniques including microscopy and SANS with and without the solvent ethanol (See figure). Computational approaches including molecular dynamics simulations (MD) are also leveraged to provide molecular detail and experimentally inaccessible understanding of membrane behavior in the presence of co-solvents.

Tan et al. (2023) has made a significant step forward in validating the SFA’s hypothesis by (1) demonstrating a direct disruption of a model lipid raft due to the presence of an amphiphilic co-solvent (ethanol) and (2) elucidating the physical mechanism by which the disruption of the lipid domains occurred. The team shows that unequal partitioning of ethanol between the co-existing phases leads to an increase hydrophobic mismatch of the thickness of these phases and a corresponding increase in the domain line tension. This is a driver to minimize the domain interface to domain area ratio. This represents the physical basis for a novel mode of co-solvent induced cell stress due to domain disruption. Continuing work will further test the hypothesis using more complex sample types, including live cells, and compare the effects of other amphiphilic biofuel molecules on domain organization. Further validation of the SFA’s hypothesis will provide a more holistic understanding of solvent-membrane interactions and inform actionable approaches to mitigating toxicity and improving biofuel yields from fermentative microbes.

Visualization of Solvent Disruption of Biomass and Biomembrane Structures in the Production of Advanced Biofuels and BioproductsDavisonOak Ridge National LaboratoryDavisonBioenergyBiomass Deconstruction

The Solvent Disruption of Biomass and Biomembranes Science Focus Area (SFA) provides fundamental knowledge about how solvents alter the structures of plant cell walls and microbial membranes. The project’s overarching hypothesis is that knowledge of partitioning or binding of the solvent from the bulk phase to biomass or biomembranes can help predict maximal or minimal disruption. Solvents disrupt biological structures comprising amphiphilic molecules and polymers (e.g., membranes and biomass). Determining common biophysical principles of solvent disruption will lead to new understandings of how solvents affect the relevant structures. This information will help determine the ultimate microbial limits in tolerating specific solvents, as well as the eventual design of co-solvents best suited for pretreatment. The SFA will integrate the power of world-class neutron scattering capabilities and leadership-class supercomputing facilities available at ORNL. These capabilities are complemented by expertise in biodeuteration and biomembranes at ORNL, plant cell wall chemistry at the University of Tennessee, and interpreting small-angle neutron scattering (SANS) data at the University of Cincinnati.

A sustainable bioeconomy will undoubtably rely on the efficient production of lignocellulosic biofuels that can be combusted directly in automobile engines or catalytically upgraded to long-chain hydrocarbons for use as diesel and aviation fuels. Plant cell wall structure of biomass is an intricate design of several carbohydrate polymers encased in the hydrophobic lignin polymer to protect against degradation. The recalcitrance to deconstruction of lignocellulosic biomass due to the complex physicochemical structure of plant cell walls is a challenge in biological-based biorefinery systems due to the complex physicochemical structure of plant cell walls. Pretreatment and genetic modification are two approaches in biomass conversion that have succeeded in modifying the structure of lignocellulose to enable better enzymatic deconstruction. However, the structural differences among pretreatment-solubilized biomass biopolymers have not been extensively investigated. The SFA’s goal is to understand the molecular-level mechanism which drive efficient biomass deconstruction. ORNL scientists have reported direct experimental and computational evidence of physical chemical principles underlying pretreatment. Here, the team will discuss the use of molecular dynamics (MD) simulations, experimental pretreatments with acids and with acidified solvents combine the scattering measurement to elucidate structural changes in the three key biomass lymers (cellulose, hemicellulose, and lignin). This will be illustrated by several examples.

Researchers determined that solvent mixtures with both hydrophilic and hydrophobic interactions are key for efficient deconstruction of biomass as revealed by neutron scattering and molecular simulation. The team elucidated the effect of tetrahydrofuran (THF)-water pretreatment on the nanoscale architecture of biomass and the role the co-solvents play in solubilizing lignin and cellulose (Pingali et al. 2020). In situ SANS determined temperature-dependent changes in biomass morphology; whereas lignin dissociates over a wide temperature range (>25°C), cellulose disruption occurs only above 150°C. SANS with contrast variation and MD simulations provided direct evidence for the formation of THF-rich nanoclusters (~0.5 nm) on the nonpolar cellulose surfaces and on hydrophobic lignin, and equivalent water-rich nanoclusters on polar cellulose surfaces.

In another example, three organosolv pretreatment systems—ethanol (EtOH), tetrahydrofuran, and γ-valerolactone, in dilute acidic aqueous—were used on wild-type and two transgenic switchgrasses with altered lignin. All organosolv pretreatments caused a significant reduction in the molecular weights of lignins particularly, and up to ~90% decrease was observed in EtOH pretreated lignin compared to untreated lignin. A correspondence was found between the molecular weight reduction of lignin molecules in the experiments and the number of hydrogen bonds between lignin and the organic solvents as calculated in the MD simulation, suggesting a connection between the depolymerization of lignin and its ability to hydrogen bond with the organic solvents.

To understand the role of noncellulosic switchgrass polymers on the overall efficiency of pretreatment, the structural evolution of the noncellulosic polymers of the plant cell wall was investigated during dilute acid pretreatment by employing in situ SANS on various polymer fractions from switchgrass (Yang et al. 2021). In this study, researchers observed real-time structural changes not possible to observe by any other technique. These interpretations were consistent with MD simulations. These results suggest that not only lignin but also hemicellulose can form aggregate particles within plant cell walls during pretreatment. These concepts can be employed to tune pretreatment technologies that maximize deconstruction of biomass and facilitate the separation of its components for upgrading to energy and materials.

Visualizing Spatial and Temporal Responses of Plant Cells to the EnvironmentDahlbergSLAC National Accelerator LaboratoryJoubertStructural Biology

This project will develop correlative light and electron microscopy (CLEM) cryo-electron tomography (cryo-ET) tools for high-resolution imaging of fluorescent biosensors in plant cells under different physiological stress conditions.

The plant cell wall is a complex dynamic structure that functions as the primary interface through which interactions between plants and their environment are mediated. To study the nanometer-scale structural changes that correspond to a plant’s response to stress, researchers are developing cryogenic correlative light and electron microscopy methods that use fluorescent biosensors that report on various aspects of a cell’s physiology.

Cryogenic preparation of plant cells has been notoriously challenging due to various morphological and physiological features that are unique to terrestrial plants and dissimilar from both unicellular and multicellular organisms, cell types, and macromolecules for which robotic plunge-freezing techniques were developed. Due to their thick cellulosic cell wall, relatively large size, and water-filled large vacuoles, high-pressure freezing techniques need to be optimized to ensure complete vitrification of the entire plant cell. Using root tips of Arabidopsis seedlings grown on agar and in liquid medium, the team uses cryo-focused ion beam–scanning electron microscopy (cryoFIB-SEM) milling, including Cryo-LiftOut (CLO) techniques, to prepare ultrathin lamellae for cryo-ET data collection. The project additionally uses integrated fluorescence microscopy (iFLM) approaches to do guided milling and target specific fluorescent biosensors for tomographic data collection.

The tools and workflows developed here will be transformative for the field of cryo-ET by demonstrating that fluorescence can be used in cryogenic CLEM experiments to characterize specific aspects of the subcellular chemical environment.

Understanding the Effects of Populus—Mycorrhizal Associations on Plant Productivity and Resistance to Abiotic StressCreggerOak Ridge National LaboratoryCreggerBioenergyEarly Career

The overarching goal of this project is to create sustainable, multipurpose bioeconomies whereby globally important feedstocks can be produced while simultaneously maximizing soil health and mitigating adverse impacts of climatic change. In this project, the unique ability of Populus species will be leveraged to associate with both ectomycorrhizal (ECM) and arbuscular mycorrhizal (AM) fungi to examine how variation in these associations alters plant productivity, abiotic stress response, and belowground soil carbon cycling.

Within the myriad of possible plant-microbe interactions occurring belowground, plant-mycorrhizal associations are widespread with the two most common mycorrhizal types being ECM and AM. Belowground plant interactions with these fungi have been shown to increase water uptake and nutrient acquisition and alter soil carbon storage. It is unclear how these two dominant mycorrhizal fungal types differ in their abilities to offer benefits to the plant and change belowground carbon and nutrient cycling. Most plants associate with one type of mycorrhizal fungi, and individual plant species are less likely to associate with both AM and ECM fungal species. Populus species, however, uniquely associate with AM and ECM simultaneously in natural settings, thus providing an ideal experimental system for examining how mycorrhizal fungal types confer benefits to their host. The team will take advantage of high-throughput plant phenotyping, greenhouse, and field experiments to characterize how variation in Populus-mycorrhizal associations alters the plant response to drought, and researchers will manipulate plant-mycorrhizal interactions to influence plant productivity, increase plant drought tolerance, and enhance soil health.

In objective one of this project, researchers identified drought tolerant and drought susceptible genotypes of Populus trichocarpa and Populus deltoides x P. trichocarpa hybrids. In a series of replicated greenhouse experiments, researchers grew 37 genotypes of P. trichocarpa and 29 unique hybrid genotypes (P. trichocarpa x P. deltoides and P. deltoides x P. trichocarpa) in double autoclaved potting mix under well-watered and drought conditions. Before manipulating water availability, base line plant phenotypes (e.g., plant height, stomatal conductance, leaf chlorophyll, leaf protein) were measured. Next, the team initiated an acute drought on half of the plants and monitored soil volumetric water content over the course of 1 week. When plants began to wilt significantly, hyperspectral images were captured from leaf four using a Headwall camera from 900 to 2500 nm wavelength to identify early indications of drought tolerance in images. Further, researchers characterized water-use efficiency, leaf protein, leaf chlorophyll, and changes in plant height, leaf number, and above/below ground biomass. Overall, the team found significant variation in plant phenotype across genotypes and in response to acute drought. Populus genotypes that varied in drought tolerance will be used in upcoming manipulative experiments with mycorrhizal fungi. This work will be expanded to identify P. deltoides genotypes that vary in drought tolerance.

Within objective two, researchers will characterize variation in mycorrhizal community composition, colonization, and abundance across drought tolerant/susceptible Populus species/genotypes. In February and July of 2022, root and rhizosphere soil samples were collected from drought tolerant and susceptible P. trichocarpa in a genome-wide association study (GWAS) plantation in Davis, Calif. Across these genotypes, the team found that both AM and ECM fungi colonized the roots, and drought tolerant genotypes had a greater percentage of hyphae, greater number of arbuscules, and a larger hartig net compared to drought susceptible trees. Amplicon and metatranscriptomic sequencing are in progress to characterize the AM and ECM taxa across these trees. Further, culturing of these unique organisms is underway to be used in manipulative experiments.

Combined, these initial results highlight significant genetic variation in the response of Populus to drought when grown without microbial symbionts, and further demonstrates variation in belowground mycorrhizal communities across drought tolerant and susceptible genotypes. Plant and fungal resources resulting from these experiments will be used to evaluate how these differences drive changes in host abiotic stress tolerance and soil carbon cycling.

Elucidating the Genetic Components of the Physiological and Metabolic Processes Governed by the TORC Regulatory Module in PoplarColemanUniversity of MarylandColemanBioenergyUniversity

The goal of this research is to identify the target of rapamycin complex-1 (TORC1)–mediated regulatory and signaling pathways and the linkage between TOR and Rho of Plants (ROP) GTPase nutrient sensing in poplar to elucidate the functional role of genes regulating nutritional responses. Through experiments using CRISPR gene editing, genomics, biochemical, and computational approaches, the project will decipher the functional significance of poplar TOR coupled to ROP GTPase nutrient sensing and signaling. This will allow the team to elucidate function and divergence in this nutrient signaling relay allowing for identification of the regulatory and signaling networks and hubs.

Understanding the genetic basis of complex physiological and metabolic traits, including resource use efficiency and abiotic stress, is necessary for improving lignocellulose crops for a sustainable biobased economy. Poplar (Populus spp.) is an important and sustainable bioenergy and bioproduct plant feedstock, yet understanding of the pathways and networks governing resource use efficiency and responses to abiotic stresses is poorly developed. Nutrient sensing and signaling is a fundamental mechanism which modulates cellular activities that mediate growth, development, and biomass accrual. The protein TOR kinase is part of an evolutionally conserved central hub that integrates not only nutrient signals but also signals related to energy, hormones, biotic, and abiotic stresses through its pivotal role in regulating transcription, translation, and metabolism. In plants, TOR signaling of diverse nitrogen signals has been linked to signal integration by ROP GTPases that modulate TOR activation and signaling outputs. Although the ROP GTPase/TOR signaling relay has been shown to be important to integrating nitrogen signals, little is known about these signaling components in poplar. Interestingly, compared to other plants species, such as Arabidopsis and rice, poplar contains two TOR genes, and researchers have discovered that the two poplar TOR genes can be gene edited to vary gene dosage resulting in viable plants with discernible phenotypes. Since in other plant species TOR knockouts or nulls are embryo lethal, researchers have a unique opportunity to employ genome editing to determine if the two poplar TOR genes have functionally diverged along with establishing the role of the TOR/ROP GTPase relay in nitrogen sensing and signaling.

The overall objective of this research is to identify the TOR-mediated regulatory and signaling pathways and the linkages between TOR and ROP GTPase nutrient sensing in poplar to elucidate the functional roles of these genes in regulating nutritional responses. Through experiments that use CRISPR gene editing, genomics, biochemical, and computational approaches, the project will decipher the functional role and divergence of the two poplar TOR genes by altering gene dosage and identifying the functional linkage between ROP GTPase signal integration and TOR activity. The specific objectives of the project are to (1) characterize carbon and nitrogen mediated TOR activation in poplar cells; (2) alter gene dosage of the two poplar TOR genes to determine their functional roles and divergence in carbon and nitrogen signaling; (3) determine which members of the small GTPase ROP gene family can integrate diverse nitrogen signals to activate TOR; (4) identify how the regulatory and signaling networks and associated regulatory factors downstream of the ROP-GTPase-TOR nutrient sensing and signaling relay have diverged; and (5) validate the function of genetic factors downstream of the ROP GTPase/TORC relay in nitrogen sensing and signaling.

The results of this project will allow the team to elucidate function and divergence in this nutrient signaling relay allowing for identification of the regulatory and signaling networks and hubs involved in nitrogen responses in poplar. Although this research is focused on nitrogen nutrition, it is likely that the results of this project will also uncover aspects of how TOR and ROP GTPases modulate other processes such as abiotic and biotic stress, and developmental processes regulating biomass yield.

Construction of a Synthetic 57-Codon E. coli Chromosome to Achieve Resistance to All Natural Viruses, Prevent Horizontal Gene Transfer, and Enable BiocontainmentChurchHarvard Medical SchoolNyergesBiosystems DesignUniversity

The project is finalizing the construction of a fully recoded 3.97 Mb Escherichia coli genome that relies on the use of only 57 genetic codons. For this aim, the genome was computationally designed, synthesized, and assembled into 88 segments. In the final steps of genome construction, the team combines and optimizes these segments in vivo to assemble the fully recoded, viable chromosome. In parallel with the construction of this 57-codon organism, the team investigates whether mobile genetic elements and environmental viruses can overcome the genetic isolation of organisms bearing modified genetic codes.

The team presents the construction of a recoded, 57-codon E. coli genome, in which seven codons are replaced with synonymous alternatives in all protein-coding genes. For this aim, the entirely synthetic recoded genome was assembled as 88 25-48-kb episomal segments, individually tested for functionality, and then integrated into the genome. Developing a specialized integration system and optimizing workflow enhanced integration efficiency to 100%, resulting in an order-of-magnitude increase in construction speed. Researchers are now combining recoded genomic clusters with a novel technology that builds on the latest developments in recombineering and CRISPR-associated nucleases (Wannier et al. 2020, 2021). In parallel with genome construction, researchers developed novel experimental methods to identify fitness-decreasing changes and troubleshoot these cases. Leveraging massively parallel genome editing and accelerated laboratory evolution allowed correction of partially recoded strains’ fitness within weeks (Nyerges et al. 2018). As the final assembly of this E. coli genome approaches, dependency on non-standard amino acids is also implemented.

Previous experiments showed that rational genetic code engineering could isolate Genetically Modified Organisms (GMOs) from natural ecosystems by providing resistance to viral infections and blocking horizontal gene transfer (HGT); however, how natural mobile genetic elements and viruses could cross this genetic-code-based barrier remained unanswered. By systematically investigating HGT into E. coli Syn61∆3, an E. coli strain with a synthetic, 61-codon genetic code, researchers discovered that transfer (t) RNAs expressed by viruses and other mobile genetic elements readily substitute cellular tRNAs and abolish genetic-code–based resistance to HGT (Nyerges et al. 2022). The team also discovered 12 new bacteriophages in environmental samples that can infect and lyse this 61-codon organism. These viruses express 10-27 tRNAs, including functional tRNAs needed to replace the host’s missing tRNA genes. Researchers also identified viruses with tRNAs that hold the potential to abolish the virus resistance of the 57-codon organism. These findings suggest that the selection pressure of organisms with compressed genetic codes can facilitate the rapid evolution of viruses and mobile genetic elements capable of crossing a genetic-code–based barrier. Therefore, additional genetic biocontainment technologies were developed to simultaneously block GMOs’ unwanted proliferation, eliminate viral infections, and prevent transgene escape (Nyerges et al. 2022).

In sum, this genome synthesis work will soon (1) demonstrate the first 57-codon organism, (2) establish a tightly biocontained chassis for new-to-nature protein production, and (3) open a new avenue for the bottom-up synthesis and refactoring of microbial genomes, both computationally and experimentally. Furthermore, this research demonstrates that horizontally transferred tRNA genes of mobile genetic elements and viruses can substitute deleted cellular tRNAs and thus rapidly abolish compressed genetic codes’ resistance to viral infections and HGT.

Expanding Knowledge of Bacterial-Fungal Interactions in Environmental MicrobiomesChainLos Alamos National LaboratoryRobinsonBioenergyBacterial-Fungal Interactions

As part of the LANL Science Focus Area (SFA) on Bacterial-Fungal Interactions (BFIs), researchers are developing novel bioinformatic and experimental tools and resources for the identification and characterization of BFIs that occur within complex natural microbiomes. The theoretical framework of this project is built upon gaining a more comprehensive understanding of how bacteria and fungi sense, respond to, and co-evolve with one another using multiomics-based interrogations. A more complete understanding of the molecular mechanisms underlying BFIs allows interrogation of how the dynamics of these relationships are altered in the context of environmental change (e.g., nutrient availability, temperature). Through these studies, the team hopes to gain a predictive understanding of how these interactions impact microbiome function and how they may be altered to steer the function of soil ecosystems and increase resilience to climate change and other environmental perturbations.

Diverse members of the bacterial and fungal kingdoms often co-dominate environmental microbiomes. Over the past decade, it has become clear that members from these two kingdoms frequently interact (Robinson et al. 2021). However, many gaps in knowledge, resources, and data remain in the field of BFIs. The BFI SFA has performed a number of investigations which have provided increased knowledge on the diversity of these interactions, how bacteria and fungi co-evolve, the molecular mechanisms which drive BFIs, and other important areas within the field. These investigations have involved taxonomically diverse bacteria and fungi, including a mix of model organisms which provide high tractability and resources for experimental work, and non-model organisms from dryland environments. This mix has provided fundamental knowledge on the mechanisms employed by bacteria and fungi to sense and respond to one another using model BFIs, while enabling continued investigations into the diversity of these mechanisms in natural systems through the use of non-model BFIs. Furthermore, this approach allows the assessment of how BFI may be altered over evolutionary time or as a result of changes in their environment. This poster will highlight several of the most substantial results and provide perspective on integrating results from multiple investigations to gain a more complete understanding of how and why BFIs occur and potential impacts of BFIs on microbiome dynamics and function.

An Online Public Resource for Bacterial-Fungal Interaction ResearchChainLos Alamos National LaboratoryChainBioenergyBacterial-Fungal Interactions

As part of the LANL Science Focus Area (SFA) on Bacterial-Fungal Interactions (BFIs), researchers are developing novel bioinformatic and experimental tools and resources for the identification and characterization of BFIs that occur within complex natural microbiomes. The theoretical framework of this project is built upon gaining a more comprehensive understanding of how bacteria and fungi sense, respond to, and co-evolve with one another using multiomics-based interrogations. A more complete understanding of the molecular mechanisms underlying BFIs allows interrogation of how the dynamics of these relationships are altered in the context of environmental change (e.g., nutrient availability, temperature). Through these studies, the team hopes to gain a predictive understanding of how these interactions impact microbiome function and how they may be altered to steer the function of soil ecosystems and increase resilience to climate change and other environmental perturbations.

The burgeoning field of BFI research is quickly gaining interest in broader fields such as microbial ecology and evolution. This is at least partially due to the fact that current knowledge on BFIs suggests that interactions between these two kingdoms are quite common (Robinson et al. 2021). Furthermore, it has been demonstrated that BFIs can have direct impacts on the ecological functions performed by participating bacterial and fungal partners, suggesting that BFIs play an important role in the ecology and evolution of environmental microbiomes (Deveau et al. 2018; Pierce et al. 2021). However, it is currently very challenging to assess the state of the field, particularly with respect to what bacterial and fungal taxa have been previously reported as participants in BFIs. This is due to non-standardized methods and even format in which BFIs are reported in literature, as different descriptors (BFIs, endofungal, symbiotic, etc.) are often used, and some interactions are only reported in tables and/or figures. To address this problem, the BFI SFA has performed a comprehensive search to establish a database containing current knowledge on BFIs and which bacterial and fungal taxa participate in them. This database has been integrated into the BFI Research Portal (https://sfa-bfi.edgebioinformatics.org/search), which can be queried using specific taxa to identify if any known BFIs involving that taxa have been previously reported in the literature. The database is a dynamic resource that will be updated as new BFI descriptions are published and allows users to submit any descriptions of BFIs not presently represented in the database after internal review. Additionally, the team is continually analyzing non-published fungal sequencing data from the National Center for Biotechnology Information Sequence Read Archive to find potential bacterial associates/interactions. Interactive visual outputs allow users to expand upon their initial queries to gain insights as to the diversity of BFIs related to their taxa or lineages of interest. This centralized resource will continually be developed to allow researchers in the field the ability to conduct BFI relevant analyses using custom bioinformatic workflows and enable the BFI database to be integrated into any analyses performed within the portal.

Multiomics-Driven Microbial Model OptimizationCarothersUniversity of WashingtonShinBiosystems DesignUniversity

The project’s goal is to create genome-scale models of endogenous metabolic pathways and develop metabolic sensitivity maps to identify reactions that dominate the control of flux. Specifically, the teamaims to gain insight into the regulatory mechanisms within pathways and predict outcomes of metabolic interventions at the genome scale.

Biomanufacturing poses a sustainable approach to wean humanity’s reliance on petrochemical-derived commodity products. Despite the advent of omics data and genome-scale models, there is no straightforward process for integrating all this data to design biochemical pathways that produce chemicals at an industrial scale. To understand and engineer metabolism, researchers must identify which enzymes exert the most influence on metabolite concentrations and fluxes through the biochemical pathway. Theoretical work also suggests that the cast of enzymes exerting control over the pathway changes under different growth conditions. To identify these influential enzymes, steady state enzyme perturbation data is used within a genome-scale context containing multiple metabolic engineering interventions. Researchers approximate Michaelis-Menten kinetics near the reference steady state through a lin-log model and supply these calculations to a Bayesian inference model. The inference model estimates each reaction’s influence on the metabolic pathway and thus provides metabolic intervention targets for improving bioproduction titers and rates. This method was successfully applied to estimate sensitivities in yeast metabolism (St. John et al. 2019; McNaughton et al. 2021). This poster will present the extension of this approach to Pseudomonas putida.

Genome-Wide Gene Regulation by Transcriptional CRISPRa/i Tools in Non-Model BacteriaCarothersUniversity of WashingtonKiattiseweeBiosystems DesignUniversity

CRISPR activation (CRISPRa) and CRISPR interference (CRISPRi) are modular tools that can regulate gene expression of both heterologous and endogenous genes of microorganisms. The project’s goal is to use CRISPRa/i to build large gene regulatory networks (GRNs) spanning more than 25 genes. This study will establish a new paradigm for genome-wide design and significantly improve the ability to engineer microbes for next-generation bioproduction applications.

CRISPRa, developed from this group, is an emerging tool for transcriptional regulation in bacteria providing the ability to modulate gene expression in trans without direct modification at the DNA target (Dong et al. 2018; Fontana et al. 2020). The team also characterized the rules for effective CRISPRa in Escherichia coli and Pseudomonas putida, potential chassis for aromatic compounds bioproduction, where the rules governing high-functional CRISPRa are portable across organisms (Fontana et al. 2020; Kiattisewee et al. 2021). With incorporation of protospacer adjacent motif–flexible dCas9 proteins, researchers can target almost any endogenous gene in the bacterial genome (Kiattisewee et al. 2022), and by combining with CRISPRi gene repression, gene expression can be fine-tuned by both up- and down-regulation to any desired expression level (Tickman et al. 2021). In this study, researchers have demonstrated that CRISPRa/i tools can be used to control heterologous gene expression for bioproduction of various fine chemicals. The team has also investigated CRISPRa/i of more than 25 endogenous genes related to carbohydrate, amino acids, and fatty acids metabolism. This CRISPRa/i platform should provide the ability to combine heterologous and endogenous gene regulations and further accelerate Design-Build-Test-Learn cycles of strains engineering in non-model bacteria.

FatPlants: A Comprehensive Information System for Lipid-Related Genes and Metabolic Pathways in PlantsCahoonUniversity of NebraskaDurrettBiosystems DesignUniversity

The team will develop a dedicated web resource that provides a “one-stop” solution for plant acyl-lipid metabolism so that the research community can use it to study lipid science, model lipid networks, and pursue their own hypotheses.

Increasing seed oil content for biofuels and bioproducts by breeding and biotechnology has resulted in trade-offs or penalties with respect to protein content, seed size, or seed fitness. The molecular basis for this impasse is mostly speculative. Use of current global profiling approaches to better understand both the metabolic consequences of altered oil content and composition and the basis for reduced yield must also deal with off-target genetic mutations, ultimately confounding cause-effect interpretations. The team proposes a diverse, integrated strategy to study the consequences of higher and tailored lipid production by studying transgenic plants specifically engineered to produce altered seed oil content and composition. As a continuation of a prior project, researchers are developing a “one-stop-shop” community web resource for all data pertaining to modifying oil composition and increasing oil content in plants, and to leverage data generated from this project with curated forms of public data from other funded websites and the literature. The FatPlants framework and tools currently exist for a number of crop and model oilseeds, including camelina (Camelina sativa). As part of the B5 project, the team is expanding these resources to include pennycress (Thlaspi arvense) and Cuphea viscosissima, an “extreme” producer of seeds with medium-chain fatty-rich oils. Researchers will present all the known fatty acid related proteins and genes in these species and overlay these data with lipidomic measurements from seeds of B5 target species. As a comparative analysis tool, FatPlants includes pathway viewer, protein structure viewer, Basic Local Alignment Search Tool, protein-protein interaction viewer, and Gene Ontology enrichment viewer. To strengthen interactions among B5 investigators, a user authentication internal data-sharing space has been provided to all collaborative labs. The website is publicly available as a community tool at www.fatplants.net.

B5: Bigger Better Brassicaceae Biofuels and Bioproducts—An OverviewCahoonUniversity of Nebraska–LincolnCahoonBiosystems DesignUniversity

The project addresses three goals:

  • Systems-guided interrogation of the plastid bio-factory for enhanced production of fatty acids and predictable production of fatty acids with tailored chain-lengths
  • Synthetic biology tool development for predictable and high-throughput oilseed crop engineering
  • Integration of the redesigned plastid bio-factory with extra-plastidial metabolism for enhanced oils and biocontainment.

B5 will address the imperative need for sustainable liquid fuels and oils of defined structures desired by the U.S. bioenergy and oleochemical sectors. The project will integrate fundamental knowledge generation and synthetic biology tool development to predictably and more rapidly develop non-food Brassicaceae oilseeds that produce high quantities of oils and oils with tailored fatty acid compositions (Figure 1). B5’s multidisciplinary team will interrogate plastid metabolic circuitry for carbon flux through fatty acid biosynthesis in seeds of the Brassicaceae, pennycress, and camelina. Focus on both species will generate “rules” for next-generation metabolic engineering of Brassicaceae oilseeds and provide higher-value and broader cover/rotation crop options for U.S. farmers. B5 efforts will be guided by mathematical models as well as biochemical data acquired from seeds of metabolically “extreme” species that produce exceptionally high levels of medium-chain fatty acids. In concert, B5 will develop synthetic biology tools to deliver transgene combinations into defined genome regions and advanced gene-editing methods for tunable up- or down-regulation or replacement of endogenous genes. Aided by a comprehensive analytical learning platform and computational models, B5 will integrate data and toolsets to develop enhanced pennycress and camelina germplasm through design, build, test, learn (DBTL) cycles. Given the central metabolic role of fatty acids in the cell, robust and integrated DBTL cycles will be key to discovering how plants “fight back” against lipid metabolic remodeling. The high-quality genome sequences, plethora of genomic resources, existing metabolic engineering toolbox, and simple Agrobacterium-based floral infiltration transformation systems make both pennycress and camelina ideal crops for modified oil production. These attributes will also accelerate synthetic biology chassis optimization and introduction of genetic biocontainment technology for safe, sustainable production on marginal and underutilized land across wide portions of the United States. The availability of U.S. Department of Agriculture–Animal and Plant Health Inspection Service regulated field sites and a high-throughput camera-based phenotyping system will facilitate agronomic evaluation of engineered germplasm under diverse environments. In addition to significant fundamental and translational outputs, B5 will further develop extant databases (e.g., FatPlants, ARALIP) for the scientific community and train nine undergraduates, six graduate students, and 11 postdoctoral scientists as the next generation of investigators to tackle U.S. and global energy security, natural resources, and environmental challenges.

Changes in Amino Acid Distribution Across Populus trichocarpa Roots with and Without Microbes in a Rhizosphere-on-a-Chip HabitatCahillOak Ridge National LaboratoryCahillBioimaging

The goal of this project is to develop new technologies to image changes in chemistry occurring in the rhizosphere in living biosystems. The team created in situ Liquid Extraction Mass Spectrometry (in situ-LE-MS), a liquid microjuction-surface sampling probe mass spectrometry (LMJ-SSP-MS) imaging modality that enables non-destructive imaging of plant rhizospheres with broad chemical coverage and chemical specificity. In situ measure of amino acid distributions was achieved for Populus cuttings grown in a synthetic rhizosphere-on-a-chip system. Amino acid distributions varied across the Populus root structure. When co-cultured with and without rhizosphere bacteria, amino acid distributions were altered in a species-specific manner. These data shed unique insights into the high degree of spatial variance in root exudate occurring within the rhizosphere.

The rhizosphere is an incredibly complex environment, containing thousands of unique exogenous chemical species oriented in a complex spatial network. Such compounds are known to affect plant-microbe organization, interactions, and, ultimately, growth and survivability. Due to its importance, the role of exogenous compounds in the rhizosphere is under much investigation, specifically the relation between plant physiology and the spatiotemporal distribution of molecular components. However, measure of the spatial distribution of exogenous compounds in the rhizosphere is challenging given the complex and dynamic nature of the environment. Compounds include, among others, organic acids, polysaccharides, proteins, and amino acids (AAs) which can exhibit a variety of roles in the rhizosphere including acting as a nutrient source for microbial colonization or a deterrent against pathogenic species. The exudation of AAs is one of the biggest components of plant carbon loss, which when released into carbon-deficient soil can lead to significantly enhanced hot-spots of microbial growth. In turn, microbes alter the relative distribution of exuded AAs, which can then be re-assimilated in the plant. Direct measurement of AA distribution within the rhizosphere is challenging due to the complex and dynamic nature of the environment and the limited accessibility of the rhizosphere for analysis. However, even for abundant molecules, like AAs, little is known of how they are spatially distributed along plant roots, how their distribution changes over time, and how their distribution affects microbial composition and, ultimately, plant health.

To understand the complex chemical dynamics of exudated compounds in the rhizosphere, the team developed in situ-LE-MS, an LMJ-SSP-MS imaging modality designed to measure exudate chemistry in situ from synthetic rhizosphere-on-a-chip environments. Uniquely, this technology extracts a very small volume of liquid from the rhizosphere and can measure exudates without additional sample preparation procedures enabling in situ, non-destructive MS imaging for the first time. Here, researchers chemically imaged AA distributions across Populus plants grown in rhizosphere-on-a-chip systems. Populus was cultured with rhizosphere bacteria CF313, PM419, and their co-culture relative to control (Populus without bacteria). Multivariate analyses were used to identify unique AA distributions measured between samples and related to root morphology annotated through brightfield imaging of root structure.

Metabolome-Informed Proteome Imaging of Lignocellulose Decomposition by a Naturally Evolved Fungal Garden Microbial ConsortiumBurnum-JohnsonPacific Northwest National LaboratoryBurnum-JohnsonEnvironmental MicrobiomeEarly Career

The objective of this Early Career Research project is to gain transformative molecular-level insights into microbial lignocellulose deconstruction through comprehensive and informative review of underlying biological pathway data yielded by the integration of spatiotemporal multiomic measurements (i.e., proteomics, metabolomics, and lipidomics). One of this project’s goals is to uncover the mechanisms that drive cooperative fungal-bacterial interactions that result in the degradation of lignocellulosic plant material in the leafcutter ant fungal garden ecosystem. The project’s approach will enrich the current knowledge base needed for a predictive systems-level understanding of the fungal-bacterial metabolic and signaling interactions that occur during cellulose deconstruction in an efficient, natural ecosystem.

The leafcutter ant fungal garden is known as a natural model system for efficient plant matter degradation. The degradation processes are largely mediated by the symbiotic fungal and bacterial members within the complex microbial consortium. These symbiotic microbes with unique metabolic capabilities, however, are heterogeneously spatially organized in the sample. Previous mass spectrometry (MS) studies profiled molecules from bulk fungal garden samples; thereby, averaging the biological processes across the ecosystem and masking their spatial localization, biological origin, and molecular dynamics (Khadempour et al. 2021). To overcome this limitation, researchers, hereby, performed microscale imaging across 12 µm-thick fungal garden serial sections by applying a metabolome-informed proteome imaging (MIPI) approach. This approach combines two spatial multiomics MS modalities that enable obtaining comprehensive molecular characterization across and through the fungal garden. Matrix-assisted laser desorption/ionization (MALDI) imaging profiled metabolites with a spatial resolution of 50 µm and correlated morphologically unique features with metabolome profiles of interest (i.e., lignocellulose degradation). The identified regions of interest (ROIs) were selected for subsequent microdissection and microscale proteomic imaging using microPOTS (microdroplet processing in one pot for trace samples) with an integrated metaproteomic approach to detect metabolic activities and identify microbial community members.

Untargeted MALDI-Fourier-transform ion cyclotron resonance–MS imaging analysis revealed heterogeneous spatial distribution of various molecular features across the fungal garden sections. Researchers leveraged the METASPACE annotation platform to search against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and tentatively annotate 650 unique metabolites, which colocalized metabolomic signatures with distinct microscopic features. The MALDI images mapped the presence of phenylpropanoids, benzaldehydes, flavonoids, plant hormones, Krebs cycle compounds, sugars, amino acids, and other molecules that were produced by the complex community in the fungal garden. Differential relative abundance and accumulation of low molecular weight lignin products (coniferyl alcohol, coniferyl aldehyde, sinapoyl aldehyde, cinnamate, ferulate, caffeate, vanillin, etc.) were observed indicating specific spatial patterns. Another observed spatial pattern colocalized with a unique ant wing-like feature that was characterized mainly by primary metabolites, such as soluble sugars, amino acids, and fatty acids. Informed by the metabolomic specific features, the team selected the wing and three additional ROIs characterized as lignocellulose degradation hotspots for subsequent microPOTS metaproteomics analyses. Selected ROIs and their biological replicates were dissected using a laser capture microdissection system and collected in individual wells of the microPOTS chip. Peptides resulting from on-chip sample preparation were analyzed by liquid cryotomography (LC)-MS/MS analyses. For the metaproteomic analyses, a reference database was first curated from 50 million proteins of known members in the consortium that were grouped into >24 million clusters based on sequence similarity to annotate the high-resolution tandem MS spectra with stringent matching criteria. A total of 7,392 non-conservative and taxon-specific peptides that mapped to 2,239 unique protein clusters were detected, unveiling a complex community with relatively high representation of arthropod peptides (5,178) observed only in the wing ROI, while fungal peptides (1,825) and comparatively low abundant plant (552) and bacterial (47) peptides were localized in the other three ROIs mapped as lignocellulose degradation zones. Metaproteomics data revealed the presence of a fungal ligninolytic auxiliary enzyme and several fungal carbohydrate-active enzymes such as hemicellulases, cellulases, pectinases, and amylases in the lignocellulose degradation hotspots ROIs. The metabolic functions detected at the microscale provide more direct evidence that fungi cultivated by leafcutter ants such as Leucoagaricus gongylophorus degrade plant cell walls in the leafcutter ant garden ecosystem.

Leveraging MIPI capability to spatially profile a plethora of metabolites and peptides provided some molecular insights and understanding of species-specific activities in this multimember heterogenous ecosystem. Integration of MIPI data unraveled some of the processes in this complex ecosystem by reconstructing crucial parts of the lignocellulose decomposition pathways in distinct microscopic ROIs. MIPI enzyme-metabolites integration showed a strong correlation comparing abundance and spatial localization between two omics modalities. Mechanistic understanding of this symbiotic system can aid in the biological production of biofuel precursors and bioproducts from plant biomass. This novel MS micron-scale multiomic workflow can be applied to other complex and heterogenous biological systems to enhance the understanding of community member interactions and dynamics.

BioPoplar: A Tunable Chassis for Diversified Bioproduct ProductionBuellAgricultural Research Service, U.S. Department of AgricultureBuellBiosystems DesignUniversity

Domestication and breeding efforts have shown that selection of specific plant architecture traits across a wide array of plant species, both annuals and perennials, results in improved traits for human use, either for food, feed, or fuel. Similarly, selective breeding can yield distinct chemotypes of crops with desired chemical profiles or compositions. Today, precise knowledge of gene regulation and function can be generated through high-resolution omics technologies, and a synthetic biology toolkit can be constructed to engineer plant genomes at the DNA sequence, chromatin accessibility, and expression levels. Thus, society has entered an era where it is possible to model, design, and then engineer precise changes in plant genomes that will lead to predictive, modified traits.

In this project, researchers will re-engineer poplar as a multipurpose crop that can be used for bioenergy, biomaterial, and bioproduct production. A cell atlas will be generated that encompasses gene expression, gene regulatory networks, and cis-regulatory elements and is responsible for gene expression at the cell-type level, providing the requisite knowledge base and tools for precision bio-based design and fabrication of multipurpose poplar. The team will couple single-cell datasets with new genome and epigenome editing tools to develop new morphotypes of poplar that have altered tree and leaf architecture. These morphotypes will substantially improve biomass potential via increased stand density and tree integrity, photosynthetic capture, and trichome density, and serve as the foundational chassis. These chassis will have altered ratios of leaves to stems and/or trichome density in which researchers can further engineer cell wall composition and/or novel molecules such as precursors for drop-in fuels, thus making chemotypes of poplar that are ‘customized’ to their biomaterial or bioproduct applications and simultaneously ‘maximized’ in optimal morphotypes. The project will employ an iterative design process in which metabolic pathways are optimized to create unique chemotypes with tailored biomaterial and bioproduct composition.

This project will yield poplar chassis with multipurpose uses including bioenergy, biomaterials, and bioproduct production. The generation of a robust cell-type–specific set of transcription factors and cis-regulatory elements, the ability to modulate gene expression in a high-resolution manner, i.e., that of specific cell types, will enable precision genome engineering of metabolism, a significant advancement in capabilities in modulating plant biochemistry. The change in architecture will be exploited to permit production of bioproducts (drop-in fuel precursors in leaves), biomaterials (modified wood composition) in wood, as well as changes in agronomic production practices such as increased stand density leading to increased yield. Collectively, these engineered chassis and tools provide the platform of a new era for poplar biology, agronomy, and processing.

Synthetic Membrane Biology of Microbial Cell Factories: Lipid Interactions that Shape the Inner Mitochondrial MembraneBudinUniversity of California–San DiegoBudinBioenergyEarly Career

The goal of this Early Career Program project is to engineer the structure and properties of cell membranes to improve the performance of industrially relevant microbes. The project’s first objective is to enhance the rate and efficiency of the respiratory metabolism by engineering the organization of the Electron Transport Chain. Engineering efforts will define the limits of respiratory metabolism and seek to increase the production of energy-intensive next-generation biofuels. The second objective is to apply the emerging biochemistry of intracellular lipid trafficking pathways to develop new transporters for the capture of valuable biochemicals produced by the engineered yeast.

The inner mitochondrial membrane (IMM) is the site of bulk ATP generation for yeast cell factories and is thus essential for production of high-energy biofuels and bioproducts. The IMM is defined by highly curved cristae membranes (CM), whose lipidome is composed of unsaturated phospholipids and cardiolipin (CL). Recent efforts combined experimental lipidome dissection with multiscale modeling to investigate how lipid interactions shape CM morphology and metabolic function. When modulating fatty acid unsaturation in engineered yeast strains, the team observed that loss of di-unsaturated phospholipids (PLs) led to a surprising breakpoint in IMM topology and respiratory capacity. PL unsaturation modulates the organization of ATP synthases that shape cristae ridges, phenocopying the loss of CM shaping proteins. Based on molecular modeling of mitochondrial-specific membrane adaptations, the team hypothesized that conical lipids like CL buffer against the effects of saturation on the IMM. Loss of CL was found to collapse the IMM at intermediate levels of PL saturation, a function that is independent of ATP synthase oligomerization. To explain this interaction, researchers employed a continuum modeling approach, finding that lipid and protein-mediated curvatures act in concert to form curved membranes in the IMM. These results suggested that fermentation conditions that alter the fatty acid pool, such as oxygen availability or overproduction of saturated fatty acids in engineered strains, define the CL function. While loss of CL only has a minimal phenotype in highly aerated shake flasks, research shows that its synthesis is essential in microaerobic fermenters, which promote saturated lipidomes. Lipid and protein mediated mechanisms of curvature generation thus act together to support mitochondrial architecture in industrially relevant environments.

Developing Chassis for LDPE Upcycling from Microbes Native to the Gut Microbiome of Yellow MealwormsBlennerUniversity of DelawareKlauerBiosystems DesignUniversity

This project aims to enable the efficient depolymerization of polyolefins and upcycling to itaconic acid, through novel genomic insights into nutrient enhanced polyolefin degradation by the yellow mealworm gut microbiome and genetic tool development for gut microbiome isolates and engineered microbial communities.

Waste plastics represent a significant untapped source of carbon. Annually more than 200 million tonnes of plastic waste in the form of high-density polyethylene (HDPE), low-density polyethylene (LDPE), and polypropylene (PP) are generated and pollute soils, waterways, and bodies. No robust system exists to capture this carbon; however, in prior work in labs, researchers observed that yellow mealworm gut microbes are able to chemically modify these wastes suggesting an opportunity for biological upcycling. Microbial profiling of these communities reveals non-model taxa as the dominant microbes. Furthermore, the plastic degrading metabolic pathway remains unelucidated. A barrier to understanding plastic metabolism and engineering biological upcycling platforms is the lack of genetic tools to manipulate these species.

As a first step, the team previously enriched Tenebrio molitor guts for fast-degrading plastic microbial communities to identify optimal chasses for development for upcycling. Enrichment including growth on plastics and plastics with nutritional supplements such as oats, bran, and banana peels. Mealworm gut communities supplemented with oats were found to be optimal for plastics degradation rate. Researchers are further studying the effect of micronutrients and macronutrients in mealworm diets to enrich for an optimal polyolefin degrading community. The nutritional composition of the plastic and oats diet will be altered by supplementing with macronutrients such as protein and fats and with micronutrients such as nitrogen and potassium. From enriched communities, the team plans to cultivate key players in the degradation process and understand how the optimized community is structured, informing future creation of reduced complexity engineered microbial communities for plastic degradation and upcycling.

More than 300 microbial isolates were cultivated from microbial communities of LDPE, HDPE, and PP-fed mealworm guts, with and without oats supplementation through standard microbiological methods and isolating morphologically distinct colonies from each gut condition. By screening these isolates for growth with polyethylene powder as the primary carbon source, 30 taxonomically unique isolates were found to use LDPE powder as a carbon source for growth in a liquid mineral medium. Improved growth with LDPE particles as the primary carbon source relative to a mineral medium suggested that microbes participate in the degradation process. LDPE degradation is demonstrated by scanning electron microscopy (SEM) through visualized microbial colonization of plastic particles, biofilm formation, and surface modifications to the particles. Contrarily, no surface modifications were observed via SEM on LDPE films treated by the same microbial isolates, indicating that degradation efficiency varies depending on polymeric mechanical and chemical properties such as form factors, additives, and processing conditions. From this characterization survey, and microbial profile abundance, five taxa were identified belonging to genera Staphylococcus, Enterococcus, Corynebacterium, Brevibacterium, and Kocuria as robust chassis for development as upcycling platforms.

To understand degradation and upcycling potential of these isolates, researchers bioinformatically screened these isolates for polyolefin degrading enzymes. Polyolefin backbone cleavage is anticipated to be initiated by secreted enzymes via oxygen radical chemistry due to their high reactivity on carbon-carbon bonds. Protein families (pfam) that perform oxygen radical chemistry such as monooxygenases, dioxygenases, and peroxidases were identified in isolate genomes in order to identify PE-active enzymes. Genomes of isolates that grow best on LDPE were mined to find pfam that have a high number of genes that perform said chemistry relative to closely related microbial taxa. Select genes were heterologously expressed into Escherichia coli, purified, and tested on plastic substrates. Plastic substrate modification as a result of enzymatic activity is observed via (%) crystallinity increase after enzyme treatment on LDPE films.

As a first step towards the development of robust genetic engineering tools for these isolates, the team is collecting genome and methylome sequence data. The genome sequences are a prerequisite for targeted genetic engineering. The methylome data will help researchers understand which restriction modification systems are present in each isolate and design genetic parts to work in the presence of these systems.

In summary, mealworm microbiomes were enriched for optimum plastics degradation by supplementing with various co-feeds. New co-feed studies will identify key nutrients that further improve plastic degradation. From previously enriched communities, over 300 organisms have been isolated, 30 of which are able to grow with plastic as their primary carbon source. Five organisms hailing from non-model genera appear able to grow on LDPE powder and are somewhat abundant in enriched communities, indicating these microbes are suitable chassis for engineered polyolefin degradation and upcycling. Enzymes initiating the first degradation step of the upcycling process are being mined bioinformatically from microbial isolates and communities. Degradation capability of these enzymes is being analyzed chemically by Fourier-transform infrared spectroscopy and mechanically by differential scanning calorimetry. In parallel, downstream metabolic degradation processes are being evaluated using metagenomics and metatranscriptomics. Upon elucidation of polyolefin deconstruction processes, said processes will be enhanced in isolated strains and reassembled back into reduced complexity communities, as well as engineered into a heterologous host for the eventual production of itaconic acid from plastic substrates via metabolically engineering of the host.

Analysis of the Beneficial Associations of Sorghum with Arbuscular Mycorrhizal Fungi Studied with Genetics, Genomics, and MicrobiomicsBennetzenUniversity of GeorgiaBennetzenBioenergyUniversity

This project is designed to investigate the interactions between sorghum genes, arbuscular mycorrhizal fungi (AMF), the sorghum root-associated microbiome, and numerous environmental factors that contribute to sorghum biomass production. A key focus is on host genetics, so the team is investigating 337 Bioenergy Association Panel accessions to perform a full genome-wide association study (GWAS) analysis.

In first-year pilot studies, researchers determined that very different AMF populations were associated with sorghum roots at two field locations in Georgia and Arizona. Root types and developmental stages were also identified that were consistent for AMF abundances, infection structures, and percent colonization, thereby decreasing the number of sample extractions that will be needed in all future experiments. These first-year studies also indicated that a good deal of microbiome analysis could be performed most effectively by shotgun sequence analysis, rather than amplicon analysis.

In second-year studies of input effects, performed at a Georgia field location, the team determined efficient/reproducible methods for sample production/collection, and generated the full set of 4044 samples (337 genotypes * three replications * four treatments) to determine the effects of phosphate and nitrogen levels on all of the investigated field/sample properties in a randomized complete block design. These measured properties included quantification and type characterization of AMF and other microbes on or within the sorghum roots, expression of eukaryotic genes on or within the sorghum roots, microscopic analysis to classify and quantitate AMF infection structures, and agronomic sorghum traits, such as plant height, plant biomass, plant tillering, plant flowering time, and plant mineral content. Most of these analyses are still in the data generation stage, but researchers did observe a trend of high nitrogen increasing tiller number, but high phosphate decreasing plant height, potentially indicating disruption of mycorrhizal associations at the Georgia field site.

The team has also made major technical advances, including creating and training an automated imaging platform to allow high throughput classification and segmentation of AMF structures, such as arbuscules and vesicles, which in turn provided the size, density, and percent colonization of AMF structures in field-grown sorghum roots for GWAS analysis. Over 100 fungal morphotypes have been cultured from sorghum roots in Georgia, including three AMF species that are being propagated on sorghum as pure cultures. In addition, 100+ endophytic bacteria and fungi have been isolated and cultured from first-year studies in Arizona, and about half have been shown to exhibit drought tolerance by a polyethylene glycol assay. These cultured microbes will be used for planned greenhouse tests of hypotheses generated from descriptive field data from the major GWAS fields in Arizona and Georgia.

Understanding and Engineering Crown Root Development to Improve Water-Use Efficiency in Bioenergy GrassesBaxterDonald Danforth Plant Science CenterGoudinho VianaBiosystems DesignUniversity

This study aims to improve water-use efficiency in the bioenergy grass Sorghum bicolor by engineering crown root development. Crown roots play an important role in water and nutrient acquisition. To precisely modify crown root development, this project is engineering the location and gene expression level of a newly identified crown root regulator called Crown Root Defective using synthetic genetic circuits. Since synthetic genetic circuits have never been utilized in crop species before, researchers are currently testing and optimizing individual circuit building blocks in Setaria and Sorghum. The project goal is to enable the construction of circuits that drive gene expression in a predictable manner in these bioenergy grasses. Through the precise control of genes influencing crown root development and the identification of new root development regulators, the team aims to improve water-use efficiency.

S. bicolor is a biofuel crop that offers great potential because of its tolerance to drought, heat, and low cost of production as compared to other potential feedstocks. Despite yield gains through breeding, its productivity in suboptimal conditions is still limited. Acquisition of water and nutrients is mediated by the root, and in grasses the mature root system is primarily composed of crown roots (Viana, Scharwies, and Dinneny 2022). In the panicoid grass model Setaria viridis, soil moisture around the crown stimulates the development of crown roots, while drought conditions inhibit their growth, which facilitates the conservation of water (Sebastian et al. 2016). Despite their importance in crop productivity, the genetic mechanisms behind crown root development in response to environmental factors are not well understood. To engineer water-use efficiency in panicoid bioenergy grasses, researchers aimed to elucidate the key genetic mechanisms and subsequently alter crown root development. The team identified a mutant named crown root defective-1 (crd-1) that is specifically impaired in crown root development under well-watered conditions. Interestingly, this defect is rescued under drought stress conditions. Through Bulk Segregant Analysis, the gene was mapped to a single nucleotide polymorphism disrupting the splice site resulting in a premature stop codon of a gene encoding a WD-repeat protein. Researchers independently generated secondary alleles and complementation lines to confirm that the identified gene is in fact the causal gene for the phenotype.

Because of its promising phenotype, the team decided to leverage crd-1 to alter the number of crown roots by precisely modifying its expression level. In the past, synthetic genetic circuits were successfully used in combination with tissue-specific promoters to modify root architecture in Arabidopsis thaliana (Brophy et al. 2022). Despite success in developing tools and modifying model plants, in the past the transfer of these applications to crop plants has not always been successful. The team therefore established transient systems in Sorghum and S. viridis to test and optimize individual components of synthetic circuits for application in grass species. These circuits are built using synthetic transcription factors composed of bacterial DNA-binding proteins fused to transcriptional activation or repression domains. Through the use of transient protoplast expression, results have shown that the strength of the synthetic transcription factors is influenced by both the DNA-binding protein and activation domain. Researchers hypothesize this is due to the different conformation of each synthetic transcription factor resulting in altered accessibility of the activation domains. Moreover, research shows that the transcriptional activity of synthetic transcription factors is directly correlated to the number of binding sites utilized in the synthetic promoters. Together, the results from transient assays show that these parts can be modified in multiple aspects to drive gene expression in a predictable manner. Subsequently, genetic circuits were successfully built that implement the Boolean NOT implies logic operation in grasses.

In future research, the team aims to identify more factors influencing crown root development. WD-repeat proteins can act as scaffold proteins that organize multiprotein complexes and can also regulate gene expression. To understand the function of the WDR6 protein, researchers performed a yeast-two hybrid screen, which led to the discovery of several binding partners, including members of the Growth-Regulating Factor (GRF) family of transcription factors. The team is now working on validating these interactions and planning to conduct functional studies to further explore the role of the GRFs in crown root development. Through the identification of more players in the crown root development pathway and initial data from the genetic circuits, results indicate a promising future for engineering root architecture and other complex traits in bioenergy grasses.

Accelerating Discovery of Genes Regulating Stomatal Patterning and Water Use Efficiency in C4 Crops with Novel High-Throughput Methods for Mutagenesis and PhenotypingBaxterDonald Danforth Plant Science CenterTanBiosystems DesignUniversity

Bioenergy feedstocks need to be deployed on marginal soils with minimal inputs to be economically viable and have a low environmental impact. Currently, crop water supply is a key limitation to production. The yields of C4 bioenergy crops such as Sorghum bicolor have increased through breeding and improved agronomy. Still, the amount of biomass produced for a given amount of water use (water-use efficiency, or WUE) remains unchanged. Therefore, this project aims to develop novel technologies and methodologies to redesign the bioenergy feedstock sorghum for optimal WUE. Within this broader context, this subproject is using Setaria viridis as a rapid cycling model for gene discovery. The project aims to develop and demonstrate novel methods and resources to accelerate both the production of genetic variants and phenotyping of WUE traits as part of reverse and forward genetics approaches to discover genes regulating stomatal patterning and WUE.

Stomata regulate the exchange of CO2 and water vapor between the leaf and atmosphere, and therefore play a key role in determining WUE. However, relatively little is known about the genes that regulate stomatal patterning and WUE in C4 grasses. Previous work has identified several hundred candidate genes through a combination of genome-wide association study and transcriptome-wide association study. To validate these discoveries and advance efforts to engineer improved WUE of bioenergy crops, the team is developing novel methods to accelerate the use of forward and reverse genetics for gene discovery. Researchers conducted a forward genetic screen of 155 families of an N-nitroso-N-methylurea–mutagenized Setaria population, of which 100 lines show small stature and/or altered leaf color phenotypes. Whole-plant WUE is being assessed by imaging and automated lysimeters. These families are part of a larger population which is being fully sequenced by DOE Joint Genome Institute to create a sequence indexed mutant population. This data is being paired with screening for abnormalities in stomatal patterning. To accelerate a reverse genetic screen, researchers developed new methods for viral delivery of mutagenesis reagents in both Setaria and tobacco and are using these methods for studying stomatal developmental genes through loss- and gain-of-function mutations. Both the forward and reverse genetic approaches utilize high throughput optical tomography imaging to generate high-resolution images of the leaf surface.

A robust machine learning model was developed for identifying the size, shape, and number of epidermal cells in maize, and the team is adapting the model for Setaria. This work demonstrates a positive feedback loop of high-throughput phenotyping to genotyping in which genes of interest can be quickly identified and tested for a role in water-use efficiency. Success in this effort could be leveraged to accelerate research on a wide range of other traits and species.

Quantum Optical Microscopy of Biomolecules near Interfaces and Surfaces (QuOMBIS)BacklundUniversity of Illinois Urbana–ChampaignBacklundBioimaging

Researchers will develop three complementary microscopy techniques that exploit quantum correlations in light: Hong-Ou-Mandel interferometric tomography, g(2) correlation function imaging, and passive and active transverse mode sorting. Upon initial demonstration, the team will incorporate these methods into a single platform for tracking and imaging individual and few fluorescently labeled biomolecules, including cellulases, in the context of nearby biological interfaces and surfaces in order to unravel the fundamental processes involved in the conversion of lignocellulosic biomass into renewable fuels.

Since the publication of Hooke’s Micrographia in 1665, the scientific disciplines of light microscopy and (sub)cellular biology have progressed in lockstep with one another. Advances in the spatial and temporal resolution, specificity, and sensitivity of optical methods have continually led to new capabilities and insights in biological imaging. The pace of this evolution has quickened in the past century, as a mastery of the physics of light according to Maxwell’s equations has been wielded to more fully exploit classical effects like interference and diffraction. As the classical limits of light microscopy near saturation, however, sustained improvement in bioimaging technology is ultimately untenable without a more fundamental shift in research direction. Just as the field of quantum computing has gained prominence in anticipation of the inevitable breakdown of Moore’s Law, quantum-enabled light microscopy will likely provide the path forward for (sub)cellular biological imaging.

The team aims to help lead this effort by developing three complementary quantum microscopy modalities that each address a different challenge inherent to (sub)cellular microscopy:

  1. Hong-Ou-Mandel Interference Microscopy to enable loss- and noise-tolerant depth imaging with exquisite resolution;
  2. g(2) Microscopy to facilitate orders-of-magnitude sensitivity improvement in focusing and tracking single quantum emitters atop oppressive classical backgrounds at reduced excitation powers; and
  3. Transverse Mode Sorting Microscopy to enable super-resolution microscopy at low excitation powers and high temporal resolution.

Preliminary results demonstrate progress in developing these constituent techniques. The team will ultimately incorporate them into a common imaging platform that can provide access to the many scales of interest in energy-relevant plant and microbial biology. The combined technique, Quantum Optical Microscopy of Biomolecules near Interfaces and Surfaces (QuOMBIS), will be especially powerful for tracking and imaging individual and few fluorescently labeled biomolecules in the context of nearby biological interfaces and surfaces. Upon development of the methods, researchers will apply the platform to unravel and harness the enzymatic conversion of biomass into renewable fuels.

Optical and X-ray Multimodal-Hybrid Microscope Systems for Imaging of Plant Stress Response and Microbial InteractionsWakatsukiSLAC National Accelerator LaboratoryDowlatshahiBioimaging

Development of next-generation correlative X-ray, light, and electron tomography by incorporating caged fluorescent protein and metal nanocrystals as both tracers for X-ray imaging/microscopy, electro-optical fluorescence lifetime imaging microscopy (EO- FLIM) and fiducial markers for Cryogenic electron tomography (Cryo-ET; See Fig. 1a). The initial application will be to study plant-bacterial pathogen interaction at the plant cell surface and transport of vesicles. In this early phase of development, researchers are highlighting plans and progress for a high frequency EO-FLIM setup at Stanford Synchrotron Radiation Lightsource (SLAC), caging of fluorescent fusion proteins for conjugation to metal nanocrystals with cage protein, and establishing plant systems to monitor membrane trafficking and transport.

X-ray-FLIM coupling. The team’s current EO-FLIM configuration uses 40 and 80 MHz resonant gating (Bowman et al. 2019, Bowman and Kasevich 2021). The plan is to double this to 158.8 MHz in wide-field mode at SSRL X- ray imaging beamline BL 6-2. Some of the accomplishments in this area include: (1) sourcing of components for a new FLIM microscope to be integrated in X-ray microscopy beamlines for hybrid, X-ray pulse locked FLIM (See Fig. 1b) or used as a stand-alone instrument; (2) Testing the timing patterns at 158.72 Mhz with uniform fill (124 bunches) on the SSRL synchrotron on machine safety and electron stability; (3) Developing alternative approaches to produce X-ray excitation “single” pulses using a novel X-ray chopper crystal monochromator to select individual X-ray pulses from 476.3 MHz using a spinning silicon crystal at 75,000 rpm.

Novel biomarkers for multimodal imaging. The team’s previous work demonstrated the fusion of small proteins to the surface of a protein cage-like structure for cryo-EM structural determination of small biomolecules referred to as the double-shell system (Zhang et al. 2022). Modeling supported the feasibility to modify the existing system to encapsulate cargo for example having an external affibody and encapsulate a fluorescent protein within the protein cage that leaves headroom for conjugation of metal-based nanocrystals (MNCs) approximately 3-5 nm in size that emit X-ray photoluminescence with compatible short lifetimes (See Fig. 1a).

Some key accomplishments in this area: (1) refining modeling work in alpha-fold; (2) design of protein expression constructions and developing of purification strategies; (3) Cryo-EM analysis of first generation caged fluorescent proteins (See Fig. 1c).

The Brandizzi group used Arabidopsis thaliana as model plant species to show that members of the VAMP-associated- proteins (VAPs) family, VAP27-1 and VAP27-3, play a critical role in determining the topology of endocytosis in plant cells (Stefano et al. 2018. ).The goal is to analyze these processes using protein cages (See Fig. 1d) for hybrid multimodal X-ray imaging, Cryo-ET, and super-resolution imaging approaches. While the team develops these systems, members share some progress on establishing plant systems: (1) Vap27-YFP reporter strains grown and maintained for EO-FLIM measurements; and (2) Design and generation of constructs for expression and purification of VAP27 proteins and mutants for in vitro characterization supporting future in vivo experiments.

Diffraction-Free Beams in Light-Sheet MicrocopyVasdekisUniversity of IdahoLuoBioimagingUniversity

To overcome the tradeoff between frame rates and levels of irradiance in Raman imaging, the team introduces a prudently constructed light-sheet microscope relying on the Airy beam. This unique propagation-invariant beam also enhances considerably the field of view and contrast in light-sheet microscopy compared to traditional Gaussian beams. Further, the bending properties of accelerating diffraction-free beams are explored to aid the rates and resolution of light-sheet microscopy.

Raman imaging represents only a modest fraction of all research and clinical microscopy to date even though it exhibits great potential. This is due to the fact that most biomolecules present ultralow Raman scattering cross-sections, which impose low-light or photon-sparse conditions. Bioimaging under such conditions is suboptimal, as it either results in ultralow frame rates or requires increased levels of irradiance. Here, researchers overcome this tradeoff by introducing Raman imaging that operates at both video rates and 1,000-fold lower irradiance than state-of-the-art methods. To accomplish this, researchers deployed a judicially designed Airy light-sheet microscope to efficiently image large specimen regions (Dunn, et al. accepted, PNAS). The Airy beam is a special diffraction-free beam that harnesses its self-healing and refocusing characteristics. As such it can be utilized in light-sheet imaging (LSI) [Subedi et al. 2020, Subedi et al. 2021) including selective plane illumination imaging schemes (SPI). The diffraction-free behavior of Airy beams in these settings is investigated as a function of its cubic phase modulation ‘𝛼𝛼’, both theoretically and experimentally, as displayed in Fig. 1.

Furthermore, light-sheet microscopy facilitates wide field of view (FOV), high-contrast by employing propagation-invariant beams. To improve these properties, researchers are exploring the refocusing and self-bending properties of Airy beams and one-dimensional Bessel accelerating waves, as per Fig. 2.

Gatekeepers of Arctic Carbon Loss: Landscape-Scale Metabolites-to-Ecosystems Profiling to Mechanistically Map Climate FeedbacksVarnerUniversity of New HampshireKuhnEnvironmental MicrobiomeUniversity

The team proposes to identify and characterize microbes and metabolites critical for C transformation in high-latitude interconnected terrestrial and aquatic sediment systems, which are undergoing rapid climate change. Discontinuous permafrost in these regions s rapidly thawing and the potential climate feedbacks are substantial. These landscapes encompass various permafrost thaw stages and lesser-studied lakes, which can be the exit point for a significant fraction of CH4 lost post-thaw via ebullition (bubbling) and are projected to increase with warming. At the same time, thaw-initiated succession of plant communities can increase net soil C storage and potentially also plant-derived inhibitory compounds that dampen C processing. Accurate predictive understanding of the net effect of these simultaneous coupled loss, gain, and stabilization processes, under increasing temperatures is an area of urgent study.

The team will leverage a model terrestrial ecosystem with site-specific genomes, metabolite spectra, C gas emissions and isotopes profiles—including lake sediments with anaerobic CH4 oxidizers present at some of the highest abundances ever observed in a natural system. The team will compare integrated substrate-microbiome-emissions in the two habitats, recently thawed terrestrial fens and lake sediments, which together dominate climate feedbacks (via >90% of landscape CH4 emissions), quantify rates and controls, and distill insights into models, to more accurately predict C cycling and climate feedbacks.

Populus Secondary Cell Walls: Atomistic Models and Natural Variability in Architecture Informed by Solid-State Nuclear Magnetic ResonanceTuskanCBIHarman-WareBioenergyCBI

The Center for Bioenergy Innovation (CBI) vision is to accelerate domestication of bioenergy-relevant, nonmodel plants and microbes to enable high-impact innovations along the bioenergy and bioproduct supply chain while focusing on sustainable aviation fuels (SAF). CBI has four overarching innovation targets: (1) Develop sustainable, process-advantaged biomass feedstocks, (2) Refine consolidated bioprocessing with cotreatment to create fermentation intermediates, (3) Advance lignin valorization for bio-based products and aviation fuel feedstocks, and (4) Improve catalytic upgrading for SAF blendstocks certification.

Interactions and higher-order architecture of the biopolymers present in plant secondary cell walls (SCW) are poorly understood. Solid-state Nuclear Magnetic Resonance (ssNMR) measurements were used to infer refined details regarding the arrangements of the three major biopolymers in Populus cell walls: lignin, cellulose, and hemicellulose. Atomistic, macromolecular models representing multiple plant secondary cell wall arrangements were constructed in silico. Quantitative observables from molecular dynamics (MD) simulations (e.g., polymer-polymer distances, conformational analyses, quantified surface contact) were compared to ssNMR data to identify the most plausible polymeric arrangements. Results further understanding of plant SCWs and enable the development of hypotheses regarding superstructural architecture and resulting physicochemical properties of biomass cell walls. Additionally, researchers evaluated the genotypic variability of biopolymer spatial proximities in Populus SCW from a selection of natural variants exhibiting a range of compositional phenotypes. Coupling understanding of variability in SCW architectural characteristics with predictive atomistic models corroborated by ssNMR measurements will enable optimized design and deconstruction of sustainable biomass feedstocks. A high-level summary of the process workflow is presented in the figure below.

AI-Informed Systems Biology: The Discovery of Cryptic Phenotypes and the Functional Networks that Control ThemTuskanCBILagergrenBioenergyCBI

The Center for Bioenergy Innovation (CBI) vision is to accelerate domestication of bioenergy-relevant, nonmodel plants and microbes to enable high-impact innovations along the bioenergy and bioproduct supply chain while focusing on sustainable aviation fuels (SAF). CBI has four overarching innovation targets: (1) Develop sustainable, process-advantaged biomass feedstocks, (2) Refine consolidated bioprocessing with cotreatment to create fermentation intermediates, (3) Advance lignin valorization for biobased products and aviation fuel feedstocks, and (4) Improve catalytic upgrading for SAF blendstocks certification.

This project enables fast and accurate automated image-based plant phenotyping with minimal hand-annotated training data. Plant phenotyping is typically a time-consuming and expensive endeavor, requiring large groups of researchers to meticulously measure biologically relevant plant traits, and is one of the main bottlenecks in understanding plant adaptation and the genetic architecture underlying complex traits at population scale. Here the team addresses these challenges by leveraging few-shot learning with convolutional neural networks to segment the leaf body and visible venation of 2,906 P. trichocarpa leaf images obtained in the CBI common garden located at UC-Davis. In contrast to previous methods, the approach: (1) does not require experimental or image pre-processing, (2) uses the raw RGB images at full resolution, and (3) requires very few samples for training (e.g., just eight images for vein segmentation). Traits relating to leaf morphology and vein topology were extracted from the resulting segmentations using traditional image- processing tools and validated using real-world physical measurements.

To better understand the relationship among leaf phenotypes, a predictive phenomics network has been created from the leaf phenotypes with the use of iRF-LOOP, an explainable-AI-based network creation approach. Genome-wide association studies (GWAS) have been performed on each leaf phenotype and network-based functional partitioning has been performed across the GWAS results to determine the shared and distinct functional interactions responsible for governing leaf traits.

In this way, the current work provides the plant community with (1) methods for fast and accurate image-based feature extraction that require minimal training data, (2) a new population-scale phenotype data set, including 68 different leaf phenotypes, (3) a new SNP dataset for 1,419 genotypes called against v4.1 of the P. trichocarpa genome, and (4) a unique view of the functional relationships governing these leaf phenotypes. All few-shot learning code, data, and results are publicly available. This is one of the largest single releases of new plant genotype and phenotype data [www.osti.gov/dataexplorer/biblio/dataset/1846744].

Metaproteomic Insights into Mechanisms Enabling Anaerobic, Thermophilic Microbiomes to Achieve Undiminished Fractional Carbohydrate Solubilization at High Solids LoadingTuskanCBIHolwerdaBioenergyCBI

The Center for Bioenergy Innovation (CBI) vision is to accelerate domestication of bioenergy-relevant, non-model plants and microbes to enable high-impact innovations along the bioenergy and bioproduct supply chain while focusing on sustainable aviation fuels (SAF). CBI has four overarching innovation targets: (1) Develop sustainable, process-advantaged biomass feedstocks, (2) Refine consolidated bioprocessing with cotreatment to create fermentation intermediates, (3) Advance lignin valorization for biobased products and aviation fuel feedstocks, and (4) Improve catalytic upgrading for SAF blendstocks certification.

Economically viable production of cellulosic biofuels requires (1) operation at high solids loadings of lignocellulosic feedstocks—on the order of 150 g/L, and (2) complete utilization of solubilized carbohydrates. Around two-thirds of the mass content of lignocellulose is carbohydrate. An efficient sugar-to-liquid-biofuel microbial metabolism can achieve an end-product yield of 0.5 g ethanol/g sugar. Not considering product titer restrictions and solids handling issues, 150 g/L solids loading would result in a maximum biofuel titer for ethanol of ~50 g/L. However, the recalcitrant character of lignocellulosic feedstocks impedes biological conversion and represents a major cost barrier. To this end, the team characterized Nature’s ability to deconstruct and utilize lignocellulosic feedstocks at increasing solid loadings using defined cultures of anaerobic bacteria as well as anaerobic methanogenic microbiomes.

While the microbial community exhibits undiminished fractional carbohydrate solubilization near 0.7 at loadings ranging from 30g/L to 150g/L (Chirania et al. 2022), the defined culture shows decreasing solubilization at increasing solids loadings up to 80 g/L (Kubis et al. 2022). Note that fractional carbohydrate solubilization is defined as the portion of carbohydrate removed or solubilized per the original total amount of carbohydrate in a sample of biomass. The defined cultures reach high levels of solubilization they also leave behind small but distinct amounts of solubilized yet unutilized oligosaccharides. An ideal lignocellulose-to-biofuel process would employ characteristics of both these biocatalyst systems: the ability to (1) make a single liquid biofuel or intermediate as the metabolic end-product and (2) maintain high solubilization and utilization at high solids loadings.

To gain insight into the differences between solubilization and utilization of these two biological systems team members characterized microbiomes using metaproteomics, particularly focused on Carbohydrate Active enzymes (CAZymes) and diagnosed the defined cultures via fermentation- centered studies.

An anaerobic, thermophilic, semi-continuously fed, methanogenic microbial enrichment cultivated over an extended period (550 days), referred to as the lignocellulose-fermenting microbiome, was sampled at various solids loadings at steady state. The samples were fractionated to identify key microbes and enzymes. Researchers documented changes in the abundance of CAZymes across fractions and the details of the methanogenesis pathways. Significant enrichment of auxiliary activity family 6 enzymes at higher solids suggests a role for Fenton chemistry. Stress-response proteins accompanying these reactions are similarly upregulated at higher solids, as are β-glucosidases, xylosidases, carbohydrate-debranching, and pectin-acting enzymes—all of which indicate that removal of deconstruction inhibitors is important for observed undiminished solubilization.

The defined cultures reached solubilization levels as high as 75% on unpretreated feedstock in batch fermentations of corn stover and switchgrass. Cocultures consisting of cellulolytic and saccharolytic organisms showed increased solubilization and utilization over monocultures of cellulolytic bacteria. However, solubilization diminished at increasing solid loadings and this was not recovered via diagnostic experiments: the addition of a concentrated cell pellet to a running high solids fermentation did not increase solubilization compared to an equal volumetric addition of water-only. The culture was also able to ferment additional pulses of model soluble (cellobiose) and insoluble (cellulose) substrates without issue, but this did not result in adverse or beneficial effects on the solubilization of the corn stover. Dilution with no-carbon source medium additions did partly recover the high solubilization characteristics, as did supplementation of the defined cultures with microbiome isolates or co-inocula.

Our work provides insights into the mechanisms by which natural microbiomes effectively deconstruct and utilize lignocellulose at high solids loadings and sets us on a path for engineering bacterial strains in the development of defined cultures for efficient bioconversion.

Comprehensive Genome-Wide CRISPR Interference Library for High-Throughput Functional Genomic Studies in Pseudomonas putidaTuskanCBIEckertBioenergyCBI

The Center for Bioenergy Innovation (CBI) vision is to accelerate domestication of bioenergy-relevant, nonmodel plants and microbes to enable high-impact innovations along the bioenergy and bioproduct supply chain while focusing on sustainable aviation fuels (SAF). CBI has four overarching innovation targets: (1) Develop sustainable, process-advantaged biomass feedstocks, (2) Refine consolidated bioprocessing with cotreatment to create fermentation intermediates, (3) Advance lignin valorization for biobased products and aviation fuel feedstocks, and (4) Improve catalytic upgrading for SAF blendstocks certification.

The current paradigm for microbial engineering utilizes the design-build-test-learn cycle (DBTL), where production chassis are iteratively engineered based on previous findings toward the development of more robust production strains. Advanced genetic tools can accelerate this cycle, enabling the construction of strains with desirable phenotypes at a faster pace. With the advent of next-generation sequencing and CRISPR-Cas9-mediated genome editing, high-throughput genome-wide functional genomics experiments are now possible, enabling the study of hundreds of thousands of mutations in a single experiment. This work adapts existing knowledge gained using these high-throughput CRISPR-Cas9 technologies in model organisms to the promising nonmodel lignin-degrading soil bacterium Pseudomonas putida KT2440 to enable novel experimental approaches in this host for genotype-phenotype discovery for engineering efforts.

The team has optimized CRISPR-interference (CRISPRi) for P. putida by screening inducible promoter systems that express catalytically dead spCas9, a variant of Cas9 that can still associate with guide RNA (gRNA) and bind targeted sites in the promoter or 5’ end of a gene, resulting in knockdown of gene expression. Researchers have quantified the dynamic range of repression by targeting a genomically integrated fluorescent reporter as well as key metabolic genes that compete with pathways to target products to increase titers (Fenster et al. 2022). In collaboration with the DOE Joint Genome Institute (JGI), team members have generated genome-wide gRNA libraries to identify functional guides for gene editing and repression via CRISPRi to expand this system for genome-scale studies. For CRISPRi, researchers designed a 78,932-member library targeting each gene in the genome with 10-15 gRNAs per gene as well as 798 non-targeting gRNAs (1%) as internal controls. This library was transformed into P. putida and grown under varied growth conditions including glucose, acetate, and 10- and 50-mM p-coumaric acid, one of the major components of lignin hydrolysates that can be valorized by P. putida. Following selective growth, Illumina sequencing was performed where gRNA plasmids serve as barcodes and are used to compare pre- and post-selection to determine enrichment and drop out in the population to non-targeting gRNAs in the populations as controls. These data represent a wealth of new knowledge, currently uncovering essential genes (including a number that are of unknown function) as well as gene knockdowns that enrich or inhibit growth under each of the selective conditions. Validation of these identified targets will lead to discovery of new gene functions and optimization of gene expression in production strains to further expand engineering efforts and continue to accelerate the DBTL cycle in support of CBI research needs.

Deciphering the Genetic Basis of S/G Lignin Variation in Switchgrass through QTL Mapping and Transgenic ApproachesTuskanCBIDevosBioenergyCBI

The Center for Bioenergy Innovation (CBI) vision is to accelerate domestication of bioenergy-relevant, nonmodel plants and microbes to enable high-impact innovations along the bioenergy and bioproduct supply chain while focusing on sustainable aviation fuels (SAF). CBI has four overarching innovation targets: (1) Develop sustainable, process-advantaged biomass feedstocks, (2) Refine consolidated bioprocessing with cotreatment to create fermentation intermediates, (3) Advance lignin valorization for biobased products and aviation fuel feedstocks, and (4) Improve catalytic upgrading for SAF blendstocks certification.

Switchgrass (Panicum virgatum) is being domesticated as a sustainable bioenergy crop due to its wide adaptability, high yield, and low agricultural inputs. Life cycle analysis (LCA) has shown that, overall, feedstock yield is the main driver of fuel cost. When only considering the highest yielding accessions, however, biomass quality becomes an important player in biofuel yield and, hence, cost. Because the lignin S/G ratio may affect ethanol production as well as lignin monomer yields, researchers determined lignin monomeric composition using both pyrolysis molecular-beam mass spectrometry (pyMBMS) and thioacidolysis in an F2 population derived from a cross between the lowland genotype AP13 and the upland genotype VS16 (Qi et al. 2021). Quantitative trait locus (QTL) mapping for the S/G lignin ratio obtained using both methods identified colocalizing QTL on chromosome 9N. The 9N QTL region harbors the genes PvBLH6, which encodes a BEL1-like homeodomain protein 6 transcription factor, and PvKNAT1, a member of the Arabidopsis TALE homeodomain transcription  factor  family.  In Arabidopsis, BLH6 has been shown to interact with KNAT7 to affect secondary cell wall biosynthesis, including lignin content (Liu et al. 2014). Because lignin content was affected in the knat7 and blh6 knat7 loss-of-function mutants but not in the blh6 mutant, researchers transformed the AP13 (PvKNAT1AP13) and VS16 (PvKNAT1VS16) alleles of PvKNAT1, which differed by several nonsynonymous single nucleotide polymorphisms (SNPs), into a knat1-null Arabidopsis mutant. PvKNAT1VS16 but not PvKNAT1AP13 rescued the phenotype of the knat1 null mutant. These data suggest that only PvKNAT1VS16 is functional. T3 transgenic Arabidopsis plants homozygous for the presence of PvKNAT1VS16 and PvKNAT1AP13 are being grown for assessment of the lignin S/G ratio. The study demonstrates how combining genetic mapping in switchgrass with transgenic analyses in Arabidopsis can help uncover critical variants in genes contributing to traits of importance to the bioeconomy.

Engineering Bacterial Microcompartments in Clostridium autoethanogenum to Overcome Bottlenecks in Sustainable Production of Synthetic RubberTullman-ErcekNorthwestern UniversityPalmeroBioenergyUniversity

To investigate bacterial microcompartments in Clostridium autoethanogenum and engineer them to compartmentalize synthetic metabolic pathways.

One promising route to sustainable bioproduction of fuels and chemicals is the engineering of organisms such as acetogens to efficiently convert abundant and low-cost gases containing carbon monoxide or carbon dioxide and hydrogen to desirable, value-added products at high efficiency and low cost. This approach not only provides an avenue for repurposing greenhouse gases (GHG), but also minimizes the use of harsh chemicals and hazardous byproducts common in petroleum-based processes. However, many biochemicals are not yet produced biologically due to roadblocks in the cellular biosynthesis process. These roadblocks can include intermediate toxicity, redox imbalances, and loss of product to off-pathway reactions. In nature, these issues are often alleviated using spatial organization strategies, such as sequestration in organelles. In bacteria, such organization often occurs in protein-based organelles known as bacterial microcompartments (MCPs).

The team will investigate the native regulation, assembly, and function of MCPs in the industrially relevant nonmodel host C. autoethanogenum. In the C. autoethanogenum genome, two unique gene clusters have been identified as putative MCP operons. These putative operons contain sequences encoding possible hexamers, trimers, pentamers, and enzyme encapsulation sequences. The team tested potential inducers of these operons and found that some of these small molecules were consumed by C. autoethanogenum. RNAseq data show that these same small molecules transcriptionally activate the MCP operons. MCP formation in these conditions was corroborated by electron microscopy of C. autoethanogenum, which shows distinctive polyhedral shapes within the cells, indicative of MCP formation.

Beyond understanding the native function of these putative MCP operons, the engineering goal is to sequester key biosynthesis enzymes from two distinct metabolic pathways into MCPs to make compounds involved in rubber production. Specifically, researchers aim to showcase the power of enzyme encapsulation in an MCP for reducing toxicity and product losses to side reactions for these pathways. Towards enabling heterologous enzyme encapsulation in these new MCP systems, 16 C. autoethanogenum reporter strains were generated with different putative encapsulation peptides fused to sfGFP. Fluorescence microscopy shows that 11 of these 16 sfGFP-encapsulation peptide fusions exhibit punctate fluorescence upon MCP induction indicating successful encapsulation of the fluorescent reporter within MCPs. These results will pave the way for encapsulating biosynthesis enzymes for rubber production in future years and enable the cost-efficient production of chemicals currently derived from petroleum.

WINTR: Winter Transcriptome Regulation in PoplarTsaiUniversity of GeorgiaTsaiBioenergyUniversity

This project aims to advance understanding of how molecular control of the winter latent state is linked to perennial woody biomass productivity.

Woody biomass growth of trees comprises a significant contribution to the supply of renewable feedstocks for biofuels and biomaterials in the emerging bioeconomy. Genetic diversity in the molecular and physiological underpinnings of woody biomass growth continues to be explored for its potential adoption in tree improvement. Most of what is known about these underpinnings has come from the study of development and expansion growth during the summer or indoors. However, the woody bole of a field-grown tree is physiologically active year-round. In temperate deciduous tree species, physiological and metabolic adjustments are essential to confer winter protection in the wood-forming tissues. Mechanisms for the avoidance or tolerance of freeze-related intracellular or extracellular desiccation and for the protection and maintenance of plasma membrane and cell wall become critical.

Researchers focus on two complementary Populus experimental systems, P. trichocarpa with rich population genomic resources and the fast-growing hybrid P. tremula ´ P. alba INRA 717-1B4 (717) with proven transformation and genome editing efficiencies. A multipronged approach integrating stem RNA-Seq, genome-wide association studies (GWAS), expression quantitative trait loci (eQTL) mapping, high-precision CRISPR genome editing, and gene network modeling will be used to investigate transcriptome regulation in woody stem tissues during the winter. Of particular interest are genome duplicates that exhibit either winter-biased expression or divergent seasonal expression in both species. GWAS and eQTL predictions will be experimentally tested by CRISPR editing of coding or cis regulatory sequences for investigating their functional links to seasonal growth transitions and woody biomass accrual. Confirmed mutants will be field tested for seasonal growth transitions, transcript and metabolite profiling, and histological analysis. The transcriptomics data will feed back into regulatory network construction to improve inference of winter processes controlled by seasonal biased genes or their regulators. A key deliverable will be the contribution of winter stem transcriptomes to existing expression data obtained primarily from actively expanding tissues. The data will be integrated within the DOE Systems Biology KBase to promote further research efforts. Understanding how seasonal growth dynamics impact woody biomass productivity will offer new targets for bioenergy crop improvement.

Harnessing Robustness of Thermophilic Bacillus coagulans for Conversion of Switchgrass Hydrolysates to Designer Bioesters at Elevated TemperaturesTrinhUniversity of TennesseeRyuBioenergyUniversity

To fundamentally understand and redirect metabolism and regulation of thermophilic Bacillus coagulans for the efficient conversion of undetoxified lignocellulosic biomass hydrolysates into designer bioesters.

B. coagulans is a gram-positive thermophilic bacterium that is capable of growing at elevated temperatures, utilizing biomass hydrolysates, and producing lactate at high levels. Researchers aim to understand and harness its robustness for conversion of biomass hydrolysates to designer bioesters (e.g., acetate esters, lactate esters) that have broad use as solvents, flavors, fragrances, and advanced biofuels. Through comprehensive screening of diverse undomesticated B. coagulans strains isolated from different environmental niches against a wide range of temperatures (30-60oC), either single or mixed C5/C6/C12 sugars, and lactate concentrations (0–60 g/L), researchers found that most of the strains could grow in all of the environments tested with different degrees of robustness. Some candidates could utilize all sugars with minimal exhibition of diauxic growth, grow optimally at 55oC, and tolerate high concentrations of lactate up to at least 40 g/L, which serve as reference strains for elucidating cellular robustness and metabolic engineering. Genome sequencing and proteomics of the representative strain UT-1 showed that it has a 10% larger genome than those reported in literature and exhibited metabolic capability to utilize sugars in biomass hydrolysates and produce lactate. To rewire the metabolism of these novel undomesticated B. coagulans strains to overproduce designer bioesters, researchers have been developing genetic tools (e.g., antibiotic selection, plasmid compatibility, transformation, and gene expression) to manipulate them. The team has identified and designed thermostable enzymes to build exchangeable ester production modules compatible with B. coagulans for biosynthesis of acetate and lactate esters. Overall, B. coagulans is a promising microbial manufacturing platform that will be advanced by a fundamental understanding of its robustness and the ability to harness it for production of designer bioesters from lignocellulosic hydrolysates.

From Bulk Organic Matter Profiling to Specific Metabolite Identification: Improving Metabolomics Data Analysis, Annotation, Interpretation, and Integration TfailyUniversity of Arizona Ayala-OrtizEnvironmental MicrobiomeUniversity

Over the past decade, advances in different omics technologies such as metagenomics, metatranscriptomics, metaproteomics, and metabolomics have revolutionized biological research by enabling high-throughput monitoring of biological processes at the molecular and organismal level and their responses to environmental perturbation. Metabolomics is a newer and fast-emerging technology in systems biology that aims to profile small compounds within a biological system that are often end products of complex biochemical cascades. Despite its importance, metabolomics has long been overshadowed by other omics and while metabolomics has not always been considered a standard tool in environmental and microbiome science, it can augment the power of genomics, transcriptomics, and proteomics by providing a functional snapshot of all upstream biological processes, thereby filling in gaps left behind by genomics and proteomics. The overarching goal of this project is to develop user-friendly and open-source tools (MetaboDirect and MetaboTandem) to optimize, streamline, and improve current data analysis pipelines for metabolomics data sets from complex and heterogeneous samples. All tools will be available for other researchers and practitioners who can replicate or extend the work.

Metabolites constitute the chemical currency of environments as they are used, transformed, and exchanged by the microorganisms found within a system. Understanding how different perturbations change the metabolome will allow improvement of the knowledge of the mechanisms used by the microbial communities that drive ecosystem functions. However, fully characterizing the metabolome is challenging due to its complexity and heterogeneity, often requiring multiple approaches with different objectives. One approach is the bulk characterization of metabolites or organic compounds in a system through the use Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS). While the high-resolving power of this instrument allows for molecular-level characterization of organic matter from diverse environments, it generates hundreds of millions of data points that need to be processed and visualized. In response, the comprehensive, open-source, command-line based tool MetaboDirect was developed based upon years of analytical expertise with diverse sample types. The current version of MetaboDirect was published to be fully compatible with the output of the molecular formula assignation software, Formularity. Efforts are in place to integrate this pipeline with CoreMS, the comprehensive mass spectrometry framework being developed by EMSL at PNNL. Liquid chromatography tandem mass spectrometry (LC-MS/MS) is another approach that can be used for metabolome characterization of complex environmental samples that, unlike FT-ICR MS, is commonly used to get the true identity of the metabolite. However, LC-MS/MS suffers from additional challenges related to data processing and metabolite annotation and identification. Researchers are developing MetaboTandem as a package and a shiny app for the easy and fast analysis, annotation, and visualization of LC- MS/MS data that has a special emphasis on providing a comprehensive metabolite annotation that goes beyond the use of in-house database to include a combination of publicly available databases and in silico prediction tools.

While MetaboDirect and MetaboTandem were/are being developed to address two challenges with metabolomics data (Challenge 1: Easy-to-use software; Challenge 2: Effective data visualization), team members acknowledge that the metabolomics field continues to struggle with two additional challenges (Challenge 3: Metabolite annotation and identification and Challenge 4: Big data). As such, researchers have been working on combining analytical chemistry, computer science (ML/AI), and statistics to develop bioinformatics tools that address these two additional challenges. As such the team is not currently testing multiple ML-based algorithms to putatively identify unannotated metabolites. Accessing the information hiding in the unidentified metabolites will prove to be key to the effective integration of metabolomics with other omics data.

Deciphering Virocell Metabolic Dynamics and Ecosystem Outputs Using a Novel Integration of Machine Learning and MetabolomicsSullivanThe Ohio State UniversityRajakarunaEnvironmental MicrobiomeUniversity

The overarching goal of this project is to establish ecological paradigms for how viruses alter soil microbiomes and nutrient cycles by developing foundational (eco)systems biology approaches for soil viruses. Here, the team used metabolomics to investigate phage-specific metabolic reprogramming in virus-infected cells (virocells) to build critically needed model systems and in silico resources and tools that can be extended to new soil model phage-host systems. Researchers used an already established marine phage-host model system as the base to (1) characterize metabolic dynamics of virocell infection, (2) assess the output of virocell metabolic reprogramming on ecosystem function, and finally (3) develop the analytics that will be directly transferable to soil systems.

Microorganisms, including bacteria and viruses, play a vital role in biogeochemical cycles and global ecosystem function. At present, virus contributions are largely assessed through community-scale geochemical measurements or through evolutionary inferences such as identifying horizontal gene transfer. However, viruses also metabolically reprogram their bacterial host cells towards virion synthesis during infection, which effectively makes the infected cells (virocells) ecologically, metabolically, and physiologically different from uninfected bacteria. Here the team investigated the endo- and exometabolomes of an ecologically important marine heterotroph (Cellulophaga baltica strain #18), independently infected by three viruses with different morphologies, genomes and, therefore, infection strategies, to understand phage-specific virocell metabolic reprogramming and ecosystem outputs in a highly resolved infection time course. Through numerous ordination and multivariate statistical analyses, researchers show not only that a virocell’s metabolite dynamics are different to that of an uninfected cell, but also that such metabolite dynamics differ temporally and between each virocell in ways that qualitatively associate with virus infection efficiencies, likely impacting ecosystem function. To assess this impact most comprehensively, researchers are optimizing metabolite annotation—a major limitation in metabolomics—via testing and implementing combined machine learning and deep learning algorithms to obtain probabilistic annotations for unknown metabolites. These efforts have enabled a 24% increase in total annotations so far, which ultimately would significantly improve the biological interpretation of metabolomics results. Given that soil microbes can be infected by viruses at any given time, collectively these findings suggest that viruses can play an important role in regulating present and future carbon cycling.

Diversity Patterns and Ecological Footprint of RNA Viruses Along a Permafrost Thaw GradientSullivanThe Ohio State UniversityPratamaEnvironmental MicrobiomeUniversity

The overarching goal of this project is to establish ecological paradigms for how viruses alter soil microbiomes and nutrient cycles by developing foundational (eco)systems biology approaches for soil viruses. Specifically, this study aims to establish a fundamental knowledge of RNA viruses in permafrost regions and provide a comprehensive framework for understanding their complex interactions with hosts and the environment. To achieve this, the team will investigate the diversity and distribution of RNA viruses. In addition, to investigate their ecological impact, team members will examine the virus-host dynamics and evolution and identify the occurrence of auxiliary metabolic genes in RNA viruses. Ultimately, this research can significantly contribute to the integration of RNA viruses into the ecosystem and climate-related models.

As global atmospheric temperatures rise, sequestered permafrost soil carbon may be released via microorganismal activity, which can lead to a positive feedback loop effectively speeding up greenhouse gas emissions and climate change. While data are emerging about how prokaryotes and their DNA viruses may respond to permafrost thaw, the role of RNA viruses that infect soil eukaryotes remains less clear even though they may be important for nutrient cycling as demonstrated previously for the global oceans (Zayed et al. 2022; Dominguez-Huerta et al. 2022) and a diversity of forest, mountain, semi- desert, agricultural and sedimentary soils (Chen et al. 2022). Here team members leverage 55 metatranscriptomes (630 gigabases) collected along a permafrost thaw gradient in the model ecosystem Stordalen Mire to identify, quantify, and ecologically contextualize RNA viruses in these soils. Application of analytical approaches optimized for maximal sensitivity and largely automated systematic classification to these data identified 2,651 species-like RNA virus taxa. Though most of these species derive from the five known established phyla (Wolf et al. 2018), including one vOTUs in the recently suggested phylum Taraviricota (Zayed et al. 2022), nearly all represent novel species within these higher taxa and five likely represent a novel class in the phylum Lenarviricota (proposed name, “Stomiviricetes”). Ecological analyses of the approximately species-level taxa revealed strong habitat specificity as well as depth trends where RNA virus diversity decreased with depth. To assess the ecological footprint of Stordalen Mire RNA viruses, team members predicted hosts and evaluated gene content. This revealed that most (68%) likely infect key nutrient cycling eukaryotes that span multiple levels in the food web (only a few percent RNA viruses were predicted to infect prokaryotes), as well as 96 that carried virus-encoded auxiliary metabolic genes hinting at metabolic reprogramming of miscellaneous metabolic pathways, cellular and molecular processes, such as transport, transcription, countering antiviral defense, and coping to environmental stress. Together these findings provide baseline RNA virus diversity and ecology data and tantalizing hints at ecosystem impacts that establish fundamental knowledge for integrating them into predictive ecological models in the rapidly changing Arctic environment.

Genetic Determinants of Klebsiella Phage InfectionSullivanThe Ohio State UniversityGittrichEnvironmental MicrobiomeUniversity

The overarching goal of this project is to establish ecological paradigms for how viruses alter soil microbiomes and nutrient cycles by developing foundational (eco)systems biology approaches for soil viruses. Here the team seeks to understand how phages infect bacteria, specifically, what bacterial genes are required for infection to (1) see if researchers scan predict the bacterial genes required for phage infection based on phage sequences, and (2) use these data to generate models to understand how phages and bacteria interact in an ecological setting.

Novel bacteriophages (phages) are being cataloged at unprecedented rates, and current research broadly credits them with driving nutrient and energy cycling across many of Earth’s ecosystems. However, little is known about the bacterial genes required for infection beyond a few model phage-host systems. Such data are critical for modeling phage-host interactions in complex communities. Here, the team mapped bacterial genetic determinants of phage infection using a randomly barcoded, genome-wide loss-of-function transposon mutant library (RB-TnSeq) of a plant growth-promoting rhizobacterium (Klebsiella sp. M5a1). This library was individually challenged by 25 diverse, double-stranded DNA phages spanning four known phage families at three multiplicities of infection (0.1, 1, and 10). The genetic screen uncovered a multitude of bacterial factors involved in phage infection, such as genes involved in receptor formation, transcription regulation, electron transport, and genes with unknown functions. When disrupted, some bacterial genes, such as those encoding putative glycosyltransferases involved in LPS biogenesis, conferred resistance to up to 50% of the phages across multiple phage families, potentially due to preventing phage adsorption. Other bacterial genes involved in intracellular functions, such as the electron transport chain and transcriptional regulation, were phage-specific, indicating that such cellular processes are differentially required across the diverse phage set. This supports previous findings in Escherichia coli that genes involved in intracellular functions are phage-specific, while genes encoding for receptors required for phage adsorption are more broadly required across phages. Additionally, the team found that some phages, although highly related, required a unique set of bacterial genes for phage infection. Together these findings provide a foundation to develop predictive models of phage infection that can be applied to environmental systems.

Portals of Discovery: The NMDC Infrastructure and Products for Microbiome ResearchEloe-FadroshLawrence Berkeley National LaboratorySmithEnvironmental MicrobiomeNMDC

The vision of the National Microbiome Data Collaborative (NMDC) is to connect data, people, and ideas to advance microbiome innovation and discovery. The team is committed to creating the needed infrastructure to answer tomorrow’s research questions. With this vision in mind, the NMDC seeks to support a Findable, Accessible, Interoperable, and Reusable (FAIR) microbiome data sharing network—through infrastructure, data standards, and community building—that addresses pressing challenges in environmental sciences. The infrastructure and portals that the NMDC has developed provide the research community with a platform to share their microbiome research and data in accordance with the FAIR principles, thereby promoting data reuse and accelerating scientific discoveries.

The NMDC is committed to FAIR multiomics microbiome data. The NMDC infrastructure supports a collaborative, integrative science ecosystem that empowers the research community to contribute, explore and investigate microbiome data. This is accomplished through the three key NMDC products that are openly available to the research community: (1) the Submission Portal, (2) the Data Portal, and (3) NMDC EDGE. The NMDC Submission Portal provides users a place to contribute sample metadata in a standardized manner with in-sheet validation to ensure machine readability and findability. The Data Portal consumes this metadata and presents search tools to find and access information about the research studies and data generated from the samples. The Data Portal also provides links to the associated omics data processed through the NMDC’s standardized bioinformatics workflows. NMDC EDGE is an user-friendly interface for the NMDC standardized bioinformatics workflows. These three NMDC products are built specifically with the BER research community in mind and refined through a process of continual collaboration that is based on user-focused feedback. Leveraging each of the NMDC products enables environmental microbiome researchers to adhere to FAIR data principles, thus expanding potential research questions, comparisons, and scientific discovery.

The Context-Dependency of Plant-Microbial Interactions in the Bioenergy Resource EconomyStuartLawrence Livermore National LaboratoryHestrinBioenergyMicrobial Systems Biology for Bioenergy

Algal and plant systems have the unrivaled advantage of converting solar energy and CO2 into useful organic molecules. Their growth and efficiency are largely shaped by the microbial communities in and around them. The μBiospheres SFA seeks to understand phototroph-heterotroph interactions that shape productivity, robustness, the balance of resource fluxes, and the functionality of the surrounding microbiome. The team hypothesizes that different microbial associates not only have differential effects on host productivity but can change an entire system’s resource economy. The approach encompasses single cell analyses, quantitative isotope tracing of elemental exchanges, omics measurements, and multi-scale modeling to characterize microscale impacts on system-scale processes. The team aims to uncover crosscutting principles that regulate these interactions and their resource allocation consequences to develop a general predictive framework for system-level impacts of microbial partnerships.

The hyphosphere is a hotspot for multipartite interactions that shape critical terrestrial processes such as soil nutrient cycling, C distribution, and plant growth. To investigate how water limitation impacts microbial dynamics and biogeochemistry in both the rhizosphere and hyphosphere, researchers inoculated P. hallii with one of two functionally different mycorrhizal partners (Rhizophagus irregularis and Serendipita bescii) and grew the plants under either water-limiting or water-replete conditions in 13CO2 labeling chambers to enable carbon tracking. After 3 months, researchers used H218O quantitative stable isotope probing (qSIP) to assess how water limitation impacted hyphosphere bacterial growth rates and diversity. The team found that both fungal partners helped sustain growth and diversity in hyphosphere bacterial communities exposed to water limitation relative to uninoculated controls. Of the bacterial taxa that responded positively to R. irregularis or S. bescii in water-limited soil, many belong to lineages that are considered drought-susceptible, including Bacteroidetes, Planctomycetes, Verrucomicrobia, Proteobacteria, and Acidobacteria. The size of soil C pools and 13CO2 efflux from hyphosphere soil depended on soil moisture conditions, but exometabolite profiles and multimodal imaging suggest that the different mycorrhizal fungi also can influence C flow and soil biogeochemistry. Together, the findings indicate that mycorrhizal fungi can support biotic activity and resilience to water limitation.

In addition to mycorrhizal fungi, roots are often colonized by a diverse array of endophytic fungi. Historically, these fungi have been assumed to be largely commensal, but recent work by the NCSU team suggests that many confer nutritional benefits to the host plant. Using a panel of phylogenetically diverse root endophytes isolated from switchgrass, researchers demonstrated that these fungi broadly enable plant acquisition of organic N and P in soil. Compared to fungus-free controls, 30% of fungi (n=12) increased tissue N by 20-90% when provided with organic N, and 40% of fungi (n=16) increased tissue P by 25-80%. Most importantly, some fungi appear to substantially shift the N:P ratio (from N>>P to N=P). Fungi that aggressively consumed organic nutrients in culture were less beneficial for host acquisition of organic N or P (r = -0.39 to -0.57). Leveraging these results, team members are investigating C-nutrient trading in the root endophyte system. This work will substantially increase understanding of root endophyte contributions to host and ecosystem C and nutrient cycling.

In parallel with experimental work, team members are developing a plant-mycorrhizal-bacteria model to bridge cellular scale processes within a systems-level context. The model is a hybrid model that combines a lattice-free hyphal network and a co-localized diffusive/advective grid. The model is designed to enable interpretation of and integration with spatially resolved community flux balance simulations of mycorrhizal-bacterial communities using data from experimental studies. Environmental control on plant, mycelium, and bacterial dynamics are governed by a set of coupled differential equations that preserve C and nutrient mass balances. The team will explore the traits and tradeoffs that promote plant growth and total system biomass growth under different environmental conditions and will test these predictions in future experimental studies.

Identifying Genomic and Metabolic Underpinnings of Algal-Bacterial Interactions via Metatranscriptomics, NanoSIMS Isotope Tracing, and Genome-Scale Metabolic ModelingStuartLawrence Livermore National LaboratoryMayaliBioenergyMicrobial Systems Biology for Bioenergy

Algal and plant systems have the unrivaled advantage of converting solar energy and CO2 into useful organic molecules. Their growth and efficiency are largely shaped by the microbial communities in and around them. The μBiospheres science focus area (SFA) seeks to understand phototroph-heterotroph interactions that shape productivity, robustness, the balance of resource fluxes, and the functionality of the surrounding microbiome. The team hypothesizes that different microbial associates not only have differential effects on host productivity but can change an entire system’s resource economy. The approach encompasses single cell analyses, quantitative isotope tracing of elemental exchanges, omics measurements, and multiscale modeling to characterize microscale impacts on system-scale processes. Team members aim to uncover crosscutting principles that regulate these interactions and their resource allocation consequences to develop a general predictive framework for system-level impacts of microbial partnerships.

The team’s previous work has shown that heterotrophic bacteria influence algal productivity through complex metabolic interactions and resource utilization. However, those interactions, varying from mutualistic to antagonistic, may be context-dependent based on resource availability. Identifying these dependencies requires a better fundamental characterization of the interactions’ genomic and metabolic underpinnings. To accomplish this, researchers present a study system comprised of a previously uncharacterized and uncultivated antagonistic parasitic bacterium that attaches to and crashes bioenergy-relevant algae in days. Team members hypothesized that the acute phenotypic changes would coincide with bacterial uptake of algal-derived substrates, and that bacterial metabolism during infection would indicate key resources required by the bacterium. To address these hypotheses, the team used a combination of amplicon sequencing and genome-resolved metagenomics, fluorescent in situ hybridization (FISH), stable isotope probing, metatranscriptomics, and metabolic modeling.

To set up the system, team members first identified a bacterial enrichment community that killed the alga Phaeodactylum tricornutum. Using 16S amplicon sequencing in tandem with genome-resolved metagenomics, researchers initially annotated the presumed parasitic bacterium as a previously uncharacterized Rickettsiales sp. Then, using custom FISH probes, positively identified the Rickettsiales sp. directly attached to algal cells. Researchers next quantified the ability of the parasitic Rickettsiales sp. to incorporate algal-derived substrates using stable isotope probing with 13C bicarbonate and 15N nitrate and measured single-cell uptake with nanoscale secondary ion mass spectrometry (NanoSIMS). Researchers found that algal cells with attached parasitic bacteria had lower single-cell carbon fixation, and when attached to the algal host, bacterial cells were enriched in both 13C and 15N, with some bacteria more highly enriched in 15N compared to the host. This suggests that Rickettsiales sp. incorporated algal-derived substrates and may have the capacity to siphon newly metabolized N resources from its host.

In order to provide insight into the identity of these host resources, researchers used metatranscriptomics to quantify gene-level expression changes in P. tricornutum and Rickettsiales during infection. They found that prior to Rickettsiales attachment, it overexpressed genes for iron and trace metal scavenging, amino acid starvation, ribosomal hibernation, oxidative stress, and flagellar and pilin assembly, among others. Once attached, Rickettsiales upregulated genes for chemotaxis and signal transduction, antibiotic resistance, production of proteases and peptidoglycanases, type IV secretion system, gene transfer agents implicated in virulence, membrane transporters, and amino acid metabolism. Taken together, this suggests that free-living parasites are likely starved for nutrients, particularly amino acids and trace metals, and once attached can produce enzymes that degrade algal cell wall material, transfer virulence factors to the host, and potentially import and metabolize algal-derived N-rich amino acids and proteins. To better predict specific substrate utilization by bacteria, the team has curated a genome-scale model from the near-complete metagenome-assembled genome (MAG) of the Rickettsiales sp. bacterium and integrated the gene expression data into a metabolic flux balance analysis (GX-FBA). Optimization of the model is ongoing; however, researchers anticipate that it will putatively identify specific algal-derived substrates metabolized by the parasitic bacterium during the interaction, which can then be tested in subsequent experiments. To this end, the team has ongoing research to identify taxon-level resource partitioning of P. tricornutum exudates by its microbiome, using metabolomics, stable isotope probing, and the newly designed porous microplate incubation system. Thus far, researchers have conducted exometabolomics studies to characterize metabolite uptake profiles and potential resource partitioning among a suite of algal associated bacterial isolates, and further investigated those interactions with pairwise sequential growth experiments. Following this sequential interaction study, the team has confirmed the bacterial competition by measuring each growth response to algal exudates in situ using a co-culture porous microplate. Moving forward, the team will expand upon these simplified approaches to quantify how intimate algal-bacterial interactions change the flow of carbon system-wide using more complex microbial communities.

Improving Bioprocess Robustness by Cellular Noise EngineeringStephanopoulosMassachusetts Institute of TechnologyDaletosBioenergyUniversity

The overall goal of this project is to enhance the robustness of biofuel production in adverse and fluctuating environments, such as media containing toxic hydrolysates, by introducing cellular noise engineering as a means of improving the production. The approach involves the identification of factors in the transcription process that increase cellular noise and the deployment of such factors to generate cells with increased noise. Researchers use modeling and a single-cell analysis workflow to engineer Yarrowia lipolytica variants that can tolerate, grow, and efficiently synthesize biofuel precursors under steady state, albeit stressful, conditions. Overall, the team anticipates that strains with optimal levels of cellular noise will also exhibit robustness that maintains production under time-varying stresses.

Robustness represents a system-level trait that allows cell populations to maintain function under adverse and fluctuating environments. When observed at the cellular or subcellular level, an isogenic cell population exhibits increased cell-to-cell variability, or noise, even under steady- state conditions. In this context, isogenic cells undergo division of labor with some expressing the pathways that enable them to continue functioning in the new environment. This concept guides this project in developing workflows for introducing and manipulating cellular noise to enhance cellular tolerance to environmental stressors. The focus has been placed on the construction of Yarrowia lipolytica strains with the double‐phenotype of tolerance and high lipid productivity. In the team’s first steps on cellular noise engineering, researchers refined gene editing toolboxes that can deterministically vary the level of cellular noise in protein expression levels. In this context, the team followed an approach previously demonstrated in Saccharomyces cerevisiae for designing a synthetic promoter library in which increasing numbers of transcription factor binding sites led to enhanced noise levels (Sharon et al. 2014). Similarly, the team introduced three to five tandem upstream activating sequences to the erythritol‐inducible pEYK1 promoter. The synthetic hybrid promoters were placed upstream of a red fluorescent protein, fused into plasmids, and stably integrated into the genome of Y. lipolytica. In the induction experiments, the team independently varied the concentration of erythritol and glucose to determine the relationship between expression level and noise. Each transformant bearing the erythritol‐inducible pEYK1 promoter fused to the upstream activating sequences was screened separately by flow cytometry. The results were categorized into expression and noise levels and compared to those of parental strains. As a next step, researchers introduced key genes that play a significant role in viability at varying inhibitor levels. To this end, rational design was applied to develop a cellulosic oil Y. lipolytica strain that is tolerant to the primary lignocellulosic inhibitor furfural. To enable tolerance to furfural, researchers constructed Y. lipolytica overexpressing an evolved reductase enzyme (GRE2evol), which was directly obtained from prior work with S. cerevisiae (Lam et al. 2021),  or an endogenous aldehyde dehydrogenase that converts furfural to the less toxic furoic acid. The team finally evaluated front‐runner Y. lipolytica strains under both stressful and non‐stressful conditions to quantify the effects of noise and expression levels on furfural tolerance.

Developing Anaerobic Fungal Tools for Efficient Upgrading of Lignocellulosic FeedstocksSolomonUniversity of DelawarePareekBiosystems DesignUniversity

This project develops genetic and epigenetic tools for emerging model anaerobic fungi to identify the genomic determinants of their powerful biomass-degrading capabilities, facilitate their study, and enable direct fungal conversion of untreated lignocellulose to bioproducts.

Deconstruction of plant cell walls is a significant bottleneck to the economical production of affordable biofuels and bioproducts from abundant and renewable plant biomass. Anaerobic fungi (Neocallimastigomycota) from the digestive tracts of large herbivores, however, have evolved unique abilities to degrade untreated fiber-rich plant biomass by combining hydrolytic strategies from the bacterial and fungal kingdoms (Haitjema et al. 2017). Anaerobic fungi secrete the largest known diversity of lignocellulolytic carbohydrate active enzymes (CAZymes) in the fungal kingdom, which unaided can degrade up to 60% of the ingested plant material within the animal digestive tract (Seppälä et al. 2017, Youssef et al. 2013). Unlike many other fungal systems, these CAZymes are tightly regulated and assembled in fungal cellulosomes to synergistically degrade plant material, including untreated agricultural residues, bioenergy crops, and woody biomass with comparable efficiency regardless of composition (Haitjema et al. 2017, Solomon et al. 2016, Solomon et al. 2018, Hooker et al. 2018).

The team’s efforts to characterize gut fungal CAZymes reveal industrially relevant properties such as remarkable stability and activity towards untreated plant biomass (Hooker et al. 2018, Hillman et al. 2021). Gut fungal CAZymes liberate sugars from cellulosic substrates for over a week after inoculation with some sugar metabolized to organic acids. However, model bioproduction hosts such as K. marxianus can capture the carbon in these sugars and acids in a two-stage process to efficiently upgrade this carbon to high value solvents, fragrances and advanced fuels derived from esters and aromatic alcohols (e.g., ethyl-acetate and 2-phenylethanol; Hillman et al. 2021). Similarly, anaerobic fungal biosynthetic enzymes possess unique cofactor substrate preferences that support higher catalytic efficiencies, which are easily overlooked via heterologous expression due to the extremely high AT content (~83%) of gut fungal genomes and biased codon preferences (Hillman et al. 2021). Thus, there is an unmet need to build genetic tools and methods to study these enzymes natively in anaerobic fungi.

As a first step to tool development, the team sequenced the genomes of three novel anaerobic fungal isolates to enable part mining. High quality genomic DNA isolations were paired with PacBio long-read sequencing and Hi-C (chromosomal conformation capture) sequencing to achieve the first chromosomally resolved genomes for isolates belonging to the genera Neocallimastix and Piromyces. The team’s assemblies incorporate more than 99% of the genome into 12-25 chromosomes with N50<10. Parts that regulate gene expression (e.g., promotors and terminators) were then identified via sequence homology to assemble a nascent genetic toolbox for heterologous expression in anaerobic fungi.

To introduce these parts, the team optimized delivery of DNA with fluorescently labelled oligonucleotides and circular plasmids. Both Piromyces and Neocallimastix isolates were naturally competent for double-stranded DNA, taking up any DNA supplemented to the growth media and localizing it to the nucleus. Transformation was observed in multiple life stages suggesting that natural competence is a robust property of anaerobic fungi and a facile method to introduce heterologous DNA. Using natural competency, team members then validated heterologous expression of anaerobic fluorescent reporters and codon-optimized antibiotic resistance markers expressed via identified constitutive promoters and terminators synthesized by the DOE Joint Genome Institute’s Biological and Environmental Research Support Science (JGI-BERSS) program. Researchers also identified and validated nuclear localization sequences (NLS) from anaerobic fungal histone proteins, which displayed distinct sequence motifs from conventional NLSs used in model organisms. However, due to a lack of known autonomously replicating sequences, heterologous DNA must be supplemented to growth media daily to achieve stable phenotypes. Efforts are underway to achieve stable transformants via the use of Cas9 ribonucleoproteins (RNPs) together with split-marker cassettes.

In conclusion, anaerobic fungi hold a wealth of potential for biocatalysis from renewable plant substrates. Leveraging natural competency to introduce exogenous DNA, researchers have achieved the first simple methods for targeted heterologous expression in anaerobic fungi. The team’s growing toolbox for anaerobic fungi form foundational tools to generate a deeper systems-level understanding of anaerobic fungal physiology while establishing fundamental knowledge about regulation of gut fungal CAZymes. Ultimately, the team enables predictive biology in anaerobic fungi and derive insight into microbial plant deconstruction to advance the development of economical biofuels and bioproducts.

Developing a High-Throughput Functional Bioimaging Capability for Rhizosphere Interactions Utilizing Sensor Cells, Microfluidics, Automation, and AI-Guided AnalysesBabniggArgonne National LaboratoryBabniggBioimaging

The complex dynamics of root-microbe interactions in the rhizosphere drives recognizable spatial structures. However, knowledge of the specific factors that lead to their development and sustain them for plant health and productivity is sparse.

This project aims to develop a unique functional imaging technique that exploits native sense-and-respond circuits of plant growth–promoting rhizobacteria (PGPR) to monitor chemical exchange between the plant root and microbe during the different phases of colonization. Several native PGPRs will be turned into biosensor cells, and root colonization will be evaluated with Arabidopsis and Camelina. Genetic variants of Arabidopsis with gain or loss of function will provide drastically altered local environments, resulting in colonization patterns that differ from those observed previously. An orthogonal X-ray imaging approach will provide high resolution elemental analysis of the local environment, and imaging throughput in general will be accelerated by automation and artificial intelligence (AI)–driven analysis. In addition, the team aims to advance the throughput of current bioimaging capabilities that leverage imaging chips developed with BER funding with automation, and an AI-guided image analysis strategy.

This combined HTP-AI bioimaging capability, along with advanced analytical techniques offered by the Advanced Photon Source and Environmental Molecular Sciences Laboratory, will capture the dynamic chemical shifts and colonization patterns in the rhizosphere.

Optogenetic Control of a Dual Yeast-Yeast Consortia for Chemical ProductionAvalosPrinceton UniversityGarcia EchauriBioenergyUniversity

The goal of this project is to develop optogenetic tools and applications—the use of light-responsive proteins to modulate biological processes—for the control of microbial consortia for biofuel and chemical production. Researchers have developed novel optogenetic circuits to control growth rates in several strains of yeast and bacteria; this allows the team to not only stabilize microbial consortia with light, but also optimize their population ratios for chemical production. The team will develop light-controlled co-culture fermentations and use mathematical models and feedback controls to advance basic understanding of these biological systems and optimize them for growth rate and chemical production. These technologies constitute a new paradigm for the engineering and control of microbial consortia, which could help to realize their promise for biofuel and chemical production.

Microbial co-culture fermentations can improve the production of chemicals and biofuels over single-strain fermentations; optimizing and segregating production modules among the consortia members can lower the metabolic burden from overexpression of metabolic enzymes (Zhou et al. 2015). Dynamically tuning the consortia composition is integral in optimizing multistep production processes, in which growth and production phases are uncoupled and therefore different for each consortia member. Optogenetics has enhanced the ability to seamlessly control gene expression with the input of light. Light as a gene inducer has many advantages when compared to traditionally used chemical inducers: it’s inexpensive, can be applied and removed instantly, is highly tunable, is active in different media compositions, and has minimal cellular side-effects. The lab has developed optogenetic tools to control gene expression in yeast with blue light (Lalwani et al. 2021). Using this system, researchers engineered two Saccharomyces cerevisiae strains with opposite growth phenotypes, one requiring blue light to grow and stops growing in darkness, while the other requires the absence of blue light (darkness) to grow and does not grow under blue light. Using these strains, researchers demonstrate the control of synthetic yeast-yeast consortia to achieve desired set-points of cell densities in batch and continuous culture conditions.

KBase Science and Infrastructure UpdatesArkinLawrence Berkeley National LaboratoryWood-CharlsonComputational BiologyKBase

The Department of Energy Systems Biology Knowledgebase (KBase) is a knowledge creation and discovery environment designed for both biologists and bioinformaticians. KBase integrates a large variety of data and analysis tools, from DOE and other public services, into an easy-to-use platform that leverages scalable computing infrastructure to perform sophisticated systems biology analyses. KBase is a publicly available and developer extensible platform that enables scientists to analyze their own data within the context of public data and share their findings across the system.

Science Updates

Science Focus Areas (SFAs) and university collaborators have been testing and releasing new data and functionality in KBase, especially around improving genome quality, functional prediction of microbial communities, and making data and tools accessible to everyone.

The Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA) SFA is integrating long-read sequencing and isolate polishing tools into KBase and is collaborating with KBase to host a training workshop on laboratory and bioinformatics methods that support the generation of high-quality isolate genomes. Next, a major collaborative development area includes model-driven phenotype prediction and mechanistic analysis within KBase. This pipeline starts with a wide range of new tools designed to predict potential functions from protein sequence. New genome annotation pipelines, Distilling and Refining Annotation of Metabolism (DRAM) and Snekmer, expand annotation to new specialty areas of metabolism and offer alternative function hypotheses for difficult to annotate genes. Improvements to the KBase infrastructure were made to support multiple alternative theoretical annotations for genes developed with the Systems Biology Approach to Interactions and Resource Allocation in Bioenergy-Relevant Microbial Communities SFA. Growth phenotype data offers a means of discerning which of these alternative annotations is correct, but this data is commonly not available for many genomes. The KBase Knowledge Engine (KE) team addressed this by developing machine-learning–based tools to predict phenotypes based on genome annotations. Following that, researchers determine which combinations of gene annotations lead to the best agreement with predicted phenotype. The Phenotypic Response of the Soil Microbiome to Environmental Perturbations SFA developed an algorithm for automatically fitting metabolic models to predicted (and observed) phenotype data. Further validation of proposed annotations can be done using protein structure–based evidence, which is now supported by a collection of KBase tools that import protein structure data for KBase proteins from the Research Collaboratory for Structural Bioinformatics Protein Data Bank. Finally, all of these model-driven workflows are enhanced by significant improvements to the ModelSEED metabolic model reconstruction and analysis tools in KBase. Together, these tools seamlessly interoperate, offering greatly enhanced understanding of genome metabolism, with more accurate and quantitative energy metabolism and improving phenotype prediction accuracy from 56% on average for draft models to 72% accurate.

This pipeline is being applied to a growing collection of high-quality datasets loaded into KBase from collaborators. For example, the Genome Resolved Open Watershed (GROW) project contains 178 metagenomes, 50 metatranscriptomes, and 2,093 metagenome-assembled genomes that are available and linked to rich sample metadata in KBase. Another example is the Plant-Microbe Interfaces SFA, working on KBase apps to simplify the isolate selection process for constructed community experiments, adding >550 isolate genomes to KBase, and supporting integration of high-value datasets (Biolog data, BacDive database).

Infrastructure Updates

KBase has also undergone some infrastructure improvements over the past year. KBase has continued to dramatically improve bulk upload support, which now enables upload of large datasets using a spreadsheet to specify object names, filenames, and metadata.

A central goal of KBase is to put users’ data in context of all data in the platform, allowing users to quickly find and prioritize all relevant data. To further this goal, the project is introducing “Collections,” high quality curated data sets and an interface for rapidly matching and sub selecting data sets based on their relationship to a user’s data. These collections will initially focus on high quality, highly relevant datasets especially from the DOE community and will enable users to see relationships between their data and these collections to enable new insights and drive further analysis.

KBase is also working to ensure community contributions are tracked and credited appropriately. KBase assigns Digital Object Identifiers (DOIs) for Narrative workflows that have been made static and documented for publication, which connects KBase research products to the broader publishing infrastructure. KBase will soon enable researchers to have their KBase DOIs visible as part of their individual ORCID record, and KBase will soon be able to report data reuse numbers to DataCite for all Narratives that have received a DOI for publication.

Finally, in collaboration with the DOE Joint Genome Institute through a co-development effort, KBase and Integrated Microbial Genomes (IMG) have generated a mapping of non-redundant protein sequences to UniRef 100 clusters which enables IMG users to identify and link directly to identical sequences in KBase. These platforms continue to work together to improve data connections, which will guarantee that data ownership and embargo periods are honored, even after data has been transferred between platforms.

The KBase Knowledge Engine: Ecosystem Classification PrototypeArkinLawrence Berkeley National LaboratoryDehalComputational BiologyKBase

One of the primary goals of the Department of Energy Systems Biology Knowledgebase (KBase) is the generation and application of biological knowledge from analytical results. To that end, the KBase Knowledge Engine (KE) will leverage existing and novel machine learning and bioinformatics tools to build up such knowledge from the growing body of results from analysis done using KBase and made publicly available. Ultimately, the project seeks to predict the key taxa, functions, ecosystem features, and their interactions. To accomplish this, researchers begin by developing (1) classifiers for identification of key determinants of ecosystems; (2) phenotype and trait predictors; and (3) robust pangenomes and their relationships across the microbial tree of life.

Microbial life is a critical component of Earth’s ecosystems, and the taxonomic and functional information from environmental genomics can provide insights into microbial roles in the environment. However, comparing this data across metagenomes can be challenging, and furthermore, abundance differences may not reflect important functional differences between environments. As a first KBase KE prototype, researchers aimed to use machine learning to: (1) build and evaluate robust ecosystem classification models using standardized data from ~32,000 metagenomes; and (2) identify important classification features and how they relate to understanding of environments on Earth.

Relying on standardized data from the European Bioinformatics Institute MGnify resource, researchers constructed feature tables associating metagenome sample environment labels with Gene Ontology (GO) term abundance, InterPro (IPR) domain, and predicted taxonomy profiles. Through a series of reusable data preparation and cleaning techniques, input data was generated for reliable model training. Hyperparameter tuning was performed on top multiclass classification methods and model performance was assessed with cross-validation. With a permutation analysis to extract feature importance from the top models, researchers obtained features important for classification and used these to construct trees and networks relating different ecosystems. Using relationships from the environmental classification as well as sample ecosystem outliers, the team interpreted model errors, including misclassifications as hypernyms and hyponyms, and were able to account for most model errors suggesting future improvements through better incorporation of classification semantics into model training. Researchers also identified a series of model predictions, which directly suggest sample relabeling, for example providing more specific terms for samples labeled as Environmental: Aquatic: Marine. Results provide a high-performance metagenome ecosystem classification model and enable model interpretability to learn important ecosystem indicator functions as well as ecosystem and function relationships.

Learning and Training with KBaseArkinLawrence Berkeley National LaboratoryAllenComputational BiologyKBase

The Department of Energy Systems Biology Knowledgebase (KBase) is a knowledge creation and discovery environment designed for both biologists and bioinformaticians. KBase integrates a large variety of data and analysis tools, from DOE and other public services, into an easy-to-use platform that leverages scalable computing infrastructure to perform sophisticated systems biology analyses. KBase is a publicly available and developer extensible platform that enables scientists to analyze their own data within the context of public data and share their findings across the system.

The KBase user interface (UI) enables instructors to work with students to conduct hands-on data science research and analysis without the need for programming skills or computational resources. The KBase team works with instructors and researchers of varying skill and career levels to ensure the transfer of domain knowledge is accompanied by an understanding of bioinformatic tools and techniques. KBase supports learning and training through the KBase Educators Community and by hosting workshops and webinars on community-focused topics. The variety of programming targets different cross-sections of the BER research community, with the overall goal to improve and expand the next generation of data analysis using KBase.

KBase Educators

The KBase Educators program consists of biological and data science instructors ranging from high school to graduate level that have adapted the KBase platform to their curriculum needs by developing modular, adaptable, and customizable instructional units using KBase Narratives. These instructional modules contain teaching resources, data analysis tools, and mark-down utility to tailor instructions and learning goals. Each module can be adapted for independent class concepts across Genomics, Metagenomics, Phylogenetics, Pangenomics, Metabolic Modeling, and Transcriptomics. The KBase Educators Organization provides access to resources in KBase, and a KBase Users Slack channel provides access to a community network of peers, supported by community-driven guidelines, instructional templates, and KBase staff.

Educators from community colleges, primarily undergraduate institutions, and doctoral research institutions make up the KBase Educators Community. There is also representation of diverse student populations from Minority-Serving Institutions, including Hispanic-Serving, Asian American and Native American Pacific Islander-Serving, Alaskan Native-Serving or Native Hawaiian-Serving, Native American-Serving Nontribal, Historically Black Colleges or Universities, and Predominantly Black Institutions. Program growth and expansion will continue, as the community identifies additional areas that are important to support educators and their students.

Come see how KBase can support Promoting Inclusive and Equitable Research (PIER) Plans and develop connections with new collaborators for Reaching a New Energy Sciences Workforce (RENEW) and Funding for Accelerated, Inclusive Research (FAIR).

Outreach

The KBase team hosts outreach and training events to support research groups, educators, and collaborators advance their research. Training events include workshops and webinars to demonstrate use of the platform and showcase popular workflows. Workshops are used to reach specific institutions to facilitate collaboration with researchers and students. Webinars reach a broader audience through web-based training to introduce new features, showcase workflows, and host speakers including KBase staff, community developers, and subject matter experts.

Webinars are posted on the KBase YouTube channel (https://www.youtube.com/DOEKBase) for anyone to revisit and view after the event and often include public Narratives for users to test out new tools and workflows.

Through each of these approaches, KBase empowers skilled researchers and inspires the next generation of biologists and data scientists by providing a platform that seamlessly enables users to integrate conceptual knowledge with sophisticated systems biology investigative tools.

Probing Lignin Deconstruction and Catabolism in Soil Pseudomonas SpeciesAristildeNorthwestern UniversityAristildeBioenergyUniversity

The overall goal of this project is to elucidate the metabolic reaction networks within outer membrane vesicles (OMVs) secreted from soil Pseudomonas species. In particular, this project aims to evaluate how OMVs catabolize lignin-derived aromatics in Pseudomonas strains, and in turn to maximize aromatic catabolic activity via engineered or synthetic systems. The results from this work will enhance understanding of carbon cycling by soil bacteria and have implications in the use of engineered pseudomonads for lignin valorization to value-added compounds to support the bioeconomy.

Valorization of lignin is an important component of a sustainable bioeconomy. Gram-negative soil Pseudomonas strains, which natively catabolize lignin-derived aromatics (LDAs), are commonly engineered for the conversion of LDAs to value-added compounds. It was recently shown that Pseudomonas putida secretes OMVs enriched with enzymes that catalyze LDA turnover (Salvachúa et al. 2020). However, the metabolic reaction networks of pseudomonad OMVs remain uncharacterized.

Towards characterizing the regulatory controls and bottlenecks of OMV-localized fluxes from LDAs, the team first characterized a method for OMV isolation and enumeration from Pseudomonas cultivations and applied this to compare OMV secretion rates across growth conditions and stages. For OMV isolation, affinity-based kits were compared to ultracentrifugation (UC). Nanoparticle tracking analysis (NTA) was used both to count and measure the size of OMVs. Affinity-based isolation had improved throughput and lower processing time, but lowered particle yields. The isolation method did not have an effect on OMV size distribution, and OMV secretion was determined at different growth stages during growth on hydroxycinnamic acid, p-coumarate, or nutrient-rich medium. Proteomics analysis is underway to evaluate whether isolation strategies result in selective enrichment of certain OMV populations. The objective is to identify optimal sampling strategy for P. putida cultivations on p-coumarate.

Regarding OMV function, researchers hypothesized that the differential abundance of enzymes packaged into OMVs from P. putida fed on different LDAs will give rise to OMVs with different reaction networks. To test this hypothesis, the OMV metabolic functionality is being characterized for a variety of LDAs using OMVs produced by cultivation in lignin-rich media prepared with alkaline pretreated lignin liquor. After addition of an LDA and cofactors (i.e., ATP and NAD(P)H) were added to purified OMVs, preliminary metabolic profiling was conducted. Metabolic intermediates such as 4-hydroxybenzoate and protocatechuate were identified, demonstrating OMVs are actively catabolizing the LDA substrate. High-resolution kinetic profiling and kinetic 13C-labeling experiments with several LDAs will be performed next to both obtain direct evidence of and to quantify the metabolic functionality of OMV-localized reaction networks.

Lastly, quantification of metabolic functionalities in the OMVs may reveal metabolic bottlenecks within the pathways for LDA catabolism. Overcoming these bottlenecks would be of interest in engineering improved biocatalysts for lignin valorization. However, genetic tools for OMV biogenesis and cargo packaging are not currently available in P. putida. To this end, a library of genetic mutants has been screened to identify mutations that induce vesiculation but do not significantly impact growth on LDAs. Current work is focused on developing a SpyCatcher-SpyTag system for targeting protein cargo into OMVs. Protocols for proteomics analysis of OMV preparations from wild-type and mutant strains are under refinement and will enable assessment of aberrant protein cargo sorting in hypervesiculating mutants. The project aims to enable OMV deployment as a standardized synthetic biology tool in pseudomonads.

A Systems Understanding of Nitrogen-Fixation on the Aerial Roots of SorghumAnéUniversity of Wisconsin–MadisonVermerrisBioenergyUniversity

This project aims to understand the molecular and cellular networks controlling biological nitrogen fixation in sorghum aerial roots using a combination of genetics, synthetic bacterial communities, and systems biology.

Biological nitrogen fixation (BNF) by crop plants via microbial symbiosis is an effective approach to lowering the economic and environmental costs of crop production by decreasing fertilizer dependence. BNF is commonly associated with legumes, but cereals have been reported to be able to support nitrogen-fixing bacteria in the mucilage of their aerial roots (adventitious nodal roots), such as indigenous landraces of maize (Zea mays L.) in Oaxaca, Mexico (Van Deynze et al. 2018). Sorghum (Sorghum bicolor (L.) Moench) is an attractive bioenergy crop due to its ability to produce high biomass yields with minimal inputs and to withstand biotic and abiotic stresses. Some sorghum accessions can host nitrogen-fixing bacteria in the mucilage produced by their aerial roots, and this project is investigating the mechanisms enabling the symbiotic interactions with nitrogen-fixing microbes. Since this interaction relies on the presence of aerial roots, researchers have performed a genome-wide association study (GWAS) of two panels of genetically diverse sorghum genotypes, the sorghum minicore (Upadhyaya et al. 2009), a collection of landraces, and the sorghum association panel (SAP; Casa et al. 2008), a collection of sorghum genotypes representing all major cultivated races and important U.S. breeding lines and their progenitors. Traits of interest include the number of nodes producing aerial roots, the total number of aerial roots, aerial root length, and aerial root diameter. Since the traits supporting efficient BNF are hypothesized to be under genetic control but influenced by the environment, the team has also analyzed the effect of the nitrogen fertilization level and location (Florida vs. Wisconsin) on these traits. The proportion of genotypes forming aerial roots was substantially greater in the minicore than in the SAP, suggesting the presence of aerial roots has been under negative selection in modern breeding programs. The GWAS resulted in several candidate loci associated with the number of nodes producing aerial roots consistently in both locations. The genetic analysis of segregating breeding populations derived from crosses between commercial bioenergy sorghums and landraces that form aerial roots will be used to validate these loci. In addition, backcross populations with inbred line RTx430 will be the basis for future transgenic validation experiments with an improved sorghum leaf-whorl transformation system (Silva et al. 2020).

Plastic Degradation by the Gut Microbiome of Yellow MealwormsSolomonUniversity of DelawareKlauerBioenergyUniversity

This project discovers and reconstructs the plastic degradation pathways distributed across the gut microbiome of yellow mealworms (larvae of Tenebrio molitor) to develop enhanced capabilities for biologically based polymer recycling.

Plastics, initially selected for their durability and environmental resiliency, pose a significant environmental challenge for modern economies. Polystyrene (PS), high- and low-density polyethylene (HDPE and LDPE), and polypropylene (PP) are produced at a rate of more than 228 million tonnes globally each year. However, none have robust infrastructures for mechanical or chemical recycling and ultimately become polluting waste streams. To address this need, the team will pursue biological strategies for plastics depolymerization. Researchers will focus on the microbiomes of insect larvae (colloquially called worms) as they degrade plastics more rapidly than microbial isolates and do not require clean plastics or pretreatment. In particular, the microbiome of yellow mealworms is unique in that its host does not appear to contribute to degradation of a wide range of plastics. While bacterial community members have been identified, the specific pathways responsible for biodegradation remain to be elucidated and the potential contributions of fungal members are unexamined. Additionally, emerging evidence suggests that nutrient supplementation enhances plastic metabolism up to 70% and gives rise to a gut community structure distinct from that without additional nutrients. However, it is unclear if nutrient supplementation induces microbes to participate in in the plastic degradation or if it supports an optimal community composition for function.

As a first step to address these gaps, researchers characterized the consumption rates of PS, LDPE, HDPE, and PP via T. molitor larvae in the presence and absence of co-fed oats as a nutritional supplement. The consumption rates of PS, LDPE, and HDPE were 20.4, 12, and 1.1 mg (100 larvae)-1d-1, respectively, in agreement with established studies. However, oat supplementation enhanced plastics consumption by ~160, 60, and 230%, respectively. These studies establish the use of oats as a potent supplement for enhancement of PS and LDPE consumption rates, up to double that obtained with established supplements, and validated HDPE consumption by T. molitor.

Worm-consumed plastics were chemically modified beyond simple mechanical degradation validating biological mechanisms for plastics depolymerization. Fourier transform infrared spectroscopy (FTIR) analysis of plastic extracted from the frass (excrement) of mealworms fed PS revealed incorporation of oxygen not found in untreated controls. Moreover, benzene ring cleavage was observed for treated PS samples. Similarly, FTIR spectra of extracted plastic from LDPE-fed mealworm frass revealed the incorporation of carbonyl and alcohol groups. Finally, gel permeation chromatography (GPC) of the ingested plastic from PS-fed T.molitor larvae confirmed a 40% decrease in polymer molecular weight while LDPE-fed worms were able to decrease the molecular weight by up to three orders of magnitude. Taken together, these results demonstrate that the plastics being ingested by the larvae are actively depolymerized and chemically modified.

Microbiome community analysis via 16s and ITS sequencing revealed a rich consortium of bacteria and fungi. The bacterial community was more diverse than the fungal community with observed taxa belonging to the bacterial phyla Firmicutes, Tenericutes, Proteobacteria, Actinobacteria, Spirochaetes, Bacteroidetes, and Fusobacteria, and fungal Ascomycota, Basidiomycota, and Mucoromycota. As expected, mealworm diet led to unique community structures adapted to degradation of the fed plastic substrate. However, oats co-supplementation frequently selected for taxa that were not observed in plastics-only or oats-only controls suggesting currently unrecognized interactions. Despite these unique community structures, microcosms of communities in planktonic culture selected for with LDPE, HDPE, PS, and PP diet were all able to grow on LDPE as a primary-carbon source. Finally, community analyses revealed many facultative and obligate anaerobic genera such as Spiroplasma associated with LDPE and PS degradation when supplemented with oats. Correspondingly, these communities were enriched with clusters of genes (COG) and protein family (pfam) for iron-dependent anaerobic oxidation enzymes and pathways, which may serve as novel oxygen-independent pathways for plastics depolymerization.

To determine microbes and active enzymes responsible for the degradation of PS, PE, and PP, researchers have developed a suite of photoreactive chemical probes that resemble oligomers of these polymers. These probes are fluorescently labeled, providing an avenue to selectively isolate microbes that take up these molecules via fluorescence-activated cell sorting and enable subsequent proteomic characterization of the proteins acting upon them. Team members have begun using these probes to characterize enzymes that bind them strongly in a series of microbial isolates and identify the taxonomy of cells capable of transporting these plastics in plastic-degrading worm gut microbiomes.

In summary, ongoing work has characterized plastic consumption rates in T. molitor microbiomes, revealing novel strategies to structure gut microbial populations for enhanced degradation. Plastics were noted to be metabolized and not only mechanically degraded by both bacterial and fungal communities that contribute to plastic degradation even independent of the host mealworm. Team members have also developed chemical probe analogs of common plastics to isolate plastic-binding microbes and proteins for study. Through these parallel efforts, researchers aim to generate systems-level insight into the metabolic pathways of plastic-degrading microbiomes and to develop consortia enriched in plastic degradation activity.

The MoonTag Programmable Transcription Activator and Synthetic Promoters: Tools for Engineering Oil Biosynthesis in Camelina and PennycressSmanskiUniversity of Minnesota–Saint PaulCasas MollanoBiosystems DesignUniversity

Leverage the use of sequence-programmable transcriptional activators (PTAs) and synthetic promoters to coordinately fine tune the expression of multiple endogenous genes and/or transgenes to facilitate targeted lipid production.

Programmable transcriptional activators (PTAs) are synthetic tools that enable tunable regulation of the expression of endogenous genes and transgenes in eukaryotic and prokaryotic organisms. PTAs are composed of activation domains fused to a dead Cas9 enzyme (dCas9) that lacks nuclease activity but can still bind DNA in a sequence-specific manner (Casas-Mollano et al. 2020). The main advantages of Cas9-based systems are that they can achieve high levels of gene activation and are very easy to program via base-pairing between the guide RNA (gRNA) and the DNA target strand (Casas-Mollano et al. 2020). The PTA described here, called MoonTag, is a second-generation system that uses a nanobody-antigen peptide interaction to recruit multiple copies of an activation domain to its target promoters (Casas-Mollano et al. 2023). MoonTag is capable of inducing high levels of transcription in reporter as well as in endogenous genes in the monocot model plant Setaria. MoonTag is also able to efficiently activate genes in eudicot species such as Arabidopsis and tomato (Casas-Mollano et al. 2023). In addition to its activation capabilities, MoonTag components are expressed in transgenic plants to high levels without any deleterious effects resulting from the expressed components being incompatible with plants cells. Thus, MoonTag is a new activator that could be used to regulate the transcription of endogenous genes in many plant species.

The team has also created a set of synthetic promoters that could be used together with MoonTag to coordinate the expression of multiple genes. The synthetic promoters were designed using a modular approach. They comprise a minimal promoter (the region immediately upstream the transcriptional initiation site where the transcription pre-initiation complex and RNA pol II binds) and a trans-activation (TA) region upstream containing binding sequences for transcription factors that stimulate or repress transcription (Belcher et al. 2020). The TA region was designed to contain six gRNA binding sites for a Cas9-based PTA such as MoonTag. The designed synthetic promoters can be strongly activated by MoonTag and other Cas9-based activators. One advantage of these synthetic promoters is that they share a minimal amount of duplicated sequences, limited to the gRNA binding sites, allowing for the design of multigene vectors with limited sequence homology but with robust expression levels.

As proof-of-concept, researchers designed synthetic promoters driving the three genes from the betalain biosynthetic pathway (He et al. 2020). In the presence of the MoonTag PTA, these genes were simultaneously activated leading to the accumulation of betalains. Researchers also demonstrate tissue-specific regulation of the synthetic promoters by expressing MoonTag from a seed-specific promoter that led to the accumulation of betalains exclusively in the seeds. Thus, the use of synthetic promoters together with PTAs should allow for the deployment of multigene constructs such as those for metabolic pathways or trait stacking that can be directed to specifically be expressed in a particular tissue, organ, developmental time, or in response to external cues.

In the context of the B5 (Bigger Better Brassicaceae Biofuels and Bioproducts) project the team will use the MoonTag activator and the synthetic promoters to achieve the precise coordination of gene expression of transgenes as well as endogenous genes in order to facilitate oilseed engineering in pennycress and camelina. Overexpression of endogenous genes in seeds will be achieved by expressing MoonTag driven by seed-specific promoters together with gRNAs targeting the promoters of the genes of interest. When necessary, transgenes driven by synthetic promoters activated by MoonTag will be used so that both transgenes and endogenous genes, can be coordinately regulated in a particular seed developmental stage by where MoonTag is expressed. For overexpression researchers will initially target the FatB gene to boost palmitic acid (16:0) content and WRI1, DGAT1, GDP1 to increase oil content. Oil seed engineering will also require repression of lipid biosynthetic genes. To target genes for repression team members will generate MoonTag PTRs (programmable transcriptional repressors) bearing the repressive SRDX domain (Hiratsu et al. 2003). Alternatively, researchers will use the synthetic promoters to drive expression of double-stranded RNAs that target endogenous genes for downregulation through RNA interference.

Predicting and Modeling Protein-Protein Complexes at Large-Scale with Deep LearningSkolnickGeorgia Institute of TechnologyGaoComputational BiologyUniversity

With the advances in next-generation sequencing technologies, the number of sequenced genomes is growing exponentially. This has resulted in a bottleneck for the translation of sequence information into functional hypotheses about each gene. Current gene annotation technologies are primarily based on evolutionary inference by sequence comparison; however, many proteins in a proteome remain uncharacterized. To address this challenge, this collaborative team is developing a suite of novel high-performance-computing (HPC), deep-learning methods that predict protein structures and interactions at unprecedented accuracy, making use of the Summit supercomputer at the DOE leadership computing facility at the Oak Ridge National Laboratory. The combination of deep learning, HPC, and structural-based analysis will help to understand molecular mechanisms of protein functions, and enable rapid, accurate prediction of gene function on a genomic scale, such as novel protein-protein interactions important to life.

One key observation of proteins in a living cell is that they usually interact with each other to carry out their biological functions. The identification and characterization of these protein-protein interactions are therefore critical to understanding life. Very recently, researchers proposed a deep learning–based approach for the identification of protein-protein interactions. The approach, AF2Complex, is built on the success of AlphaFold 2. AF2Complex extends the idea of structure modeling of a single protein sequence to a complex made of multiple sequences and further predicts protein-protein interactions by using the confidence of its structural modeling. While AF2Complex have been successfully benchmarked in multiple tests including 7,000 protein pairs from the bacteria E. coli, it is important to demonstrate its usefulness by applying it to address some real-world problems. For this purpose, team members investigated the pathway leading to the folding and assembly of outer membrane proteins (OMP) in E. coli as a proof-of-concept illustration of approach. OMPs serve an essential functional role such as nutrients exchange with their living environment. The making of these barrel-like OMP proteins is an elaborate process starting within the cytoplasm, where they are first manufactured by ribosomes. Coming out of the ribosomes are nascent, still largely unfolded peptide chains that must subsequently cross the inner membrane, travel through the periplasmic space, and finally land at their destination: the outer membrane. To ensure a successful journey, many other proteins provide vital help by forming functional protein complexes. However, they are challenging for experimental characterization because many of them are membrane proteins, and the interactions are often transient. By applying the AF2Complex workflow established at Summit to several key proteins in the OMP biogenesis pathway, researchers have identified their functional partners within the top 1% ranking of ~1,500 proteins screened for PPIs per query. Thanks to high confidence structures underlying the top predictions, one can understand many experimental phenomena, particularly in vivo site-directed photo cross- linking data. For example, cross-linked products found from the translocon SecYEG or the β-barrel assembly machine (BAM) supercomplexes may be explained by direct physical interactions revealed in predicted structures (Figure 1). An unexpected, biologically important interaction has been identified between the enzyme DsbA and chaperon PpiD, which is associated with the SecYEG translocon. Moreover, previously speculated conformations are captured for SurA and BepA. Most importantly, these revealing atomic structures of various supercomplexes suggest mechanistic hypotheses for various steps of the OMP biogenesis pathways.

Feedstocks-to-Fuels Pipeline Demonstration: End-to-End Process SynthesisSinghCABBIBanerjeeBioenergyCABBI

One goal of the “Feedstocks-to-Fuels Pipeline” project was to demonstrate pilot- scale processing of CABBI feedstocks, namely genetically modified sugarcane (oilcane) and Miscanthus x giganteus for the recovery of vegetative lipids, microbial lipids, succinic acid, and anthocyanins as the main products. Overall, this work on end-to-end process synthesis tests the performance of the CABBI feedstocks, the processing methods, and the yeasts engineered to produce bioproducts at an industrially relevant scale using the Integrated Bioprocessing Research Laboratory identifies technology gaps and develops enabling technologies.

The Feedstocks-to-Fuels pipeline synthesizes advances in feedstocks, bioprocessing technology, and engineered yeasts. The deconstruction of the CABBI feedstocks is the most crucial step in recovering “in planta” products such as oil, waxes, sugars, and pigments. The pilot scale processing of CABBI feedstocks, such as oilcane and purple-stemmed Miscanthus x giganteus, involves the development of biomass deconstruction strategies followed by their conversion into value-added products. Oilcane is produced by the metabolic engineering of sugarcane to accumulate lipids in vegetative tissues (Parajuli et al. 2020). The high biomass productivity of this transgenic bioenergy crop holds the potential to produce more oil per hectare of cultivated land than soybean (Huang et al. 2016). The vegetative lipids present in oilcane have the potential for biodiesel production while the oilcane juice, rich in sugars, can be used for the production of value-added biochemicals.

Oilcane stems were received from the University of Florida and were processed at Integrated Bioprocess Research Laboratory. About 218 kg of juice and 230 kg of wet bagasse were recovered by processing 466 kg of oilcane stems. The oilcane juice was used to produce succinic acid using a novel metabolically engineered Issatchenkia orientialis at an acidic pH of 3 and achieved a 54 g/L titer. In the downstream processing, 63.9% succinic acid was recovered from the fermentation broth with 98.5% purity through filtration, followed by decolorization and crystallization. The processed oilcane bagasse was pretreated through a continuous pilot-scale hydrothermal process at 50% (w/w) solids followed by mechanical refining (HMR) for the deconstruction of the lignocellulosic network. Fed-batch enzymatic hydrolysis of pretreated bagasse was performed to achieve industrially relevant cellulosic sugar concentrations. Processing with HMR did not incur any changes to the in situ lipid profile in the oilcane bagasse and gives optimal recovery of lignocellulosic sugars for conversion to microbial lipids. The major fraction of vegetative lipids were thus recovered from the biomass residue following enzymatic saccharification of the pretreated bagasse (Maitra et al. 2022).

Purple-stemmed Miscanthus x giganteus is another CABBI feedstock and is known for its potential to accumulate natural colorants namely anthocyanins. The overall yield of anthocyanins recovered per unit area is significant due to the high productivity of this bioenergy crop (Banerjee et al. 2023).

Preliminary studies showed that hydrothermal pretreatment of miscanthus could be used as a green approach to recover more than 90% of the total anthocyanins as an additional product stream and also enhanced the enzymatic digestibility of the biomass (Banerjee et al. 2022). For pilot-scale demonstration, 50 kg of purple-stemmed Miscanthus x giganteus, grown at the Energy Farms at the University of Illinois, was processed through a continuous pilot-scale hydrothermal pretreatment at 50% (w/w) solids. The pretreated biomass was further subjected to disc milling. The pretreatment led to a recovery of 94.3% w/w of the total anthocyanins present in miscanthus and also improved the enzymatic digestibility of cellulose leading to a 2.1-fold increase in the overall recovery of glucose. The cellulosic sugars thus obtained were converted into microbial lipids using an oleaginous yeast strain, which are a potential feedstock for biodiesel production.

Overall, the Feedstocks-to-Fuel pipeline successfully demonstrated the potential of two CABBI feedstocks, namely oilcane and purple-stemmed miscanthus. The valorization products were vegetative and microbial lipids for biofuel production along with the production of succinic acid and anthocyanins as additional value-added product streams.

Functional Analysis of Genes Encoding Ubiquitin Proteasome System Components Affecting Poplar Wood TraitsShabekUniversity of California–DavisRodriguez-ZaccaroBioenergyUniversity

Wood vessel trait candidate genes coding for E3 ubiquitin ligase enzymes will be functionally characterized and examined through CRISPR-cas9 genome editing, TurboID proximity labeling, and drought and ABA treatments. Specifically, the data generated in this project will be further used to study the transcriptome, proteome, interactome, and ubiquitinome in poplar wood forming tissues.

Wood is the water-conducting tissue of tree stems. Like most angiosperm trees, poplar wood contains water-conducting vessel elements whose anatomical properties affect water transport and growth rates as well as susceptibility to cavitation and hydraulic failure during drought. Despite their key role in determining the hydraulic physiology of trees, the genetic regulation of vessel element morphological traits is poorly understood. In a preliminary study, a dosage- based genome-wide screen found significant associations (or dosage QTL regions) between wood vessel traits and specific regions of the genome. Poplar wood forming tissues were then sampled to conduct a gene coexpression network analysis. Height-corrected vessel frequency was significantly correlated to a group of co-expressed genes that code for E3 ubiquitin ligase components of the ubiquitin proteasome system. The team found that some of these genes are located within a chromosome 9 dosage QTL region identified in its previous screen, suggesting that these could affect vessel trait variation in a dosage-dependent manner. From these genes, the team selected vessel trait-related candidates for further characterization, including E3 ubiquitin ligases makorin (MKRN), SKP1-interacting protein 2 (SKIP2), and Big Brother (BB). Based on these preliminary findings, future aims involve the novel functional characterization of key components of ubiquitin-proteasome regulation in poplar wood forming tissue. To meet this goal, researchers will generate CRISPR-cas9 mutants for poplar ubiquitin E3 ligase candidate genes to determine changes in wood phenotype, gene expression, protein abundance and ubiquitinomes. The team will also use a TurboID proximity labeling strategy in poplar to identify the interacting partners for candidate proteins and characterize their specific structure and function. Similarly, the transcriptome, proteome, interactome, and ubiquitinome of trees that were grown under drought or treated with ABA will be determined. Ultimately, these strategies will shed light on the role of the ubiquitin proteasome system in wood formation, vessel trait variation, and tree responses to the environment.

532 Genomes Reveal Natural Variation and Local Adaptation History in PennycressSedbrookIllinois State UniversityToro AranaBioenergyUniversity

This project employs evolutionary and computational genomic approaches to identify key genetic variants that have enabled Thlaspi arvense L. (Field Pennycress; pennycress) to locally adapt and colonize all temperate regions of the world. This, combined with knowledge of metabolic and cellular networks derived from first principles, guides precise laboratory efforts to create and select high-resilience lines, both from arrays of random mutagenesis and by employing cutting-edge CRISPR genome editing techniques. This project will deliver speed-breeding methods and high-resilience mutants inspired by natural adaptations and newly formulated biological principles into a wide range of commercial pennycress varieties to precisely adapt them to the desired local environments.

Pennycress is under development as an annual winter oilseed cover crop for the 80 million acre U.S. Midwest Corn Belt and other temperate regions including the Pacific Northwest. It has demonstrated unique attributes such as extreme cold resilience, rapid spring growth, and adaptation to various environments. Identifying genomic loci contributing to its resilience and local adaptation will significantly benefit its breeding programs and shed light on other Brassicaceae bioenergy crops for climate adaptation. Here is presented an analysis of 532 high-quality whole-genome resequenced wild pennycress accessions collected from their natural Eurasia and North American range (Pennycress Genome Portal, Nunn et al. 2022, Geng et al. 2021). The team comprehensively identified 6.3 million single nucleotide polymorphisms (SNPs) using the grenepipe variant calling pipeline (Czech, L. and Exposito-Alonso, M. 2022). In-depth analyses of population structure and demography indicate multiple recent migrations of the North American accessions from Europe with similar extensive genetic variation structured latitudinally. Preliminary admixture results suggest that introgression may play an important role in the local adaptation of pennycress. Genome-wide scans of selection signals and climate GWAS provide candidate genomic regions responsible for local adaptation and cold tolerance. In conclusion, this study offers valuable genomic resources for pennycress breeding and helps elucidate the history of migration and local adaptation in pennycress.

Comparison of Physiological and Metabolomic Responses to Drought Across Pennycress CRISPR Mutants and Natural AccessionsSedbrookIllinois State UniversityThomasBioenergyUniversity

This project employs evolutionary and computational genomic approaches to identify key genetic variants that have enabled Thlaspi arvense L. (Field Pennycress; pennycress) to locally adapt and colonize all temperate regions of the world. This, in combination with knowledge of metabolic and cellular networks derived from first principles, is guiding precise laboratory efforts to create and select high-resilience lines, both from arrays of random mutagenesis and by employing cutting-edge CRISPR genome editing techniques. This project will deliver speed-breeding methods and high-resilience mutants inspired by natural adaptations and newly formulated biological principles to be introduced into a wide range of commercial pennycress varieties to precisely adapt them to the desired local environments.

Field pennycress is an overwintering bioenergy cover crop that is rapidly being domesticated. It produces oilseeds that can be converted into various products from cooking oil to renewable diesel and sustainable aviation fuel (SAF). By replacing fossil fuels, pennycress directly combats climate change. However, abiotic stresses brought by climate change threatens stable production of many crops including pennycress. Pennycress’s drought response has not been studied in much detail. Here researchers explored the drought responses of wild-type, gene-edited, and natural accessions by physiological assays, metabolic profiling, and data integration via a genome-scale metabolic pathway database for pennycress. Team members first measured phenotypes relevant to drought stress in the reference line Spring32-10 seedlings and plants including yield, biomass, stomatal conductance, and photosynthetic efficiency over various drought conditions. Researchers subjected plants to drought and control conditions and then collected above- and belowground tissues for metabolomic analyses via liquid chromatography coupled mass spectrometry, which allows detection of changes in plant metabolism during drought stress. These data will be analyzed in the context of the PennycressCyc database the team created for the pennycress community. Additionally, team members generated pennycress single, double, and triple mutants using CRISPR-Cas9 mutagenesis targeting 10 genes important for drought responses in other species. These mutants, along with a subset of 800 pennycress worldwide natural accessions predicted to have varied levels of drought resilience based on the team’s climatype predictions, were subjected to drought conditions and phenotyped to identify relative differences in drought responses. Taken together, this work helps decipher how pennycress uniquely responds to drought stress and is identifying natural and induced genetic changes that could improve pennycress drought resilience.

Optimizing Biological Nitrogen Fixation on Sorghum by Manipulating Microbial CommunitiesAnéUniversity of Wisconsin–MadisonPalmerBioenergyUniversity

As part of a multidisciplinary collaboration, this project aims to address key sustainability challenges facing the cultivation of sweet sorghum, an important biofuel crop. In particular, this project aims to decrease reliance on synthetic nitrogen fertilizers by improving biological nitrogen fixation, focusing on sorghum aerial roots as the main sites for diazotrophic activity. In tandem with collaborators investigating this trait from the plant perspective, researchers evaluate the bacterial communities associated with sorghum responsible for biological nitrogen fixation. The team plans to (1) isolate and characterize diverse bacterial strains from aerial root mucilage, (2) determine the bacterial interspecies interactions impacting biological nitrogen fixation, and (3) develop and test synthetic communities with robust biological nitrogen fixation.

Cultivation of the key biofuel crop, sorghum, relies heavily on using natural gas-intensive and environmentally damaging nitrogen fertilizers (Rütting, Aronsson, and Delin 2018). As an alternative, biological nitrogen fixation (BNF) through the activity of plant-associated diazotrophic bacteria has the potential to improve crop production sustainability and reduce environmental damage (Pankievicz et al. 2019). Demonstrating the potential of BNF in cereals, some indigenous corn landraces from Central America can obtain 29%–82% of their nitrogen from BNF in the low oxygen and sugar-rich mucilage secreted by aerial roots (Deynze et al. 2018). Several sorghum accessions also produce aerial roots and mucilage, but the diazotrophic activity of sorghum aerial root–associated communities has not been investigated. Following these observations, the team hypothesized that aerial root mucilage provides the ideal environment in which to improve BNF in sorghum.

The project uses a synthetic microbial community approach to investigate the bacterial interspecies interactions impacting BNF. Researchers drew upon a previous study that identified a seven-member community that stably and reproducibly assembled on corn roots and added five additional diazotrophic strains of interest, developing a 12-member nitrogen-fixing community which is referred to as PComm1 (Niu et al. 2017). Community BNF and composition in low and high species richness subcommunities of PComm1 is evaluated using acetylene reduction assays followed by 16S sequencing. All possible 1-, 2-, 11-, and 12-member communities are investigated. Preliminary results suggest that most interspecies interactions negatively impact nitrogen fixation, with less nitrogenase activity observed in communities with more members. This data set will be used to build computational models to further elucidate interspecies interactions impacting community growth and fixation, as well as design communities with enhanced BNF.

In addition to investigating community BNF in a simplified model system, researchers are also working to isolate and characterize bacteria from aerial root mucilage. Previously, the team isolated a collection of ~90 individual strains with a range of plant growth–promoting traits, such as auxin production, siderophore production, phosphate solubilization, and 1-aminocyclopropane-1-carboxylate (ACC) degradation. However, less than 10% of these isolates have nitrogen-fixing capabilities. Therefore, focus is currently on expanding the diversity of nitrogen-fixing strains in the project’s collection. A semisolid nitrogen-free medium approach was used to obtain new bacterial strains and increase the genetic variability of the mucilage-borne strain collection. Researchers obtained more than 320 new bacterial strains and used DNA fingerprinting to narrow down the collection to ~200 strains, excluding redundant profiles. In the next steps, the team will identify these strains and determine their diazotrophic ability, aiming to select new strains for future community manipulation and genetic engineering.

Single-Cell Omics to Examine the Cell Type-Specific Gene Regulatory Programs of Mucilage Production in Sorghum bicolorAnéUniversity of Wisconsin–MadisonAnéBioenergyUniversity

Specific accessions of sorghum (Sorghum bicolor) develop aerial roots and produce carbohydrate-rich mucilage. The presence of mucilage suggests potential nitrogen-fixing microbial associations that contribute to nitrogen nutrition, as reported in a maize (Zea mays) landrace. However, the specific cell types and their underlying gene regulatory programs that produce mucilage are poorly understood. To understand the molecular basis of mucilage production, researchers developed a protocol to isolate nuclei of the cells collected from aerial roots that produced mucilage. The isolated nuclei were then sequenced using the 10X Genomics chromium technology. The preliminary single-nucleus dataset of circa 7,000 cells detected cell types in the aerial root based on the expressions of known marker genes. In parallel, the team has been developing novel analytical pipelines to identify cell types, cellular lineage structure, and cell-type-specific gene regulatory networks (GRNs). Using the team’s approach to published single-cell RNA-seq datasets, researchers were able to recapitulate putative GRN components and regulators for the cell fate specification. Applying these approaches to the sorghum single nucleus datasets should advance understanding of mucilage production in the aerial root. This opens the possibility of identifying and validating candidate genes for breeding this important trait.

Developing a National Virtual Biosecurity for Bioenergy Crops CenterSchoonenBrookhaven National LaboratoryFreimuth BioenergyNVBBCC

The goal of this 18-month pilot project is to develop a roadmap for a new U.S. Department of Energy (DOE) Office of Science Biological and Environmental Science capability to address biothreats to bioenergy crops. The main deliverable of this effort is a roadmap toward a new National Virtual Biosecurity for Bioenergy Crop Center (NVBBCC) based on community input as well as the experiences of conducting a limited study on anthracnose, a disease affecting Sorghum, one of the DOE’s leading biofuels feedstocks. A mature NVBBCC is envisioned to be a distributed, virtual center with multiple DOE laboratories at its core to maximize the use of unique facilities and expertise across the DOE complex. NVBBCC will support community-driven plant pathology research as well as broader BER-relevant plant biology research. The new NVBBCC capability could also support responses to biothreats to unmanaged ecosystems and techniques, workflows, and infrastructure could be readily pivoted to a wider range of biosecurity challenges.

Rationale: The development of resilient and sustainable bioenergy crops is an important part of U.S. government strategy to transition to a net-zero economy. An important consideration in developing the US bioeconomy is the biosecurity of crops grown for bioenergy production. The most likely biosecurity threats to bioenergy crops are either known pests or pathogens that emerge in new areas, possibly driven by climate change or new pests or pathogens that are genetically related to known ones. A robust biosecurity capability optimized to respond rapidly to biothreats to bioenergy crops requires an integrated and versatile platform that delivers rapid detection and targeted sampling, propagation prediction, and timely characterization of the interaction between a pest or pathogen and the bioenergy crop. These capabilities are needed to underpin the development of controls and solutions. Here the team reports on a new pilot study funded by the DOE to develop a roadmap for a National Virtual Biosecurity for Bioenergy Crops Center (NVBBCC) organized around four interconnected modules: detection and sampling, biomolecular characterization, assessment, and mitigation.

Approach: The team will use a series of community planning meetings and experimental work on a known disease in Sorghum to develop a roadmap for the development of NVBBCC. The roadmap planning meetings, conducted within the first nine months, will identify partnerships within and outside DOE necessary to establish the full capability required for an end-to-end biosecurity platform and develop a network of experts and facilities that the center can draw upon when faced with a biothreat. A study on a fungal disease, anthracnose, which affects Sorghum, a leading energy crop, will be used to develop material, experimental, and data workflows as well as guide future investments. Anthracnose is caused by Colletotrichum sublineola and can lead to significant reductions by up to 67% in crop yield (Stutts and Vermerris 2020). The team has established the capability to work with the disease to study pathogen-host interactions. Biomolecular imaging capabilities at Brookhaven National Laboratory will be used to advance understanding of the interaction of C. sublineola with Sorghum.

Informed by lessons learned from DOE’s National Virtual Biotechnology Laboratory (NVBL), the team will develop a dedicated computing platform to support the NVBBCC pilot study. Its backbone is an integrated and flexible computational science software and hardware system to support persistent data storage; advanced AI/ML-enabled data analysis; data fusion; data sharing; and near- real-time visualizations of the geographic localization of disease as well as computational simulations and predictions. The pilot study intends to stand up an initial prototype of a computational platform that can be scaled up if needed. A separate contribution to this meeting will focus on that component of the pilot study.

Results: The presentation will be a status review of the pilot study and a preliminary report from two community planning meetings that will have been held by mid-April. One planning meeting, held in February, will be focused on biomolecular characterization of biothreats to bioenergy crops. A second one will focus on atmospheric dispersion pathways of diseases relevant to bioenergy crops.

Lipid Membrane Remodeling and Metabolic Response During Ethanol and Isobutanol Stress in Zymomonas mobilisDonohueGLBRCRivera VazquezBioenergyEarly Career

Zymomonas mobilis, an ethanologenic gram-negative bacterium, is currently being bioengineered to produce isobutanol. However, it has been observed that exposure to isobutanol elicits detrimental physiological changes, including a reduction in growth rate and glucose consumption. This project aims to systematically investigate the physiological response of Z. mobilis to isobutanol with a particular emphasis on changes in lipid membrane composition and proteome allocation.

Despite being a proficient ethanol producer, Z. mobilis experiences growth inhibition at high ethanol titers and is highly sensitive to isobutanol. It is known that bacteria can modulate lipid membrane composition to increase their tolerance to environmental stressors. In this study, researchers used liquid chromatography–mass spectrometry (MS)/MS-based lipidomics to measure changes in lipid membrane composition that occur when Z. mobilis is exposed to increasing concentrations of ethanol and isobutanol. Exposure to ethanol and isobutanol resulted in significant but distinct changes to the lipid and fatty acid composition. Affected lipid classes included cardiolipins, phosphatidylcholines, and phosphatidylethanolamines. The fatty acid composition was also significantly affected. Most notably, a substantial increase in C19 cyclopropane fatty acid content was observed when cells were grown at high ethanol concentrations, suggesting that the changes comprise a defense mechanism in response to solvent stress. Previous evidence showed that cyclopropane-ringed fatty acids modify membrane fluidity and act as a barrier to prevent detrimental molecules from entering the cell. To test the hypothesis that C19 cyclopropane fatty acids and derived lipids contribute to solvent resistance in Z. mobilis, researchers engineered a strain that overexpressed the Cyclopropane Fatty Acyl Synthase (CFA synthase) protein (ZMO1033) responsible for transforming unsaturated fatty acids into cyclopropane fatty acids. Analysis of the lipid membrane composition of the CFA synthase overexpressing strain showed a significant increase in C19 cyclopropane fatty acid content for all lipid classes. This increase correlated with significantly improved growth rates in the presence of high ethanol and isobutanol concentrations. These data demonstrate the importance of cyclopropane fatty acids to solvent stress resistance and the effects of isobutanol on protein activity in Z. mobilis. This data will allow engineering of strains that are more resistant to high ethanol and isobutanol concentrations.

The Use of Deuterated Water as a Substrate-Agnostic and Cost-Effective Isotope Tracer for Investigating Reversibility and Thermodynamics of Reactions in Central Carbon MetabolismAmador-NoguezUniversity of Wisconsin–MadisonAmador-NoguezBioenergyEarly Career

Integrate advanced mass spectrometry, computational modeling, and metabolic engineering to develop an experimental-computational approach for the in vivo genome-scale determination of Gibbs free energies (ΔG) in metabolic networks suitable for high-throughput thermodynamic profiling of engineered organisms and emerging model systems.

Successful manipulation of microbial systems for biotechnology applications requires a quantitative understanding of their metabolism. Stable isotope tracers (e.g., 13C, 15N, 18O, and 2H tracers), in combination with mass-spectrometry–based metabolomics, have become a widely used tool for the quantitative analysis of metabolism. Steady-state and dynamic isotope tracer experiments can provide information on metabolic network structure, metabolic fluxes, and thermodynamics of metabolic reactions and pathways.

The use of isotope tracers to estimate ΔG of reactions in central carbon metabolism constitutes a recent development. Rather than relying on measurements of product and reactant concentrations, this approach estimates ΔG from forward (J+) and backward (J) reaction fluxes via the relation ΔG = −RT ln(J+/J). The measurement of forward-to-backward J+/J ratios can be performed using 2H-labeled and 13C-labeled tracers and relies on the generation of distinctive metabolite labeling patterns by reversible reactions within a pathway. Both the measurement of metabolic fluxes and the estimation of in vivo ΔG of reactions have been previously accomplished by placing the heavy isotopes—most commonly 13C, 15N, or 2H—into nutrient substrates such as 13C or 2H-labeled sugars, 2H-labeled fatty acids, 15N-labeled amino acids, 15N-labeled ammonia, 13C-labeled CO2 or formic acid, and many others.

For some applications, the use of isotopically labeled nutrient substrates may not be feasible due to availability and/or high cost. One salient example of this is metabolic flux analysis in industrially relevant cellulolytic microbes, such as Clostridium thermocellum, that metabolize complex substrates (i.e., lignocellulosic biomass). In this work, the use of deuterated water (also named as “heavy water,” 2H2O, or D2O) is explored as a non-nutrient, substrate-agnostic, cost-effective isotope tracer for investigating reversibility of reactions in central carbon metabolism. Researchers reasoned that the use of 2H2O as a tracer (i.e., by growing bacteria in culture media containing a defined amount of 2H2O) can provide information on reversibility of dehydration/hydration reactions, isomerization reactions, aldol reactions, and transamination reactions that result in the incorporation of protons from water into C-H bonds, and thus allow characterization of pathway thermodynamics. The team reports the successful use of deuterated water to investigate the reversibility of glycolytic reactions on three bacterial species of industrial interest: the model bacterium Escherichia coli, the cellulolytic and ethanologenic bacterium C. thermocellum, and the ethanologenic bacterium Zymomonas mobilis, each harboring distinct versions of glycolysis. This work will aid in the construction of accurate metabolic models that incorporate thermodynamic constraints and guide fast rational engineering of microbial networks.

Integrases on DemandSchoenigerSandia National LaboratoriesWilliamsBiosystems DesignInCoGenTEC 

The Intrinsic Control for Genome and Transcriptome Editing in Communities (InCoGenTEC) project sponsored by the BSSD Secure Biosystems Design initiative conducts mechanistic studies of gene flows between bacteria encompassing broad phylogenetic diversity and evolutionary time, focusing on mechanisms that permit natural gene delivery and gene integration. In prokaryotes DNA rearrangements are physiologically and ecologically important and an important potential source of genome instability and loss of biocontainment of engineered features. Goals include comprehensively mapping and classifying bacterial genomic islands, analyzing mechanisms of mobility and identifying routes of horizontal gene transfer. The team will also use this information to identify prophages suitable as vectors for editing microbial community members and to improve design of synthetic genetic elements that can be integrated into target organisms.

Recent advances in genome editing have stimulated a renewed interest in microbial DNA integrases. Previously the team published bioinformatic methods for precise high-throughput definition of bacterial and archaeal genomic islands that contain integrases (Mageeney, et al. 2020). These methods enable pairing of integrases with their specific attachment site (Att) sequences. As part of InCoGenTEC, the team has mapped ~one million genomic islands in ~350,000 bacterial genomes, yielding tens of thousands of unique integrase-Att pairs.

Researchers have implemented assays for high-throughput in vitro and in vivo characterization of integrase activity to verify integrase activity at predicted Att sites as well as investigate integrase dependence on host species factors that might limit the range of horizontal gene transfer. Considerable work has been published on the use of serine integrases for synthetic biology and genome editing of target organisms, typically requiring first the introduction of a heterologous Att site (“landing pad”) into the target genome. Less attention has been paid to the much more abundant tyrosine integrases because of their putative dependences on microbial host factors. Researchers have used the project’s assays to demonstrate activity in E. coli of both serine and tyrosine integrases drawn from broad phylogenetic spans. On the other hand, the team also demonstrated that its methods can be used to identify from near neighbors of a target strain integrases that utilize native Att sites in the target genome, simplifying and potentially greatly increasing the efficiency of bacterial genome editing. Finally, team members have used modified versions of its assays to mine difficult to identify directionality factors (excisionases) from genomic islands. These methods have the potential to significantly improve genome editing efficiency and enable layered genome restructuring in diverse organisms, while improving understanding of the biochemical determinants of the specificity of integrase activity and mobile element host range.

Genomic Analyses and Enzyme Characterization Provide Insights into the Catabolism of Lignin-Related Aromatic Compounds in White-Rot FungiSalvachúaNational Renewable Energy LaboratorySalvachúaBioenergyEarly Career

The overall goal of this project is to test the hypothesis that white-rot fungi (WRF) can simultaneously depolymerize lignin extracellularly and catabolize depolymerization products intracellularly as carbon and energy sources. The results from this project will lead to improved understanding of lignin utilization by WRF and enable identification of promising fungal strains and enzymes for lignin catabolism and valorization. To date, researchers have confirmed the utilization of lignin-related compounds as carbon sources by two WRF species and have initiated pathway elucidation via systems biology approaches. To continue this effort, the team is using enzymology approaches and comparative genomic and phylogenetic analyses to validate and broaden the knowledge on the catabolism of lignin-related compounds by WRF.

Plant-derived biomass is the most abundant biogenic carbon source on Earth. Despite this abundance of lignocellulose, only a small clade of organisms known as WRF can efficiently break down both the polysaccharide and lignin components of plant cell walls. This unique ability imparts a key role for WRF in global carbon cycling. To date, research on WRF has almost universally and intensely focused on their extracellular enzymes that depolymerize plant polymers, whereas knowledge of their intracellular metabolism remains underexplored (Kijpornyongpan et al. 2022). This project aims to elucidate intracellular pathways in WRF with a particular focus on aromatic catabolic pathways. Recently, the team confirmed the utilization of two lignin-related compounds (e.g., 4-hydroxybenzoic acid and vanillic acid) as carbon sources in two species of WRF (Trametes versicolor and Gelatoporia subvermispora). Additionally, the team proposed a catabolic route via the hydroxyquinol pathway from 4-hydroxybenzoic acid to central carbon metabolism, which was informed by differential transcriptomic, proteomic, and metabolomic analyses (del Cerro et al. 2021). This pathway has been considerably less studied than the b-ketoadipate pathway, which is well known in aromatic catabolic bacteria to convert 4-hydroxybenzoic acid to the tricarboxylic acid cycle but which includes different catabolic intermediates. Based on this discovery, researchers were motivated to conduct mechanistic pathway validation and to seek better understanding of the distribution and prevalence of these pathways in WRF and the fungal kingdom generally.

For the enzyme validation work, team members focused on three main proposed steps in the catabolism of 4-hydroxybenzoic acid: oxidative decarboxylation, hydroxylation, and ring-cleavage (del Cerro et al. 2021). To date, five enzymes have been validated via in vitro approaches, and two additional enzymes have also been validated in vivo in a bacterial host. The former five enzymes have been purified, and enzymology analyses have been conducted to understand substrate and cofactor preference as well as inhibitory mechanisms in the presence of various lignin-related compounds. Structural biology efforts are also underway.

In parallel to its enzyme validation efforts, the team sought to understand the distribution of these catabolic activities in WRF and other microbes. For that purpose, researchers performed a large-scale comparative genomic and phylogenetic study across the bacterial and fungal kingdoms. Team members selected protein domains related to specific catabolic activities and sampled 255 bacterial genomes and 317 fungal genomes. Researchers have shown that some of these enzymes are highly conserved in certain fungal lineages and that substrate specificity of aromatic ring-cleavage enzymes has expanded during fungal evolution. In addition, the abilities to depolymerize lignin and catabolize lignin-related aromatic compounds seem to be independent. Lastly, these analyses have also revealed a series of unique enzymes from WRF that may broaden the spatial location of some catabolic reactions with aromatic compounds. These new enzymes are undergoing biochemical characterization. Overall, these studies will provide a deeper understanding of carbon turnover during wood decay and discover enzymes and pathways that could be exploited to convert the undervalued biopolymer lignin into value-added compounds.

RNA Phages: Under-Estimated Players in Soil Ecosystems?RouxJGIRouxCrosscuttingJGI

The overarching goals of this project are to establish an analytical and experimental framework for comprehensive characterization of viral-driven alteration of microbial metabolisms in soil. The specific results presented here focus on the unexpected diversity of RNA phages detected in soil microbiomes, reveal their specific activity patterns and likely impact on bacterial lysis rate in a model soil ecosystem, and highlight several potential avenues to further characterize these soil RNA phages and their impact on microbiome processes.

Bacteriophages are now recognized as key regulators of microbial communities and processes in virtually all ecosystems, from the human gut to the global oceans. The overwhelming majority of phages described and studied so far, either via cultivation or metagenomics approaches, are double-stranded DNA phages with head-tail virion morphology, i.e., “tailed phages” from the Caudoviricetes class. In contrast, RNA-based bacteriophages are rarely reported or isolated and are typically not considered as important components of environmental microbiomes.

Here, the team combined large-scale data mining and time-series analyses to highlight the unsuspected diversity and potential ecological importance of RNA phages in soils. As part of a global survey of RNA viruses across more than 3,500 metatranscriptomes, researchers identified more than 80,000 potential RNA phages, primarily related to the known leviviruses but also including at least eight proposed new families and genera across several phyla (Neri et al. 2022). This global survey also identified a clear enrichment for leviviruses in wastewater, soil, and rhizosphere samples, suggesting these ecosystems are the primary reservoirs of novel RNA phage diversity. Through the specific analysis of a multiomics time series from East River (Colorado) watershed soils, researchers now demonstrate that RNA phage populations in these soils are highly diverse and follow similar activity levels and patterns as dsDNA phages. In particular, team members have observed an increase in RNA phage activity throughout the plant growing season, concomitant with an expected increase in microbial activity. Given the absence of known and predicted lysogenic cycles for RNA phages, they may contribute substantially to the overall bacterial cell lysis and nutrient cycling in the East River watershed, although their exact host range and infection dynamics remains to be characterized. Finally, based on these new RNA phages genomes, researchers identified novel protein families potentially associated with alternative mechanisms for host cell attachment and lysis. Ongoing computational and experimental characterization of these new attachment and lysis proteins should provide further insights in the potential host range of these RNA phages, and possibly reveal new molecules of biotechnological interests such as new single-gene lysis proteins.

Taken together with other recent surveys and studies (Callanan, et al. 2018), these results suggest that RNA phages should be more broadly considered and included in viral ecology studies, especially in soils; and they represent promising sources of novel genes and molecules for biotechnological applications.

Quantum Diamond EcoFAB Microscope for In Situ NMR of Root Exudate MoleculesAjoyUniversity of California–BerkeleyGilbertBioimaging

The scientific goal of this project is to understand the causal relationships between root exudation, rhizosphere processes such as microbial nutrient cycling, and plant health. To achieve this goal, this project will design and validate a prototype “quantum sensing microscope” that is integrated within Fabricated Ecosystems (EcoFABs) for the chemical imaging of rhizosphere processes (Sasse et al. 2019). Quantum control of electronic and nuclear spins in diamond will yield sensors that non-destructively measure nuclear magnetic resonance (NMR) spectra of 13C-labeled root exudates and natural abundance 31P species.

The team has designed, constructed, and optimized a confocal quantum sensing microscope for rhizosphere bioimaging at LBNL. This confocal microscope has a spatial resolution better than 200 nm and uses the nitrogen-vacancy (NV) centers in a single-crystal diamond as the quantum NMR sensors that are controlled and interrogated using optically detected magnetic resonance (ODMR) spectroscopy. Researchers have achieved a long NV coherence time of over 5 ms and have observed strong hyperfine couplings with naturally abundant 15N nuclei indicating the potential of enhanced sensitivity with hyperpolarization (Ajoy et al. 2018). Researchers have developed a software user interface with all the pulse sequence control protocols (Rabi oscillation, spin-echo, dynamical decoupling XY8-N, correlation spectroscopy, and NV-NMR coherently averaged synchronized readout) to perform a series of the ODMR measurements of exemplary metabolite samples (Bucher et al. 2019; Glenn et al. 2018). The team has experimentally demonstrated the coherent control of NV electron spins for detection of paramagnetic ions and NMR-active nuclei. These studies were performed in aqueous solutions with a less than picoliter volume at high throughput (in seconds) with a negligible total power input, allowing the planned non-invasive rhizosphere observations. These results will be presented at Goldschmidt Lyon 2023, an international geochemistry conference.

Improving Candidate Gene Discovery by Combining Multiple Genetic Mapping DatasetsRellán-ÁlvarezNorth Carolina State UniversityRellán-ÁlvarezBioenergyUniversity

(1) Perform an environmental GWAS in a panel of ~2000 sorghum accessions that have already been genotyped and georeferenced using phosphorus availability and early season cold stress as the phenotypes for the GWAS analysis.

(2) Characterize the genetic architecture of lipid content during the early stages of sorghum development using the SAP. The team will sequence the SAP accessions at 10–15X and make these data available. Researchers will perform a GWAS on lipid content under both stress conditions (low temperature and low phosphorus).

(3) Develop algorithms that incorporate all the different types of information collected (i.e., metabolite levels, GWAS candidate genes, selection signals) to improve the ability to detect signals of small effects and increase confidence in the selection of candidate genes. The algorithms and pipelines developed here will be made available to the community as R packages.

With a growing wealth of genetic datasets generated by next-generation sequencing coupled with the advent of large plant phenotyping datasets, exciting new corridors for investigations have opened for understanding complex traits due to the environment. The Genome-Wide Association Studies (GWAS) model identifies associations between single nucleotide polymorphisms (SNPs) and the phenotype. Complex biological processes involve multiple phenotypes. Researchers have previously identified lipid variation for maize adaptation in Mexican highlands, which has adapted to low phosphorus and cold. The team is now using high-dimensional Sorghum bicolor genetic datasets to perform environmental GWAS for various soil phosphorus phenotypes (availability, concentration, and solubility) in African region, Fst measurement in the same panel adapted to high and low phosphorus, and finally, profiling various lipid concentrations (LC/MS) in low and high phosphorus in Sorghum Association Panel (SAP) and their subsequent metabolomics GWAS. Comprehensive research on the complete genetic architecture is laborious, costly, and time extensive due to the overwhelming number of genes and their regulatory networks, different phenotypes explaining the same adaptive process, and multidimensional genomics datasets. Hence, the need for developing a robust statistical framework that can combine information from different experiments and genomics dataset in an individual p-value level that can aggregate multiple small and large effects of genes and redefine the order of emphasis of genes. For such an outcome, researchers use the Cauchy distribution to define a test statistic as a weighted sum of Cauchy transformation of individual p-values, which can? be used to combine p-values across different datasets. Researchers are working towards creating such a framework through R packages that will be publicly available. Finally, researchers hope to have an accurate description of the molecular mechanisms involving phosphorus. Team members also hope to test whether lipids in sorghum play a similar adaptive role in Africa, and whether there is convergence between the sorghum and maize in the plausible molecular mechanisms for such an adaptation.

The Use of Synthetic Communities Reveals Disturbance of Process Partitioning Among Denitrifying Microbes Leads to Increased Nitrous Oxide EmissionsAdamsLawrence Berkeley National LaboratoryValenzuelaEnvironmental MicrobiomeENIGMA

The Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA) Science Focus Area (SFA) uses a systems biology approach to understand the interaction between microbial communities and the ecosystems they inhabit. To link genetic, ecological, and environmental factors to the structure and function of microbial communities, ENIGMA integrates and develops laboratory, field, and computational methods. Thus, ENIGMA has been organized into several campaigns involving multiple institutes with varying expertise. An overarching goal of the Environmental Simulations and Modelling Campaign is to simulate, model, and predict the mechanistic underpinnings of field-observed phenomena. This includes characterizing the process partitioning of N2O emissions in varying ecological contexts (pH, metal availability, oxygen, and phage attack) using field isolates assembled into synthetic communities (SynComs).

The Field Research Center (FRC) at Oak Ridge, Tenn., has some of the highest subsurface nitrate concentrations (>10g/L) ever recorded. This pool of subsurface nitrate, a remnant of legacy activities, can end up as the greenhouse gas nitrous oxide (N2O) via incomplete denitrification or as nitrogen gas (N2) when completely denitrified. Based on spatiotemporal field surveys of biogeochemistry, hydrology, metagenomics, and activity measurements the ENIGMA team is formulating mechanistic hypotheses to explain ecologically important phenomena, such as the emissions of increased N2O emissions in wells that transiently transition from a neutral to an acidic pH. Here, using the N2O emission phenomenon, researchers explicate how such phenomena are being dissected through laboratory studies using synthetic communities assembled from field isolates of relevant organisms. The ultimate goal is to characterize the mechanism(s) responsible for each field phenomenon and test generalizability of findings back at the field site.

Analysis of FRC field isolates with denitrifying capabilities showed that more than half of the isolates were missing at least one step in the denitrification pathway. Researchers, therefore, hypothesized that multiple microbes with complementary enzymatic steps likely work together in communities to complete denitrification through the exchange of nitrogen intermediates. Further, the team hypothesized that different abiotic and biotic factors that inhibit specific enzymatic reactions would likely decouple complete denitrification, contributing to significant levels of N2O emissions at the FRC. To test this hypothesis, the team established a synthetic community (SynCom) of two field isolates—Rhodanobacter sp. R12 and Acidovorax sp. 3H11—which together can perform complete denitrification but cannot independently. Therefore, a cross-campaign initiative was generated to elucidate different mechanisms of abiotic control, including pH shifts, microaerobic environments, and metal availability. Using time course experiments, researchers determined that a shift in pH from neutral pH 7 to pH 6 or an increase in nickel (Ni) concentrations was enough to decouple the complete denitrification process of the SynCom by different mechanisms. Strikingly, both perturbations resulted in significant increases in N2O emissions. Transcriptome analysis of the SynCom at differing pH or Ni conditions suggests dynamic changes in community composition and physiological states. Current experiments are focused on shifts in pH at varying C/N ratios, oxygen, and even phage attack that may result in the decoupling of denitrification partitioning. Transmission electron microscope imaging suggests very different morphologies between the two field isolates that may play an essential role in carbon, nitrogen, and phosphorous fluxes between the organisms. Together, all these efforts are leading to the construction of a context-specific gene regulatory network that can be used to predict how environmental fluctuations at the field site will impact emissions of N2O.

Design and Omics Exploration of Synthetic Microbial Communities in KBaseRanjanOak Ridge National LaboratoryRanjanComputational BiologyKBase

Simple constructed communities with desired biological functions can be used to study bacterial processes involved in community establishment and mimic the behavior of natural communities to increase plant growth and disease resistance. In this collaboration with the Plant-Microbe Interfaces (PMI) science focus area (SFA) at Oak Ridge National Laboratory, team members are adding datasets and apps to KBase to simplify the selection of isolates for constructed community experiments. For designing the community, researchers can use KBase Apps that annotate genomes with plant growth–promoting traits and secondary metabolism classes as well as an app that calculates metabolic dependencies between microbes of interest. The design process will be tested using experimental systems established in the PMI SFA for studying constructed communities. Further, the results from these experiments will be integrated back into KBase to improve the mechanistic understanding of interactions and iteratively improve the design process.

Simple constructed communities offer a key experimental platform for examining how environmental perturbations affect the structure of the microbiome and host physiology and productivity. Computational simulations are a necessary tool to guide the design of simplified constructed communities. KBase—a DOE BER-funded, public, and freely accessible software and data science platform with a rich user interface—is ideal for developing such computational tools since it already offers a large and increasing number of diverse tools: e.g., functional annotation, metabolic modeling, auxotrophy prediction, substrate utilization and production of byproducts, taxonomic information, and predicting microbial traits. With these tools, it is possible to get some insights about a genome’s biochemistry and general characteristics only based on its sequence. The following user interface applications (apps) are under development to expand the set of KBase tools in support of constructed community studies.

The first app, Annotate Genomes with Plant Growth Promoting Traits, applies annotation to genes in a genome based on the Plant Growth Promoting Trait ontology (PGPT). PGPT is a literature- and omics-curated, comprehensive, and hierarchical collection containing 6,900 PGPTs associated with 6,965,955 protein sequences. The ontology has several categories such as phytohormone production, plant signal production, bio-fertilization (potassium solubilization, iron acquisition), bioremediation (fluoride, heavy metal detoxification), colonizing plant system (chemotaxis, surface attachment, root colonization), plant immune response stimulation, stress control, competitive exclusion (quorum sensing, bacterial fitness, cell envelope remodeling). This app can help a user prioritize and select genomes for constructed community experiments.

The app Annotate Genomes with Secondary Metabolism Classes uses antiSMASH (1), a popular genomics tool, to annotate genes in a genome with secondary metabolism classes and identify biosynthetic gene clusters within the microbial genomes of interest. The biosynthetic gene cluster profile can be used to generate hypotheses regarding certain organisms having a higher potential for antagonistic and antimicrobial activity. This information can then be used to form subsequent hypotheses regarding the membership of stable communities. The team will also be exploring outputs from the JGI Secondary Metabolite Collaboratory as they work with them in the integration of the analysis workflow into KBase.

The app Calculate Metabolic Interaction Score builds pairwise metabolic models in a given environment and calculates several quantitative metrics that describe competitive and cooperative potential between the paired microbes. These metrics include: (1) the metabolic interaction potential (MIP), which approximates the potential for syntrophic cooperation between the organisms; (2) the metabolic resource overlap (MRO), which approximates competition for media substrates between the organisms; and (3) predicted growth rates of the isolates and community, which leverages other KBase Apps to approximate growth dynamics of the given community.

The aforementioned apps will be used by PMI researchers to prioritize genomes for constructed community experiments and generate testable hypotheses: e.g., the apps can identify microbial functions and possibly redundant community members, which can be tested by replacing one member with another microorganism whose functional and metabolic profile are predicted to be similar or different. Team members have also uploaded 550 PMI isolate genomes to KBase and are exploring design of improved user interface for searching, filtering, and selecting genomes for constructed communities. These datasets and apps will expedite the design-build-test-learn cycles of PMI SFA projects and enable new scientific explorations.

Microbes Persist: Towards Quantitative Theory-Based Predictions of Soil Microbial Fitness, Interaction and Function in KBasePett-RidgeLawrence Livermore National LaboratoryKimbrelEnvironmental MicrobiomeMicrobes Persist SFA

Microorganisms play key roles in soil carbon turnover and stabilization of persistent organic matter via their metabolic activities, cellular biochemistry, and extracellular products. Microbial residues are the primary ingredients in soil organic matter (SOM), a pool critical to Earth’s soil health and climate. The team hypothesizes that microbial cellular chemistry, functional potential, and ecophysiology fundamentally shape soil carbon persistence, and team members are characterizing this via stable isotope probing (SIP) of genome-resolved metagenomes and viromes. Researchers are focusing on soil moisture as a master controller of microbial activity and mortality, since altered precipitation regimes are predicted across the temperate United States. This science focus area’s (SFA) ultimate goal is to determine how microbial soil ecophysiology, population dynamics, and microbe-mineral-organic matter interactions regulate the persistence of microbial residues under changing moisture regimes.

This SFA has pioneered methods that quantify element fluxes with taxonomic resolution and has proposed to integrate these into KBase. In particular, quantitative stable isotope probing (qSIP) allows us to evaluate in situ activity of individual taxa in complex communities by adding isotope tracers such as 18O-enriched heavy water or 13C-enriched compounds. Researchers have refactored a computational workflow that accepts both amplicon or metagenomic sequence SIP input and calculates atom fraction excess (enrichment) as well as growth and mortality rates for individual amplicon sequence variants (ASVs) and genomes assembled from metagenomes (MAGs and viral OTUs). Experiments using 18O-H2O labeling and qSIP provide critical information on organism growth rates and mortality in situ. The analytical pipelines the team is developing within KBase establish a standard qSIP analytical workflow and a qSIP database suitable for robust cross-site comparisons and for model benchmarking. The workflow will enable uniform bioinformatics and calculations of qSIP data (e.g., a uniform approach to density shift calculations) within the existing quantitative insights into microbial ecology (QIIME) platform, and the database will facilitate robust comparisons across experiments. Integration within KBase will support analyses that compare traits of organisms with their performance in nature across environments.

The qSIP pipeline is fully integrated with a genomes-to-traits workflow (microTrait) and compatible with a dynamic energy budget–based trait-based model (DEBmicroTrait). With microTrait and DEBmicroTrait, team members have developed and tested a computational workflow to (1) infer ecologically relevant traits from microbial genomes, (2) systematically reduce the high-dimensionality of genome-level microbial trait data by inferring functional guilds (sets of organisms performing the same ecological function irrespective of their phylogenetic origin), (3) quantify within-guild trait variance and capture trait linkages in trait-based models, (4) explore trait-based simulations under different scenarios with varying levels of microbial community and environmental complexity, and (5) benchmark emergent model substrate utilization (digested as chemical abundance data) and qSIP-derived growth and mortality rates (from qSIP database).

Ongoing work to combine both the qSIP and DEBmicroTrait tools within KBase will provide a strong foundation for researchers who wish to use quantitative in situ measurements of microbial ecophysiology and population dynamics to benchmark models and build a predictive understanding of biological processes controlling material fluxes in complex environments.

ENIGMA Long Read Sequencing and Assembly for Microbial Genomes: KBase Integration for Assembly and LISA WorkshopAdamsLawrence Berkeley National LaboratoryLuiEnvironmental MicrobiomeENIGMA

Achieving a causal understanding of a microbial system requires mapping mechanisms by which organisms grow, cooperate, and compete in complex environments. These mechanisms include ecological phenomena and abiotic factors that influence behavior and survival. One of the critical requirements for reaching this level of understanding is fully resolving the genomes of the community so that the functional roles specified by their genomes can be assayed and discovered. While the challenges of gene functional annotation and linking genotype and phenotype loom beyond simply obtaining genomes, the underlying challenge at the present remains to generate high-quality genomes for microbial isolates. The base genome along with its relative abundance constitute the most important foundational data needed to infer and parameterize models of microbial system dynamics.

The Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA) Science Focus Area (SFA) has spent time developing pipelines for sequencing and assembly of long read data from microbial isolates and metagenomes to help achieve the goal of casual microbial ecology. Researchers have developed the capability to isolate diverse organisms, extract the high-molecular weight DNA needed for single-molecule long-read sequencing, and perform the sequencing using Oxford Nanopore Technologies MinION and PromethION sequencers. To characterize the microbial diversity and activity at the Oak Ridge Reservation at Oak Ridge National Laboratory, ENIGMA anticipates isolating thousands of bacteria and archaea, as well as generating spatio-temporal series of fully resolved enrichments and metagenomes from the site. These sequencing projects assist the goals of linking genotype to phenotype and understanding the temporal, dynamic, and complex factors influencing microbial community structure and activity at the research site. ENIGMA uses isolates to help link genotype to phenotype by analyzing genomes in conjunction with transposon mutant libraries, metabolomics, and growth condition data. High-quality genomes are essential for these types of experiments and ENIGMA science.

Researchers are currently adding new functionality to DOE Systems Biology KnowledgeBase (KBase) by implementing tools for using long-read data for assembly of isolates and methylation detection. By developing workflows within KBase, these tools will be more broadly available across the ENIGMA SFA and to other scientists, especially for scientists that do not specialize in computational methods. These apps and workflows will enable ENIGMA, as well as other DOE SFAs and microbiologists to (1) address scientific questions that would otherwise be infeasible with isolate assemblies using only short reads, (2) track provenance of data and methods used for assembly, and (3) share assemblies across the SFA for collaborations. By providing this new functionality in KBase, a foundation will be provided for further extensions in KBase to support developments in long-read technology. Currently, the genome assembler Unicycler has been released as a KBase app and Flye is in development. Filtlong, a read quality tool, and Polypolish, an assembly polishing tool, are in beta.

In Summer 2023, the team will be holding the Long-read Isolate Sequencing and Assembly (LISA) Workshop at LBNL. This workshop will teach participants how to go from a microbial isolate to sequencing to assembly of a genome. The workshop will have a wet lab session to learn how to extract high-molecular weight DNA, make nanopore libraries, and run a MinION sequencer. A separate computational session will be held on base calling of nanopore data, genome assembly using long read sequencing data, and genome annotation using KBase. Scientists from computational, modeling, and bench backgrounds are encouraged to attend both sessions. This workshop will be designed to accommodate learners from these diverse technical backgrounds.

Three-Dimensional High Spatial Resolution Simulation for Groundwater Flow and Nitrogen Transport Under Rainfall Perturbations in the Subsurface of Area 3AdamsLawrence Berkeley National LaboratoryImEnvironmental MicrobiomeENIGMA

The Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA) Science Focus Area (SFA) uses a systems biology approach to understand the interaction between microbial communities and the ecosystems that they inhabit. To link genetic, ecological, and environmental factors to the structure and function of microbial communities, ENIGMA integrates and develops laboratory, field, and computational methods.

Uranium and nitrate contaminant transport in Area 3 near the S3 ponds at ORNL are investigated through 3D field-scale modeling and simulation. This project leverages a recently acquired Cone Penetration Testing (CPT) dataset which provides the hydraulic conductivity field of Area 3 with a high spatial resolution as input data to the numerical subsurface model. The CPT data shed light on local heterogeneity of subsurface materials, significantly decreasing the uncertainty due to limited sampling points. By further using 27 well survey data collected by the ENIGMA SFA (e.g., meteorological, hydrological, microbial, and geochemical datasets), a hydrogeological model is built on PFLOTRAN and run on the high-performance computing system, National Energy Research Scientific Computing Center. Generalized stoichiometries are used for biogeochemical reactions related to nitrogen cycling. Computational results of the 3D field-scale simulation show: (1) different flow and transport regimes depending on subsurface materials, (2) impacts of rainfall events on nitrous oxide emission, (3) influential controls of flow conditions through sensitivity analysis enabling a full treatment of the ModEx approach to designing and implementing the Subsurface Observatory (SSO). The results help researchers understand nitrogen cycling in Area 3 and determine the location of the ENIGMA SSO site. Furthermore, the results will be compared to omics-informed modeling and simulation as planned in the Framework for Integrated, Conceptual, and Systematic Microbial Ecology (Lui et al. 2021).

The Path from Root Input to Mineral-Associated Soil Carbon is Dictated by Habitat-Specific Microbial TraitsPett-RidgeLawrence Livermore National LaboratoryFoleyEnvironmental MicrobiomeMicrobes Persist SFA

Microorganisms play key roles in soil carbon turnover and stabilization of persistent organic matter via their metabolic activities, cellular biochemistry, and extracellular products. Microbial residues are the primary ingredients in soil organic matter (SOM), a pool critical to Earth’s soil health and climate. The team hypothesizes that microbial cellular-chemistry, functional potential, and ecophysiology fundamentally shape soil carbon persistence, and researchers are characterizing this via stable isotope probing (SIP) of genome-resolved metagenomes and viromes. The team is focusing on soil moisture as a master controller of microbial activity and mortality since altered precipitation regimes are predicted across the temperate United States. This science focus area’s ultimate goal is to determine how microbial soil ecophysiology, population dynamics, and microbe-mineral-organic matter interactions regulate the persistence of microbial residues under changing moisture regimes.

Soil microorganisms influence the global carbon balance by transforming plant inputs into mineral-associated organic matter (MAOM), but which microbial traits control mineral-associated SOC storage is widely debated. While current theory and biogeochemical models have settled on microbial carbon-use efficiency and growth rate as positive predictors of mineral-associated SOC accrual, empirical tests are sparse and show contradictory observations. To investigate the relationship between different microbial traits and MAOM, researchers conducted a 12- week 13C tracer study to track the movement of rhizodeposits and root detritus into microbial communities and SOM pools under moisture-replete (15 ± 4.2%) or water-limited (8 ± 2%) conditions. Using a continuous 13CO2-labeling growth chamber system, researchers grew the annual grass Avena barbata for 12 weeks and measured formation of 13C-MAOM from either 13C-enriched rhizodeposition or decomposing 13C-enriched root detritus. The team also measured active microbial community composition (via 13C-quantiative stable isotope probing; qSIP) a suite of microbial traits including carbon-use efficiency, growth rate, and turnover (via the 18O-H2O method), extracellular enzyme activity, bulk 13C-extracellular polymeric substances (EPS), and total microbial biomass carbon (13C-MBC), as well as chemical composition of MAOM via 13C-nuclear magnetic resonance (NMR) and Fourier-transform ion cyclotron resonance mass spectrometry (FTICR-MS).

In the microbial habitat around living roots (rhizosphere), the activity of bacterial-dominated communities with fast growth, high biomass, and high production of extracellular polymeric substances were positively associated with the accrual of 13C-mineral–associated SOC under normal moisture conditions. However, under drought, the rhizosphere and the microbial habitat around decaying roots (detritusphere) had more fungal-dominated communities positively associated with 13C-mineral associated SOC with slower growth, lower carbon-use efficiency, and higher exoenzyme activity. 13C-qSIP revealed that bacterial taxa from the families Bacillaceae, Bradyrhizobiaceae, and Comamonadaceae were particularly active in the rhizosphere, whereas filamentous fungi (families Ceratostomataceae, Lasiosphaeriaceae, and Pleosporaceae) were dominant decomposers in the detritusphere. FTICR-MS and 13C-NMR indicated that rhizosphere MAOM has a higher O/C ratio than the detritusphere as well as having a greater amount of lipids and carbohydrates, whereas the detritusphere had a greater abundance of lignin-like compounds. Together, this suggests a more microbial-processed signature of MAOM in the rhizosphere and a more plant-derived signature in the detritusphere.

Overall, these findings emphasize that microbial traits linked with SOC storage vary with soil habitat and moisture conditions—a fact that emerging SOC models should explicitly reflect, since living versus decaying root ratios and moisture regimes will shift under a changing climate.

ENIGMA Environmental Atlas: An Integrated Approach to Linking Microbial Genotype to Phenotype in a Dynamic Subsurface EcosystemAdamsLawrence Berkeley National LaboratoryChakrabortyEnvironmental MicrobiomeENIGMA

The goal of the Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA) Science Focus Area (SFA) is to develop theoretical, technological, and scientific approaches to gain a predictive and mechanistic understanding of the biotic and abiotic factors that constrain microbial communities’ assembly and activity in dynamic environments. To link genetic, ecological, and environmental factors to the structure and function of microbial communities, ENIGMA uses a systems biology approach to integrate and develop laboratory, field, and computational methods.

Despite decades of research, there are still significant gaps in fundamental understanding of microbes, their interactions to form communities, and their relationships with the environment. Towards this, this project is constructing an ‘ENIGMA Environmental Atlas,’ a valuable resource that enables mapping genotype to phenotype for a significant number of diverse subsurface microbes at the research field site, the Oak Ridge Reservation Field Research Center. This Atlas includes a growing collection of over 2200 microbial isolates representing 36 orders and 895 unique strains from the site. Genome sequencing of 750 isolates to date has revealed both macro and microdiversity. High-resolution electron microscopy images reveal unique morphotypes and features. Researchers have established genetic toolkits and genome-wide mutant libraries in 25 diverse isolates to date and are using these resources to annotate genes of unknown function and interrogate physiological responses to environmental stressors.

The team has combined several statistical analyses using field environmental and sequencing-based metadata to identify high-priority targets for deeper isolation and characterization effort based on abundance, community correlation, and other metrics. The team is developing diverse assays to investigate the physiology of these ‘most wanted’ microbes including high throughput carbon utilization, metal toxicity thresholds, biofilm formation, and exometabolomic profiling. Exometabolomic profiling of 135 isolates confirmed that substrate use is phylogenetically conserved. Together, these measurements enable understanding the interaction of microbes with each other and determining if field measurements of co-occurrence coincide with positive interactions among isolates, helping to progress the ability to predict community function from metagenomic and amplicon sequence variant data.

The team presents progress thus far on the development of this unique community-usable platform and highlights several instances where the Atlas can be used to better understand the complexities that govern microbial community structure and function in the environment.

Functional Succession of Growing Soil Microorganisms and Virus-Driven Mortality Following Rewetting in a California Grassland SoilPett-RidgeLawrence Livermore National LaboratoryBlazewiczEnvironmental MicrobiomeMicrobes Persist SFA

Microorganisms play key roles in soil carbon turnover and stabilization of persistent organic matter via their metabolic activities, cellular biochemistry, and extracellular products. Microbial residues are the primary ingredients in soil organic matter (SOM), a pool critical to Earth’s soil health and climate. Researchers hypothesize that microbial cellular chemistry, functional potential, and ecophysiology fundamentally shape soil carbon persistence and are characterizing this via stable isotope probing (SIP) of genome-resolved metagenomes and viromes. The team focuses on soil moisture as a master controller of microbial activity and mortality since altered precipitation regimes are predicted across the temperate United States. This science focus area’s ultimate goal is to determine how microbial soil ecophysiology, population dynamics, and microbe-mineral-organic matter interactions regulate the persistence of microbial residues under changing moisture regimes.

Rewetting of soil stimulates a succession of microbial growth and mortality, a process that could potentially become more frequent as climate change in semi-arid zones is predicted to lead to fewer rain events, potentially allowing for soil dry-down between events. It has been hypothesized that certain microbial traits, such as degradation of carbohydrates and acquisition of nitrogen, underlie this succession and confer advantages for growth as both the soil microbial community and available resources change over time. Researchers presume that the initial burst of mortality following wet-up is driven by osmotic lysis due to the rapid change in osmotic pressure, while continued mortality after the first few hours is driven by viruses or other biological factors. Researchers also hypothesized that the summer dry down would drive phages to integrate into host chromosomes and that wet-up of dry soil might serve as an environmental inducer of temperate phages.

To determine the mechanisms driving microbial growth and mortality during wet-up, researchers performed a wet-up experiment using soils that had been previously 13CO2-labeled and maintained under one of two precipitation regimes: the historical average precipitation (100%) and a 50% water reduction. Following the annual summer dry period, soils were collected and incubated with multiple isotopic treatments. Heavy water (18O-H2O) additions were used to specifically target the growing portion of the microbiome and virome. Samples were harvested at six times following rewetting (0, 3, 24, 48, 72, and 168 hr) for DNA–quantitative stable isotope probing (qSIP), metagenomics, metatranscriptomics, viromics, and CO2 production.

While total soil respiration did not vary between soils exposed to 100% versus 50% precipitation, respiration of new (labeled) rhizodeposits was higher in the 100% soils, implying functional differences between precipitation groups. This result was supported by large differences in taxon-specific responses for bacterial growth and mortality and differential abundance of traits found in growing (18O-labeled) microorganisms for the two precipitation treatments. Differential abundance of traits also revealed a large difference in the functional potential of the microbiome at the end of the dry season. Surprisingly this legacy effect disappeared after one week, indicating functional capacity converged regardless of prior conditions. Temporal changes were observed in the abundance of genes coding for carbohydrate active enzymes in growing organisms implying that substrate availability varied with time. Genes coding for synthesis and export of EPS were more abundant in growing organisms as compared to the total community in alignment with the hypothesis that this function could provide a fitness advantage during both the dry-down and wet-up.

In comparison to temporal abundance patterns in microorganisms, viruses displayed higher spatial heterogeneity in addition to temporal community changes. Quantitative isotope tracing, time-resolved metagenomics and viromic analyses indicated that dry soil held a diverse but low biomass reservoir of virions, of which only a subset thrived following wet-up. Viral richness decreased by 50% within 24 hours post wet-up, while viral biomass increased at least four-fold within one week. Counter to recent hypotheses suggesting temperate viruses predominate in soil, the team’s evidence indicates that wet-up is dominated by viruses in lytic cycles. Researchers estimate that viruses drive a measurable and continuous rate of cell lysis with up to 46% of microbial death driven by viral lysis one week following wet-up. Results show that viruses contribute to a significant portion of soil microbial biomass turnover and the widely reported CO2 efflux following wet-up of seasonally dry soils.

In summary, the team observed temporal changes in growing microbial and viral communities following wet-up that were underpinned by succession of organic carbon degradation capabilities as well as by lytic infection by viruses.

Conversion of Lignocellulosic Plant Biomass into Industrial Chemicals via Metabolic Engineering of Two Extreme Thermophiles, Caldicellulosiruptor bescii and Pyrococcus furiosusAdamsUniversity of GeorgiaBingEnvironmental MicrobiomeUniversity

This project aims to metabolically engineer two extreme thermophiles, Caldicellulosiruptor bescii (Tmax 90°C) and Pyrococcus furiosus (Tmax 103°C), to convert lignocellulosic plant biomass into industrial chemicals including acetone, 2,3-butanediol, 1-propanol, 3-hydroxypropionate, and ethanol. This work includes efforts to reincorporate CO2, which is generated by fermentation, into desired products powered by energy recovered from H2, which is also produced during fermentation. Some of the native enzymes of C. bescii that degrade lignocellulose will be expressed in P. furiosus to allow growth on cellulose and xylan. Additionally, system-wide metabolic and regulatory models for both organisms will be leveraged to optimize conversion, yield, and selectivity of plant biomass to industrial chemicals.

Conversion of lignocellulosic plant biomass into industrial chemicals has potential to provide renewable, sustainable sources of non-electrifiable fuels and chemicals. Enzymatic degradation and conversion of lignocellulose into desirable chemicals offers possible energy and monetary savings compared to chemical or mechanical methods. Consolidated bioprocessing aims to combine biological deconstruction and conversion of plant biomass into a single step to further increase these savings. Some extreme thermophiles, like C. bescii, excel at deconstruction of plant biomass which is in part aided by the high temperatures (Straub et al. 2018; Bing et al. 2021). Labs have shown how extremely thermophilic fermentation temperatures (>70°C) offer additional specific advantages including reduced contamination risk and opportunities for novel product separations (Bing et al. 2022, 2023). Previously, C. bescii was metabolically engineered to produce ethanol, acetone, and various alcohols, but not yet at industrially relevant titers (Williams-Rhaesa et al. 2018; Straub et al. 2020; Rubinstein et al. 2020). As such, the project is currently working to improve selectivity, yield, and titers of these products in C. bescii, as well as looking at additional target products, such as 2,3-butanediol and 3-hydroxypropionate.

Additional work involving the hyperthermophilic archaeon, P. furiosus, is also underway with the aim to leverage its high thermophily, efficient and established genetic system, as well as unique CO2 fixing and energy conservation enzymes (Hawkins et al. 2015; Keller et al. 2015, 2017). This work not only includes similar efforts to produce industrial chemicals (3-hydroxypropionate, 1-propanol, and ethanol), but also to leverage knowledge of C. bescii to engineer P. furiosus with (hemi)cellulases from C. bescii to enable growth on cellulose and xylan. Likewise, researchers are using enzymes from P. furiosus in C. bescii to improve production of target chemicals. Research is ongoing to engineer an NADPH regenerating soluble hydrogenase I from P. furiosus into C. bescii to increase redox factors for different target chemicals, such as 3-hydroxypropionate. Throughout all this work, system-wide metabolic and regulatory models for the organisms were created to evaluate, guide, assist, and optimize the production of target chemicals. The C. bescii models that were created previously continue to be updated, and P. furiosus models were created more recently as part of this project (Rodionov et al. 2021; Zhang et al. 2021).

Towards Generalized Platforms for Functional Genomics in α-ProteobacteriaPetersGLBRCHallBioenergyGLBRC

Enable rational engineering of alphaproteobacteria that can synthesize biofuels and bioproducts from lignocellulosic biomass. This project develops and validates genomic insertion sites, inducible promoters, and inducible riboswitches for bacteria such as Novosphingobium aromaticivorans, Zymomonas mobilis, and Rhodobacter sphaeroides.

Numerous alphaproteobacterial species have promising traits for converting lignocellulosic biomass to useful biofuels and bioproducts. However, many of the genes required to carry out these roles have remained enigmatic in part due to limited genetic tools developed for these organisms. Next-generation genetic tools, such as CRISPRi-seq, are capable of systematically phenotyping all genes, but they have not been broadly deployed in alphaproteobacteria. Building on success in establishing genome-scale CRISPRi in Z. mobilis, the team seeks to develop generalized platforms for synthetic biology and functional genomics in alphaproteobacteria. Here, researchers lay the groundwork for these platforms by optimizing site specific integration and developing synthetic, inducible promoters for N. aromaticivorans and R. sphaeroides. These enhanced genetic tools will enable basic and applied research such as delivery of CRISPRi systems to investigate gene function and expression of heterologous pathways to generate valuable bioproducts.

Improved Biofuel Production Through Discovery and Engineering of Terpene Metabolism in SwitchgrassZerbeUniversity of California–DavisWyattBioenergyEarly Career

Of the myriad specialized metabolites that plants form to adapt to environmental challenges, terpenoids form the largest group. In many major crops, unique terpenoid blends serve as key stress defenses that directly impact plant fitness and yield. In addition, select terpenes are used for biofuel manufacture. Thus, engineering of terpenoid metabolism can provide a versatile resource for advancing biofuel feedstock production but requires a system-wide knowledge of the diverse biosynthetic machinery and defensive potential of often species-specific terpenoid blends. This project merges genome-wide enzyme discovery with comparative omics and protein structural studies to define the biosynthesis and stress-defensive functions of switchgrass (Panicum virgatum) terpenoid metabolism. These insights would be combined with the development of genome editing tools to design plants with desirable terpene blends for improved biofuel production on marginal lands.

Diterpenoids form a diverse class of metabolites with critical functions in plant development, defense, and ecological adaptation. Major monocot crops, such as maize (Zea mays) and rice (Oryza sativa), deploy diverse blends of specialized diterpenoids as core components of biotic and abiotic stress resilience. This project reports the genome-wide discovery and functional characterization of the stress-related diterpenoid-metabolic network in switchgrass (P. virgatum). Mining of the allotetraploid switchgrass genome identified expansive diterpene synthase (diTPS) and cytochrome P450 monooxygenase (P450) enzyme families critical for the chemical diversity of bioactive diterpenoids. Tissue-specific transcriptome and metabolite analyses of drought-resistant (Alamo) and drought-susceptible (Cave-in-Rock) genotypes showed an earlier onset of transcriptomic changes and significantly more differentially expressed genes in response to drought in Cave-in-Rock. Diterpenoid-biosynthetic genes showed drought-inducible expression in Alamo roots, contrasting largely unaltered triterpenoid and phenylpropanoid pathways. In addition, metabolomic analyses identified common and genotype-specific terpenoids. Consistent with transcriptomic alterations, several root diterpenoids showed significant drought-induced accumulation. Structural analysis of drought-responsive root diterpenoids verified these metabolites as oxygenated furanoditerpenoids that are perhaps unique to switchgrass. Together, these findings support a role of diterpenoids in switchgrass drought stress tolerance and provide resources for understanding the molecular mechanisms underlying switchgrass environmental resilience.

Enhanced Resistance Pines for Improved Renewable Biofuel and Chemical ProductionPeterUniversity of FloridaMorganBioenergyUniversity

The goal is to genetically increase constitutive terpene defenses of loblolly and slash pine to enhance protection against pests and pathogens and simultaneously expand terpene supplies for renewable biofuels and chemicals.

The constitutive and inducible oleoresin defense network in loblolly (Pinus taeda) and slash (Pinus elliottii var elliottii) pine provides physical and chemical resistance to insects and pathogens and the chemical composition of oleoresin can be used as a renewable source of biofuels harvested directly from live tree stems. Increasing pine terpenes is well aligned with the needs of the developing bioeconomy, as the southeastern United States hosts the world’s largest biomass supply chain, annually delivering 17% of global wood products, and has the potential to expand the U.S. pine chemicals industry by increasing biofuels from pine terpenes, which is limited by relatively low average wood terpene content. The focus is to increase constitutive terpene production to enhance loblolly and slash pine resistance to pests and pathogens and to simultaneously increase biofuel feedstocks in these commercial pine species.

Pine terpenes evolved as a primary chemical and physical defense system and are a main component of a durable, quantitative defense mechanism against pests and pathogens. In previous research it was demonstrated that terpene defense traits are under genetic control and behave as quantitative traits and have used genetic engineering to validate 12 genes that can significantly increase wood terpene content. In objective one, researchers are integrating existing and new genome wide association studies (GWAS) genetic results with RNA expression, quantitative trait locus (QTL) mapping, and allele frequency information in known high oleoresin flow selections and the project’s breeding populations to discover and validate loblolly and slash pine alleles/genes that are important for resistance.

GWAS analyses of constitutive oleoresin flow, wood diterpenoid content, and resin canal number with ~83,000 biallelic single nucleotide polymorphisms (SNPs) were completed for the project’s CCLONES population and constitutive oleoresin flow, mono- and diterpene content are complete and complete and resin canal number is in progress for the project’s ADEPT2 population. In the ADEPT2 population, researchers simultaneously measured constitutive and induced oleoresin flow after treating clones with methyl-jasmonate (MeJA). While the goal is to increase constitutive terpene defenses, researchers use MeJA to induce defense responses to identify the genes and genetic architecture of resinosis. In the ADEPT2 population, the team found the clonal repeatability of constitutive oleoresin flow and inducible oleoresin flow to be 0.31, suggesting these traits are under moderate genetic control.

The team’s estimate of clonal repeatability for constitutive oleoresin flow in the ADEPT2 population is consistent with what was previously published in the CCLONES population, and the estimate of clonal repeatability for inducible oleoresin flow in the ADEPT2 population is the first estimate of genetic control for this trait. Importantly, researchers observed a strong genetic correlation (0.82) between induced and constitutive oleoresin flow, suggesting the genetic architecture between these traits is shared.

Researchers conducted association analyses—with constitutive and inducible oleoresin flow, wood monoterpene content and composition, and diterpenoid content obtained in the ADEPT2 population—using linear mixed models and multi-locus linear mixed models in ASRgwas and GAPIT packages using two sets of SNP markers totaling ~2.28 million biallelic SNPs.

Significant SNPs are being mapped to genes and compared with those for constitutive oleoresin flow found in the CCLONES population. In the pseudo-backcross population between one F1 slash x loblolly hybrid genotype backcrossed to slash and loblolly genotypes, researchers collected constitutive and induced oleoresin flow and needle tissue for future QTL mapping.

To identify early, mid, and late genes expressed in differentiating resin ducts, team members induced axial resin canal formation in the cambial meristem by applying methyl-jasmonate (MeJA), which is a known inducer of traumatic resin canal formation in the Pinaceae family. The team conducted a time course experiment where 78 RNAseq libraries were created from cambial zone tissue collected from days 0, 1 to 14, 17, and 21 after MeJA treatment. Researchers also constructed 43 RNAseq libraries from 10 transgenic pine lines from four different constructs with significantly elevated wood terpene content. They pooled all libraries from the time course and from the transgenic lines and sequenced to a 30x read depth with the NovaSeq Illumina NGS platform.

Team members mapped the reads to an improved de novo loblolly pine transcriptome that includes 64,671 genes composed of existing EST contigs, PacBio reads, and predicted transcripts from loblolly pine reference genome v2.01. Researchers used DESeq2 to identify thousands of significantly differentially expressed genes across the time course and in transgenic pines compared with wild type. With these differentially expressed genes researchers created a Predictive Expression Network (PEN) using iterative Random Forest Leave-One- Out Prediction to illustrate higher-order interactions between genes and to determine the gene-to-gene relationships that are the most highly predictive of each other. To identify and prioritize genes across the PEN that are involved in axial resin canal formation, researchers applied random walk with restart (RWR) algorithms based on a set of literature-curated seed genes that included known orthologous regulators of xylem formation and development, which are suppressed while resin canal formation is increased. The RWR approaches allowed us to identify mechanistically associated genes that did not appear in GWAS due to a lack of statistical power or genetic variation but are still important components of resinosis. Researchers are continuing to annotate the network to identify genes whose expression supports involvement in resin canal formation and terpene synthesis.

In objective two, team members are using information from objective one to accelerate breeding for increased resistance in loblolly and slash pine through marker-assisted introgression and will develop and test genomic selection models to accelerate breeding of resistant slash pine.

 

Functional Characterization of Glycosyltransferases in Duckweed to Enable Predictive BiologyUrbanowiczUniversity of GeorgiaUrbanowiczBioenergyUniversity

Glycosyltransferases (GTs) catalyze the formation of glycosidic linkages to produce complex carbohydrates. This project involves the use of a multidisciplinary, high-throughput biochemical and computational biology approach to study carbohydrate metabolic processes in duckweed, a promising energy crop. The role of enzymatic microenvironments is being assessed through a combined proteomic and computational biology approach. This combined data will be used to populate a deep-learning framework to predict plant GT function.

Functional validation achieved through this research will be used to assign gene function and study plant processes at the systems level to efficiently link genome sequence with gene function in a feedstock agnostic manner.

Secure Ecosystem Engineering and Design (SEED) to Enable Safe Biodesign of Novel Plant-Microbe InteractionsAbrahamOak Ridge National LaboratoryYangBiosystems DesignSEED

The Secure Ecosystem Engineering and Design (SEED) Science Focus Area (SFA), led by ORNL, combines unique resources and expertise in the biochemistry, genetics, and ecology of plant-microbe interactions with new approaches for analysis and manipulation of complex biological systems. The long-term objective is to develop a foundational understanding of how non-native and engineered microorganisms establish, spread, and impact ecosystems critical to U.S. Department of Energy missions. This knowledge will guide biosystems design for ecosystem engineering while providing the baseline understanding needed for risk assessment and decision-making across biodefense enterprises.

Advancements made in plant engineering are necessary to address future challenges associated with climate change and food security. CRISPR/Cas9-based genome engineering now provides novel methods for accelerating high precision engineering in non-model plants. Yet nearly all genetic editing is created through tissue culture–based plant transformation systems, and these are often poorly developed in non-model plant species. Moreover, it is currently difficult to predict the activity of CRISPR using existing bioinformatic methods. Virus-mediated delivery of CRISPR/Cas systems has great potential to improve the delivery needed to expedite and maximize the usefulness of this technology. Because it is a challenge to deliver an entire CRISPR/Cas tool using RNA viruses, researchers recently developed an intein-mediated split-nCRISPR/Cas9 technology to deliver an entire based editing CRISPR/Cas system into plants (Yuan et al. 2021). Biocontainment of these advanced genome engineering tools is important to mitigate risks of unwanted genome engineering. Therefore, the team developed a biosensor for real-time detection of active CRISPR/Cas tools in planta and an anti-CRISPR (Acr) protein countermeasure to limit unwanted CRISPR/Cas9-based genome editing activity in planta (Yuan et al. 2022; Liu et al. 2023). These advancements are important steps towards safe, high-throughput plant biodesign and genome engineering.

Targeted genome editing of plants alone may not facilitate the advancements necessary to achieve the Department of Energy’s climate and economic competitiveness goals. Emerging research on plant holobiont theory and microbial invasion ecology emphasizes the importance of plant-microbe interactions. However, researchers currently lack the knowledge necessary to successfully introduce beneficial alterations, prevent undesired modifications, or assess the risks of proposed ecosystem engineering efforts. Therefore, advancements are being made to detect and control novel plant-microbe interactions for safe biodesign. Researchers are currently developing plant-based biosensors to detect the establishment of fungi on poplar, and the Plasminogen-Apple-Nematode (PAN) domain was recently recognized for its important role in plant host cell invasion, which will serve as a useful target for engineering plants to control microbial invasion. Lastly, plant-delivered in situ engineering is being developed to control root-associated microbes through the delivery of small-secreted proteins. Preliminary results indicate these advancements have potential for engineering plants to detect and control associated microbes and thus facilitating new opportunities of safe ecosystem engineering.

The Root Microbiome of Camelina in the Dryland Wheat Production Areas of Eastern WashingtonPaulitzWashington State UniversityPengBioenergyUniversity

Identify microbial diversity of the soil, rhizosphere, and root of the oilseed crop, Camelina sativa; Determine the effects of cropping zone and precipitation on microbial composition; Characterize the core bacterial and fungal microbiome of Camelina.

Camelina sativa L. is a broadleaf member of the Brassicaceae family. A short season crop (85 to 100 days), Camelina is fairly disease- and pest resistant, drought- and frost tolerant, and can be grown under low-input conditions (Gao et al. 2018; Zanetti et al. 2017; Matteo et al. 2020; Neupane et al. 2022; Séguin-Swartz et al. 2009). An important part of the success of this plant under low-input conditions may be attributed to its rhizosphere and endosphere microbiome, but to date there have been no publications on the microbiome of Camelina. In this study, soil was collected from 33 sites across a 200 km gradient in the dryland wheat production area in eastern Washington. This included four main cropping zones based on precipitation: wheat/summer fallow (< 300 mm/yr), intermediate (300 to 450 mm/yr), annual cropping zone (450 to 600 mm/yr) and south (400 to 500 mm/yr). Camelina was planted into the soils in the greenhouse. After six weeks, plants were harvested and DNA was extracted from root, rhizosphere and bulk soil. Bacterial 16S rRNA and fungal internal transcribed spacer (ITS) amplicons were amplified and sequenced with Illumina MiSeq (Gohl et al. 2016). Researchers then identified the core Camelina microbiome using an abundance-occupancy model (Shade and Handelsman 2012; Shade and Stopnisek 2019). They found that microbiome diversity decreased from the soil to the endosphere and, for the soil microbiome, increased with average precipitation. Plant compartment, zone, previous crop, and site all significantly affected composition of the Camelina microbiome, with the largest differences seen between the annual cropping zone and wheat/summer fallow zone. Several Actinobacteriota and Alphaproteobacteria (e.g., Actinoplanes, Aeromicrobium, Mycobacterium, Rhizobium, Caulobacter, and Sphingomonas) and two fungi (e.g., Pseudogymnoascus and one unidentified ASV) were differentially abundant in the plant-associated core microbiome. Fitting the abundance-occupancy model to the Sloan neutral model of community assembly provided additional evidence that these genera appear to be deterministically selected by the plant, suggesting that these core genera may possess traits that would make them good candidates for efficient plant colonization.

Integrating Functional Genomics with Molecular-Level Experimentation to Understand Adaptation to Nutrient Stress in Poplar and SorghumPaapeBrookhaven National LaboratoryPaapeBioenergyQPSI

The Quantitative Plant Science Initiative (QPSI) is a capability that aims to bridge the knowledge gap between genes and their functions. A central aspect of QPSI strategy is combining genome-wide experimentation and comparative genomics with molecular-level experimentation. In this way, researchers leverage the scalability of omics data and bioinformatic approaches to capture system-level information while generating sequence-specific understanding of gene and protein function. By incorporating molecular-level experimentation into the workflow, team members are addressing the question of how a protein functions and establishing mechanistic insight into how sequence variation impacts phenotype. This knowledge serves as a touchstone for accurate genome-based computational propagation across sequenced genomes and forms the foundation for robust predictive modeling of plant productivity in diverse environments.

Micro- and macronutrient stress is a growing importance in maximizing bioenergy/bioproduction crop yield in marginal soil. Bioavailability in the soil is dynamic and variable, and yield-impacting deficiencies are poorly understood. Because micronutrients are essential for the proper assimilation and metabolism of macronutrients such as nitrogen, metal deficiencies and other soil stresses can result in poor macronutrient availability. To support the development of bioenergy crops with improved nutrient stress resilience, the goal during the current 3-year period is to develop a genome-based, molecular-level and system-level understanding for the adaptation to micronutrient stress. Focusing on the bioenergy crops poplar and sorghum, researchers have completed a large-scale transcriptomics time-course experiment to understand how these plants respond to different nutrient stresses in their environment. Team members are also employing an interdisciplinary approach to provide a layer of experimentally grounded sequence-specific understanding of molecular-level functions for major players involved in plant homeostasis. Comparative genomics provides an in silico platform to generate protein function hypotheses. Hypotheses are tested with reverse genetics in model organisms and biochemical assays of protein family members. Structure-function studies supply mechanistic insight into how sequence space translates into molecular function. While working with micronutrient stresses in the current phase, there will be subsequent opportunities to incorporate other real-world conditions with the addition of field experiments addressing the impact of the soil geochemistry, microbiome and rhizosphere, and studying macro- and micro-nutrient interactions.

Functional Characterization of bHLH Transcription Factors Coordinating Abiotic Stress Response, Secondary Cell Wall Biosynthesis, and Metal Homeostasis in PopulusXieBrookhaven National LaboratoryXieBioenergyQPSI

The Quantitative Plant Science Initiative (QPSI) is a capability that aims to bridge the knowledge gap between genes and their functions. A central aspect of QPSI strategy is combining genome-wide experimentation and comparative genomics with molecular-level experimentation. In this way, the team leverages the scalability of omics data and bioinformatic approaches to capture system-level information while generating sequence-specific understanding of gene and protein function. By incorporating molecular-level experimentation into the workflow, researchers are addressing the question of how a protein functions and establishing mechanistic insight into how sequence variation impacts phenotype. This knowledge serves as a touchstone for accurate genome-based computational propagation across sequenced genomes and forms the foundation for robust predictive modeling of plant productivity in diverse environments.

Populus is one of DOE’s flagship bioenergy crops as the source of renewable energy and biobased products. Gene regulatory networks (GRNs) that describe the hierarchical regulatory relationships between transcription factors (TFs), associated proteins, and their target genes are fundamental for coordinating genome-wide gene expression responses to environmental and developmental signals. The complex and dynamic behavior of plant TFs is crucial for the GRN plasticity for sensitive responses. However, it also poses a fundamental challenge in understanding molecular principles underlying GRN dynamics. Researchers previously identified two basic Helix-Loop-Helix (bHLH) TFs (PtrbHLH038 and PtrbHLH011) whose expressions were oppositely regulated in Populus leaves under iron deficiency treatment. Using the transactivation assay in protoplasts, researchers found that they are transcriptional repressors. Using their novel protoplast-based transient chromatin immunoprecipitation-sequencing (transient ChIP-seq) approach for mapping genome-wide binding targets of TFs in vivo, team members found and validated that PtrbHLH038 directly regulates PtrbHLH011 and abiotic stress-responsive genes. In contrast, PtrbHLH011 seems to have broader regulatory functions because its targets include metal transporters, growth regulators, and master regulators of secondary cell wall biosynthesis. More interestingly, protoplast-based approaches enabled the discovery that iron deficiency treatment can eliminate PtrbHLH011’s binding and repression on its target genes. By performing TurboID-based proximity labeling in protoplasts, the team identified protein cofactors that form complexes with PtrbHLH011. Based on the results described above, researchers hypothesize that PtrbHLH038 and PtrbHLH011 form a regulatory hierarchy to coordinate abiotic stress responses, secondary cell wall biosynthesis, and iron homeostasis in Populus. Transgenic plants overexpressing PtrbHLH038 and PtrbHLH011 have been generated to test the team’s hypothesis and study the biological impacts of these two transcription factors.

Recent Developments at the Center for Structural Molecular Biology at Oak Ridge National LaboratoryO’NeillOak Ridge National LaboratoryPingaliStructural Biology

The Center for Structural Molecular Biology (CSMB) at ORNL is a national user facility funded to support and develop the user access and science research program of the Biological Small-Angle Neutron Scattering (Bio-SANS) instrument at the High Flux Isotope Reactor (HFIR). Bio-SANS is dedicated to the analysis of the structure, function and dynamics of complex biological systems. The CSMB also operates a Bios-Deuteration Laboratory for expression and purification of deuterium-labeled biomacromolecules and for synthesis of small molecules and ligands in support of the biology neutron scattering program. This resource complements capabilities at other Department of Energy (DOE) Biological and Environmental Research (BER) program facilities for structural biology. The CSMB supports a vibrant biological research community from academia, industry, and government laboratories.

The Bio-SANS instrument is ideally suited for studies of biomacromolecules including proteins, DNA/RNA, lipid membranes, and other hierarchical complexes. The dual detector system of Bio-SANS allows simultaneous access to a wide spatial range that enables utilization of the full potential of the high neutron flux from the ORNL HFIR cold source. The Bio-SANS detector system is being upgraded to include an additional mid-range detector bank that will greatly benefit time-resolved measurements of a variety of biological systems.

The team has developed a series of new sample environment capabilities that open untapped opportunities for the studies of biological systems using neutrons. A robotic sample changer sample environment has been installed that supports measurement of a range of sample types including solutions, suspensions, powders, and solid materials. It can maintain samples during storage (up to 58) in a desired temperature range between 10 to 70oC. A Peltier heating block at the sample position allows rapid temperature change between 10 to 100oC for in operando measurements. Another example is combined size-exclusion chromatography—SANS for fractionation of biomacromolecules in beam. A novel aspect of this capability is the ability to perform continuous flow measurements as well as fractionation of complex mixtures of biomacromolecules. The flow cell design was improved to ensure reliable and reproducible cell thickness and the flow cell holder has been expanded to accommodate four cells to minimize down time during sequential purifications of multiple proteins. The Bio-SANS data acquisition system and data reduction algorithms have been updated and include the ability to perform wedge- reduction for anisotropic systems such as biomass, and the ability to time-slice data files, which has been invaluable for analysis of time-resolved SANS measurements.

Researchers have expanded the Bio-Deuteration Laboratory to develop small molecule deuteration capabilities. Researchers established the ability to extract and purify deuterated lipid extracts from E. coli and to fractionate the lipid extracts to obtain purified phosphatidylethanolamine and phosphatidylglycerol. Furthermore, the team has produced deuterated phosphatidylcholine from an engineered strain of E. coli. In addition, the team successfully synthesized coniferyl alcohol-d5 as a precursor for deuterated lignin. In the future, this synthetic route will be used to prepare coniferyl alcohol with varying levels of deuterium incorporation and to deuterate other monolignols. Other new laboratory capabilities include preparative high-performance chromatography for separation and purification of different lipids and other small ligands. Analytical ultracentrifugation has recently been acquired and will provide a complementary structural information for biological solution X-ray scattering (SAXS) and SANS studies.

To broaden the impact of the CSMB and catalyze the synergy between BER-funded structural biology resources, the team established collaborative programs with the National Synchrotron Light Source II for joint access to SANS and SAXS and with the BER Facilities Integrating Collaborations for User Science (FICUS) program between the Joint Genome Institute (Lawrence Berkeley National Laboratory) and the Environmental Molecular Sciences Laboratory (Pacific Northwest National Laboratory).

Cell Free Conversion of Pyruvate to 2,3-Butanediol Using Co-Substrate Feed as pH Control StrategyOlsonDartmouth CollegeJilaniBioenergyUniversity

Cell-free systems are promising tools for production of chemicals and to improve understanding of biochemical pathways. Researchers are interested in developing well-characterized modules whose behavior can be mathematically predicted, allowing them to be combined into larger systems either as production modules or indicator reactions. The conversion of pyruvate to 2,3-butanediol is a good model system due to interest in 2,3-butanediol as a commodity chemical, thermodynamic favorability of all reaction steps, and prior demonstration of high titer production both in vitro and in vivo.

In the present work, the team demonstrate development of a high-performance pyruvate to 2,3- butanediol conversion system. Researchers start by characterization of individual enzymes in the pathway and demonstrate that none of them are subject to significant inhibition by substrates or products. Researchers then demonstrate conversion of 1063.1 (±19.0) mM (~93.7 g/L) acetoin to 1017.3 (±2.0) mM (~91.7 g/L) 2,3-butanediol, which represents 95.7% of the theoretical maximum yield, high titer and yield using a 2-enzyme system consisting of butanediol dehydrogenase and formate dehydrogenase. Team members subsequently extended the system to allow conversion of pyruvate to 2,3-butanediol using a 4-enzyme system. Researchers were able to convert 2045.5 (±30.7) mM (~225.0 g/L) pyruvate to 929.1 (±20.4) mM (~83.7 g/L) 2,3-butanediol, which represents 90.8% of the theoretical maximum yield. Achieving high titer production required careful attention to proton recycling. Further increases to product titer were limited by experimental limitations (substrate solubility, foaming due to gas formation, etc.) rather than intrinsic limitations of the enzymatic pathway.

The team subsequently developed mechanistic kinetic models for each enzyme individually and showed that these models (1) can be combined to predict the behavior of the 4-enzyme system, or (2) can be used to predict targeted modifications to minimize enzyme concentration (while maintaining the overall conversion rate) or to minimize concentration of a particular metabolic intermediate.

Engineering Synthetic Anaerobic Consortia Inspired by the Rumen for Biomass Breakdown and ConversionO’MalleyUniversity of California–Santa BarbaraBlairBiosystems DesignUniversity

This project will leverage a synthetic rumen consortium composed of anaerobic fungi and chain-elongating bacteria to study which metabolites are shared and exchanged between microbes and identify strategies to bolster lignocellulose conversion to value-added products. The project’s approach will develop high-throughput systems and synthetic biology approaches to realize stable synthetic consortia that route lignocellulosic carbon into short and medium chain fatty acids (SCFAs/MCFAs) rather than methane. Key research objectives are to (1) design and predict anaerobic fungal and bacterial consortia that efficiently convert lignocellulosic biomass into MCFAs, (2) understand how fermentation parameters and microbe-microbe interactions regulate and drive microbiome metabolic fluxes, and (3) use genomic editing to alter the fermentation byproducts of anaerobic fungi and bolster MCFA titers and yields.

Lignocellulose deconstruction and conversion in nature is driven by mixed microbial partnerships rather than the action of a single microbe. For example, microbes are particularly well optimized to recycle organic matter in anaerobic habitats, ranging from landfills to intestinal tracts, via interspecies H2 transfer and methane release. Compared to aerobic processes, anaerobic digestion can be far more efficient in converting substrate to chemical products, largely because far less carbon is funneled to cell growth resulting in higher yields, and far less energy inputs are required because pre-treatment, aeration, mixing, and heat removal are greatly reduced. Compartmentalizing difficult biomass deconstruction and production steps among specialist anaerobes is an exciting new route to convert biomass into value-added products, especially if consortia can be built predictively and engineered for stability.

Previously, the team established a model bacterial consortium enriched from the rumen that converts lignocellulosic biomass into high titers of C4 volatile fatty acids (VFAs; butyrate) based on a chain elongation process that inhibits archaeal methanogenesis. Metagenomic and metatranscriptomic analysis identified key chain-elongating bacteria in these consortia that maintain high expression of the reverse β-oxidation pathway responsible for C4-C8 VFA production. This analysis also revealed several other bacterial species in the consortia that compete with chain elongators and reduce overall C4-C8 VFA yields by diverting carbon to unwanted products. Therefore, building synthetic consortia that eliminate these competing bacteria would bolster product yields and enable great control over VFA chain length. In parallel, the team also demonstrated that anaerobic rumen fungi within the Neocallimastix genus are superior biomass degraders compared to anaerobic bacteria from these enrichments. Moreover, the biomass degradation products lactate, acetate, and ethanol from Neocallimastix are optimal substrates for chain elongators.

Accordingly, partnering anaerobic fungi and chain-elongating bacteria in synthetic consortia represents a novel strategy for maximizing lignocellulose conversion to C4-C8 VFAs (Fig. 1).

Recently, the team screened multiple chain-elongating bacteria and identified candidates that grow robustly in culture media with known fungal metabolites and produce VFAs. Two of these strains are Megasphaera elsdenii and Pseudoramibacter alactolyticus. The team paired Neocallimastix spp. with each of these strains and cultivated them for numerous passages on reed canary grass. These cocultures show promise in terms of stability because both members were present after multiple transfers. Lactate is the key metabolic intermediate in these consortia, where the fungi make lactate as they degrade grass, and the chain elongators produce VFAs from lactate. High-performance liquid chromatography measurements showed that lactate from fungal cultures was consumed after chain elongators were added and butyrate was produced. Future plans include optimizing inoculation ratios of the different strains in consortia to maximize stability and VFA production. Researchers will employ qPCR to evaluate abundances of the community members over time and thus elucidate insight into the culture’s stability. Here, researchers further describe efforts to systematically characterize and model the production of VFAs from synthetic bacterial and fungal communities that have been grown on representative lignocellulosic grass substrates.

Crosstalk: Interkingdom Interactions in the Mycorrhizal Hyphosphere and Ramifications for Soil Carbon CyclingNuccioLawrence Livermore National LaboratoryNuccioEnvironmental MicrobiomeEarly Career

Arbuscular mycorrhizal fungi (AMF) are ancient symbionts that form root associations with most plants. AMF play an important role in global nutrient and carbon cycles, and understanding their biology is crucial to predict how carbon is stored and released from soil. This Early Career research investigates the mechanisms that underpin synergistic interactions between AMF and microbes that drive nitrogen and carbon cycling, addressing DOE’s mission to understand and predict the roles of microbes in Earth’s nutrient cycles. By coupling isotope-enabled technologies with next-generation DNA sequencing techniques, this project investigates soil microbial interactions in situ using natural levels of soil complexity. This work will provide a greater mechanistic understanding needed to determine how mycorrhizal fungi influence organic matter decomposition and will shed light on nutrient cycling processes in terrestrial ecosystems.

Background: The arbuscular mycorrhizal association between Glomeromycota fungi and land plants is ancient and widespread; 72% of all land plants form symbiotic associations with AMF. While AMF are obligate symbionts that depend on host plants for C and cannot decompose soil organic matter (SOM), AMF can stimulate the decomposition of SOM and dead plant tissue. The team’s prior research strongly suggests that AMF partner with their microbiome in the zone surrounding hyphae (or “hyphosphere”) to encourage decomposition. The molecular mechanisms that underpin interactions between AMF and the microbial community during N uptake from SOM is a key knowledge gap. Researchers examine AMF-microbial interactions in both reduced complexity microcosms and in the field to assess the impact of AMF on terrestrial C and N cycling processes. In the field, team members assess how a deeply rooted perennial grass impacts soil C stocks and alters the zone of influence of AMF in soil depth profiles relative to shallow-rooted annuals.

Approach: AMF serve important roles in the soil microbial food web by stimulating soil organic matter decomposition and providing plant C to the soil community. To identify the genomes of actively growing bacteria and archaea in the AMF hyphosphere, researchers tracked plant-fixed 13CO2 through AMF hyphae into the 13C-hyphosphere microbiome using high-throughput stable isotope probing (HT-SIP) combined with high-resolution SIP-metagenomics (14 metagenomes per gradient). To separate the hyphosphere-C from the rhizosphere-C, the microcosms contained an airgap between plant and hyphal compartments that excluded roots but permitted fungal hyphae into living soil. SIP showed that the AMF Rhizophagus intraradices and associated metagenome assembled genomes (MAGs) were highly enriched (10-33 atom% 13C), even though bulk soil enrichment was low (1.8 atom% 13C). Of the 212 assembled 13C-hyphosphere MAGs, the taxa that assimilated the most AMF-13C were from the phyla Myxococcota, Fibrobacterota, Verrucomicrobiota, and the ammonia oxidizing archaeon genus Nitrososphaera. The phylogenetic composition and gene content of the highly 13C-enriched MAGs highlight the potential for cross-kingdom trophic interactions in the AMF hyphosphere, including predation, decomposition of fungal necromass or plant detritus, and archaeal ammonia oxidation (that may utilize ammonium or CO2 released from the aforementioned processes). In combination with other omics technologies, such as metatranscriptomics or proteomics, these MAGs will provide an important genomic resource for future experiments exploring interactions between AMF and their native microbiome.

To facilitate metabolomics and mechanistic studies of the hyphosphere, researchers have developed a sterile plant-mycorrhizal microcosm (called MycoChip, based off the EcoFAB platform) that researchers can use to interrogate hyphal-microbial interactions in situ. The MycoChip is intended to allow both destructive and nondestructive resampling of hyphosphere communities over time, and it is optically clear to permit microscopic investigation. This system has a raised airgap flanked by two 20 µm mesh barriers to create a hyphosphere zone isolated from the rhizosphere, which permits hyphae to enter the hyphae chamber but blocks root entry. The raised airgap contains a dam that prevents solute exchange between chambers in either vertical or horizontal positions. In the most recent design, the team created larger rectangular chambers that stand upright and accommodate more experimental soil in both chambers. Researchers are testing these chambers so they can be used in future experimental studies.

Most knowledge about physiology and ecology of AMF (and most soil organisms) has been learned from surface soils that are less than 20 cm deep. In a national field study, researchers assessed how the rhizosphere of a deep-rooted perennial bioenergy grass—switchgrass (Panicum virgatum)—impacts soil C stocks and alters the zone of influence of AMF and surface soil bacteria in depth profiles. Rhizosphere and bulk samples from paired switchgrass and shallow- rooted fields were collected from 2.5 m deep soil cores across nine field sites in the eastern United States (TX, MS, NC, NY, MI, WI, IL, SD). The team characterized the impact of switchgrass on soil microbial communities (AMF and bacteria), soil organic carbon (SOC), radiocarbon (14C), root abundance, and a range of soil physical and chemical properties. Switchgrass standing root biomass was significantly greater than annual standing root biomass. Across sites, radiocarbon and natural abundance 13C data suggests that switchgrass-derived C was present to a 1 m depth and ~95% of the roots were in the top 1 m. Differences in the SOC stock were highly variable, but the effect of switchgrass on SOC was more consistently positive in southerly sites featuring Alamo. The team used amplicon sequencing to characterize the AMF and bacterial communities throughout the rooting depth profiles using WANDA and 16S primer sets, respectively; this analysis will show if deeply rooted switchgrass extends the habitat of AMF down the soil profile, thus increasing their zone of influence and contribution to subsoil C cycling and weathering processes. The project’s C results indicate that standing root biomass may be a significant contributor to belowground C stocks in the first decades following conversion of annual cropland to perennial cover.

The Twin Ecosystems Project: A New Capability for Field and Laboratory Ecosystems Coupled by Sensor Networks and Autonomous ControlsNorthenLawrence Berkeley National LaboratoryTringeEnvironmental MicrobiomeOther

The goal of the TWIN ecosystem project (TWINS) is to pilot self-driving lab, positron emission tomography (PET), microbial biosensors, and laboratory and field twin ecosystems to gain insights into above- and belowground plant dynamics and interactions. Autonomous experiments are used to gain novel insights into grass responses to nutrient stress, and PET is being used to collect hot spots for omics analyses. These efforts are being used to study compositional changes in root exudates and rhizosphere communities following harvest. Here the field twin defines the climate conditions for the lab twin—providing powerful environmental controls and measurements, which are essentially not possible in the field.

This project has integrated computer vision software and autonomous experimental design software (gpCAM) developed by the Center for Advanced Mathematics for Energy Research Applications (CAMERA) with an automated experimental system for performing fabricated ecosystem experiments (the EcoBOT). Three rounds of experiments have now been performed to map the nutritional landscape of the JGI flagship model grass, Brachypodium distachyon, and the hyperspectral signatures of plant combinations of nutrient stresses. The resulting model will be a valuable tool in interpreting ongoing remote multispectral image data from a field site in Prosser, Washington.  At this field site, researchers have leveraged an existing field experiment to define climate conditions for a controlled laboratory twin that replicates field conditions for plants that have been transplanted from the field into large-scale mesocosm environments (EcoPOD, the lab twin).

The laboratory twin enables detailed control and characterization of the composition and dynamics of microbes and exudates under baseline conditions and in response to perturbation. In a pilot experiment, TWINS is investigating how plant biomass harvest alters the soil microbial community structure and function in response to tall wheatgrass (Thinopyrum ponticum) exudates. Samples were collected from both the field and the EcoPOD under twinned environmental conditions in July 2022, three days prior to harvest of the plant’s aboveground biomass and again 3 days after the harvest. Bulk and rhizosphere soils were isolated and extracted for nucleic acids and polar metabolites. Amplicon sequencing of the 16S V3–V4 region was used to investigate microbial community responses to plant harvest. Root, rhizosphere, and bulk soil samples have been analyzed using liquid chromatography–tandem mass spectrometry including both reverse phase and hydrophilic liquid interaction chromatography. Untargeted metabolomic analysis has been used to compare thousands of unique chemical features to identify changes in root and rhizosphere metabolites following harvest and to compare these findings between the lab and field twins.

Engineered microorganisms could potentially be used to monitor in situ processes in fabricated ecosystems. To investigate whether information can be transmitted across a soil using a rare volatile metabolite researchers evaluated whether one member of the Model Soil Consortium-2 (MSC-2; Variovorax) can be programmed to produce a unique methyl halide signal. By expressing a methyl halide transferase in Variovorax, researchers showed that this microbe can synthesize a signal that is more than 100–fold higher than the basal level produced by other soil microbes in MSC-2 including Dyadobacter, Ensifer, Rhodococcus, and Streptomyces. The signal generated by Variovorax could be read out directly using gas chromatography or using a Methylorubrum biosensor that produces a fluorescent output. Using soil habitats ranging in size from 1 to 50 grams of soil researchers found that methyl halide cell-cell signaling could be achieved under environmentally relevant water holding conditions. Synthetic methyl halide signaling is expected to simplify fundamental studies of gene expression in hard-to-image materials containing microbiomes, and it should be useful for programming soil consortia to convert information sensed in subterranean settings into overt aboveground visual or gas signals in fabricated ecosystems.

Examine Plant-Microbe-Soil Interactions Using Fabricated Ecosystems Across ScalesNorthenLawrence Berkeley National LaboratoryLinEnvironmental Microbiomem-CAFEs

Understanding the interactions, localization, and dynamics of grass rhizosphere communities at the molecular level (genes, proteins, metabolites) to predict responses to perturbations and understand the persistence and fate of engineered genes and microbes for secure biosystems design. To do this, advanced fabricated ecosystems are used in combination with gene-editing technologies such as CRISPR-Cas and bacterial virus (phage)-based approaches for interrogating gene and microbial functions in situ—addressing key challenges highlighted in recent DOE reports. This work is integrated with the development of predictive computational models that are iteratively refined through simulations and experimentation to gain critical insights into the functions of engineered genes and interactions of microbes within soil microbiomes as well as the biology and ecology of uncultivated microbes. Together, these efforts lay a critical foundation for developing secure biosystems design strategies—harnessing beneficial microbiomes to support sustainable bioenergy and improving understanding of nutrient cycling in the rhizosphere.

Studying plant-microbe-soil interactions including those mediated by root glycans is challenging due to the complexity and variability found in natural ecosystems. Fabricated ecosystems offer an opportunity to recapitulate aspects of these systems in reduced complexity and controlled laboratory settings. However, it is important to benchmark their performance against established systems and identify specific protocols for these studies. Here, researchers compared the colonization and persistence of a field-derived microbiome in colonizing the model grass Brachypodium distachyon when grown in sterile devices called EcoFABs as compared to conventional containers like pots and tubes (Acharya et al. 2023). Comparable plant growth and microbial community composition was obtained between conventional containers and the EcoFAB. The team also observe a distinct microbiome profile for the rhizosphere (root tip or root base) and the bulk soil. Researchers then used a synthetic community (SynCom) to study how different growth media and inoculation methods affect microbial community assembly (Coker et al. 2022). The results showed that sample types (sand, rhizosphere, and root) but not the inoculation method significantly affects microbial community assembly. Next, the team studied how plant cell wall composition and root exudates affect microbial community assembly in the rhizosphere by using a lignin biosynthetic B. distachyon transgenic line that has an altered root cell wall composition.

Researchers observed a higher Rhizobium colonization in the transgenic line than in the wild type line Bd21-3 suggesting a possible connection between root aromatics and root colonization. To explore this in more detail, the team performed metabolomic analysis of a diverse collection of Brachypodium accessions to identify lines with elevated aromatic compounds in the exudates. Researchers then used a meter-scale fabricated ecosystem to investigate SynCom colonization between two lines with dramatically different aromatic exudate compositions and again observed altered rhizosphere community compositions. Together, these results highlight the potential for using fabricated ecosystems to study the molecular ecology of plant-microbe interactions.

Secure Ecosystem Engineering and Design (SEED) to Mitigate the Impacts of Non-Native Fungal Pathogens on Managed EcosystemsAbrahamOak Ridge National Laboratory TannousBiosystems DesignSEED

The Secure Ecosystem Engineering and Design (SEED) Science Focus Area (SFA), led by ORNL, combines unique resources and expertise in the biochemistry, genetics, and ecology of plant-microbe interactions with new approaches for analysis and manipulation of complex biological systems. The long-term objective is to develop a foundational understanding of how non-native and engineered microorganisms establish, spread, and impact ecosystems critical to U.S. Department of Energy missions. This knowledge will guide biosystems design for ecosystem engineering while providing the baseline understanding needed for risk assessment and decision-making across biodefense enterprises.

Invasion of non-native fungal species is acknowledged as one of the major external drivers altering the structure, biodiversity, and function of ecosystems (Pyšek and Richardson 2010). Understanding the mechanisms of establishment of these invaders and developing mitigation approaches to manage them is a critical aspect of sustaining native biodiversity and normal ecosystem functions. The fungal pathogen Sphaerulina musiva is a well-characterized example of an invasive species spread unintentionally by human activities (Abraham et al. 2018). Originally native to Eastern North America, S. musiva was only recently introduced and established into the Pacific Northwest of North America, resulting in deleterious effects on susceptible Populus species/genotypes, a foundational bioenergy crop, and a keystone tree species in forested ecosystems (Herath et al. 2016; Sakalidis et al. 2016).

Within the SEED SFA, a goal is to identify genetic determinants that alter S. musiva establishment, spread, and overall impact in DOE-managed Populus ecosystems to guide engineering and risk mitigation strategies for more sustainable and productive systems. As a critical first step, researchers are developing an S. musiva pangenome representing isolates collected across the U.S. and generating genome-wide association study resources for high-throughput genotype-to-phenotype discovery. These resources have already uncovered genetic associations for several important establishment and pathogenicity traits that can be exploited by developed genomic engineering approaches, such as CRISPR-enabled gene drives. To enable future biodesign on S. musiva, the team developed the first transformation system using a protein-based version of the CRISPR-Cas9 genome editing system.

Ultimately, the ongoing research will address fundamental knowledge gaps related to anthropogenic-assisted microbial invasions and guide future biosystems design strategies that safely prevent undesired modifications in DOE-managed ecosystems.