Genomic Science Program
U.S. Department of Energy | Office of Science | Biological and Environmental Research Program

GTL and Beyond: Data Standards Workshop

September 10–11, 2003
San Francisco, California
Hotel Nikko

Adam Arkin and Nancy Slater, Berkeley GTL Project

The overall goal of the two days of workshop collaborations is to produce a report that clearly outlines the short-, mid- and long-term theoretical, analytical and computational needs of systems microbiology projects. Specifically, we will develop a roadmap towards Facility IV: Analysis and Modeling of Cellular Systems. This workshop complements the modeling and simulation workshop hosted by ORNL, PNNL, OBER and OASCR in July. The GTL and Beyond workshop will benefit from the results of the July meeting, and we will expand the scope to address the biological needs driving Facility IV and systems biology.

The mission of Facility IV is to be the integrative facility for understanding cellular physiology and population dynamics, and it is the closest Facility in spirit and operation to the existing GTL projects. Thus, the first day will be dedicated to understanding the current and growing needs of the GTL projects. This workshop will specifically address the management, curation and distribution of data and results, the need for determining how the results work together with the other GTL projects and the shaping of the facilities to meet the community’s needs. The current GTL leaders attending the Day 1 workshop will provide practical, experienced advice to the workshop on Day 2 in addition to defining near term need for data standardization, data sharing, analysis, and centralization.

The second day will be opened to the larger community in which we will discuss methods for progressing from individual GTL projects to the larger Facility plan. In particular, issues to be addressed are: the design of a systems biology facility like Facility IV; requirements of facility-like capabilities and services; the facility’s impact on the practice of experimental and practical systems microbiology and applied microbiology; the development and deployment of these services; and the open theoretical, computational and experimental challenges that need to be met for such efforts to be successful.

Throughout the workshops, a professional note-taker will be present to compile the minutes of the meetings, and a captain will be charged with creating a power point summary of each section. At the end of the workshop, groups will be charged with writing short sections of a meeting report to guide development of computational and systems biology facilities such as Facility IV and the data management/analysis/serving pieces of the other three facilities.

GTL ON THE GROUND: Data and Computational Needs Workshop

The purpose of this first workshop is for the current GTL performers to collate what they have learned about performing large scale systems microbiology projects in regards to experimental design, laboratory information management, data management, data analysis and modeling. Each project will be expected to present two slides outlining their current means of generating, organizing, analyzing and deducing models from their data (if appropriate) and what the scale of this operation is projected to be in the next few years. The near term needs for data standard, data sharing and data analysis will be identified. The group will then split up to develop focused recommendations on the data management, bioinformatics and data analysis, and experimental designs, for systems biology and modeling. Leaders of these groups will return to present their conclusions to the workshop and then these notes will be combined into a presentation for day 2.

September 10, 2003
8:00 a.m. Continental Breakfast
8:30 a.m. Welcome, Introductions, and Workshop Goals
8:45 a.m. State of the GTL Nation Address: OBER
9:15 a.m. Talk 1: What is Systems Biology? What is necessary before models become the dominant activity?
9:45 a.m. Talk2 : Data management, Analysis and Publication. How should data from these projects be captured and preserved? Is there a network effect to combining and serving GTL results?
10:15 a.m. Break
10:45 a.m. GTL Scientists present their two slides on their

  1. current data generation capabilities, data rates, management approaches, scale-ups and roadblocks
  2. current analytical approaches, software used and what they hope to get from their data
12:00 p.m. Working Lunch (provided)
1:00 p.m. Breakout Sessions

  1. Database Design, Federation, Curation and Data Transport for Systems Biology Information
  2. Bioinformatic analyses: Annotation, comparative genomics, functional genomic analysis towards pathways, modules, and models
  3. (Multicellular) Network analysis: From regulatory networks to populations- experimental designs to interaction with data and models.
3:00 p.m. Break
3:30 p.m. Discussion of action items: Should there be a scientific face for the GTL program where data and results are systematized and served? How do we work together to accomplish this or ensure other means of preserving, synergizing and publicizing results from the projects and beyond? What do we see as the role of all four facilities in serving the current GTL needs?
6:00-8:00 p.m. Dinner hosted by LBNL

GTL AND BEYOND: Applied Systems Microbiology Towards Facility IV

On this day of the workshop the wide community of invitees spanning the National Laboratories, academia, industry and representatives from the DOE, will meet to assess the needs of the applied and environmental microbiology community that will drive Facility IV: analysis and modeling of cellular systems. We will define the capabilities and services that will be needed by the community that are best served by location in a large Facility, describe the operation of the Facility both in research and service, and provide a detailed outline of the technological needs and challenges that such a Facility will face.

The day will start with a summary of the current thinking at DOE on Facility IV and related problems in computation. We will include a summary of the July workshop and a summary of day 1. The first focused workshop talk will aim at coming to a consensus on the biological drivers for Facility IV efforts. A roundtable discussion will then be led to integrate these biological needs what experimental designs are becoming most useful in this multiscale systems biology and what experimental needs do not yet have the best high-value high-throughput designs. The second round table will define the data management, data analytical and modeling technologies that are most proven, most promising and most needed. Both groups will outline outstanding challenges in the area. The workshop will then split into two working groups that will write the outline of the report one focusing on biological planning and the other on computational planning. Finally the groups will come together to discuss the interaction between the two planning activities and to plan for the production of an integrated Facility IV report.

September 11, 2003
8:00 a.m. Continental Breakfast
8:30 a.m. Welcome, Introductions, and Workshop Goals
8:45 a.m. Summary of Genomes to Life Program: Computation and Experiment and Facility IV
9:15 a.m. Biological drivers for Facility IV efforts and summary of GTL I day 1 workshop
9:45 a.m. Roundtable discussion of state of the art and potential near term goals for GTL systems biology experiments
10:30 a.m. Break
10:45 a.m. Roundtable discussion of state of the art and potential near term goals for GTL systems biology computation
11:30 a.m. Panel Led Summary of Roundtable Discussion
12:00 p.m. Working Lunch (provided)
1:00 p.m. Breakouts

  • Computation Planning Summary
  • Biological Planning Summary
3:00 p.m. Break
3:30 p.m. Discussion of process for development of workshop report, assignments for workshop participants
4:45 p.m. Summary remarks
5:00 p.m. Adjourn

We welcome suggestions for other invitees and comments on the program are also welcome.

Please send any questions or correspondences to Nancy Slater at [email protected].