Genomes of Sea Microbes

Genomics is the study of the genetic information encoded by all the nucleotides possessed by an organism. For oceanographers, genomics provides detailed information on the many genes that drive biogeochemical activities of ocean-dwelling microbes. Carbon fixation, nitrogen fixation, sulfur gas formation, CO 2 production, and many other critical processes are underlain by the collective action of genes inside individual microbial cells in the ocean environment. In essence, genomics provides access to those genes and serves as an important step toward understanding their role in the ocean environment. The study of genes in the ocean had its start more than a decade ago when the new field of molecular ecology first allowed oceanographers to measure the diversity and distribution of selected protein-encoding genes. Polymerase chain reaction (PCR) amplification and sequencing of nifH genes led the way, providing information on the distribution of nitrogen-fixing organisms in marine environments and investigating the factors that limit their activity (Zehr and Capone, 1996). Similarly, studies of expression patterns of rbcL, the gene encoding a subunit of the major enzyme for carbon fixation, provided some of the first glimpses into cellular-level regulation of a major biogeochemical process in the ocean (Paul, 1996). These molecular oceanography studies were important predecessors to genomic oceanography studies, but they differ in two significant ways. First, the scale of genomics is grander, focusing on all the genes harbored by a marine organism simul-


Genomes of Sea Microbes
By Mary ann Moran and E. VirGinia arMBruSt Genomics is the study of the genetic information encoded by all the nucleotides possessed by an organism.For oceanographers, genomics provides detailed information on the many genes that drive biogeochemical activities of ocean-dwelling microbes.Carbon fixation, nitrogen fixation, sulfur gas formation, CO 2 production, and many other critical processes are underlain by the collective action of genes inside individual microbial cells in the ocean environment.In essence, genomics provides access to those genes and serves as an important step toward understanding their role in the ocean environment.
The study of genes in the ocean had its start more than a decade ago when the new field of molecular ecology first allowed oceanographers to measure the diversity and distribution of selected protein-encoding genes.Polymerase chain reaction (PCR) amplification and sequencing of nifH genes led the way, providing information on the distribution of nitrogen-fixing organisms in marine environments and investigating the factors that limit their activity (Zehr and Capone, 1996).Similarly, studies of expression patterns of rbcL, the gene encoding a subunit of the major enzyme for carbon fixation, provided some of the first glimpses into cellular-level regulation of a major biogeochemical process in the ocean (Paul, 1996).These molecular oceanography studies were important predecessors to genomic oceanography studies, but they differ in two significant ways.First, the scale of genomics is grander, focusing on all the genes harbored by a marine organism simultaneously rather than just a few genes at a time.Second, genomics includes a large measure of discovery and does not require that the target genes be identified beforehand.Thus, it is not actually  Glöckner et al., 2003), Thalassiosira pseudonana (a marine diatom; Armbrust et al., 2004), and Ostreococcus tauri (a marine prasinophyte; Derelle et al., 2006).Decisions about which marine prokaryotes to sequence began over a decade ago, with metabolic novelty serving as a major criterion.The hydrothermal-vent-dwelling, methanegenerating Methanocaldococcus jannaschii was the first marine prokaryote sequenced (Bult et al., 1996), followed by the hyperthermophilic, sulfate-reducing Archaeoglobus fulgidus (Klenk et al., 1997)  and sense their environment.Many of the next wave of marine prokaryote genomes, however, were selected with ecological relevance rather than novelty as a primary criterion.Molecular taxonomic surveys of 16S rRNA genes revealed the identity of the major marine prokaryotic taxa (Giovannoni and Rappé, 2000;Suzuki and DeLong, 2002), and obtaining a genome sequence of representatives from as many of these groups as possible became a priority.
This resulted in genome sequences from the Cyanobacteria, Roseobacter, SAR11, Flavobacteria, and Planctomycetes groups (Rocap et al., 2003;Moran et al., 2004;Giovannoni et al., 2005;Bauer et al., 2006;Glöckner et al., 2003) that are now being used to explore the genetic basis for ecological strategies and biogeochemically relevant activities of marine prokaryotes (Moran, in press).
The Instead, the strongest motivators were to identify the most ecologically important organisms and then put them in prior-  (3) Fragilariopsis cylindrus, which is restricted to polar environments.
• the haptophyte Emiliania huxleyi, which plays a critical role in the production of calcium carbonate (Table 2).
In the past couple of years, sequenc-  with marine dinoflagellates (Alavi et al., 2001;Strompl et al., 2003), suggesting bacterial-dinoflagellate interactions as a possible role for these Type IV secretion systems (Worden et al., 2006).

EnrichinG thE dEtail S
Along with the discovery of the unexpected, genome sequences of cultured marine microbes also provide new details about well-recognized processes.on nitrogen metabolism in marine phytoplankton.A second example comes from analysis of genomes of the green algae Ostreococcus tauri (Derelle et al., 2006) and Ostreococcus lucimarinus (Palenik et al., 2007).Computer-based analyses indicate that both organisms possess a chromosome with sequence features completely unlike those of other chromosomes, suggesting possible transfer of genes to both species from different organisms.This stunning result suggests that DNA may be more readily exchanged between eukaryotic organisms than previously suspected.A last example for marine eukaryotes has to do with the manner by which photosynthetic microorganisms convert carbon dioxide (CO 2 ) into organic carbon.
In the ocean, most inorganic carbon is present as bicarbonate rather than CO 2 .
Because the enzyme that catalyzes the first step in the generation of organic carbon is specific for CO 2 , most phytoplankton must somehow concentrate CO 2 intracellularly (Giordano et al., 2005).Careful analyses of multiple phytoplankton genomes now suggest that the C4 pathway, an alternative pathway for carbon fixation present in some land plants, may provide a mechanism for concentrating CO 2 (Derelle et al., 2006).
This pathway is unexpected in marine phytoplankton, but may enhance their ability to fix carbon; thus, it has implications for understanding the global carbon cycle.This last example acts as a reminder that new discoveries about how marine microbes work can be found even in the most familiar places.
Although these examples came from single eukaryotic organisms maintained in culture, they provide new avenues to explore in natural communities.
The value of genome sequences for revealing unexpected traits of microorganisms in the ocean has been demonstrated for prokaryotes as well.
Rhodopsin genes in surface-water bacterioplankton (Béjà et al., 2000) and ammonia oxidation by marine archaea (Könneke et al., 2005) (Kröger et al., 1999).Analysis of the T. pseudonana genome not only identified additional silaffin genes (Poulsen and Kröger, 2004)   Figure 3. Synergism between pure-culture genomics and metagenomics is evident in genomic scaffolding analyses, in which metagenomic fragments are aligned against the genome sequence of a cultured marine microbe.cultured organisms provide complete genome sequences and access to physiology, while metagenomic sequences show the abundance and distribution of genes and genome fragments from organisms that have not been cultured.top: Metagenomic sequences from the Sargasso Sea were aligned to the genome sequence of cultured Prochlorococcus strain Mit9312 to show regions of shared genes typical of most Prochlorococcus as well as regions of variable genes (genomic islands indicated by shading) that may be important in defining the ecological niche of strains (redrawn from Coleman et al., 2006).Bottom: Metagenomic sequences from the Global ocean Sampling expedition are aligned to the genome sequence of Pelagibacter ubique htcc1062 to show population differences in coastal north american samples (yellow colors at left) versus Sargasso Sea samples (red colors at right) (redrawn from Rusch et al., 2007).
ent stress response, and phage resistance, among others (Coleman et al., 2006).A similar fragment recruitment approach was carried out with sequences from the larger Global Ocean Sampling expedition metagenome against several necessary to know what you are looking for in order to find genes that are novel or informative.The difficulty of culturing every microbe in the sea and the dilemma of maintaining millions of microbial cultures simultaneously means that only a small fraction of marine microbial diversity can be studied as pure cultures.Selected model organisms have therefore begun to play an increasingly important role in oceanography.Because they can be grown in the laboratory and readily incorporated into experiments, ecologically relevant, culturable microbes provide a window into the ecology and physiology of their uncultured relatives.Put simply, they allow us to develop hypotheses about the uncultured multitude.The recent availability of genome sequences has greatly enhanced the value of model marine microbes, allowing manipulative experiments to be designed and interpreted much more specifically and insightfully.A few examples of sequenced marine microbes emerging as important model organisms in oceanography are: Prochlorococcus marinus (an abundant marine cyanobacterium; Rocap et al., 2003), Pelagibacter ubique (a member of the SAR11 clade; Giovannoni et al., 2005), Silicibacter pomeroyi (a Roseobacter; Moran et al., 2004), Rhodopirellula Mary ann Moran (mmoran@uga.edu) is Professor, Department of Marine Sciences, University of Georgia, Athens, GA, USA.E. VirGinia arMBruSt is Professor, School of Oceanography, University of Washington, Seattle, WA, USA.> SEction iii.toolS, MEthodoloGiES, inStruMEntation, and approachES > chaptEr 4. GEnoMicS and MEtaGEnoMicS For oceanographers, genomics provides detailed information on the many genes that drive biogeochemical activities of ocean-dwelling microbes.a S E a o F M i c r o B E S baltica (a representative of the marine planctomycetes;

"
Pure-culture genomics" is distinct from but complementary to "metagenomics."In the latter (see Edwards and Dinsdale, this issue), organisms are sequenced without culturing and in the context of the other members of the community.In this chapter, we will focus on the former, addressing both the value of genomes of cultured marine microbes for tackling major questions in oceanography and the intrinsic synergism between pure-culture genomics and metagenomics.Which GEnoME S Should BE SEquEncEd?DNA sequencing is becoming ever more rapid and inexpensive, putting marine microbial genomes within a few days' reach of a major sequencing center.Nonetheless, nontrivial postsequencing investments in genome analysis and annotation make organism selection a very important task.
selection of marine eukaryotes for sequencing has generated much more heated discussion than selection of prokaryotes because of the larger scientific investment per genome.Relatively strict selection criteria had been developed to identify which medically relevant eukaryotes should be sequenced.For example, only organisms with a long history of being easily maintained and manipulated in numerous laboratories using molecular or genetic techniques were considered.But, the marine environment is characterized by incredibly diverse microbial communities and thus relatively few representative model marine microbes existed that could be easily manipulated in the laboratory.
photosynthetic organisms have made their way into sequencing queues, and their whole genome sequences have either been completed or are well underway.The list includes: • three additional diatoms-(1) Phaeodactylum triconutum, for which the most advanced tools for molecular biology and genetics are available, (2) Pseudo-nitzschia multiseris, which can produce the neurotoxin domoic acid and

Figure 1 .
Figure1.hypothesized nitrogen and phosphorus utilization abilities for marine microbial taxa based on genes with sequence similarity to those whose role in n or p uptake and processing is known.dashed line indicates utilization pattern may be variable within the group.
In a recent example, discovery of the genes mediating two critical steps in the sulfur cycle was made possible largely by pure-culture genomics.In these steps, an organic sulfur compound produced by marine phytoplankton (dimethylsulfoniopropionate, or DMSP) is converted by marine bacteria via one of two competing pathways(Kiene et al., 2000;   Sievert et al., this issue).The first pathway leads to dimethylsulfide (DMS), a volatile sulfur compound that is readily transferred from the ocean to the atmosphere; the second leads to lessvolatile compounds that are assimilated by bacteria (Figure2).Which pathway dominates in marine surface waters has major implications for global temperature regulation because DMS exchanged across the ocean-atmosphere boundary affects cloud formation and global temperature, while non-DMS degradation products provide both sulfur and carbon to the marine microbial food web.Although the biochemical basis for the two competing pathways has heretofore been unknown, genome sequences of two cultured marine bacteria have recently led to identification of the pure-culture genomics has indeed begun to fundamentally change our understanding of who is doing what in the ocean, and how they are doing it.
are two of the most dramatic examples of unanticipated discoveries of significant biogeochemical importance, although many others can be found.For example, discovering Type IV secretion systems in marine bacterial genomes was surprising (Moran et al., in press) because these systems are known to encode DNA export to eukaryotic cells (Dolowy et al., 2005), such as for the initiation of gall formation in the plant pathogen Agrobacterium tumefaciens (Christie et al., 2005).Because about half of the Roseobacter genomes contain Type IV secretion homologs, could these systems indicate widespread ability in this taxon DMSP demethylase gene (dmdA) that mediates DMSP degradation to non-DMS fates (Howard et al., 2006) and the DMSP cleavage gene (dddD) that mediates DMSP degradation to DMS (Todd et al., 2007) (Figure 2).With these two gene sequences in hand, understanding the regulation of this critical step in the marine sulfur cycle is on the horizon.The silicon cycle is another important biogeochemical cycle, but the biology underlying silicon utilization remains poorly understood.Diatoms require silicon to produce elaborately patterned cell walls composed primarily of silicon, and they process about 7 billion metric tons of silicon on a yearly basis in doing so.Early insights into how diatoms use silicon to create their cell walls relied on painstaking biochemical analyses that by necessity were conducted with just one diatom, Cylindrotheca fusiformis.This work led to discovery of a novel class of lysine-and serine-rich phosphorylated proteins known as silaffins, which are involved in the precipitation of silica to create the diatom cell wall but comparative analyses also highlighted important features of the encoded proteins.All the genes identified thus far that play a role in silicon utilization in diatoms are unique to these organisms, and identification of additional silicon-related genes will rely increasingly on the use of postgenomic techniques.These two examples, one from a prokaryote and another from a eukaryote, illustrate how pure-culture genomics opens windows into understanding the critical details of globalscale processes such as the sulfur and silicon cycles.Pure-culture genomics has indeed begun to fundamentally change our understanding of who is doing what in the ocean, and how they are doing it.linK aGE S to MEtaGEnoMicS As the value of genomics to biological oceanography is becoming apparent, the next goal is to blend pure-culture genomics with metagenomics.The potential synergism of these two approaches is evident in the work of Coleman et al. (2006), who used the genome sequence of cultured Prochlorococcus strain MIT9312 to align sequence fragments from wild Prochlorococcus populations from the Sargasso Sea metagenome (Figure 3).Although most of the MIT9312 genome was well represented in the Sargasso Sea sequences, obvious gaps were found at "genomic islands."These may represent our first glimpse at the genetic basis for niche differentiation among Prochlorococcus species because the islands can provide a reservoir of interchangeable genes encoding ecologically important functions such as photoinhibition, nutrient uptake (amino acids, manganese/iron), nutri-

Figure 2 .
Figure 2. The availability of genome sequences of two cultured marine bacteria (Silicibacter pomeroyi and Marinomonas sp.MWyl1) led to the discovery of two key genes mediating dimethylsulfoniopropionate (dMSp) degradation.The genes encode the first enzyme in two competing pathways with vastly different ecological fates.The dmdA routes carbon and sulfur from dMSp to the marine microbial food web.The dddD degrades dMSp to a volatile sulfur compound that plays a critical role in the atmospheric sulfur pool.
figures, T. Mock and M. Parker for useful discussion, and Gordon and Betty Moore Foundation for grant support.

table 2 .
Marine eukaryotes with genome sequences as of april 2007.