Genome sequence of the exopolysaccharide-producing Salipiger mucosus type strain (DSM 16094T), a moderately halophilic member of the Roseobacter clade

Salipiger mucosus Martínez-Cànovas et al. 2004 is the type species of the genus Salipiger, a moderately halophilic and exopolysaccharide-producing representative of the Roseobacter lineage within the alphaproteobacterial family Rhodobacteraceae. Members of this family were shown to be the most abundant bacteria especially in coastal and polar waters, but were also found in microbial mats and sediments. Here we describe the features of the S. mucosus strain DSM 16094T together with its genome sequence and annotation. The 5,689,389-bp genome sequence consists of one chromosome and several extrachromosomal elements. It contains 5,650 protein-coding genes and 95 RNA genes. The genome of S. mucosus DSM 16094T was sequenced as part of the activities of the Transregional Collaborative Research Center 51 (TRR51) funded by the German Research Foundation (DFG).


Introduction
The Roseobacter clade is a very heterogeneous group of marine Alphaproteobacteria that plays an important role in the global carbon cycle and other biogeochemical processes [1]. Members of this group form an allegedly monophyletic, physiologically heterogeneous, as well as metabolically versatile group of bacterioplankton [1]. They are known to live in the open ocean, especially in coastal areas, where they have been found many times in symbiosis with algae, in microbial mats, sediments, or associated with invertebrates, but representatives of this lineage were also isolated from marine environments like polar waters or sea ice [1][2][3][4], which is also presented and reflected by their genome sequences [2]. Whereas some members of the Roseobacter clade contain the pigment bacteriochlorophyll a and are capable of aerobic anoxygenic photophosphorylation, other members were found to transform dimethylsulfonylpropionate into dimethylsulfide [4][5][6]. Some representatives of the Roseobacter lineage such as Salipiger mucosus A3 T are also known to be moderate halophiles, which are adapted to a wide range of salinities and were found to produce special compounds like compatible solutes, halophilic enzymes or exopolysaccharides [7][8][9]. Strain A3 T (= DSM 16094 T = LMG 22090 T = CECT 5855 T ) represents the type strain of S. mucosus (initially proposed as 'S. muscescens') in the monotypic genus Salipiger [10] and was isolated from saline soil bordering a saltern on the Mediterranean Sea coast at Calblanque (Spain) [7]. The genus name Salipiger was derived from the Latin noun sal, salis ('salt') and the Latin adjective piger ('lazy') [10]. The species epithet mucosus refers to the Latin adjective mucosus ('slimy, mucous') [10]. Current PubMed records do not indicate any follow-up research with strain A3 T after the initial description of S. mucosus [7] and the characterization of its exopolysaccharide [11]. In this study we analyzed the genome sequence of S. mucosus DSM 16094 T , which was selected for sequencing under the auspices of the German Research Foundation (DFG) Transregio-SFB51 Roseobacter grant because of its phylogenetic position [12] and was also a candidate for the Ge-nomic Encyclopedia of Archaea and Bacteria [13]. We present a description of the genomic sequencing and annotation and present a summary classification together with a set of features for strain DSM 16094 T , including novel aspects of its phenotype.

Classification and features 16S rRNA gene analysis
The single genomic 16S rRNA gene sequence of S. mucosus DSM 16094 T was compared with the Greengenes database for determining the weighted relative frequencies of taxa and (truncated) keywords as previously described [14]. The most frequently occurring genera were Salipiger (21.2%), Pelagibaca (17.1%), Roseovarius (17.0%), Marinovum (13.1%) and Roseobacter (9.5%) (30 hits in total). Regarding the single hit to sequences from members of the species, the average identity within high scoring pairs (HSPs) was 100.0%, whereas the average coverage by HSPs was 97.2%. Regarding the two hits to sequences from other members of the genus, the average identity within HSPs was 98.4%, whereas the average coverage by HSPs was 99.0%. Among all other species, the one yielding the highest score was 'Salipiger bermudensis' (DQ178660), which corresponded to an identity of 96.8% and a HSP coverage of 99.9%. (Note that the Greengenes database uses the INSDC (= EMBL/NCBI/DDBJ) annotation, which is not an authoritative source for nomenclature or classification). The highestscoring environmental sequence was AB302369 Figure 1 shows the phylogenetic neighborhood of (Greengenes short name 'Hydrocarbon-Degrading Indonesian Seawater seawater isolate B44-2B44-2 str. B44-2'), which showed an identity of 97.4% and an HSP coverage of 95.4%. The most frequently occurring keywords within the labels of all environmental samples that yielded hits were 'aquat, rank' (4.9%), 'microbi' (3.7%), 'harbour, newport' (3.3%), 'water' (2.7%) and 'seawat' (2.4%) (219 hits in total) and in line with the habitat from which strain A3 T was isolated. Environmental samples that yielded hits of a higher score than the highest scoring species were not found.
S. mucosus strain DSM 16094 T in a 16S rRNA gene sequence based tree. The sequence of the single 16S rRNA gene copy in the genome does not differ from the previously published 16S rRNA gene se-quence (AY527274), which contains two ambiguous base calls.

Morphology and physiology
Cells of strain A3 T are pleomorphic and stain Gram-negative ( Figure 2). They are 1 µm in width and 2.0-2.5 µm in length. Motility was not observed. They live a strictly aerobic and chemoheterotrophic lifestyle. Colonies grown on MY solid medium are circular, convex, cream-colored and mucoid, whereas in liquid medium their growth is uniform. Cells are encapsulated. They are moderately halophilic, and capable of growth in a mixture of sea salts from 0.5 to 20% (w/v), whereas the optimum is between 3 and 6% (w/v) sea salts. When using NaCl instead of sea salts, optimal growth occurs at a salt concentration of 9-10% (w/v). Cells of strain A3 T grow within a temperature range of 20-40°C and at a pH range between 6 and 10. Growth does not occur under anaerobic conditions either by fermentation, fumarate or nitrate reduction or photoheterotrophy. Cells are cytochrome oxidase and catalase positive. Polyhydroxyalkanoates (PHA) are stored as reserve material within the cells. H2S is produced from L-cysteine. Selenite reduction, gluconate oxidation and phosphatase were observed. Urea and Tween 20 are hydrolyzed. A variety of tested organic compounds were neither metabolized nor sustained growth; for details see [7]. The utilization of carbon compounds by S. mucosus DSM 16094 T grown at 28°C was also determined for this study using Generation-III microplates in an OmniLog phenotyping device (BIOLOG Inc., Hayward, CA, USA). The microplates were inoculated with a cell suspension at a cell density of 95-96% turbidity and dye IF-A. Further additives were vitamins, micronutrient and seasalt solutions, which had to be added for dealing with such marine bacteria [29]. The plates were sealed with parafilm to avoid a loss of fluid. The exported measurement data were further analyzed with the opm package for R [30,31], using its functionality for statistically estimating parameters from the respiration curves such as the maximum height, and automatically translating these values into negative, ambiguous, and positive reactions. The reactions were recorded in three individual biological replicates. Phylogenetic tree highlighting the position of S. mucosus relative to the type strains of the type species of the other genera within the family Rhodobacteraceae. The tree was inferred from 1,331 aligned characters of the 16S rRNA gene sequence under the maximum likelihood (ML) criterion as previously described [14]. Rooting was done initially using the midpoint method [15] and then checked for its agreement with the current classification ( Table 1). The branches are scaled in terms of the expected number of substitutions per site. Numbers adjacent to the branches are support values from 650 ML bootstrap replicates (left) and from 1,000 maximum-parsimony bootstrap replicates (right) if larger than 60% [14]. (That is, the backbone of the tree is largely unresolved.) Lineages with type strain genome sequencing projects registered in GOLD [16] are labeled with one asterisk, those also listed as 'Complete and Published' with two asterisks [3,17].
An important physiological property, the halophilic lifestyle, could be confirmed by the OmniLog measurements, showing that S. mucosus is able to grow in up to 8% NaCl. According to [7] the salt tolerance of this strain exceeds 10% NaCl.
The only detected respiratory lipoquinone was ubiquinone 10, which is a well-known characteristic of alphaproteobacterial representatives (all data from [7]). Altitude not reported Evidence codes -TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Nontraceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). Evidence codes are from of the Gene Ontology project [28].

Genome sequencing and annotation Genome project history
The genome of S. mucosus DSM 16094 T was sequenced as a part of the DFG funded project TRR51 "Ecology, Physiology and Molecular Biology of the Roseobacter clade: Towards a Systems Biology Understanding of a Globally Important Clade of Marine Bacteria". The strain was chosen for genome sequencing according the Genomic Encyclopedia of Bacteria and Archaea (GEBA) criteria [12,13]. Project information can found in the Genomes OnLine Database [16]. The genome sequence is deposited in GenBank and the Integrated Microbial Genomes database (IMG) [38]. A summary of the project information is shown in Table 2. Standards in Genomic Sciences

Growth conditions and DNA isolation
A culture of S. mucosus DSM 16094 T was grown aerobically in DSMZ medium 512 [39] by adding 2.5% NaCl at a temperature of 30°C. Genomic DNA was isolated using Jetflex Genomic DNA Purification Kit (GENOMED 600100) following the standard protocol provided by the manufacturer but modified by an incubation time of 60 min, the incubation on ice overnight on a shaker, the use of additional 50 μl proteinase K, and the addition of 100 μl protein precipitation buffer. The DNA is available from the Leibniz-Institute DSMZ through the DNA Bank Network [40].

Genome sequencing and assembly
The genome was sequenced using a combination of two genomic libraries (

Genome annotation
Genome annotation was carried out using the JGI genome annotation pipeline as previously described [17].

Genome properties
The genome statistics are provided in Table 3 and Figure 3. The genome of strain DSM 16094 T has a total length of 5,689,389 bp and a G+C content of 67.1%. Of the 5,745 genes predicted, 5,650 were protein-coding genes, and 95 RNAs. The majority of the protein-coding genes (76.0%) were assigned a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4 Insights into the genome

Plasmids and phages
Genome sequencing of S. mucosus DSM 16094 T resulted in 30 scaffolds. In the species description, it was reported that this strain contains at least seven plasmids (550, 467, 184, 140.8, 110.6, 98.2 and 30.8 kb) [7]. However, the identification of plasmids in the genome was difficult because typical replication modules comprising the characteristic replicase and the adjacent parAB partitioning operon are missing [41]. Nevertheless, comprehensive BLASTP searches with plasmid replicases from Rhodobacterales revealed the presence of three RepA and two RepB genes, whereas RepABC-type and DnaA-like replicases were absent from the genome sequence. General genomic features of the chromosome and four putative extrachromosomal replicons are listed in Table 5, whereas locus tags of the replicases and the large virB4 and virD4 genes of type IV secretion systems are presented in Table 6. The localization of the chromosomal replication initiator DnaA documents that (at least) scaffold 3 represents the chromosome. The 350 kb scaffold 5 contains a RepA-a type replicase [42] and a characteristic type IV secretion system (T4SS) comprising the relaxase VirD2 and the coupling protein VirD4 as well as the complete virB gene cluster for the transmembrane channel (Table 6 [ 43]). It probably represents a mobilizable extrachromosomal element. Scaffolds 18 and 28 contain RepA-b and RepB-I type replicases, respectively (Table 5), and may represent two additional plasmids of this species. However, the presence of plasmid replicases does not unequivocally correlate with extrachromosomal elements, as these genes may also reflect inactivated orphans or pseudogenes. Thus, the total number of S. mucosus plasmids cannot be exactly determined based on the draft genome sequence.   The total is based on the total number of protein coding genes in the annotated genome.  A potential fourth plasmid is represented by the large 702 kb scaffold 2 that contains a RepA-c as well as a RepB-II replicase (salmuc_01514; salmuc_01780). However, the presence of typical CRISPRs representing the defense system against phage attacks [44] favors a chromosomal affiliation for scaffold 2. The genome sequence of S. mucosus DSM 16094 T reveals that this strain must encounter continuous attack by phages. Regions of genes related to prophages are found at several sites throughout the genome (e.g., salmuc_02795 -02809 and salmuc_02619 -02632). Several genes encoding cas proteins (salmuc_01330, salmuc_01331 and 01333) indicate that a CRISPR defense system is functional in this strain. The large number of phage-related genes integrated into the genome could mediate frequent rearrangements of the DNA structure in this strain. Furthermore, this could indicate a possible exchange of genes with other species attacked by similar phages.

Morphological traits reflected in the genome
Analysis of the genome sequence of S. mucosus DSM 16094 T revealed the presence of a high number of genes associated with putative production and biosynthesis of exopolysaccharides (salmuc_00030, salmuc_00724, salmuc_01174, salmuc_01693, salmuc_02911, salmuc_3919, salmuc_04853 and salmuc_05511). This finding is in accord with a recent study by Llamas and colleagues, who characterized the exopolysaccharides produced by strain A3 T in detail [11]. Interestingly, genes putatively associated with cellulose synthesis (salmuc_02978 and salmuc_02979) were also found. Surprisingly, many genes involved in flagellar motility (e.g., salmuc_02151 -salmuc_02191, salmuc_04184 -04236) and chemotaxis (e.g., salmuc_03613 -03617) were observed, although this strain was described as non-motile in the original species description [7]. Genes associated with the synthesis of polyhydroxy-alkanoates as storage compound (e.g., salmuc_03738, salmuc_03739 and salmuc_05206) as well as phasin (salmuc_03343) were also found.

Nutrient limitation
Many saline environments, e.g., the central oceans, are characterized by a limitation of the essential nutrients iron and phosphorous. S. mucosus seems to have installed several mechanisms to overcome growth limitation caused by depletion of both elements. Iron is mainly acquired by the synthesis of siderophores and transported into the cell in its chelated form. Genes encoding ABC transporters for siderophores of the hydroxamate type (salmuc_02461 -02463 and salmuc_02667 -02669), as well as for hemin-bound iron (salmuc_00710 -00712) were found. To satisfy the need for phosphorous, strain DSM 16094 T is able to mobilize organic phosphonates as alternatives to phosphate, which is indicated by a continuous array of 22 genes (salmuc_00786 -00807) involved in the uptake and utilization of phosphonates.

Metabolic plasticity
In contrast to the published description of S. mucosus [7], which suggests a strictly aerobic and chemoheterotrophic metabolism, the genome reveals an astonishing metabolic versatility. Besides genes for the degradation of organic substrates, we also found genes encoding enzymes for the utilization of alternative electron donors enabling facultative lithotrophic growth: a Sox multienzyme complex encoded by the genes salmuc_00587 -00597 could be utilized for the oxidation of thiosulfate to sulfate, while molecular hydrogen may be utilized as electron donor by a multimeric uptake hydrogenase of the [NiFe]-type (salmuc_04814 -04830). A further potential substrate is carbon monoxide, which might be oxidized by an aerobic-type carbon monoxide dehydrogenase encoded by the genes salmuc_05576 -05578. Additional genes encoding subunits of carbon monoxide dehydrogenase were found dispersed at several sites in the genome. The metabolic plasticity of this species is further reflected in a multiple branched electron transport chain. The cascade starts from a NADH dehydrogenase (salmuc_03065 -03088) or succinate dehydrogenase complex (salmuc_00519 -00521), where ubiquinone is reduced to ubiquinol. Electrons can either be transferred from ubiquinol via a terminal cytochrome bd ubiquinol oxidase (salmuc_05386 -05387) directly to oxygen, or transferred to a cytochrome bc1 complex reducing cytochromes. Reduced cytochromes can then interact with terminal oxidases reducing oxygen. Genes for at least two different cytochrome c oxidases were detected, being either of a putative caa3-(salmuc_05284 -05285) or cbb3-type (salmuc_00548 -00551). The chemiosmotic gradient generated in the electron transport chain can be used for the synthesis of ATP by an ATP synthase complex of the FoF1 type (salmuc_01101 -01110). According to the genome sequence there is also the possibility that nitrate could be used as alternative electron acceptor in the absence of oxygen. In addition to a periplasmic nitrate reductase of the Nap-type (salmuc_04127 -04129) genes for a copper-containing dissimilatory nitrite reductase (salmuc_05547), a nitric oxide reductase (salmuc_05554 and salmuc_05555) and a nitrous oxide reductase (salmuc_04123) were detected, resulting in a complete pathway for denitrification of nitrate to molecular nitrogen. Interestingly, the genome sequence of S. mucosus DSM 16094 T further revealed the presence of a high number of genes associated with putative photoautotrophy. Next to a photosynthesis gene cluster (salmuc_05125 -05164) RuBisCOassociated genes (samuc_03532 -03534) involved in the fixation of CO2 via the Calvin-Benson cycle (salmuc_03531 -03539) were observed. The presence of such genes indicates a putative photoautotrophic growth under certain conditions. The genome of this strain also encodes a blue lightactivated photosensor (BLUF, salmuc_00318) that may play a role in the light-dependent regulation of photosynthesis genes. It is tempting to specu-late that a genetic inventory allowing photoautotrophy reflects an evolutionary position at the root of the Roseobacter clade. Several members of this lineage are known to be capable of an aerobic photoheterotrophic metabolism, whereas photoautotrophic growth has not been reported yet. By analogy, with the scenario proposed for the evolution of aerobic photoheterotrophic Gammaproteobacteria [45,46], representatives of the Roseobacter clade may have lost genes for CO2 fixation following adaptation to aerobic environments characterized by electron donor limitation, thereby preventing utilization of the Calvin-Benson cycle, which demands an abundant supply of reducing power and energy. However, none of the novel metabolic traits, which are predicted based on the genome sequence, could be verified experimentally in our laboratory so far. One explanation may be that under unfavorable growth conditions, e.g. anaerobiosis, lysogenic phages become activated, so that growth does not become apparent.