Complete genome sequence of the sulfate-reducing firmicute Desulfotomaculum ruminis type strain (DLT)

Desulfotomaculum ruminis Campbell and Postgate 1965 is a member of the large genus Desulfotomaculum which contains 30 species and is contained in the family Peptococcaceae. This species is of interest because it represents one of the few sulfate-reducing bacteria that have been isolated from the rumen. Here we describe the features of D. ruminis together with the complete genome sequence and annotation. The 3,969,014 bp long chromosome with a total of 3,901 protein-coding and 85 RNA genes is the second completed genome sequence of a type strain of the genus Desulfotomaculum to be published, and was sequenced as part of the DOE Joint Genome Institute Community Sequencing Program 2009.


Introduction
Strain DL T (= DSM 2154 = ATCC 23193 = NCIMB 8452) is the type strain of the species Desulfotomaculum ruminis [1], one out of currently 30 species with validly published names in the paraphyletic genus Desulfotomaculum [2,3]. Strain DL T was initially isolated by G. S. Coleman in the 1950s from the rumen of hay-fed sheep [4]. Dissimilatory reduction of sulfate to sulfide in the rumen was first demonstrated by Lewis [5], who dosed fistulated sheep with sulfate and determined the amount of sulfide produced. As high amounts of sulfide may be toxic to animals, bacterial sulfate-reduction in ruminants was a concern due to the presence of sulfate in grass and hay. D. ruminis represented the first pure culture of a sulfate-reducing bacterium isolated from the rumen. The genus name was derived from the Latin words 'de', from, 'sulfur', sulfur, and 'tomaculum', a kind of sausage, meaning 'a sausage-shaped sulfate reducer' [2,6]. The species epithet is derived from the Latin word 'rumen', throat, first stomach (rumen) of a ruminant, meaning of a rumen [1,2]. Here, we present a summary classification and a set of features for D. ruminis strain DL T , together with the description of the complete genomic sequencing and annotation. The complete genome sequence of strain DL T will provide valuable information for defining a more adequate description of the currently paraphyletic genus Desulfotomaculum.

Classification and features
A representative genomic 16S rRNA sequence of D. ruminis DSM 2154 T was compared using NCBI BLAST [7,8] under default settings (e.g., considering only the high-scoring segment pairs (HSPs) from the best 250 hits) with the most recent release of the Greengenes database [9] and the relative frequencies of taxa and keywords (reduced to their stem [10]) were determined, weighted by BLAST scores. The most frequently occurring genera were Desulfotomaculum (88.3%), Pelotomaculum (7.9%), Cryptanaerobacter (2.8%) and 'Catabacter' (1.0%) (60 hits in total). Regarding the four hits to sequences from members of the species, the average identity within HSPs was 99.1%, whereas the average coverage by HSPs was 86.1%. Regarding the 41 hits to sequences from other members of the genus, the average identity within HSPs was 93.2%, whereas the average coverage by HSPs was 90.7%. Among all other species, the one yielding the highest score was Desulfotomaculum putei (HM228397), which corresponded to an identity of 94.1% and an HSP coverage of 98.5%. (Note that the Greengenes database uses the INSDC (= EMBL/NCBI/DDBJ) annotation, which is not an authoritative source for nomenclature or classification.) The highestscoring environmental sequence was EU307084 ('Changes microbial metabolic and along hydrogeochemically variable profile unsaturated horizon soil aggregate clone A Ac-2 12'), which showed an identity of 97.5% and an HSP coverage of 98.4. D. ruminis has not only been found in the rumen of animals, but also in other environments [11,12]. Therefore, the distribution of D. ruminis is not restricted to the rumen of animals. Hence, it is likely that this species enters the digestive tract of ruminants via food contaminated by spores. Figure 1 shows the phylogenetic neighborhood of D. ruminis in a 16S rRNA based tree of type strains. The sequences of the five 16S rRNA gene copies in the genome differ from each other by up to two nucleotides, and differ by up to three nucleotides from the previously published 16S rRNA sequence (Y11572), which contains three ambiguous base calls. Cells of D. ruminis DL T are slightly curved rods with rounded ends 2-6 µm in length and 0.5-0.7 µm in width (Table 1 and Figure 2) [1,4]. Cells stain Gram-negative and form oval subterminal spores that slightly swell the cells. A slight tumbling motility is due to peritrichous flagellation [1]. Strain DL T grows optimally at 37°C, but not above 48°C [1]. The pH range for growth is 6.0-8.5 with an optimum between pH 6.0 and 7.0 [4]. D. ruminis strains are obligately anaerobic and can grow chemoheterotrophically with lactate, pyruvate, ethanol or alanine as well as mixotrophically with hydrogen or formate as electron donor and acetate as carbon source. In contrast to the distantly related D. acetoxidans, strains of D. ruminis oxidize substrates incompletely to acetate and cannot grow autotrophically [4]. Suitable electron acceptors are sulfate, thiosulfate and sulfite, but not elemental sulfur or nitrate [1,26]. Fermentative growth with pyruvate as sole substrate is also possible [26].

Chemotaxonomy
In cells of D. ruminis cytochromes of the b-type dominate [1], which is a typical trait of sulfatereducing bacteria belonging to the genus Desulfotomaculum. Respiratory lipoquinones are also present and are comprised mainly of the menaquinone MK-7 and some small amounts of MK-6 [27]. The whole-cell fatty acid pattern of the type strain of D. ruminis was determined by Hagenauer et al. [26], who found a dominance of branched-chain iso-and anteiso-fatty acids in addition to unsaturated fatty acids, whereas saturated unbranched fatty acids were of less importance. The predominant fatty acids were: iso-C 17:1 c7 , iso-C 15:0 , iso-C 17:0 , C 17:0 cyc and C 16:0 . Although, in the study of Hagenauer et al. [26] a large amount of the extracted cellular fatty acids (37.3%) remained unidentified, the fatty acid pattern of D. ruminis can be clearly distinguished from other distantly related Desulfotomaculum species like D. acetoxidans, which has a pattern dominated by straight-chain saturated fatty acids, thus further illustrating the paraphyletic origin of this genus. Standards in Genomic Sciences

Genome project history
This organism was selected for sequencing on the basis of the DOE Joint Genome Institute Community Sequencing Program 2009 proposal 300132_795700 'Exploring the genetic and physiological diversity of Desulfotomaculum species'. The genome project is deposited in the Genomes OnLine Database (Gc01775) [28] and the complete genome sequence is deposited in GenBank (CP002780). Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Growth conditions and DNA isolation
D. ruminis strain DL T , DSM 2154, was grown anaerobically in DSMZ medium 63 (Desulfovibrio medium) [29] at 37°C. DNA was isolated from 0.5-1.0 g of cell paste using Jetflex Genomic DNA Purification Kit (GENOMED 600100) following the manufacturer's instructions, with a modified protocol for cell lysis (modification st/LALMP) as described in Wu et al. 2009 [30]. DNA is available through the DNA Bank Network [31].  [13] and the NamesforLife database [3].

Genome annotation
Genes were identified using Prodigal [37] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [38].
The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [39].

Genome properties
The genome consists of one circular chromosome of 3,969,014 bp with a G+C content of 47.2% (Table 3 and Figure 3). Of the 3,986 genes predicted, 3,901 are protein-coding genes, and 85 are RNAs; 105 pseudogenes were also identified. The majority of the protein-coding genes (67.3%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.   (Desru_1652), and class V (Desru_0021), which indicates that besides alanine, other amino acids might be substrates for this species. The oxidative deamination step of the amino acid degradation is probably catalyzed by an alanine dehydrogenase, which exists in two copies (Desru_0588 and Desru_2884) or a glutamate dehydrogenase (Desru_0556), confirming previous physiological studies [40].

Not in COGs
Interestingly, a taurine degradation pathway was also detected in the annotated genome. In habitats that are depleted of sulfate, like rumen or freshwater sediments, the amino sulfonic acid taurine could represent a key substrate for D. ruminis. Taurine is widely distributed in animal tissue, especially the large intestine, and can be converted by a taurinepyruvate aminotransferase (Desru_0589) to sulfoacetaldehyde, which in turn is cleaved by the enzyme sulfoacetaldehyde acetyltransferase (Desru_0590) into sulfite and acetyl-phosphate.
Sulfite can then be used as electron acceptor and reduced to sulfide. Several genes encoding dehydrogenases were detected that catalyze the oxidation of organic acids (e.g., lactate), or alcohols (e.g., ethanol). The main metabolic intermediate resulting from the oxidation of organic carbon compounds in incomplete oxidizing sulfate-reducing bacteria is pyruvate, which in D. ruminis can be degraded by the action of several enzymes: pyruvate dehydrogenase (Desru_0066 -0067), pyruvate-ferredoxin oxidoreductase (Desru_0099 -0102) and pyruvate-Standards in Genomic Sciences formate lyase (Desru_2143 and Desru_2090). The former two enzymes are decarboxylating and yield acetyl-CoA, CO 2 and reducing equivalents, and the pyruvate-formate lyase produces acetyl-CoA and formate. The latter enzyme is preferentially used during fermentative metabolism, when pyruvate is the main carbon and energy source. Acetyl-CoA is either used for biosynthetic reactions or can be transferred into acetyl-phosphate by a phosphate acetyltransferase. However, in the annotated genome of D. ruminis only a phosphate butyryltransferase (Desru_2256) was found. It could be that such enzymes are not specific for butanoyl-CoA and also use acetyl-CoA. An alternative pathway for the recycling of CoA could be catalyzed by the acetyl-coenzyme A synthetase (Desru_0761). This enzyme may use AMP and pyrophosphate that are formed in the ATP-sulfurylase and APS reductase reaction, respectively, for the production of acetate, CoA and ATP, though it is not clear if this acetyl-CoA synthetase is reversible. It may also be involved in the activation of acetate during mixotrophic growth. Acetyl-phosphate, which is also produced in the degradation of taurine, is converted to ATP and acetate by the enzyme acetate kinase (Desru_1705). Three genes involved in the acetyl-CoA pathway were not detected. These are the acetyl-CoA synthase gene (acsB), and the genes for the large and small subunit of the corrinoid iron sulfur protein.
Due to the absence of these genes, D. ruminis is unable to perform complete oxidation of organic compounds via the acetyl-CoA pathway, which is consistent with the published species description [41].

Mixotrophic growth
Based on genes identified within the genome sequence, data hydrogen, formate and carbon monoxide could be potential substrates for mixotrophic growth in D. ruminis. As observed for other clostridial sulfate reducers [11] the genome of D. ruminis encodes several copies of [FeFe] hydrogenases, including three copies of a trimeric NAD(P)-dependent hydrogenase (Desru_2398 -2396, Desru_2393 -2391, Desru_0516 -0514), and two copies of a membrane-associated hydrogenase (Desru_3431-3433 and Desru_0447 -0445) that includes a TAT signal peptide that is not predicted to be cleaved off using SignalP [42], but to form a transmembrane helix that anchors the protein to the extracytoplasmic side of the membrane.
A monomeric [FeFe] hydrogenase (Desru_2180) and a hydrogenase encoding a PAS-sensing domain (Desru_2509), similar to HsfB [43] are also present. In addition, the utilization of hydrogen may also be catalyzed by a Ni,Fe hydrogenase encoded by the genes Desru_2370 -2372. This enzyme is bound to the membrane by a cytochrome b, but seems to be cytoplasmic as no signal peptides are predicted. Two gene loci encoding formate dehydrogenases are located adjacently in the genome. Genes Desru_3012 -3008 code for a membraneassociated enzyme in which the catalytic subunit is coded by three genes (Desru_3012 -3010), as observed in other organisms. The first gene (Desru_3012) includes a TAT signal peptide, so the localization of the enzyme relative to the membrane will depend on whether this peptide associates with the catalytic subunit (Desru_3010) or not. The gene Desru_3011 encodes for the FeS domain of the catalytic subunit. The second formate dehydrogenase (Desru_3002-3005) is a tetrameric NAD(P)-dependent enzyme. Potential genes encoding the catalytic subunit of anaerobic-type carbon monoxide dehydrogenases (cooS) were identified at Desru_0859 and Desru_3320. However, no other CODH complex genes were found near either of the two cooS genes, except for cooC at Desru_0860. While growth with hydrogen and formate with acetate as carbon source was confirmed in laboratory experiments, no growth was obtained with 5% (v/v) of carbon monoxide in the headspace gas atmosphere [23]. This is in contrast with the cooS present in the genome and brings into question the function of this gene in D. ruminis. In a study about the fermentation burst in Desulfovibrio vulgaris Hildenborough it was found that CO is produced at low levels during growth on pyruvate or lactate [41]. It was hypothesized that the catalytic subunit of carbon monoxide dehydrogenase could be involved in an internal metabolism or cycling of carbon monoxide. In D. vulgaris, the cooS gene (YP_011311.1) is downstream of a transcriptional regulator (YP_011310.1) and upstream of the cooC gene (YP_011312.1). This localization is similar to what we find in D. ruminis, Desru_0859, Desru_0858 and Desru_0860, respectively. Thus, carbon-monoxide dehydrogenases could play a role in the internal metabolism or cycling of carbon monoxide during growth of D. ruminis on organic acids. However, in contrast to D. vulgaris, no COinduced hydrogenase (coo) is present in D. ruminis.

Energy metabolism
The genome of D. ruminis encodes the full set of genes necessary for dissimilatory sulfate reduction as well as several membrane complexes, which deliver electrons from membrane electron carriers like menaquinol to cytosolic sulfatereducing enzymes. The following genes encoding cytoplasmic enzymes for dissimilatory sulfate reduction were detected in the D. ruminis genome: Sulfate adenylyltransferase (ATP-sulfurylase, Desru_3378), adenosine-5'-phosphosulfate (APS) reductase (Desru_3376 -3377) and dissimilatory sulfite reductase (Desru_0386 -0387 and Desru_3723 -3724). In D. ruminis, like in most sulfate reducers, ATP-sulfurylase and APS reductase are present as one copy. However, in contrast to most other sulfate reducers, D. ruminis contains two copies of the dsrABD genes. As observed in other Desulfotomaculum species, the alpha subunit of the APS reductase appears to be membraneanchored. The generation of a proton gradient across the cytoplasmic membrane is thought to be the main mechanism for generation of energy in sulfatereducing bacteria. The coupling of the reduction of sulfate to sulfide, which occurs exclusively in the cytoplasm with a membrane-bound electron transport chain, and a vectorial proton transport across the membrane is still far from being understood. Electrons and protons required for the generation of a chemiosmotic gradient in Grampositive sulfate reducers could be generated by the oxidation of small intermediate metabolites, like hydrogen, COor formate at the cytoplasmic membrane. Several membrane-bound enzyme complexes were recently identified that could play a role in this process. A membrane-bound pyrophosphatase (Desru_3593) could use the energy generated from the cleavage of pyrophosphate, which is formed in the activation of sulfate by the ATPsulfurylase, for proton translocation. The QmoAB complex (Desru_3374 -3375) is suggested to play a role in donating electrons to the APS reductase [44], and the genes of both enzymes are located next to each other in the D. ruminis genome. The gene for the membrane subunit QmoC is absent, as in other clostridial sulfate reducers. The QmoA subunit is predicted to contain a signal peptide, but this likely forms a transmembrane helix, as it is still present in the mature protein [45], which indicates a localization at the inner surface of the cytoplasmic membrane. In Gram-negative sulfate-reducing bacteria, a transmembrane DsrMKJOP complex is conserved, which probably transfers electrons from the periplasmic space to the dissimilatory sulfite reductase. In Gram-positive sulfate reducers, only a truncated DsrMK complex seems to be present [46], which is encoded adjacent to a small soluble protein designated DsrC that is proposed to have a function in shuttling electrons from DsrK to the cytoplasmic DsrAB sulfite reductase [47]. In D. ruminis this enzyme system is encoded by the genes Desru_3734 -3736, and a second copy of the dsrMK genes is present (Desru_2596 -2597) The two c-type cytochromes present in "D. reducens" and annotated as a nitrite reductase are absent in D. ruminis, which is consistent with the other Desulfotomaculum species sequenced to date (except D. nigrificans) [46]. An energy-conserving NADH-quinone oxidoreductase (Complex I, Desru_1808 -1818 and Desru_0514 -516) is present, which will couple NADH oxidation to proton translocation. Furthermore, a multimeric membrane-bound complex was identified at Desru_3260 -3265, that belongs to the family of Ehr complexes (for energyconserving hydrogenase related complex) first identified in Geobacter spp., but present in many microorganisms [48,49]. The subunits of Ehr complexes are related to subunits of complex I and the Ech energy conserving hydrogenases, but in most cases the cysteines binding the NiFe cluster are absent, so these complexes are not real hydrogenases. In D. ruminis EhrL (Desru_3264) the four Cys required to coordinate the catalytic center are present, so this complex may be a true energy-conserving hydrogenase. The proton gradient resulting from the above-mentioned reactions is used by a F0F1-type ATP synthase complex encoded by the genes Desru_3687 -3694. There are number of heterodisulfide reductases in the genome: three loci were identified which contained hdrA (Desru_0205, Desru_0212 and Desru_3375) and hdrB (Desru_3379 and Desru_2699) and hdrC-like (Desru_3380 and Desru_2700) heterodisulfide reductases. In addition, a fused hdrA with mvhD (methyl-viologen reducing hydrogenase delta subunit) was identified (Desru_3374).

Comparative genomics
We analyzed the fraction of shared genes in three genomes of Desulfotomaculum species with validly published names. The genomes of D. acetoxidans [50] and D. ruminis [10] are complete, whereas the genome of the type species D. nigrificans is only available as a draft sequence. D. nigrificans has the smallest genome with 3,014 protein coding sequences. The resulting data are illustrated in the Venn diagram shown in [ Figure 4]. The largest overlap is found between the strains D. nigrificans DSM 574 T and D. ruminis DSM 2154 T , which share 2,359 homologous proteins corresponding to 78.3% of the DSM 574 T genes and 60.5% of the DSM 2154 T genes. Thus, a closer relationship between D. ruminis DSM 2154 T and D. nigrificans DSM 574 T than between D. acetoxidans DSM 771 T and D. nigrificans DSM 574 T , as suggested by the 16S rRNA based phylogenetic tree, is confirmed by whole-genome data. Figures 5A and 5B show the organization of dsrAB (A), qmoBA, aprAB and hdrBC (B) and neighboring genes for D. ruminis, "D. reducens" and D. acetoxidans. In Figure 5 dsrD is upstream of dsrAB in all three strains. However, no other neighboring genes are similar to each other. In contrast, Figure  5B shows remarkable homology in gene organization for the aprAB gene neighborhood for D. ruminis and "D. reducens". Gene sequence is also very similar for that region (53-94% identity for the genes displayed including hypothetical proteins) which suggests horizontal gene transfer from a common ancestor. The dsrAB and aprBA proteins of D. ruminis are more closely related to "D. reducens" than to D. acetoxidans [ Figure 6A and 6B]. This is in accordance with the 16SrRNA based phylogenetic tree and the whole-genome data.    . Phylogenetic tree of the dsrAB protein sequences. The trees (6A and 6B) were inferred from proteins sequences using RAxML (maximum-likelihood) in the software program ARB. The sequences of Archaeoglobus fulgidus, A. profundus, and A. veneficus were used as outgroup, but were pruned from the tree. The sequence of D. ruminis is written in bold. The black circles are bootstrap values between 100-75%, the white circles are values between 75-50%. The scale bar corresponds to 10% estimated sequence divergence. Figure 6B. Phylogenetic tree of the aprBA protein sequence.