Genome analysis of Desulfotomaculum gibsoniae strain GrollT a highly versatile Gram-positive sulfate-reducing bacterium

Desulfotomaculum gibsoniae is a mesophilic member of the polyphyletic spore-forming genus Desulfotomaculum within the family Peptococcaceae. This bacterium was isolated from a freshwater ditch and is of interest because it can grow with a large variety of organic substrates, in particular several aromatic compounds, short-chain and medium-chain fatty acids, which are degraded completely to carbon dioxide coupled to the reduction of sulfate. It can grow autotrophically with H2 + CO2 and sulfate and slowly acetogenically with H2 + CO2, formate or methoxylated aromatic compounds in the absence of sulfate. It does not require any vitamins for growth. Here, we describe the features of D. gibsoniae strain GrollT together with the genome sequence and annotation. The chromosome has 4,855,529 bp organized in one circular contig and is the largest genome of all sequenced Desulfotomaculum spp. to date. A total of 4,666 candidate protein-encoding genes and 96 RNA genes were identified. Genes of the acetyl-CoA pathway, possibly involved in heterotrophic growth and in CO2 fixation during autotrophic growth, are present. The genome contains a large set of genes for the anaerobic transformation and degradation of aromatic compounds, which are lacking in the other sequenced Desulfotomaculum genomes.


Introduction
Desulfotomaculum gibsoniae strain Groll T (DSM 7213) is a mesophilic sulfate-reducing bacterium isolated from a freshwater ditch in Bremen, Northern Germany [1,2]. It grows with a wide range of substrates, including organic acids, such as medium-chain fatty acids, shortchain fatty acids, and several aromatic compounds [1]. These substrates are degraded to CO2 coupled to sulfate reduction. The strain is also able to grow autotrophically with H2/CO2 and sulfate, and is able to ferment pyruvate and crotonate. In the absence of sulfate, it grows slowly on H2/CO2, formate, and methoxylated aromatic compounds. D. gibsoniae does not require vitamins for growth.
The genus Desulfotomaculum is a heterogeneous group of anaerobic spore-forming sulfatereducing bacteria, with thermophilic, mesophilic, and psychrophilic members that grow at neutral or alkaline pH values [3]. Their cell wall stains Gram-negative, but the ultrastructure of the cell wall is characteristic of Gram-positive bacteria [4]. They are physiologically very diverse. In contrast to Gram-negative sulfatereducing bacteria and closely related Clostridia, very little is known about their physiology, but members of this genus are known to play an important role in the carbon and sulfur cycle in diverse habitats.
The Desulfotomaculum genus is divided phylogenetically into different subgroups [1]. To get a thorough understanding of the evolutionary relationships of the different Desulfotomaculum subgroups and the physiology of the individual species, it is important to have genome sequence information. Here, we present a summary of the features of D. gibsoniae strain Groll T , together with the description of the complete genomic sequencing and annotation. A special emphasis is put on the ability of this strain to grow on a large variety of aromatic compounds and the responsible genes, and its capacity for acetogenic growth in the absence of sulfate. , Desulfotomaculum intricatum (cluster 1f), Desulfotomaculum peckii (cluster 1e), and Desulfotomaculum varum (cluster 1a) and the entire cluster 1g are not included in the tree. A set of Thermotogales species were used as outgroup, but were pruned from the tree. Closed circles represent bootstrap values between 75 and 100%. The scale bar represents 10% sequence difference.

Classification and features
D. gibsoniae is a member of the phylum Firmicutes. Phylogenetic analysis of the 16S rRNA genes of D. gibsoniae shows that it clusters in Desulfotomaculum cluster 1, subgroup b. (Figure 1 [1]). Other species in this subgroup are D. geothermicum, D. arcticum, D. alcoholivorax, D. thermosapovorans, D. sapomandens and the non-Desulfotomaculum species Sporotomaculum hydroxybenzoicum and S. syntrophicum. D. gibsoniae is a mesophilic sulfate reducer, with an optimum growth temperature between 35-37°C [1,2]. Fermentative and acetogenic growth was shown with pyruvate, crotonate, formate, H2 + CO2, and methoxylated aromatic compounds as substrates. In the presence of an electron acceptor it can completely oxidize substrates to CO2. Suitable electron acceptors are sulfate, thiosulfate and sulfite. The cells of D. gibsoniae are straight or slightly curved rods (1.0-2.5 × 4-7 μm) with pointed ends (Figure 2). Spores of D. gibsoniae are spherical and located in the center of the cells, causing swelling. A summary of the classification and general features of D. gibsoniae is presented in Table 1.

Genome sequencing and annotation
Genome project history D. gibsoniae was selected for sequencing in the DOE Joint Genome Institute Community Sequencing Program 2009, proposal 300132_795700 'Exploring the genetic and physiological diversity of Desulfotomaculum species', because of its phylogenetic position in one of the Desulfotomaculum subgroups and its ability to use aromatic compounds for growth.
The genome project is listed in the Genome OnLine Database (GOLD) [18] as project Gi07572, and the complete genome sequence is deposited in Genbank. Sequencing, finishing and annotation of the D. gibsoniae genome were performed by the DOE Joint Genome Institute (JGI) using state of the art sequencing technology [19]. A summary of the project information is shown in Table 2. Table 1. Classification and general features of D. gibsoniae strain Groll T (DSM 7213) according to the MIGS recommendations [5].

Genome sequencing and assembly
The genome was sequenced using a combination of Illumina and 454 sequencing platforms. All general aspects of library construction and sequencing can be found at the JGI website [21]. Pyrosequencing reads were assembled using the Newbler assembler (Roche). The initial Newbler assembly consisting of 139 contigs in one scaffold was converted into a phrap [22] assembly by making fake reads from the consensus, to collect the read pairs in the 454 paired end library. Illumina GAii sequencing data (2,432 Mb) was assembled with Velvet [23] and the consensus sequences were shredded into 1.5 kb overlapped fake reads and assembled together with the 454 data. The 454 draft assembly was based on 220 Mb 454 draft data and all of the 454 paired end data. Newbler parameters areconsed -a 50 -l 350 -g -m -ml 21. The Phred/Phrap/Consed software package [22] was used for sequence assembly and quality assessment in the subsequent finishing process. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with gapResolution [22], Dupfinisher [24], or sequencing cloned bridging PCR fragments with subcloning. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks (J.-F. Chang, unpublished). A total of 132 additional reactions were necessary to close some gaps and to raise the quality of the final contigs. Illumina reads were also used to correct potential base errors and increase consensus quality using a software Polisher developed at JGI [25]. The error rate of the final genome sequence is less than 1 in 100,000. Together, the combination of the Illumina and 454 sequencing platforms provided 506.2 × coverage of the genome. The final assembly is based on 2,347 Mb of Illumina draft data and 133 Mb of pyrosequence draft data.

Genome annotation
Genes were identified using Prodigal [26] as part of the DOE-JGI genome annotation pipeline [27], followed by a round of manual curation using the JGI GenePRIMP pipeline [28]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [29].

Genome properties
The genome consists of one circular chromosome of 4,855,529 bp (45.49% GC content) and includes no plasmids. A total of 4,762 genes were predicted, of which 4,666 are proteincoding genes. In addition, 3,464 of protein coding genes (72.7%) were assigned to a putative function with the remaining annotated as hypothetical proteins. The statistics of the genome are summarized in Table 3. 70.24% of the total genes were assigned to the COG functional categories (Table 4 and Figure 3).    Figure 4) [1,2]. Other bacteria capable of growth via anaerobic degradation of aromatic compounds linked to nitrate reduction, Fe(III) reduction, or sulfate reduction are much more restricted [30,31].
In sulfate-reducing bacteria (e.g. Desulfobacula toluolica) methylated aromatic compounds such as toluenes, xylenes or cresols are thought to be degraded via an initial fumarate addition to the methyl group followed by β-oxidation-like reactions [32][33][34]. The genes putatively coding for the enzyme catalyzing the fumarate addition reaction (hbsABC) are present in two copies in the genome of D. gibsoniae. They might have different substrate specificities for the growth sub-strates m-and p-cresol since the genome of D. toluolica possesses one set of these genes and can only grow with p-cresol [34,35]. In D. gibsoniae m-and p-cresol are expected to be converted to 3-or 4-hydroxybenzylsuccinate. The genes coding for enzymes involved in the subsequent β-oxidation (bhsABCDEFGH), yielding 3-or 4-hydroxybenzoyl-CoA, are also present in two copies. In growth experiments toluene degradation was not observed for D. gibsoniae [1,2]. The genome provides no opposing information. All genes for the degradation of the growth substrates phenylacetate and phenol are present including the type of phenylphosphate carboxylase typically found in strict anaerobes [36].  All genes encoding enzymes of the upper benzoyl-CoA degradation pathway were identified in D. gibsoniae. The growth substrate benzoate is activated to benzoyl-CoA either via ATPdependent CoA-ligase (bcl) or succinyl-CoA dependent CoA-transferase (bct) [37,38]. There are two classes of dearomatizing benzoyl-CoA reductases (BCRs) [39]. Class I are ATPdependent FeS enzymes composed of four different subunits [40]. There are two subclasses of ATP-dependent BCRs of the Thauera-and the Azoarcus-type. ATP-independent class II BCRs contain eight subunits and harbor a tungstencontaining cofactor in the active site [41]. The ATP-independent class II BCR is characteristic of strictly anaerobic aromatic compound degrading bacteria [42]. In D. gibsoniae the genes of the catalytic subunit (bamB) of the class II BCR are present in six copies. All of the predicted seven genes for the putative electron activating subunits of class II BCR (bamCDEFGHI) were identified in at least two copies and arranged next to each other. Surprisingly, genes of a class I BCR with high similarity (47-68% amino acid identity) to class I BCRs of the Azoarcus-type (bzdNOPQ) were found, but these were not located in a single transcriptional unit. It is unclear which of the putative BCR-encoding genes is used for benzoyl-CoA and/or 3-OH-benzoyl-CoA reduction. The genes necessary to convert the product of BCRs, a cyclic conjugated dienoyl-CoA, to 3-OH-pimelyl-CoA via modified βoxidation (dch, had, oah) are present in one copy each. It is unclear whether these genes are also involved in 3-OH-benzoyl-CoA degradation. One of the more unusual growth substrates of D.
gibsoniae is catechol, a substrate metabolized only by a very limited number of anaerobic bacteria. The pathway of catechol metabolism via protocatechuate was outlined 20 years ago [2] and is now confirmed by the genome analysis. For the degradation of lignin monomers, the side chains will be degraded and the methoxy-group will be removed by o-demethylation. The genes responsible for this mechanism are present in the genome (Desgi_0674 to Desgi_0676). The resulting compounds can then be degraded by the pathways outlined in Figure 4. Phylogenetic trees based on hbsA which is a homolog to bssA ( Figure 5A) and hbsC which is a homolog tobssC ( Figure 5B) show deeply branching lineages for the Desulfotomaculum gibsoniae genes and no clear affiliation to other sulfate-reducing bacteria except, in the case of the hbsC gene to alkane-oxidizing species. Interestingly, similar genes were also found in the genomes of Desulfotignum balticum and Desulfotignum phosphitoxidans. Both are only known to use benzoate or its hydroxyl derivatives, whereas the only other species of this genus, Desulfotignum toluenicum can grow very well on toluene [44][45][46]. Using the bamB and bamC genes for phylogenetic tree construction ( Figure 6A and Figure 6B), the picture is even more heterogenous. The different genes are affiliated with genes found in sulfate-reducing and other bacteria, hence a clear clustering cannot be seen. Again, genome data provides some interesting insights. Desulfospira joergensenii is not described as a benzoate utilizing bacterium, but seems to have some similar genes [47].     Figure 7). However, D. gibsoniae is not able to grow on acetate with or without sulfate.

Complete substrate oxidation, autotrophic growth and homoacetogenic growth
The acetyl-CoA pathway in D. gibsoniae does not perform acetate oxidation, as described in D. kuznetsovii [48], but facilitates complete oxidation of substrates leading to acetyl-CoA, autotrophic growth on H2 + CO2 (or formate) in the presence of sulfate as electron acceptor, and slow homoacetogenic growth on pyruvate, crotonate, formate, hydrogen plus carbon dioxide, and methoxylated aromatic compounds [1]. Three putative acetyl-CoA synthase encoding genes can be found in the D. gibsoniae genome ( Figure 8). All three genes have a putative carbon monoxide dehydrogenase catalytic subunit encoding gene (cooS) downstream. However, only Desgi_2051 is part of an operon structure containing other genes coding for enzymes involved in the acetyl-CoA pathway.

C1 compound degradation
In addition to the three cooS genes downstream of the genes coding for the acetyl-CoA synthase, D. gibsoniae has two other cooS genes in its genome, Desgi_2753, and Desgi_3080. The latter has a transcriptional regulator (Desgi_3081) downstream and a ferredoxin (Desgi_3079) and a nitrite reductase (Desgi_3078) upstream. Growth tests on CO have not yet been performed. However, the presence of multiple cooS genes with neighbor genes like ferredoxin and nitrate reductase, or genes coding for the acetyl-CoA pathway indicates that D. gibsoniae may grow on CO. D. gibsoniae can grow on formate coupled to sulfate reduction. In the genome, two putative formate dehydrogenases (FDHs) were found. One FDH (Desgi_1522-23) is translocated over the membrane and bound to a polysulfide reductase (NrfD)-like protein containing 10 trans-membrane helixes (Desgi_1524). The al-pha subunit contains a twin-arginine translocation (tat) motif and genes encoding proteins of the Tat system; TatA (Desgi_1521) and TatC (Desgi_1526) were found near the alpha subunit coding gene. The second FDH (Desgi_2136-2139) might be a confurcating FDH. Desgi_2138 shows similarity with the NADH binding 51kD subunit of NADH:ubiquinone oxidoreductase and Fe-S cluster binding motifs, which were found in all subunits.
No methanol methyltransferase genes can be found in the genome of D. gibsoniae, which correlates with the absence of growth on methanol [1]. Other methyltransferase genes that might point to growth with methylated amines were not found, except for a possible dimethylamine methyltransferase beta subunit (Desgi_3904) and a cobalamin binding protein (Desgi_3903). However, another methyltransferase gene, mtbA, which is absent from the genome, is necessary for growth with dimethylamine. Figure 7. Acetyl-CoA pathway in D. gibsoniae based on genomic data. Enzymes are depicted in bold italic. Next to these enzymes are the possible encoding genes, and their locus tags. Genes with the locus tags Desgi_2048 and Desgi_2050 putatively code for the small subunit and the large subunit of the iron-sulfur protein, respectively. This protein is involved in transferring the methyl from tetrahydrofolate to acetyl-CoA. Abbreviations: A-CoA S, acetyl-CoA synthetase; AcsA, carbon monoxide dehydrogenase; AcsB, acetyl-CoA synthase; CFeSP, iron-sulfur protein; CH 3 , methyl; THF, tetrahydrofolate; MeTr, methyltransferase. gibsoniae. Abbreviations: acsA, carbon monoxide dehydrogenase; acsB, acetyl-CoA synthase; acsE, methyltetrahydrofolate methyltransferase; CFeSP, iron-sulfur protein; cooC, carbon monoxide dehydrogenase maturation factor; Fe-S, Iron sulfur; hp, hypothetical protein; hyd, hydrogenase; metF, methylene-tetrahydrofolate reductase; up, uncharacterized protein.

Propionate and butyrate oxidation
The genome of D. gibsoniae contains at least one copy of genes putatively encoding enzymes involved in propionate oxidation via the methylmalonyl-CoA pathway ( Figure 9A). This includes genes in a methylmalonyl-CoA (mmc) cluster (Desgi_1951-1961), which have a genetic organization similar to those seen D. kuznetzovii (Desku_1358-1369) and Pelotomaculum thermopropionicum (Pth_1355-1368) [48][49][50]. However, a few differences were found. The genome of D. gibsoniae lacks genes coding for methylmalonyl-CoA decarboxylase epsilon and gamma subunits. Moreover, the mmc cluster of D. gibsoniae contains a single gene encoding the alpha subunit of succinyl-CoA synthase (Desgi_1955), whereas the mmc clusters of D. kuznetzovii and P. thermopropionicum contain two encoding genes. Bifurcating hydrogenases may be used to re-oxidize ferredoxin, which is generated by pyruvate:ferredoxin oxidoreductase and NADH, which in turn is generated from malate dehydrogenation for the formation of hydrogen. The membraneanchored extracellular formate dehydrogenases and hydrogenases may be involved in generating a proton motive force for succinate reduction. Genes putatively coding for butyrate β-oxidation enzymes are also present in the genome of D. gibsoniae. One complete cluster of genes putatively encoding all the enzymes required to convert butyrate is present (Desgi_4671-4675, Figure 9B). Gene organization in this cluster is similar to that found in D. reducens (Dred_1493-1489), which can also utilize butyrate ( Figure  10A). In D. gibsoniae another gene cluster (Desgi_1916-1925) is present which only lacks one gene coding for butyryl-CoA:acetate CoAtransferase ( Figure 11). Desgi_1918 and Desgi_1920-1925 have a similar organization to genes found in D. acetoxidans (Dtox_1697-1703) [51]. In addition to the genes encoding enzymes involved in butyrate β-oxidation, these clusters contain genes for electron transfer flavoproteins (Desgi_1920-1921 and Dtox_1698-1699) and for Fe-S oxidoreductases (Desgi_1922 and Dtox_1700). Although Dtox_1700 is annotated as a cysteine-rich unknown protein, a protein blast of these ORFs against the D. gibsoniae genome revealed 53.65% identity (Evalue = 0.0) with the putative Fe-S oxidoreductase encoded by Desgi_1922. Two genes encoding acyl-CoA synthetases (Desgi_1916-1917) are present upstream of the acetyl-CoA dehydrogenase gene in D. gibsoniae (Desgi_1918), but these are not found near this cluster in D. acetoxidans. However, these genes are present in the same gene cluster location in other butyrate-degrading sulfate-reducing bacteria (SRB), namely D. alcoholivorax (H569DRAFT_00537-00530), D. kuznetsovii (Desku_1226-1234) and Desulfurispora thermophila (B064DRAFT_00829-00837).
Acyl-CoA synthetases are most likely involved in the biosynthesis of coenzyme A [52]. Several other clusters of genes containing at least three genes encoding enzymes involved in butyrate conversion can be found in the genome of D. gibsoniae. Standards in Genomic Sciences

Sulfate reduction
The genome contains single copies of the sulfate adenyltransferase (Desgi_3703), adenosine-5´phosphosulfate (APS) reductase (Desgi_3701-3702) and dissimilatory sulfite reductase (Desgi 4661-4662) as are found in most of the other members of the genus [18][19][20]. A membranebound pyrophosphatase (Desgi_4294) is used for energy regeneration as in other Desulfotomaculum spp. The QmoABC complex contains only the A and B subunit, the C subunit is lacking (Desgi_3699-3700). In all members of the genus Desulfotomaculum the QmoAB is followed by HdrCB (Desgi_3697-3698). This arrangement is identical to that seen in the closely related species "Desulforhudis audaxivator", Desulfurispora thermophila and the Gramnegative Desulfarculus baarsii and strain NaphS2, which possess a Gram-positive AprBA [53]. Interestingly, the same organization is also found in some phototrophic sulfur-oxidizing bacteria, such as Thiobacillus dentrificans, Thiothrix nivea and Sedimentibacter selenatireducens [54]. Other closely related Gram-positive SRB like Desulfovirgula thermoconiculi and Ammonifex degensii have a complete QmoABC system like all other SRB and the Green Sulfur Bacteria, or have QmoAB linked to a Fe-S oxidoreductase/HdrD as seen in Desulfosporosinus spp. This latter modification is also seen in other Gram-negative SRB, which have a Gram-positive AprBA-like Desulfomonile tiedjei and Syntrophomonas fumaroxidans [55]. It seems that both Desulfotomaculum sp. and Desulfosporosinus have been the source of the entire aps reductase/ QmoA complex for members of the Gram-negative Syntrophobacterales [55]. The genomes of Syntrophobacter fumaroxidans and of Desulfovirgula thermoconiculi have two different systems that can be linked to the aps reductase. In D. gibsoniae the dsrAB (Desgi_4661-4662) is linked to the same truncated dsr operon coding only for dsrC and dsrMK (Desgi 4648-4649) as in other Desulfotomaculum spp [48,51,56].

D. gibsoniae has six [FeFe] and three [NiFe]
hydrogenases, suggesting a lower redundancy in the case of [FeFe] enzymes than other members of the genus. The [FeFe] hydrogenases include one membrane-associated protein (Desgi_0926-0928) that contains a tat motif in the alpha subunit (Desgi_0926), suggesting an extracellular localization; one monomeric hydrogenase (Desgi_0935) encoded close to the membranebound enzyme, which suggests the possibility of co-regulation; two copies of trimeric NAD(P)dependent bifurcating hydrogenases (Desgi_4669-4667 and Desgi_3197-3195); one enzyme (Desgi_0771) that is part of a multi-gene cluster encoding two flavin-dependent oxidoreductases that is also present in other Desulfotomaculum spp., and one HsfB-type hydrogenase (Desgi_3194) encoding a PASsensing domain that is likely involved in sensing and regulation, and possibly with the bifurcating Desgi_3195 hydrogenase. The [NiFe] hydrogenases include one enzyme (Desgi_1398 -1397) that may also be bound to the membrane by a cytochrome b (Desgi_1402); one simple dimeric enzyme (Desgi_1231-1230); and one trimeric group 3 hydrogenase (Desgi_1166-1164), similar to methyl-viologen reducing hydrogenases from methanogens, and which is encoded next to a HdrA-like protein (Desgi_1163).

Nitrogenases
A cluster of nitrogenase genes, specifically genes encoding nitrogenase iron protein, nitrogen regulatory protein PII, nitrogenase molybdenumiron protein alpha chain, nitrogenase molybdenum-iron protein beta chain, nitrogenase mo-lybdenum-iron cofactor biosynthesis protein NifE, nitrogenase molybdenum-iron protein, alpha and beta chains, nitrogenase cofactor biosynthesis protein NifB; ferredoxin, iron only nitrogenase protein AnfO (AnfO_nitrog) (Desgi_2428-2419) were detected within the annotated genome sequence. Thus, D. gibsoniae probably has the capacity for nitrogen fixation. However, the fixation of molecular nitrogen has not been analyzed in this species so far.