Genome analysis of Desulfotomaculum kuznetsovii strain 17T reveals a physiological similarity with Pelotomaculum thermopropionicum strain SIT.

Desulfotomaculum kuznetsovii is a moderately thermophilic member of the polyphyletic spore-forming genus Desulfotomaculum in the family Peptococcaceae. This species is of interest because it originates from deep subsurface thermal mineral water at a depth of about 3,000 m. D. kuznetsovii is a rather versatile bacterium as it can grow with a large variety of organic substrates, including short-chain and long-chain fatty acids, which are degraded completely to carbon dioxide coupled to the reduction of sulfate. It can grow methylotrophically with methanol and sulfate and autotrophically with H2 + CO2 and sulfate. For growth it does not require any vitamins. Here, we describe the features of D. kuznetsovii together with the genome sequence and annotation. The chromosome has 3,601,386 bp organized in one contig. A total of 3,567 candidate protein-encoding genes and 58 RNA genes were identified. Genes of the acetyl-CoA pathway, possibly involved in heterotrophic growth with acetate and methanol, and in CO2 fixation during autotrophic growth are present. Genomic comparison revealed that D. kuznetsovii shows a high similarity with Pelotomaculum thermopropionicum. Genes involved in propionate metabolism of these two strains show a strong similarity. However, main differences are found in genes involved in the electron acceptor metabolism.


Introduction
Desulfotomaculum kuznetsovii strain 17 T (VKM B-1805; DSM 6115) is a moderately thermophilic sulfate-reducing bacterium isolated from deep subsurface thermal mineral water [1]. It grows with a wide range of substrates, including organic acids, such as long-chain fatty acids, short-chain fatty acids (butyrate, propionate, acetate), lactate, pyruvate, fumarate and succinate as well as ethanol and methanol. These substrates are degraded to CO 2 coupled to sulfate reduction. The strain is also able to grow autotrophically with H 2 /CO 2 and sulfate and to ferment pyruvate and fumarate. For growth, D. kuznetsovii has no vitamin requirement.

Standards in Genomic Sciences
Desulfotomaculum is a genus of Gram-positive, spore-forming anaerobes that is phylogenetically and physiologically very diverse. The genus is poorly studied physiologically, while its members are known to play an important role in the carbon and sulfur cycle in a variety of often adverse environments. The genus is divided phylogenetically into different sub-groups [2,3]. To get a thorough understanding of the evolutionary relationship of the different Desulfotomaculum sub-groups and the physiology of the individual species, it is important to have genome sequence information.
Here, we present a summary of the features of D. kuznetsovii strain 17 T , together with the description of the complete genomic sequencing and annotation. Moreover, we describe a physiological and genomic comparison of D. kuznetsovii strain 17 T and Pelotomaculum thermopropionicum strain SI T , because phylogenetically P. thermopropionicum is the closest related organism with validly published name that has a completely sequenced genome. However, the two strains have different physiological traits. For example, P. thermopropionicum is not able to grow by sulfate reduction, but is able to grow in syntrophy with methanogens. D. kuznetsovii lacks this ability. By comparing the genomes of the two bacteria we were able to identify the main similarities and differences.

Classification and features
D. kuznetsovii is a member of the phylum Firmicutes. Phylogenetic analysis of the 16S rRNA genes of D. kuznetsovii shows that it clusters in Desulfotomaculum cluster 1. This cluster not only contains Desulfotomaculum species, but also members of the genera Sporotomaculum, Cryptanaerobacter and Pelotomaculum. D. kuznetsovii is part of sub-group 1c together with D. solfataricum, D. luciae, D. thermosubterraneum, D.
salinum, D. australicum, and D. thermocisternum, while Pelotomaculum species belong to sub-group 1h ( Figure 1) [2]. D. kuznetsovii cells are rod-shaped (1.0-1.4 x 3.5-5 μm) with rounded ends and peritrichous flagella [ Figure 2]. Spores of D. kuznetsovii are spherical (1.3 μm in diameter) and centrally located causing swelling of the cells. D. kuznetsovii grows between 50 and 85°C, but the optimal growth temperature is 60-65°C. The substrates D. kuznetsovii can grow with are completely oxidized to CO2. Suitable electron acceptors are sulfate, thiosulfate and sulfite. D. kuznetsovii is also able to grow by fermentation of pyruvate and fumarate. A summary of the classification and general features of D. kuznetsovii is presented in Table 1 [1].

Genome sequencing and annotation
Genome project history D. kuznetsovii was selected for sequencing in the DOE Joint Genome Institute Community Sequencing Program 2009, proposal 300132_795700 'Exploring the genetic and physiological diversity of Desulfotomaculum species', because of its phylogenetic position in one of the Desulfotomaculum sub-groups, its important role in bioremediation, and its ability to use propionate, acetate and methanol for growth. The genome project is listed in the Genome OnLine Database (GOLD) [20] as project Gc01781, and the complete genome sequence was deposited in Genbank. Sequencing, finishing and annotation of the D. kuznetsovii genome were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Growth conditions and DNA isolation
D. kuznetsovii was grown anaerobically at 60 o C in bicarbonate buffered medium with propionate and sulfate as substrates [1]. DNA of cell pellets was isolated using the standard DOE-JGI CTAB method recommended by the DOE Joint Genome Institute (JGI, Walnut Creek, CA, USA). In short, cells were resuspended in TE (10 mM tris; 1 mM EDTA, pH 8.0). Subsequently, cells were lysed using lysozyme and proteinase K, and DNA was extracted and purified using CTAB and phenol:chloroform:isoamylalcohol extractions. After precipitation in 2-propanol and washing in 70% ethanol, the DNA was resuspended in TE containing RNase. Following a quality and quantity check using agarose gel electrophoresis in the presence of ethidium bromide, and spectrophotometric measurement using a NanoDrop ND-1000 spectrophotometer (NanoDrop® Technologies, Wilmington, DE, USA).

Genome sequencing and assembly
The genome was sequenced using a combination of Illumina and 454 sequencing platforms. All general aspects of library construction and sequencing can be found at the JGI website [21]. Pyrosequencing reads were assembled using the Newbler assembler (Roche).
The initial Newbler assembly consisting of 81 contigs in five scaffolds was converted into a phrap [22] assembly by making fake reads from the consensus, to collect the read pairs in the 454 paired end library. Illumina GAii sequencing data (570.2 Mb) was assembled with Velvet [23] and the consensus sequences were shredded into 1.5 kb overlapped fake reads and assembled together with the 454 data. The 454 draft assembly was based on 134.6 Mb 454 draft data and all of the 454 paired end data. Newbler parameters are -consed -a 50 -l 350 -g -m -ml 20. The Phred/Phrap/Consed software package [22] was used for sequence assembly and quality assessment in the subsequent finishing process. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with gapResolution [21], Dupfinisher [24], or sequencing cloned bridging PCR fragments with subcloning. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks (J.-F. Chang, unpublished). A total of 400 additional reactions and one shatter library were necessary to close gaps and to raise the quality of the finished sequence. Illumina reads were also used to correct potential base errors and increase consensus quality using a software Polisher developed at JGI [25]. The error rate of the completed genome sequence is less than 1 in 100,000. Together, the combination of the Illumina and 454 sequencing platforms provided 188.8 × coverage of the genome. The final assembly contained 323,815 pyrosequence and 15,594,144 Illumina reads.

Genome annotation
Genes were identified using Prodigal [26] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [27]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [28].

Genome properties and genome comparison with other strains
The genome of D. kuznetsovii consists of a circular chromosome of 3,601,386 bp with 54.88% GC content (Table 3 and Figure 3). Pseudogenes comprise 4.66% of the genes identified. Of the 3,625 genes predicted, 3,567 are protein-coding genes of which 2,560 are assigned to COG functional categories. The distribution of these genes into COG functional categories is presented in Table 4. Evidence codes -TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). Evidence codes are from the Gene Ontology project [19].    The genome of D. kuznetsovii has 58 RNA genes of which, three are 16S rRNA genes. This is one more than the previously described rrnA and rrnB [29]. These two rRNA genes contained two large inserts. One at the variable 5'terminal region and one at the variable 3'terminal region. The main differences between the two rRNA genes were found in these inserts. These inserts were hypothesized to be involved in the operation of ribosomes at high temperatures. However, more research is needed to assess the function of these inserts. All three rRNA genes of D. kuznetsovii have a size of approximately 1,700 nucleotides. This suggests that the third rRNA gene might also contain inserts. Alignment of the 16S rRNA genes confirmed the presence of inserts in all three 16S rRNA genes (data not shown). BLAST analysis [30,31] of the genes of D. kuznetsovii against genes in the KEGG Sequence Similarity DataBase revealed similarity with other Desulfotomaculum strains (Table  5), D. acetoxidans, D. carboxydivorans, "D. reducens" and D. ruminis, but interestingly also with non-Desulfotomaculum strains. D. kuznetsovii contains 873 genes with high similarity to genes of Pelotomaculum thermopropionicum, which is more than to any of the sequenced Desulfotomaculum species. Moreover, we identified the conserved proteins of D. kuznetsovii across three related fully sequenced species (Table 6). The bidirectional best blast hits showed that despite the smaller genome of P. thermopropionicum it contained more homologous predicted proteins with D. kuznetsovii (1,406) compared to D. acetoxidans (1,309) and "D. reducens" (1330). This suggests a strong physiological similarity between D. kuznetsovii and P. thermopropionicum.

Insights into the genome
Involvement of the acetyl-coA pathway in growth with acetate and methanol D. kuznetsovii oxidizes acetate completely to CO 2 . The pathway of acetate degradation has not been studied yet, but sulfate reducers may employ the tricarboxylic acid (TCA) cycle or the acetyl-CoA pathway for acetate degradation, as exemplified by Desulfobacter postgatei and Desulfobacca acetoxidans, respectively [32]. Most genes predicted to code for enzymes of the TCA cycle are present in the genome of D. kuznetsovii, but genes with similarity to those coding for an ATPdependent citrate synthase and isocitrate dehydrogenase are missing. This suggests that the TCA cycle is not complete and that the TCA cycle enzymes have mainly an anabolic function or a function in other catabolic pathways, such as the propionate degradation pathway. Genes with similarity to those coding for enzymes involved in the acetyl-CoA pathway are all present in the genome of D. kuznetsovii (Figure 4), which suggests its involvement in acetate oxidation. However, there are no genes similar to those that code for acetate kinase and phosphate acetyltransferase present in the genome. The reaction from acetate to acetyl-CoA is likely performed by acetyl-CoA synthetase (Desku_1241).

D. acetoxidans
is an acetate-oxidizing Desulfotomaculum species, positioned in subgroup 1e (Figure 1), that also uses the acetyl-CoA pathway for acetate oxidation to CO2 [33]. The genes involved in acetate oxidation in D. acetoxidans are similar to those in D. kuznetsovii, but there are some exceptions. The genome of D. acetoxidans does not contain a gene that putatively codes for acetyl-CoA synthetase, similar to D. kuznetsovii, but contains genes that putatively code for an acetate kinase and a phosphate acetyltransferase [34]. Additionally, putative carbon-monoxide dehydrogenase complex coding genes involved in the acetyl-CoA pathway show differences between the two Desulfotomaculum species. D. kuznetsovii lacks a ferredoxin coding gene that is located between cooC (Desku_1493) and acsE (Desku_1487), which in contrast is present in the genome of D. acetoxidans (Dtox_1273). Moreover, three genes similar to heterodisulfide reductase encoding genes (Desku_1486-1484) are located upstream of acsE in D. kuznetsovii, which is not the case in the genome of D. acetoxidans. 1,406 1,211 † BLAST analyses were performed using standard settings and best hits were filtered for 40% identity over an alignment length of 75 amino acids as a minimum requirement. The values show the number of predicted proteins that are homologous to the query species in each row. The number of similar proteins obtained with a unidirectional BLAST is indicated in light blue. Bidirectional best blast hits are indicated in dark blue. Proteomes were obtained from ftp.ncbi.nih.gov/Bacteria/. Accession numbers are in parenthesis: Desulfotomaculum acetoxidans (NC_013216); Desulfotomaculum kuznetsovii (NC_015573); "Desulfotomaculum reducens" (NC_009253); Pelotomaculum thermopropionicum (NC_009454). Standards in Genomic Sciences Genes with the locus tags Desku_1488 and Desku_1490 putatively code for the small subunit and the large subunit of the iron-sulfur protein, respectively. This protein is involved in transferring the methyl from acetyl-CoA to tetrahydrofolate. Abbreviations: A-CoA S, acetyl-CoA synthetase; AcsA, carbon-monoxide dehydrogenase; AcsB, acetyl-CoA synthase; CFeSP, iron-sulfur protein; CH 3 , methyl; THF, tetrahydrofolate; MeTr, methyltransferase.

Methanol metabolism
Growth of D. kuznetsovii with methanol and sulfate was studied [35]. In that study the activity of methyltransferase, an enzyme that is involved in methanol metabolism in methanogens and acetogens [36,37], could not be assessed, while low activities of an alcohol dehydrogenase could be measured. An alcohol dehydrogenase with a molecular mass of 42 kDa was partially purified and showed activity with methanol [35]. The genome of D. kuznetsovii contains several alcohol dehydrogenase genes (Desku_0165, 0619, 0624, 0628, 2955, 3082) that each code for an enzyme with a size of approximately 42 kDa. In the genome, genes with similarity to those coding for a methanol methyltransferase mtaA (Desku_0050, 0055, 0060), mtaB (Desku_0051) and mtaC (Desku_0048, 0049, 0052, 0056) were also found, suggesting a methanol metabolism as described in Moorella thermoacetica [36]. Further studies are needed to obtain information about the diversity of the methanol-degradation pathways in D. kuznetsovii.

Comparison of D. kuznetsovii and P. thermopropionicum genomes
Genomic comparison revealed that a large number of D. kuznetsovii genes show similarity to genes of Pelotomaculum thermopropionicum, a syntrophic propionate-oxidizing thermophile (Table 5 and 6). Interestingly, among them are genes that putatively code for enzymes involved in propionate metabolism (Table 7). Moreover, the genetic organization of the methylmalonyl-CoA (mmc) cluster in the genome of both bacteria is similar ( Figure 5). However, D. kuznetsovii lacks tps, mmcA and mmcM in the mmc cluster. mmcA codes for a response regulator and mmcM for pyruvate ferredoxin oxidoreductase.
Based on 16S rRNA gene sequences, D. kuznetsovii and P. thermopropionicum group in cluster group c and h of the Desulfotomaculum cluster 1, respectively (see Figure 1). P. thermopropionicum is known for its ability to grow with propionate and ethanol in syntrophic association with methanogens. It is not able to grow by sulfate respiration, despite the presence of sulfate reduction genes in the genome [38]. In contrast, D. kuznetsovii is able to grow with propionate ( Figure 6) and ethanol with sulfate. However, in the absence of sulfate, it cannot grow in syntrophic association with methanogens. Therefore, differences are expected in genes coding for hydrogenases, formate dehydrogenases, and those involved in sulfate reduction. Figure 7 depicts the sulfate reduction pathway of the two strains. In the genome of D. kuznetsovii two genes (Desku_2103; Desku_3527) are annotated as phosphoadenosine phosphosulfate reductase encoding genes whose corresponding proteins might be involved in assimilatory sulfate metabolism. The P. thermopropionicum genome lacks these genes [39]. Instead, the P. thermopropionicum genome contains an adenylylsulfate kinase gene (PTH_0238). In the dissimilatory sulfate reduction pathway, the two strains both have genes that code for enzymes to reduce sulfate to H2S. However, P. thermopropionicum is missing the gene that codes for an adenylylsulfate reductase beta subunit, which is present in the D. kuznetsovii genome (Desku_1073). Moreover, the gene labeled as a dissimilatory sulfite reductase (dsr) alpha and beta subunit in the P. thermopropionicum genome (PTH_0242) is not similar to dsrA or dsrB from D. kuznetsovii or any other Desulfotomaculum strain.

Sulfate reduction genes:
However, it has high similarity to the dsrC gene from D. kuznetsovii, indicating that it is not a dsrA or dsrB gene but a dsrC gene (data not shown). Therefore, the inability of P. thermopropionicum to grow by sulfate respiration is most likely caused by the absence of an adenylylsulfate reductase beta subunit encoding gene and the dsrAB genes.
The genome of D. kuznetsovii was screened for hydrogenase and formate dehydrogenase encoding gene clusters with BLAST analysis. Pfam search [44] was used to identify motifs in the amino acid sequences and the TMHMM Server v. 2.0 [43] was used to screen for transmembrane helices. The TatP 1.0 Server was used to screen for twin-arginine translocation (Tat) motifs in the N-Standards in Genomic Sciences terminus to predict protein localization in the cell [45]. The incorporation of selenocysteine (SeCys) was examined by RNA loop predictions with Mfold version 3.2 [46,47]. The predicted RNA loop in the 50-100 bp region downstream of the UGA-codon was compared with the consensus loop described earlier [48].     Gene locus tag numbers and α-, β-, and γ-subunits are depicted. Moreover, predicted iron-sulfur clusters and metal-binding sites are indicated.
Apart from a possible involvement in the acetate oxidation pathway (Figure 4), it remains unclear for which purpose D. kuznetsovii uses its confurcating formate dehydrogenase and hydrogenases because our genome analysis indicates that pyruvate oxidation during propionate degradation generates formate instead of ferredoxin ( Figure 6).

Vitamin synthesis
D. kuznetsovii is able to grow in medium without vitamins [1]. This indicates that D. kuznetsovii is able to synthesize all the vitamins that are required for its metabolism and that vitamin synthesis genes should be present in the genome. Vitamin B 12 is essential for the methylmalonyl-CoA pathway and the acetyl-coA pathway. The biosynthesis of cobalamin (vitamin B 12 ) is known to occur from uroporphyrinogen-III to adenosylcobalamin via two possible pathways, the aerobic and anaerobic pathway of the corrinoid ring [49,50] (vitamin B 5 ) synthesis (Desku_3262). The genes involved in coenzyme A production from pantothenate are also present in the D. kuznetsovii genome (Desku_1254, 1307, 3145, 3200). Moreover, genes involved in the biosynthesis of pyridoxine (vitamin B 6 ) via the deoxyxylulose 5-phosphate (DXP) independent route were found to be in the genome (Desku_0007, 0008). These genes code for two enzymes that facilitate the conversion of glutamine to the active form of vitamin B 6 , pyridoxal 5'phosphate [51].
Menaquinone (vitamin K) and ubiquinone (coenzyme Q 10 ) biosynthesis is important because of the electron transport function in the membranes. The genes that code for the biosynthesis enzymes from polyprenyldiphosphate to menaquinone and ubiquinone are present in the D. kuznetsovii genome (Desku_0124, 0126, 0629, 1551-1554, 1829, 2629 and 3525), except for the genes that code for a 2-polyprenyl-6-methoxyphenol 4monooxygenase (UbiH) and 2-polyprenyl-3methyl-6-methoxy-1,4-benzoquinone hydroxylase (UbiF). Additionally, three genes (Desku_1548-1550) could be identified as putative menaqui-none biosynthesis genes and are part of a menaquinone biosynthesis gene cluster (Desku_1548-1554). The products of those three genes could be involved in the reactions of the missing UbiH and UbiF encoding genes.
Folate (vitamin B 9 ) biosynthesis is also of great importance for D. kuznetsovii, because it is an essential part of the acetyl-CoA pathway. It is involved in the transfer of one-carbon compounds and can be biosynthesized from chorismate and guanosine triphosphate (GTP) [52][53][54][55]. Both pathways use a dihydropteroate synthase to produce dihydropteroate. The genome of D. kuznetsovii contains the genes encoding the enzymes involved in the pathway from chorismate to dihydropteroate (Desku_0219, 2268-2269) and from GTP to dihydropteroate (Desku_0210, 0219-0221 and 1419). The gene encoding a phosphatase (Desku_0210) in the D. kuznetsovii genome is probably involved in the removal of phosphate groups from dihydropterine triphosphate as a substitute for an alkaline phosphatase encoding gene, which is not present in the genome. Additionally, the genome contains a bifunctional protein encoding gene (Desku_404) that is expected to be responsible for the production of dihydrofolate (DHF) and the addition of multiple glutamate moieties to DHF or tetrahydrofolate (THF). However, the D. kuznetsovii genome lacks the DHF reductase encoding gene, which is required to reduce DHF to THF. The DHF reductase encoding gene appears to be absent in many microorganisms [56]. Levin et al. (2004) propose that in Halobacterium salinarum a dihydrofolate synthase and a dihydropteroate synthase domain is able to replace the function of the DHF reductase. Additionally, the authors show that when using a BLAST search, homologs of polypeptides can be found in organisms that lack a DHF reductase [56]. However, BLAST results showed no homologous protein encoding gene in the genome of D. kuznetsovii (data not shown). How in D. kuznetsovii DHF is reduced to THF can currently not be deduced from the genome sequence.