Open Access

Complete genome sequence of Dehalogenimonas lykanthroporepellens type strain (BL-DC-9T) and comparison to “Dehalococcoides” strains

  • Shivakumara Siddaramappa
  • , Jean F. Challacombe
  • , Susana F. Delano
  • , Lance D. Green
  • , Hajnalka Daligault
  • , David Bruce
  • , Chris Detter
  • , Roxanne Tapia
  • , Shunsheng Han
  • , Lynne Goodwin
  • , James Han
  • , Tanja Woyke
  • , Sam Pitluck
  • , Len Pennacchio
  • , Matt Nolan
  • , Miriam Land
  • , Yun-Juan Chang
  • , Nikos C. Kyrpides
  • , Galina Ovchinnikova
  • , Loren Hauser
  • , Alla Lapidus,
  • , Jun Yan
  • , Kimberly S. Bowman
  • , Milton S. da Costa
  • , Fred A. Rainey
  • and William M. Moe

DOI: 10.4056/sigs.2806097

Received: 15 May 2012

Published: 25 May 2012

Abstract

Dehalogenimonas lykanthroporepellens is the type species of the genus Dehalogenimonas, which belongs to a deeply branching lineage within the phylum Chloroflexi. This strictly anaerobic, mesophilic, non spore-forming, Gram-negative staining bacterium was first isolated from chlorinated solvent contaminated groundwater at a Superfund site located near Baton Rouge, Louisiana, USA. D. lykanthroporepellens was of interest for genome sequencing for two reasons: (a) an unusual ability to couple growth with reductive dechlorination of environmentally important polychlorinated aliphatic alkanes and (b) a phylogenetic position that is distant from previously sequenced bacteria. The 1,686,510 bp circular chromosome of strain BL-DC-9T contains 1,720 predicted protein coding genes, 47 tRNA genes, a single large subunit rRNA (23S-5S) locus, and a single, orphan, small subunit rRNA (16S) locus.

Keywords:

reductive dechlorinationgroundwaterstrictly anaerobichydrogen utilizationcontaminationChloroflexi

Introduction

Strain BL-DC-9T (=JCM 15061, =ATCC BAA-1523) is the type strain of the species Dehalogenimonas lykanthroporepellens, which is the type species of the genus Dehalogenimonas [1]. At the time of publication, D. lykanthroporepellens is the only validly named species in this genus. The type strain was isolated from moderately acidic groundwater (pH=5.1) collected at a waste recovery well at the Petro-Processors of Louisiana, Inc. Superfund Site, near Baton Rouge, Louisiana (USA), in an area contaminated by high concentrations of several chlorinated alkanes and alkenes [1-3]. The strain is able to reductively dehalogenate a variety of environmentally important polychlorinated alkanes but not monochlorinated alkanes, chlorinated alkenes, or chlorinated benzenes [1,3]. Quantitative real-time PCR experiments indicate that bacteria closely related or identical to D. lykanthroporepellens are present throughout the contaminated site from which strain BL-DC-9T was first isolated [4]. In this report, we present a summary classification and a set of features for D. lykanthroporepellens BL-DC-9T together with the description of the complete genomic sequencing and annotation.

Classification and features

At present, D. lykanthroporepellens strain BL-DC-9T is phylogenetically isolated within the domain Bacteria, with no other species assigned to the genus Dehalogenimonas. On the basis of 16S rRNA gene sequences, strain BL-DC-9T clusters within the phylum Chloroflexi. Based on 16S rRNA gene sequences, the closest related type strains are Caldilinea tarbellica D1-25-10-4T [5] and Caldilinea aerophila STL-6-O1T [6], with sequence identities of 81.7% and 81.5%, respectively [7]. Aside from the closely related strain BL-DC-8 that was isolated from the same groundwater source as strain BL-DC-9T, the closest previously cultured phylogenetic relatives of strain BL-DC-9T are “Dehalococcoides” strains [1,3]. Although some variable regions of the 16S rRNA genes of D. lykanthroporepellens and “Dehalococcoides” strains are highly homologous [4], the overall identity of these genes is ~90%, indicating a distant relationship [1].

Figure 1 shows the phylogenetic neighborhood of D. lykanthroporepellens strain BL-DC-9T in a 16S rRNA gene based phylogenetic dendrogram. The sequence of the lone 16S rRNA gene copy in the genome differs from the previously published 16S rRNA gene sequence (EU679419) by a single nucleotide position.

Figure 1

Phylogenetic tree showing the position of D. lykanthroporepellens BL-DC-9T relative to other strains within the phylum Chloroflexi. The tree was inferred from 1,214 aligned nucleotide positions of the 16S rRNA gene sequence using the Neighbor-Joining method within the MEGA v4.0.2 package [8]. Truepera radiovictrix RQ-24T is shown as an outgroup and was used to root the tree. Scale bar represents 2 substitutions per 100 nucleotide positions. Numbers at branching points denote support values from 1,000 bootstrap replicates if larger than 70%. Lineages with genome sequencing projects registered in GOLD [9] are shown in blue, published genomes in bold: “Dehalococcoides” GT (CP001924), “Dehalococcoides” BAV1 (CP000688), “Dehalococcoides” CBDB1 (AJ965256), “Dehalococcoides ethenogenes” 195 (CP000027), “Dehalococcoides” VS (CP001827), Dehalogenimonas lykanthroporepellens BL-DC-9T (CP002084), Caldilinea aerophila STL-6-O1T (AP012337), Anaerolinea thermophila UNI-1T (AP012029), “Thermobaculum terrenum” YNP1 (CP001825), Sphaerobacter thermophilus DSM 20745T (CP001823), Thermomicrobium roseum DSM 5159T (CP001275), Herpetosiphon aurantiacus DSM 785T (CP000875), Roseiflexus castenholzii DSM 13941T (CP000804), Chloroflexus aurantiacus J-10-flT (CP000909), and Chloroflexus aggregans MD-66T (CP001337).

The cells of D. lykanthroporepellens stain Gram-negative, and are non-spore forming, irregular cocci with a diameter of 0.3-0.6 µm (Figure 2). Strains of D. lykanthroporepellens were isolated in liquid medium using a dilution-to-extinction approach. Growth was not observed on agar plates or on medium solidified with gellan gum, even after long term (2 months) incubation [3]. The temperature range for growth of strain BL-DC-9T is between 20°C and 37°C with an optimum between 28°C and 34°C [3]. The pH range for growth is 6.0 to 8.0 with an optimum of 7.0 to 7.5 [3]. The organism grows in the presence of 2% (w/v) NaCl and is resistant to ampicillin and vancomycin at concentrations of 1.0 and 0.1 g/l, respectively [3].

Figure 2

Scanning electron micrograph of cells of D. lykanthroporepellens strain BL-DC-9T

D. lykanthroporepellens strain BL-DC-9T is a strictly anaerobic chemotroph, coupling utilization of H2 as an electron donor and several environmentally important polychlorinated aliphatic alkanes as electron acceptors for growth (Table 1). The chlorinated compounds known to be reductively dehalogenated include 1,2,3-trichloropropane, 1,2-dichloropropane, 1,1,2,2-tetrachloroethane, 1,1,2-trichloroethane, and 1,2-dichloroethane [3]. In all of the reductive dechlorination reactions characterized to date, strain BL-DC-9T appears to exclusively utilize vicinally halogenated alkanes as electron acceptors via dihaloelimination reactions (i.e., simultaneous removal of two chlorine atoms from adjacent carbon atoms with concomitant formation of a carbon-carbon double bond) [1,3]. Strain BL-DC-9T does not utilize 1-chlorobenzene, 1-chloropropane, 2-chloropropane, 1,2-dichlorobenzene, cis-1,2-dichloroethene, trans-1,2-dichloroethene, tetrachloroethene, or vinyl chloride as electron acceptors for growth [1,3]. Growth is not supported by acetate, butyrate, citrate, ethanol, fructose, fumarate, glucose, lactate, lactose, malate, methanol, methyl ethyl ketone, propionate, pyruvate, succinate, or yeast extract in the absence of H2 [1,3].

Table 1

Classification and general features of D. lykanthroporepellens strain BL-DC-9T according to the MIGS recommendations [10]

MIGS ID

    Property

      Term

     Evidence code

      Domain Bacteria

     TAS [11]

      Phylum Chloroflexi

     TAS [12,13]

      Class not reported

     TAS [1]

    Classification

      Order not reported

     TAS [1]

      Family not reported

     TAS [1]

      Genus Dehalogenimonas

     TAS [1]

      Species Dehalogenimonas lykanthroporepellens

     TAS [1]

      Type strain BL-DC-9

     TAS [1]

    Gram stain

      Negative

     TAS [3]

    Shape

      Coccoid, irregular

     TAS [1]

    Motility

      Not motile

     TAS [1]

    Sporulation

      Nonsporulating

     TAS [3]

    Temperature range

      20-37°C

     TAS [3]

    Optimum temperature

      28-34°C

     TAS [3]

    Salinity

      ≤0.1% - ≥2% (optimum ≤1%)

     TAS [1,3]

MIGS-22

    Oxygen requirement

      Obligate anaerobe

     TAS [1,3]

    Carbon source

      Not reported

    Energy source

      Chemotrophic

     TAS [1,3]

MIGS-6

    Habitat

      Groundwater

     TAS [1,3,4]

MIGS-15

    Biotic relationship

      Free-living

     NAS

MIGS-14

    Pathogenicity

      None

     NAS

    Biosafety level

      1

     NAS

    Isolation

      Solvent contaminated groundwater

     TAS [1,3]

MIGS-4

    Geographic location

      Louisiana, USA

     TAS [1,3,4]

MIGS-5

    Sample collection time

      2005

     IDA

MIGS-4.1

    Latitude

      30.581

MIGS-4.2

    Longitude

      -91.244

     IDA

MIGS-4.3

    Depth

      14 m

     IDA

MIGS-4.4

    Altitude

      16.5 m

     IDA

Evidence codes - IDA: Inferred from Direct Assay (first time in publication); TAS: Traceable Author Statement (i.e. a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e. not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [14]. If the evidence code is IDA, then the property was directly observed for a living isolate by one of the authors or an expert mentioned in the acknowledgements.

Chemotaxonomy

The major cellular fatty acids of strain BL-DC-9T, as identified and quantified with the Sherlock MIS v. 6.0 system (Microbial Identification, Inc.) using the Aerobe (TSBA) and MOORE libraries, are C18:1ω9c, C16:1ω9c, C16:0, and C14:0 [1]. The same fatty acids were also present in the closely related strain BL-DC-8 [1]. Cellular fatty acids present in lower proportions include C18:0, C12:0, and the summed features listed in the MIDI Sherlock system as summed feature 5 (C18:2ω6,9c and/or anteiso-C18:0) and summed feature 3 (one or more of C16:1ω7c, C16:1ω6c, iso-C15:0 3OH) [1].

Genome sequencing and annotation

Genome project history

D. lykanthroporepellens strain BL-DC-9T was selected for sequencing on the basis of its phylogenetic position and the importance of reductive dechlorination in the field of environmental microbiology and bioremediation. A detailed understanding of the metabolic capabilities of chloroalkane-dehalogenating bacteria such as D. lykanthroporepellens has the potential to impact decision-making regarding site clean-up at thousands of DOE and non-DOE sites across the USA and around the world. D. lykanthroporepellens strain BL-DC-9T genome project is deposited in the Genomes OnLine Database [9] and the complete genome sequence is available from GenBank. Sequencing, finishing, and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Table 2

Genome sequencing project information

MIGS ID

       Property

      Term

MIGS-31

       Finishing quality

      Finished

MIGS-28

       Libraries used

      Three genomic libraries: one 454 standard library,       one 454 paired end (19.8 kb insert size), and one Illumina library

MIGS-29

       Sequencing platforms

      Illumina GAii, 454 GS FLX Titanium

MIGS-31.2

       Sequencing coverage

      90× Illumina Gaii, 80× pyrosequence

MIGS-30

       Assemblers

      Newbler, Velvet, phrap

MIGS-32

       Gene calling method

      Prodigal 1.4, GenePRIMP

       INSDC ID

      CP002084, NC-014324

       GenBank Date of Release

      June 25, 2010

       GOLD ID

      Gc01367

       NCBI project ID

      40221

       Database: IMG

      648028022

MIGS-13

       Source material identifier

      BL-DC-9 (=JCM 15061 =ATCC BAA-1523)

       Projective relevance

      Environmental, Tree of Life

Growth conditions and DNA isolation

D. lykanthroporepellens strain BL-DC-9T (=JCM 15061, =ATCC BAA-1523) was cultured in liquid anaerobic basal medium supplemented with 1,1,2-trichloroethane as described previously [1]. Cells were harvested from 2.0 L of culture medium by centrifugation (10,000×g, 10 min, 4°C). Total DNA was extracted from the resulting cell pellet using a combination of lysozyme/SDS/proteinase K treatment, followed by purification using hexadecyltrimethyl ammonium bromide (CTAB) in conjunction with phenol-chloroform-isoamyl alcohol purification, and ethanol precipitation [15].

Genome sequencing and assembly

The genome of D. lykanthroporepellens BL-DC-9T was sequenced at the JGI using a combination of Illumina [16] and 454 technologies [17]. An Illumina GAii shotgun library with reads of 152.4 Mb, a 454 Titanium draft library with average read length of 356.7±167 bases, and a paired end 454 library with average insert size of 19,767±4,941 kb were generated for this genome. All general aspects of library construction and sequencing performed at the JGI can be found at the JGI website [18]. Illumina sequencing data was assembled with VELVET [19], and the consensus sequences were shredded into 1.5 kb overlapped fake reads and assembled together with the 454 data. Draft assemblies were based on 103.1 Mb 454 standard draft data and all of the 454 paired end data (38,136 reads that were both mapped, non-redundant). Newbler parameters are -consed -a 50 -l 350 -g -m -ml 20.

The initial Newbler assembly contained 64 contigs in 1 scaffold. The initial 454 assembly was converted into a phrap [20] assembly by making fake reads from the consensus, and collecting the read pairs in the 454 paired end library. The Phred/Phrap/Consed software package [18] was used for sequence assembly and quality assessment [21-23] during the finishing process. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with gapResolution [Cliff Han, unpublished], Dupfinisher [24], or sequencing cloned bridging PCR fragments with subcloning (Epicentre Biotechnologies, Madison, WI). Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 296 additional reactions were necessary to close gaps and to raise the quality of the finished sequence.

The error rate of the completed genome sequence is less than 1 in 100 kb. Together, the combination of the Illumina and 454 sequencing platforms provided 170 × coverage of the genome. The final assembly contained 36,976 reads.

Genome annotation

Genes were identified using Prodigal [25] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [26]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE [27], RNAMMer [28], Rfam [29], TMHMM [30], and signal [31]. Additional gene prediction analysis and manual functional annotation were performed within the Integrated Microbial Genomes - Expert Review (IMG-ER) platform [32].

Genome properties

The genome of D. lykanthroporepellens strain BL-DC-9T comprises a single circular chromosome of 1,686,510 bp with 50.04% G+C content (Table 3 and Figure 3). Of the 1,771 genes predicted, 1,720 were protein-coding genes and 51 were RNAs; 61 pseudogenes were also identified. The majority of the protein-coding genes (68.8%) were assigned a putative function and those remaining were annotated as hypothetical proteins. The distribution of the predicted protein coding genes into COG functional categories is presented in Table 4.

Table 3

Genome Statistics

Attribute

     Value

       % of Total

Genome size (bp)

     1,686,510

       100.00%

DNA coding region (bp)

     1,479,636

       87.73%

DNA G+C content (bp)

     928,329

       55.04%

Number of replicons

     1

Extrachromosomal elements

     0

Total genes

     1,771

       100.00%

RNA genes

     51

       2.65%

rRNA operons

     1a

Protein-coding genes

     1,720

       97.33%

Pseudo genes

     61

       3.44%

Genes with function prediction

     1,219

       68.83%

Genes in paralog clusters

     240

       13.55%

Genes assigned to COGs

     1,257

       70.98%

Genes assigned Pfam domains

     1,290

       72.84%

Genes with signal peptides

     417

       23.55%

Genes with transmembrane helices

     347

       19.59%

CRISPR repeats

     0

aThe genome contains a single large subunit rRNA (23S-5S) locus and a single, orphan, small subunit rRNA (16S) locus.

Figure 3

Graphical circular map of the chromosome of D. lykanthroporepellens strain BL-DC-9T. From outside to the center: Genes on the forward strand (color by COG categories), Genes on the reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content (peaks outside the circle indicate above average and peaks inside the circle indicate below average GC content), GC skew [calculated as (G − C) / (G + C); purple indicates values less than 1 and olive indicates values greater than 1]. Numbers outside the map denote nucleotide positions within the chromosome.

Table 4

Number of genes associated with the general COG functional categories*

Code

     Value

     %age

     Description

J

     126

     9.2

     Translation, ribosomal structure and biogenesis

A

     0

     0.0

     RNA processing and modification

K

     73

     5.3

     Transcription

L

     148

     10.8

     Replication, recombination and repair

B

     2

     0.2

     Chromatin structure and dynamics

D

     13

     1.0

     Cell cycle control, cell division, chromosome partitioning

Y

     0

     0.0

     Nuclear structure

V

     19

     1.4

     Defense mechanisms

T

     84

     6.2

     Signal transduction mechanisms

M

     25

     1.8

     Cell wall/membrane biogenesis

N

     13

     1.0

     Cell motility

Z

     0

     0.0

     Cytoskeleton

W

     0

     0.0

     Extracellular structures

U

     30

     2.2

     Intracellular trafficking, secretion, and vesicular transport

O

     64

     4.7

     Posttranslational modification, protein turnover, chaperones

C

     127

     9.3

     Energy production and conversion

G

     45

     3.3

     Carbohydrate transport and metabolism

E

     126

     9.2

     Amino acid transport and metabolism

F

     47

     3.4

     Nucleotide transport and metabolism

H

     84

     6.2

     Coenzyme transport and metabolism

I

     34

     2.5

     Lipid transport and metabolism

P

     56

     4.1

     Inorganic ion transport and metabolism

Q

     11

     0.8

     Secondary metabolites biosynthesis, transport and catabolism

R

     142

     10.4

     General function prediction only

S

     97

     7.1

     Function unknown

-

514

     29.0

     Not in COGs

*detailed COG categorization is available at the IMG web site [15].

Insights from the genome sequence

Analysis of the complete genome sequence of strain BL-DC-9T and its comparison with the genomes of “Dehalococcoides” strains sequenced previously provide several insights into the evolution and adaptation of the organism to its niche. Transposon-mediated horizontal gene transfer appears to have played a major role in creating the genomic diversity and metabolic versatility in strain BL-DC-9T.

Horizontal gene transfer

Strain BL-DC-9T contains a prophage region (1,604,159 to 1,672,879 bp, ~60% GC), which accounts for ~4% of the chromosome. Of the 76 ORFs identified within this region, 45 have been annotated as encoding hypothetical proteins and a vast majority of these have no homologs in the public databases. In addition to the prophage region, manual curation indicated that ~4.3% of the genome of strain BL-DC-9T (~73,000 bp) is comprised of insertion sequence (IS) elements encoding 74 full-length or truncated transposases. These IS elements are scattered throughout the chromosome and their GC content varies from 47% to 57%. The IS elements of strain BL-DC-9T belong to the families IS256 (29 of 74), IS3/IS911 (14 of 74), IS3/IS600 (10 of 74), IS4/IS5 (7 of 74), IS4/IS5/ISMca7 (5 of 74), IS1182 (4 of 74), IS116/IS110/IS902 (2 of 74), IS204/IS1001/ISL3 (2 of 74), and IS6/ISCpe7 (1 of 74).

tRNAs and Selenocysteine utilization

The chromosome of strain BL-DC-9T contains 47 tRNA genes, including those for all 20 standard amino acids as well as the unusual amino acid selenocysteine. Proteins containing selenocysteine are found in all three domains of life and many organisms contain genes encoding the complex molecular machinery required for the incorporation of this modified amino acid during the translation process [33,34]. Strain BL-DC-9T contains an operon (selCDAB) putatively involved in selenocysteine biosynthesis. selC encodes a selenocysteine-inserting tRNA (tRNAsec), which contains the complementary UCA anticodon for the internal UGA stop codon (Dehly_R0051). A gene that is not part of this operon encodes a seryl-tRNA synthetase (Dehly_0621), which catalyzes the aminoacylation of tRNAsec with serine. selD encodes a selenophosphate synthetase (Dehly_1500), an enzyme that produces monoselenophosphate using selenide and ATP as substrates. selA encodes a selenocysteine synthase (Dehly_1501), which utilizes monoselenophosphate as the selenium donor during the conversion of serine-acylated tRNAsec into selenocysteine-tRNAsec. selB encodes a GTP-dependent selenocysteine-specific elongation factor (Dehly_1502), which forms a quaternary complex with selenocysteine-tRNAsec and the selenocysteine inserting sequence (SECIS), which is a hairpin loop found immediately downstream of the UGA codon in the selenoprotein-encoding mRNA molecule [35]. This complex ensures reading through the UGA codon and incorporation of selenocysteine, instead of termination of translation [36]. Consistent with the presence of the genes encoding the synthesis and incorporation of selenocysteine, strain BL-DC-9T also contains a gene encoding a selenocysteine-containing formate dehydrogenase (Dehly_0033). This gene has an internal in-frame UGA stop codon (574 bp from the AUG start codon), which is followed by a 48 bp putative SECIS element.

Strain BL-DC-9T contains a putative IS256 element immediately downstream of the selCDAB operon (Dehly_1503, transposase). Previous phylogenetic analysis has provided evidence for horizontal transfer of these traits [37]. The presence of an IS element adjacent to the sel genes in strain BL-DC-9T suggests horizontal transfer and explains the absence of this locus in “Dehalococcoides” strains sequenced previously [38-40].

Comparative genomics

The chromosome of strain BL-DC-9T is 216,790 bp larger than that of “Dehalococcoides ethenogenes” strain 195, which has the largest genome among “Dehalococcoides” strains sequenced to date. The difference in the size of the chromosomes of strain BL-DC-9T and “Dehalococcoides” strains is partly due to the presence of multiple IS elements and IS element-associated genes in the former. These putative horizontally transferred genes appear to have played a major role in creating genomic diversity and phenotypic variability of strain BL-DC-9T vis-à-vis “Dehalococcoides” strains.

Although the chromosomes of strain BL-DC-9T and “Dehalococcoides” strains contain similar number of rRNA and tRNA encoding genes, they differ in their GC content, gene density, and percentage of sequence that encodes proteins. The lack of synteny between the chromosomes as well as the observed differences in the general features of the genomes of strain BL-DC-9T and “Dehalococcoides” strains insinuate their highly divergent evolutionary paths. This divergent evolutionary past is further supported by a phylogenetic tree constructed based on 432 core orthologous protein encoding genes shared between D. lykanthroporepellens BL-DC-9T and “Dehalococcoides” (meta)genomes [41]. In contrast, the chromosomes of four “Dehalococcoides” strains obtained from diverse geographic regions share a conserved core (1,029 orthologous groups of protein encoding genes conserved across all four genomes, with genes generally sharing the same order, orientation and synteny) that is interrupted by two high plasticity regions, indicating comparatively recent divergence from a common ancestor [40].

BLAST comparisons of protein sets of strain BL-DC-9T and “Dehalococcoides ethenogenes” strain 195 revealed that the two strains contain ~950 protein coding genes in common (bidirectional best hits, 20-90% identity at the predicted protein level). Pairwise blast comparisons indicated that strain BL-DC-9T contains ~700 protein-coding genes with no homologs in strain 195. The latter contained ~600 protein-coding genes with no homologs in BL-DC-9T. Genome-specific genes identified in strains BL-DC-9T and 195 encoded transposases, DNA endonucleases/methylases, heterodisulfide reductases, acetyltransferases, kinases, phosphatases, and dehalogenases. Some of these strain-specific genes were found within prophage-like regions or were associated with IS elements.

Biosynthesis and transport of compatible solutes

A number of microorganisms accumulate low molecular weight organic compounds alternately referred to as osmolytes, osmoprotectants, or “compatible solutes” to convey the notion that the compounds help the microorganisms survive osmotic stress but do not interfere with metabolism [42]. Ectoine, glycine-betaine, and proline are compatible solutes of many mesophilic bacteria capable of survival at high salt concentrations [42]. Many thermophilic organisms accumulate compatible solutes, such as mannosylglycerate and di-myo-inositol phosphate, which generally do not occur in mesophilic organisms [43].

Strain BL-DC-9T contains an operon (ectABC) encoding putative homologs of the enzymes involved in ectoine biosynthesis and regulation (Dehly_1306, Dehly_1307, Dehly_1308). The closest homologs of strain BL-DC-9T ectABC are found in Halomonas elongata, Wolinella succinogenes, and Desulfococcus oleovorans (48-75% identity at the predicted protein level). At least two putative transport systems for the compatible solutes proline/glycine-betaine have been identified in strain BL-DC-9T (proVWX and opuABCD). proV, proW, and proX encode an ATPase subunit (Dehly_0378), a permease protein (Dehly_0377), and a periplasmic subunit (Dehly_0376), respectively. opuA, opuB, opuC, and opuD encode a periplasmic substrate-binding protein (Dehly_0909), a permease protein (Dehly_0908), an ATPase subunit (Dehly_0907), and a permease protein (Dehly_0906), respectively. Although the permeases encoded by opuB, opuD, and proW as well as the ATPase subunits encoded by opuC and proV appear to be related to each other (34-40% identity at the predicted protein level), the periplasmic proteins encoded by opuA and proX are unrelated. The closest homologs of proVWX are found in Trichodesmium erythraeum, Marinomonas sp. MED121, and Fulvimarina pelagi (50% identity at the predicted protein level), whereas those of opuABCD are found in Pseudovibrio sp. JE062, Chromohalobacter salexigens DSM 3043, and Denitrovibrio acetiphilus DSM 12809 (44-60% identity at the predicted protein level). Strain BL-DC-9T also contains genes involved in the biosynthesis of proline (Dehly_0299, Dehly_0308). “Dehalococcoides” strains lack homologs of ectABC, proVWX, and opuABCD, but contain homologs of Dehly_0299 and Dehly_0308 (57 and 68% protein identity, respectively).

Homologs of a gene encoding a bifunctional mannosylglycerate synthase (mgsD) are found in “Dehalococcoides” strains (e.g., DET1363), an unusual occurrence for mesophilic bacteria [43]. Although the synthesis and accumulation of mannosylglycerate could not be proven to occur in “D. ethenogenes” because of insufficient biomass, the role of the bifunctional mgsD was confirmed by cloning and expression in Saccharomyces cerevisiae [43]. Comparative analysis revealed that BL-DC-9T contains a homologous gene (Dehly_0877, 54% protein identity). This expands the range of species containing genes putatively involved in the biosynthesis of compatible solutes and may offer D. lykanthroporepellens a stress response mechanism that allows growth under conditions of changing osmolarity.

Reductive dehalogenases

Genes encoding the enzymes that are involved in catalyzing the reductive dehalogenation of chlorinated solvents are organized in rdhAB operons encoding two components: a 50-65 kDa protein (RdhA) that functions as a reductive dehalogenase and a ~10 kDa hydrophobic protein with transmembrane helices (RdhB) that is thought to anchor the RdhA to the cytoplasmic membrane [44-53]. Comparative genomic analyses revealed that strain BL-DC-9T contains several loci related to rdhA and/or rdhB genes scattered throughout the chromosome. The multiple rdhA and rdhB ORFs of strain BL-DC-9T have 24-74% and 24-65% identities at the predicted protein level, respectively. The closest homologs of rdhA ORFs are found among “Dehalococcoides” strains (31-78% identity at the predicted protein level). A twin-arginine motif, with the predicted amino acid sequence SRRXFMK followed by a stretch of hydrophobic amino acids, was identified in the N-terminus of a large majority (19 of 25) of predicted RdhA sequences. Consistent with the presence of the twin-arginine sequence in the N-terminus of most of its RdhA sequences, strain BL-DC-9T contains an operon encoding proteins that constitute a putative twin-arginine translocation (TAT) system (Dehly_1346-1349). This specialized system is involved in the secretion of folded proteins across the bacterial inner membrane into the periplasmic space [54,55]. “Dehalococcoides” strains also contain an operon encoding an analogous TAT system that is partially related to the TAT system of strain BL-DC-9T (38 and 64% protein identity).

Two conserved motifs, each containing four cysteine residues, a feature associated with binding iron-sulfur clusters [56], were identified near the C-terminus of 22 of the predicted RdhA sequences of strain BL-DC-9T. The first of these motifs had a consistent number of amino acids between the cysteine residues (CX2CX2CX3C). In the second motif, there were variable numbers of intervening amino acids after the first and second cysteine residues (CX2-21CX2-7CX3C). If a “full-length” rdhA is predicted to encode a protein containing a twin-arginine sequence in the N-terminus, two iron-sulfur cluster binding motifs in the C-terminus, and an intervening sequence of ~450-500 aa, then strain BL-DC-9T contains 17 such genes. Two rdhA genes (Dehyl_0069 and 1582) appear to be truncated and are predicted to encode proteins lacking the twin-arginine sequences in their N-termini. Five rdhA genes appear to be substantially truncated and are predicted to encode proteins consisting of only the N-terminus (Dehly_0479 and 1520) or the C-terminus (Dehly_0075, 1523, and 1534).

Within strain BL-DC-9T, only six of the rdhA ORFs have a cognate rdhB, and an additional rdhB gene (Dehly_1504) appears to be an orphan with no cognate rdhA ORF nearby. The predicted RdhB sequences of strain BL-DC-9T contain two or three transmembrane helices. Similar features have been observed among the predicted RdhB sequences of “Dehalococcoides” strains [38-40]. Furthermore, the predicted RdhB sequences of strain BL-DC-9T and the “Dehalococcoides” strains share 40-72% identity. In at least seven loci (Dehly_0069, 0075, 0479, 1504, 1530, 1534, and 1541), it appears that transposon insertion has truncated one or both of the rdh genes. Interestingly, genes involved in the regulation of rdhAB operons (e.g., MarR-type or two-component transcriptional regulators) were present only in seven loci (Dehly_0121, 0274, 0479, 1148, 1355, 1530, and 1582). In addition to the genes encoding reductive dehalogenases, the strain BL-DC-9T genome contains two genes encoding putative haloacid dehalogenases (Dehly_0588, Dehly_1126) that have homologs among the “Dehalococcoides” strains (40-44% identity at the predicted protein level).

The presence of IS elements adjacent to some rdhA/rdhB loci in strain BL-DC-9T indicates their acquisition from an unknown host. Previous studies of “Dehalococcoides” strains have also suggested horizontal transfer of reductive dehalogenase genes [41,57]. It remains to be determined if strain BL-DC-9T rdhA genes lacking an rdhB ORF downstream encode functional reductive dehalogenases and whether/how they are membrane-bound. It is possible that an incognate or a non-contiguous rdhB (e.g., the orphan Dehly_1504) could complement one or more of strain BL-DC-9T rdhA genes lacking an rdhB ORF downstream. Alternatively, some of these genes may encode reductive dehalogenases that function by an unknown mechanism. An enzyme involved in the reductive dehalogenation of tetrachloroethene by Dehalospirillum multivorans was found in the cytoplasmic fraction [58], suggesting that some reductive dehalogenases are either loosely membrane-bound or soluble entities. The same may be the case for the majority of reductive dehalogenases of strain BL-DC-9T. Regardless, the repertoire of rdhA/rdhB loci identified by complete genome sequencing sets the stage for future efforts to elucidate the mechanism of reductive dehalogenation by strain BL-DC-9T and other Dehalogenimonas strains.

Declarations

Acknowledgements

The work conducted by the U.S. Department of Energy Joint Genome Institute was supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. The authors gratefully acknowledge Xiao Ying for assistance with microscopy.

References

  1. Moe WM, Yan J, Nobre MF, da Costa MS and Rainey FA. Dehalogenimonas lykanthroporepellens gen. nov., sp. nov., a reductive dehalogenating bacterium isolated from chlorinated solvent contaminated groundwater. Int J Syst Evol Microbiol. 2009; 59:2692-2697 View ArticlePubMed
  2. Bowman KS, Moe WM, Rash BA, Bae HS and Rainey FA. Bacterial diversity of an acidic Louisiana groundwater contaminated by DNAPL containing chloroethanes and other solvents. FEMS Microbiol Ecol. 2006; 58:120-133 View ArticlePubMed
  3. Yan J, Rash BA, Rainey FA and Moe WM. Isolation of novel bacteria within the Chloroflexi capable of reductive dechlorination of 1,2,3-trichloropropane. Environ Microbiol. 2009; 11:833-843 View ArticlePubMed
  4. Yan J, Rash BA, Rainey FA and Moe WM. Detection and quantification of Dehalogenimonas and “Dehalococcoides” populations via PCR-based protocols targeting 16S rRNA genes. Appl Environ Microbiol. 2009; 75:7560-7564 View ArticlePubMed
  5. Grégoire P, Bohli M, Cayol JL, Joseph M, Guasco S, Dubourg K, Cambar J, Michotey V, Bonin P, Fardeau ML and Ollivier B. Caldilinea tarbellica sp. nov., a filamentous, thermophilic, anaerobic bacterium isolated from a deep hot aquifer in the Aquitaine Basin. Int J Syst Evol Microbiol. 2011; 61:1436-1441 View ArticlePubMed
  6. Sekiguchi Y, Yamada T, Hanada S, Ohashi A, Harada H and Kamagata Y. Anaerolinea thermophila gen. nov., sp. nov. and Caldilinea aerophila gen. nov., sp. nov., novel filamentous thermophiles that represent a previously uncultured lineage of the domain Bacteria at the subphylum level. Int J Syst Evol Microbiol. 2003; 53:1843-1851 View ArticlePubMed
  7. Chun J, Lee JH, Jung Y, Kim M, Kim S, Kim BK and Lim YW. EzTaxon: a web-based tool for the identification of prokaryotes based on 16S ribosomal RNA gene sequences. Int J Syst Evol Microbiol. 2007; 57:2259-2261 View ArticlePubMed
  8. Tamura K, Dudley J, Nei M and Kumar S. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol. 2007; 24:1596-1599 View ArticlePubMed
  9. Liolios K, Chen IM, Mavromatis K, Tavernarakis N, Hugenholtz P, Markowitz VM and Kyrpides NC. The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 2010; 38:D346-D354 View ArticlePubMed
  10. Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Tompson N, Allen MJ and Angiuoli SV. The minimum information about a genome sequence” (MIGS) specification. Nat Biotechnol. 2008; 26:541-547 View ArticlePubMed
  11. Woese CR, Kandler O and Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA. 1990; 87:4576-4579 View ArticlePubMed
  12. Garrity GM, Holt JG. Phylum BVI. Chloroflexi phy. nov. In: Garrity GM, Boone DR, Castenholz RW (eds), Bergey's Manual of Systematic Bacteriology, Second Edition, Volume 1, Springer, New York, 2001, p. 427-446.
  13. Hugenholtz P and Stackebrandt E. Reclassification of Sphaerobacter thermophilus from the subclass Sphaerobacteridae in the phylum Actinobacteria to the class Thermomicrobia (emended description) in the phylum Chloroflexi (emended description). Int J Syst Evol Microbiol. 2004; 54:2049-2051 View ArticlePubMed
  14. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS and Eppig JT. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000; 25:25-29 View ArticlePubMed
  15. . Web Site
  16. Bennett S. Solexa Ltd. Pharmacogenomics. 2004; 5:433-438 View ArticlePubMed
  17. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ and Chen Z. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005; 437:376-380PubMed
  18. . Web Site
  19. Zerbino DR and Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008; 18:821-829 View ArticlePubMed
  20. The Phred/Phrap/Consed software package. Web Site
  21. Ewing B, Hillier L, Wendl MC and Green P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 1998; 8:175-185PubMed
  22. Ewing B and Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 1998; 8:186-194PubMed
  23. Gordon D, Abajian C and Green P. Consed: a graphical tool for sequence finishing. Genome Res. 1998; 8:195-202PubMed
  24. Han CS, Chain P. Finishing repeat regions automatically with Dupfinisher. in Proceedings of the 2006 international conference on bioinformatics & computational biology. Arabnia HR, Valafar H (eds), CSREA Press. June 26-29, 2006: 141-146.
  25. Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW and Hauser LJ. Prodigal: Prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010; 11:119 View ArticlePubMed
  26. Pati A, Ivanova N, Mikhailova N, Ovchinikova G, Hooper SD, Lykidis A and Kyrpides NC. GenePRIMP: A Gene Prediction Improvement Pipeline for microbial genomes. Nat Methods. 2010; 7:455-457 View ArticlePubMed
  27. Lowe TM and Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997; 25:955-964PubMed
  28. Lagesen K, Hallin PF, Rødland E, Stærfeldt HH, Rognes T and Ussery DW. RNammer: consistent annotation of rRNA genes in genomic sequences. Nucleic Acids Res. 2007; 35:3100-3108 View ArticlePubMed
  29. Griffiths-Jones S, Bateman A, Marshall M, Khanna A and Eddy SR. Rfam: an RNA family database. Nucleic Acids Res. 2003; 31:439-441 View ArticlePubMed
  30. Krogh A, Larsson B, von Heijne G and Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J Mol Biol. 2001; 305:567-580 View ArticlePubMed
  31. Bendtsen JD, Nielsen H, von Heijne G and Brunak S. Improved prediction of signal peptides: SignalP 3.0. J Mol Biol. 2004; 340:783-795 View ArticlePubMed
  32. Markowitz VM, Mavromatis K, Ivanova NN, Chen IM, Chu K and Kyrpides NC. IMG ER: a system for microbial genome annotation expert review and curation. Bioinformatics. 2009; 25:2271-2278 View ArticlePubMed
  33. Böck A. Selenium proteins containing selenocysteine. In: Crabtree RH (ed.), Encyclopedia of inorganic chemistry 2006. John Wiley & Sons, Ltd, New York.
  34. Hendrickson TL. Easing selenocysteine into proteins. Nat Struct Mol Biol. 2007; 14:100-101 View ArticlePubMed
  35. Donovan J and Copeland PR. Evolutionary history of selenocysteine incorporation from the perspective of SECIS binding proteins. BMC Evol Biol. 2009; 9:229 View ArticlePubMed
  36. Zavacki AM, Mansell JB, Chung M, Klimovitsky B, Harney JW and Berry MJ. Coupled tRNA(Sec)-dependent assembly of the selenocysteine decoding apparatus. Mol Cell. 2003; 11:773-781 View ArticlePubMed
  37. Romero H, Zhang Y, Gladyshev VN and Salinas G. Evolution of selenium utilization traits. Genome Biol. 2005; 6:R66 View ArticlePubMed
  38. Kube M, Beck A, Zinder SH, Kuhl H, Reinhardt R and Adrian L. Genome sequence of the chlorinated compound-respiring bacterium Dehalococcoides species strain CBDB1. Nat Biotechnol. 2005; 23:1269-1273 View ArticlePubMed
  39. Seshadri R, Adrian L, Fouts DE, Eisen JA, Phillippy AM, Methe BA, Ward NL, Nelson WC, Deboy RT and Khouri HM. Genome sequence of the PCE-dechlorinating bacterium Dehalococcoides ethenogenes. Science. 2005; 307:105-108 View ArticlePubMed
  40. McMurdie PJ, Behrens SF, Muller JA, Göke J, Ritalahti KM, Wagner R, Goltsman E, Lapidus A, Holmes S, Löffler FE and Spormann AM. Localized plasticity in the streamlined genomes of vinyl chloride respiring Dehalococcoides. PLoS Genet. 2009; 5:e1000714 View ArticlePubMed
  41. McMurdie PJ, Hug LA, Edwards EA, Holmes S and Spormann AM. Site-specific mobilization of vinyl chloride respiration islands by a mechanism common in Dehalococcoides. BMC Genomics. 2011; 12:287 View ArticlePubMed
  42. Santos H and da Costa MS. Compatible solutes in organisms that live in hot saline environments. Environ Microbiol. 2002; 4:501-509 View ArticlePubMed
  43. Empadinhas N, Albuquerque L, Costa J, Zinder SH, Santos MA, Santos H and da Costa MS. A gene from the mesophilic bacterium Dehalococcoides ethenogenes encodes a novel mannosylglycerate synthase. J Bacteriol. 2004; 186:4075-4084 View ArticlePubMed
  44. Adrian L, Rahnenfuhrer J, Gobom J and Holscher T. Identification of a chlorobenzene reductive dehalogenase in Dehalococcoides sp. strain CBDB1. Appl Environ Microbiol. 2007; 73:7717-7724 View ArticlePubMed
  45. Fung JM, Morris RM, Adrian L and Zinder SH. Expression of reductive dehalogenase genes in Dehalococcoides ethenogenes strain 195 growing on tetrachloroethene, trichloroethene, or 2,3-dichlorophenol. Appl Environ Microbiol. 2007; 73:4439-4445 View ArticlePubMed
  46. Hölscher T, Gorisch H and Adrian L. Reductive dehalogenation of chlorobenzene congeners in cell extracts of Dehalococcoides sp. strain CBDB1. Appl Environ Microbiol. 2003; 69:2999-3001 View ArticlePubMed
  47. Magnuson JK, Romine MF, Burris DR and Kingsley MT. Trichloroethene reductive dehalogenase from Dehalococcoides ethenogenes: sequence of tceA and substrate range characterization. Appl Environ Microbiol. 2000; 66:5141-5147 View ArticlePubMed
  48. Müller JA, Rosner BM, von Abendroth G, Meshulam-Simon G, McCarty PL and Spormann AM. Molecular identification of the catabolic vinyl chloride reductase from Dehalococcoides sp. strain VS and its environmental distribution. Appl Environ Microbiol. 2004; 70:4880-4888 View ArticlePubMed
  49. Maillard J, Schumacher W, Vazquez F, Regeard C, Hagen WR and Holliger C. Characterization of the corrinoid iron-sulfur protein tetrachloroethene reductive dehalogenase of Dehalobacter restrictus. Appl Environ Microbiol. 2003; 69:4628-4638 View ArticlePubMed
  50. Neumann A, Wohlfarth G and Diekert G. Tetrachloroethene dehalogenase from Dehalospirillum multivorans: cloning, sequencing of the encoding genes, and expression of the pceA gene in Escherichia coli. J Bacteriol. 1998; 180:4140-4145PubMed
  51. Nijenhuis I and Zinder SH. Characterization of hydrogenase and reductive dehalogenase activities of Dehalococcoides ethenogenes strain 195. Appl Environ Microbiol. 2005; 71:1664-1667 View ArticlePubMed
  52. Suyama A, Yamashita M, Yoshino S and Furukawa K. Molecular characterization of the PceA reductive dehalogenase of Desulfitobacterium sp. strain Y51. J Bacteriol. 2002; 184:3419-3425 View ArticlePubMed
  53. Thibodeau J, Gauthier A, Duguay M, Villemur R, Lepine F, Juteau P and Beaudet R. Purification, cloning, and sequencing of a 3,5-dichlorophenol reductive dehalogenase from Desulfitobacterium frappieri PCP-1. Appl Environ Microbiol. 2004; 70:4532-4537 View ArticlePubMed
  54. Stanley NR, Palmer T and Berks BC. The twin arginine consensus motif of Tat signal peptides is involved in Sec-independent protein targeting in Escherichia coli. J Biol Chem. 2000; 275:11591-11596 View ArticlePubMed
  55. Sargent F, Berks BC and Palmer T. Pathfinders and trailblazers: a prokaryotic targeting system for transport of folded proteins. FEMS Microbiol Lett. 2006; 254:198-207 View ArticlePubMed
  56. Bruschi M and Guerlesquin F. Structure, function and evolution of bacterial ferredoxins. FEMS Microbiol Rev. 1988; 4:155-175PubMed
  57. Krajmalnik-Brown R, Sung Y, Ritalahti KM, Saunders FM and Löffler FE. Environmental distribution of the trichloroethene reductive dehalogenase gene (tceA) suggests lateral gene transfer among Dehalococcoides. FEMS Microbiol Ecol. 2007; 59:206-214 View ArticlePubMed
  58. Miller E, Wohlfarth G and Diekert G. Studies on tetrachloroethene respiration in Dehalospirillum multivorans. Arch Microbiol. 1996; 166:379-387 View ArticlePubMed