Complete genome sequence of Meiothermus silvanus type strain (VI-R2T)

Meiothermus silvanus (Tenreiro et al. 1995) Nobre et al. 1996 belongs to a thermophilic genus whose members share relatively low degrees of 16S rRNA gene sequence similarity. Meiothermus constitutes an evolutionary lineage separate from members of the genus Thermus, from which they can generally be distinguished by their slightly lower temperature optima. M. silvanus is of special interest as it causes colored biofilms in the paper making industry and may thus be of economic importance as a biofouler. This is the second completed genome sequence of a member of the genus Meiothermus and only the third genome sequence to be published from a member of the family Thermaceae. The 3,721,669 bp long genome with its 3,667 protein-coding and 55 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.


Introduction
Strain VI-R2 T (T (= DSM 9946 = ATCC 700542 = BCRC 17112) was first described as 'Thermus silvanus' by Tenreiro et al. in 1995 [1]. One year later it was formally named and transferred from the genus Thermus into the then novel genus Meiothermus by Nobre et al. [2]. Currently, there are nine species within the genus Meiothermus [3,4]. The genus name derives from the Greek words 'meion' and 'thermos' meaning 'lesser' and 'hot' to indicate an organism in a less hot place [2,3]. The species name was given in honor of Manuel T. Silva, a Portuguese microbiologist and immunologist [1]. Strain VI-R2 T was isolated from the hot spring (vent temperature, 56°C; pH 8.9) located at the end of a 450 m tunnel and from thermal water (temperature 33°C; pH 8.8) piped to a spa at Vizela in northern Portugal [1]. Members of the genus Meiothermus have been isolated from natural hot springs and artificial thermal environments [2,5] in Russia [6], Central France [7], Northern and Central Portugal [1,8], North-Eastern China [9], Northern Taiwan [10], Iceland [11] and the Azores [4]. Interestingly, the genus Meiothermus is heterogeneous with respect to pigmentation. The yellow pigmented species also form a distinct group on the basis of the 16S rRNA gene sequence similarity, whereas the red/orange pigmented strains form two groups, one comprising M. silvanus and the other the remaining species [8,9]. Like all members of the class Deinococci, the lipid composition of the cell membrane of members of the genus Meiothermus contains unusual and characteristic structures. M. silvanus is well known to form colored biofilms in the paper industry, which makes this species an economic threat [12,13]. M. silvanus uses threadlike organelles for adhesion and biofilm formation to grow on stainless steel [14]. However, coating of stainless steel with diamond-like carbon or certain fluoropolymers reduced or almost eliminated adhesion and biofilm growth of M. silvanus [14]. Other strategies to combat M. silvanus in the paper industry include electrochemical inactivation (oxidation) using different levels of chloride concentration [15]. Here, the inactivation was mainly due to the electrochemically generated chlorine/hypochlorite [15]. A patent based on different natural plant extracts inhibiting biofilm formation of thermophilic species in paper or board machines, amongst them M. silvanus, has been recently issued [16]. The 16S rRNA genes of the seven other type strains in the genus Meiothermus share between 88.5% (Meiothermus chliarophilus [1]) and 89.8% (Meiothermus cerbereus [11]) sequence identity with strain VI-R2 T , whereas the other type strains from the family Thermaceae share 85.8 to 87.8% sequence identity [17]. In addition to being found on paper and board machines [12] uncultured clone 16S rRNA gene sequences very similar to M. silvanus VI-R2 T (X84211) have also been detected in the gut of an invasive wood-boring beetle (98% identity, EU148672) [18] and in seawater adjacent to a Pacillopora meandrina coral colony at Palmyra Atoll (99% identity, EU249942). Environmental samples and metagenomic surveys do not surpass 84% sequence similarity to the 16S rRNA gene sequence of strain VI-R2 T (status May 2010). Here we present a summary classification and a set of features for M. silvanus VI-R2 T , together with the description of the complete genomic sequencing and annotation.

Figure 2.
Phylogenetic tree highlighting the position of M. silvanus VI-R2 T relative to the type strains of the other species within the genus and to the other type strains within the family Thermaceae. The tree was inferred from 1,442 aligned characters [30,31] of the 16S rRNA gene sequence under the maximum likelihood criterion [32] and rooted in accordance with the current taxonomy [33]. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 900 bootstrap replicates [34] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [35] are shown in blue, published genomes in bold, i.e. Thermus thermophilus (AP008226) and the type species of the genus, M. ruber [36].
Based on mass spectral data it appears that there may be three distinct derivatives, differing in the fatty acid amide linked to the galactosamine [37]. These may be divided into one compound containing exclusively 2-hydroxylated fatty acids (mainly 2-OH iso-17:0) and a mixture of two compounds that cannot be fully resolved by thin layer chromatography, carrying either 3-hydroxylated fatty acids or unsubstituted fatty acids. The basic glycolipid structure dihexosyl -N-acyl-hexosaminyl -hexosyl -diacylglycerol is a feature common to all members of the genera Thermus and Meiothermus examined to date. There is currently no evidence that members of the family Thermaceae (as currently defined) produce significant amounts of polar lipids containing only two aliphatic side chains. The consequences of having polar lipids containing three aliphatic side chains on membrane structure has yet to be examined. Such peculiarities also indicate the value of membrane composition in helping to unravel evolution at a cellular level [36]. The major fatty acids of the total polar lipids are anteiso-C15:0 (22.4%), iso-C15:0 (16.8%) and iso-C18:0 (12.2%), followed by iso-C17:0-2OH (10.5%) and iso-C17:0 and anteiso-C17:0 (each 8.5%) [37]. The glycolipid GL-la is characterized by a large amount of the fatty acid iso-C17:0-2OH (19.2%), which is nearly completely absent from GL-lb and the phospholipid PL-2 [37]. Menaquinone 8 was the only respiratory lipoquinone detected in all strains [1]. The structure of the red pigment has not been characterized in contrast to that of M. ruber [39]. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from of the Gene Ontology project [29]. If the evidence code is IDA, then the property was directly observed by one of the authors or an expert mentioned in the acknowledgements.

Genome sequencing and annotation Genome project history
This organism was selected for sequencing on the basis of its phylogenetic position [40], and is part of the Genomic Encyclopedia of Bacteria and Archaea project [41]. The genome project is deposited in the Genome OnLine Database [35] and the complete genome sequence is deposited in Gen-Bank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2. DNA was isolated from 0.5-1 g of cell paste using Qiagen Genomic 500 DNA Kit (Qiagen, Hilden, Germany) following the standard protocol as recommended by the manufacturer, with modification st/LALMP as described in Wu et al. [41].

Genome sequencing and assembly
The genome was sequenced using a combination of Sanger and 454 sequencing platforms. All general aspects of library construction and sequencing can be found at the JGI website. Pyrosequencing reads were assembled using the Newbler assembler version 1.1.02.15 (Roche). Large Newbler contigs were broken into 3,908 overlapping fragments of 1,000 bp and entered into assembly as pseudo-reads. The sequences were assigned quality scores based on Newbler consensus qscores with modifications to account for overlap redundancy and adjust inflated q-scores. A hybrid 454/Sanger assembly was made using the Arachne assembler. Possible misassemblies were corrected and gaps between contigs were closed editing in Consed, custom primer walks from sub-clones or PCR products. A total of 323 Sanger finishing reads were produced to close gaps, to resolve repetitive regions, and to raise the quality of the finished sequence. 9,068,515 Illumina reads were used to improve the final consensus quality using an in-house developed tool (the Polisher) [43]. The error rate of the completed genome sequence is less than 1 in 100,000. Together, the combination of the Sanger and 454 se-quencing platforms provided 26.9× coverage of the genome. The final assembly contains 42,181 Sanger reads and 335,557 pyrosequencing reads.

Genome annotation
Genes were identified using Prodigal [44] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [45]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [46].

Genome properties
The genome consists of a 3,249,394 bp long chromosome, and two plasmids of 347,854 bp and 124,421 bp lengths, respectively, with a total G+C content of 62.7% (Figure 3 and Table 3). Of the 3,722 genes predicted, 3,667 were proteincoding genes, and 55 RNAs; 158 pseudogenes were also identified. The majority of the proteincoding genes (64.5%) were assigned a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.