Non-contiguous finished genome sequence of Anoxybacillus flavithermus subsp. yunnanensis type strain (E13T), a strictly thermophilic and organic solvent-tolerant bacterium

Anoxybacillus flavithermus subsp. yunnanensis is the only strictly thermophilic bacterium that is able to tolerate a broad range of toxic solvents at its optimal temperature of 55-60°C. The type strain E13T was isolated from water-sediment slurries collected from a hot spring. This study presents the draft genome sequence of A. flavithermus subsp. yunnanensis E13T and its annotation. The 2,838,393bp long genome (67 contigs) contains 3,035 protein-coding genes and 85 RNA genes, including 10 rRNA genes, and no plasmids. The genome information has been used to compare with the genomes from A. flavithermus subsp. flavithermus strains.


Introduction
Solvent-tolerant bacteria are a relatively new group of extremophilic microorganisms. They are able to overcome the toxic and destructive effects of organic solvents due to their unique adaptive mechanisms. Most of the reported solvent-tolerant bacteria are mesophilic bacteria that have an optimal temperature of between 25-37°C [1]. So far, Anoxybacillus flavithermus subsp. yunnanensis is the only strictly thermophilic bacterial species known to tolerate a broad range of solvents at its optimal temperature of 55-60°C [2,3]. The strains show unusual physiological features in the presence of solvents, such as a higher cell yield [2], an observable incrassation of electron-transparent intracellular material and a distorted cytoplasm [3]. However, mechanisms of solvent tolerance in thermophilic species have not been proposed. The type strain E13 T (=CCTCC AB2010187 T =KCTC 13759 T ) and the additional strain PGDY12 were isolated from water-sediment slurries collected from a hot spring in Yunnan Province of China in our lab, and are most closely related to A. flavithermus subsp.flavithermus, first discovered in a hot spring in New Zealand [4]. At present, a total of 19 species and two subspecies of Anoxybacilluswith validly published names have been reported [5]. None of these Anoxybacillusstrains is reported to tolerate solvents except A. flavithermus subsp. yunnanensis. To understand the molecular basis of the ability to tolerate solvents under high temperature conditions, we sequenced and annotated a draft genome of the type strain E13 T of A. flavithermus subsp. yunnanensis.

Classification and features
A. flavithermus subsp. yunnanensis E13 T (Table 1) was isolated in 2008 by static cultivation in rich Luria-Bertani (LB) medium supplemented with 10% ethanol [2]. This strain is a facultatively aerobic, Gram-positive, motile, spore-forming rod that is capable of utilizing a wide range of carbon sources, such as arabinose, cellobiose, galactose, maltose, trehalose and xylose. The strain E13 T not only exhibited a remarkable ability to grow in ethanol concentrations reaching 13% at 55°C, but can also tolerate highly toxic solvents including toluene, benzene, xylene, chloroform and cyclohexane. Because A. flavithermus subsp. yunnanensis is the only strictly thermophilic bacterium that is able to tolerate toxic solvents, the effect of temperature on solvent tolerance has not yet been studied. The reports of the effect of temperature on ethanol (a much less toxic solvent) tolerance indicated that ethanol tolerance decreased with increasing temperature [20,21]. The comparison of the growth of strain E13 T at different temperatures showed that a temperature increase of 20°C, from 45 to 65°C, resulted in a decrease of the critical inhibitory toluene concentration from 0.56 to 0.31%. A similar sharp decrease occurred in the cases of benzene, xylene, chloroform and cyclohexane. The results suggested that temperature plays a vitally important role in determining solvent tolerance in bacteria, which may explain why such thermophilic bacteria are rare in nature.
Currently, more than 30 solvent-tolerant mesophilic bacteria have been reported, and 8 genomes are available in GenBank. The phylogenetic position of A. flavithermus subsp. yunnanensis E13 T among these typical solvent-tolerant bacteria is shown in Figure 1. This strain is most closely related to Bacillusspecies. The genomes of B. cereusstrain E33L and strain ATCC 10987 might provide valuable guidance in a genetic analysis of the solvent tolerance of A. flavithermus subsp. yunnanensis E13 T . , not directly observed for the living , isolated sample, but based on a g enerally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontolog y project [19]. If the evidence code is IDA, then the property was directly observed by one of the authors or an expert mentioned in the acknowledgements.

Genome sequencing information Genome project history
The organism was selected based on its unique characteristics as a solvent-tolerant thermophile and in order to investigate new mechanisms of solvent tolerance. The genome was sequenced at BGI-Shenzhen (Shenzhen, China) and deposited in Genbank under the accession number AVGH00000000. The version described in this paper is version AVGH01000000. To our knowledge, it was the first genome of A. flavithermus subsp. yunnanensis, the 8th genome of an Anoxybacillusspecies and the 9 th genome of solvent-tolerant bacteria to be sequenced. A summary of the project information associated with MIGS version 2.0 compliance [6] is shown in Table  2.

Growth conditions and DNA isolation
A. flavithermus subsp. yunnanensis strain E13 T was grown in LB medium at 60°C for 8 h. The cells were harvested by centrifugation at 12,000 g, and washed twice with distilled water. Genomic DNA from the strain E13 T was extracted with a Genomic DNA Mini Preparation Kit (Beyotime, Shanghai, China) according to the method for extracting genomic DNA from Gram-positive bacteria. The quality and concentration of the genomic DNA were measured by spectrophotometric analysis using a biophotometer (Eppendorf BioPhotometer Plus, Eppendorf, Germany).

Genome annotation
Genes were predicted by merging the results obtained from the RAST (Rapid Annotation using Subsystem Technology) server [23] and the Glimmer modeling software package [24]. The predicted coding sequences (CDSs) were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, KEGG, Clusters of Orthologous Groups (COG), Swiss-Prot and TrEMBL databases. The tool RNAmmer [25] was used to find rRNA genes, whereas tRNA genes were found by using the tool tRNAscanSE [26]. Other non-coding RNAs were identified by searching the genome for Rfam profiles using INFERNAL (v0.81) [27]. Signal peptides and numbers of transmembrane helices were predicted using SignalP [28] and TMHMM [29], respectively.

Genome properties
The genome is 2,838,393 bp long (1 chromosome, no plasmids) with a 41.4% G+C content ( Figure 2 and Table 3). Of the 3,120 predicted genes, 3,035 were protein-coding genes, and 85 were RNAs. In addition, ten rRNA genes (two 16S rRNA, one 23S rRNA and seven 5S rRNA) and 75 predicted tRNA genes were identified in the genome. A total of 2,267 genes (72.66%) were assigned a putative function. The remaining genes were annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Table 3. The distribution of genes into COGs and KEGG functional categories is presented in Table 4.