Draft genome sequence of Dyadobacter tibetensis type strain (Y620-1) isolated from glacial ice

Dyadobacter tibetensis Y620-1 is the type strain of the species Dyadobacter tibetensis, isolated from ice at a depth of 59 m from a high altitude glacier in China (5670 m above sea level). It is psychrotolerant with growth temperature ranges of 4 to 35°C. Here we describe the features of this organism, together with the draft genome sequence and annotation. The 5,313,963 bp long genome contains 4,828 protein-coding genes and 39 RNA genes. To the best of our knowledge, this is the first Dyadobacter strain that was isolated from glacial ice. This study provides genetic information of this organism to identify the genes linked to its specific mechanisms for adaption to extreme glacial environment.


Introduction
Strain Y620-1 (=JCM 18589= CGMCC 1.12215T) is the type strain of the species Dyadobacter tibetensis [1]. The genus Dyadobacter currently has 12 species after it was first proposed by Chelius and Triplett on 2000, and the type species is D. fermentans [2]. Those species isolated from diverse environment, i.e. glacial ice, soil from the Arctic, Colorado Plateau, farm and a ginseng field, desert sand, freshwater and sea water, and plant material [1][2][3][4][5][6][7][8][9][10][11][12]. So far, however, the genome sequences have been determined for only three Dyadobacter strains (D. alkalitolerans DSM 23607 (GCA_000428845), D. fermentans DSM 18053 (GCA_000023125), D. beijingensis DSM 21582 (GCA_000382205)), and only the complete genome sequence of D. fermentans DSM 18053 has been published [13]. D. tibetensis strain Y620-1 was isolated from 59 m depth section of an 122 m ice core drilled from Yuzhufeng Glacier at 5670 m above sea level, Tibetan Plateau, China [1]. Glacier ice is an extreme environment with low temperature and nutrients, but high UV radiation, and is a huge reservoir of extremophilic microorganism that have accumulated for hundreds of years [14]. Diverse isolates were recovered from glacial ice, but the genomes of bacteria in the extreme environment were limited [15,16]. Here, we present the genome sequence of psychrotolerant D. tibetensis strain Y620-1 isolated from ice core. This is the first genome sequence of a bacterial isolated from a deep high altitude glacier ice.

Genome sequencing information Genome project history
The organism was selected for sequencing on the basis of it from extreme deep ice core from high altitude glacier. The shotgun genome sequencing project was completed in December 2012 and has been deposited at DDBJ/EMBL/GenBank under the accession number AZQN00000000. The version described here is the first version, AZQN01000000. The genome sequencing was carried out in Shanghai Majorbio Bio-pharm Technology Co., Ltd (Shanghai, China). A summary of the project information is shown in Table 2

Genome sequencing and assembly
The genome of strain Y620-1 was sequenced using an Illumina GAIIx instrument with two paired-end libraries (170 bp and 800 bp insert size). The raw sequencing data was processed to discard reads containing adaptor sequences, a high rate of ambiguity, and removing the sequence reads which were of low quality. A total of 2,041 Mb highquality of Illumina data were obtained, providing approximately 384-fold coverage. The high-quality reads were assembled in silico using SOAPdenovo v1.05, resulting in 33 contigs (> 200 bp) with an N50 length of 797,100 bp.

Genome annotation
The coding sequences (CDS) were predicted using Glimmer 3.02 [24], while tRNAscan-SE [25] and RNAmmer [26] were used to identify tRNA and rRNA, respectively. The genome sequence was also uploaded into the Rapid Annotation using Subsystem Technology (RAST) system [27] to check the annotated sequences. The functions of predicted protein-coding genes were then annotated through comparisons with the databases of NCBI-NR [28], COG [29], and KEGG [30]. The program TMHMM [31] and SignalP [32] were used to identify putative transmembrane helices and signal peptides.

Genome properties
The Y620-1 draft genome sequence has a total of 5,313,963 bp with an average GC content of 43.44%. There are 4,867 predicted genes, of which 4,828 are protein-coding genes, and 39 are RNA genes. A total of 2,844 genes (58.91%) are assigned a putative function. The remaining genes were annotated as either hypothetical proteins or proteins of unknown functions. Using COG functional assignment, 70.55% of protein coding genes could be classified into 20 COG categories. The properties and the statistics of the genome are summarized in Tables 3 and 4. According to the subsystem-based annotation generated by RAST, ~ 32% protein-coding genes of strain Y620-1 could be assigned to 358 metabolic subsystems. The most abundant of the subsystems are related to carbohydrates (n=323, 7.2% of total protein-coding genes), following by amino acids and derivatives (n=272, 6.0%), cofactors, vitamins, prosthetic groups, pigments (n=184, 4.1%), protein metabolism (n=143, 3.2%), membrane transport (132, 2.9%) and respiration (131, 2.9%). The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome. Not in COGs a) The total is based on the total number of protein coding genes in the annotated genome.

Discussion
Although there were 12 species assigned to Dyadobacter, only three have a completed genome sequence and are D. alkalitolerans DSM 23607, D. fermentans DSM 18053, and D. beijingensis DSM 21582. Strain Y620-1 has the smallest genome of D. alkalitolerans DSM 23607, D. fermentans DSM 18053 and D. beijingensis DSM 21582 (6.29 Mbp, 6.97 Mbp and 7.37 Mbp, respectively). The GC content of strain Y620-1 is comparable to that of D. alkalitolerans DSM 23607 (45.66%), but lower than those of D. fermentans DSM 18053 (51.54%) and D. beijingensis DSM 21582 (52.09%). In order to estimate the similarity among the sequenced Dyadobacter strains, an average nucleotide identity (ANI) and Genome-to-Genome Distance Calculator (GGDC) were calculated using the software JSpecies v1.2 [33] and GGDC v2.0 [34], respectively. Table 5 shows the results of ANI and GGDC. ANI analysis showed that strain Y620-1 shared a low degree of similarity with other Dyadobacter species (< 69% ANIb and < 84% ANIm), whereas relatively higher ANI value were obtained for D. alkalitolerans DSM 23607, D. fermentans DSM 18053 and D. beijingensis DSM 21582 (Table 5). Although the core concept of GGDC was based on 'genome blast distance phylogeny', which is different from ANI [35], GGDC analysis showed similar results. In both analyses, the highest similarity values were observed in the comparisons of D. fermentans DSM 18053 with D. beijingensis DSM 21582. These results were in line with phylogeny analysis based on 16S rRNA gene, which shows that D. fermentans DSM 18053 and D. beijingensis DSM 21582 form a cluster with Dyadobacter soli MJ20. Moreover, the comparison of distribution of COG categories in the genome of four Dyadobacter strains revealed that there were significant correlations between the distribution of COG categories of strain Y620-1 and other strains (r = 0.970-0.979). However, relatively higher correlation coefficients were observed for D. alkalitolerans   Five cold-shock proteins were found in this genome including CspA, GyrA, RbfA, and NusA. The proteins coded by gene RecA, RecF, RecG, RecN, RecO, RecQ, RadA and RadC, which play a critical role in recombinational repair of damaged DNA, were also found [36]. Single-stranded-DNAspecific exonuclease RecJ, required for many types of recombination events [37], and CRISPRs Cas1, interacts with components of the DNA repair systems, also were found [38]. Phage shock protein C existed in the genome, which may play a significant role in the competition for survival under nutrient-or energy-limited conditions [39]. When bacteria deposit on the glacier, the low temperature, high UV radiation, and desiccation could induce the cold-shock and recombinational repair of damaged DNA proteins. Additionally, oligotrophic condition of glacial ice may induce the phage shock protein C. The genome sequence of strain Y620-1 provides genetic information to identify the genes linked to its specific mechanisms for adaption to extreme glacial environment.