Complete genome sequence of Thalassolituus oleivorans R6-15, an obligate hydrocarbonoclastic marine bacterium from the Arctic Ocean

Strain R6-15 belongs to the genus Thalassolituus, in the family Oceanospirillaceae of Gammaproteobacteria. Representatives of this genus are known to be the obligate hydrocarbonoclastic marine bacteria. Thalassolituus oleivorans R6-15 is of special interest due to its dominance in the crude oil-degrading consortia enriched from the surface seawater of the Arctic Ocean. Here we describe the complete genome sequence and annotation of this strain, together with its phenotypic characteristics. The genome with size of 3,764,053 bp comprises one chromosome without any plasmids, and contains 3,372 protein-coding and 61 RNA genes, including 12 rRNA genes.


Introduction
Thalassolituus spp. belong to the Oceanospirillaceae of Gammaproteobacteria. The genus was first described by Yakimov et.al. (2004), and is currently composed of two type species, T. oleivorans and T. marinus [1,2]. Bacteria of this genus are known as obligate hydrocarbonoclastic marine bacteria [3]. Previous reports showed that Thalassolituusrelated species were among the most dominant members of the petroleum hydrocarbon-enriched consortia at low temperature [4][5][6][7]. In addition to consortia enriched with oil, Thalassolituus spp. can be detected in variety of cold environments as well [8][9][10]. Strain R6-15 was isolated from the surface seawater of the Arctic Ocean after enriched with crude oil during the fourth Chinese National Arctic Research Expedition of the "Xulong" icebreaker in the summer of 2010. The 16S rRNA gene sequence shared 99.86% and 96.39% similarities with T. oleivorans MIL-1 T and T. marinus IMCC1826 T , respectively. Pyrosequencing results (16S rRNA gene V3 region) of fifteen oil-degrading consortia across the Arctic Ocean showed that the dominant member in most of the consortia shared identical sequence of this strain, comprising 8.4-99.6% of the total reads (not published). Here, we described the complete genome sequence and annotation of strain T. oleivorans R6-15, and its phenotypic characteristics. Moreover, a brief comparison was made between strain R6-15 and the two type strains of the validly named species of this genus, in both phenotypic and genomic aspects.

Classification and features
T. oleivorans R6-15 is closely related with T. oleivorans MIL-1 T (Figure 1, Table 1). The strain is aerobic, Gram-negative and motile by a single polar flagellum, exhibiting a characteristic morphology of a curved rod-shape cell ( Figure 2). Strain R6-15 is able to utilize a restricted spectrum of carbon substrates for growth, including sodium acetate, Tween-40, Tween-80 and C12-C36 aliphatic hydrocarbons. Its growth temperature ranges from 4 to 32°C with optimum of 25°C.

The Genomic Standards Consortium
Thalassolituus oleivorans Figure 1. Phylogenetic tree highlighting the position of T. oleivorans strain R6-15 relative to other type and nontype strains with finished or non-contiguous finished genome sequences within the family Oceanospirillaceae. Accession numbers of 16S rRNA gene sequences are indicated in brackets. Sequences were aligned using DNAMAN version 6.0, and a neighbor-joining tree obtained using the maximum-likelihood method within the MEGA version 5.0 [11]. Numbers adjacent to the branches represent percentage bootstrap values based on 1,000 replicates.  , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [20]. If the evidence code is IDA, then the property should have been directly observed, for the purpose of this specific publication, for a live isolate by one of the authors, or an expert or reputable institution mentioned in the acknowledgements.
When compared to other Thalassolituus species, strain R6-15 differed from type strain MIL-1 T [1] in catalase, urease and acid phosphatase, and in the utilization of n-alkane, pyruvic acid methyl ester, D-mannitol and D-sorbitol (Table 2). Differences were also observed with type strain IMCC1826 T [2] in growth temperature range, catalase, nitrate reductase, urease and leucine arylamidase and the utilization of n-alkane, pyruvic acid methyl ester, β-Hydroxybutyric acid and D,L-Lactic acid (

Genome project history
This organism was selected for sequencing on the basis of its phylogenetic position and dominance position in the crude oil-degrading consortia enriched from the surface seawater of the Arctic Ocean. The complete genome sequence was deposited in Genbank under accession number CP006829. Sequencing, finishing and annotation of the T. oleivorans R6-15 genome were performed by the Chinese National Human Genome Center (Shanghai). Table 3 presents the project information and its association with MIGS version 2.0 compliance [21].

Growth conditions and DNA isolation
Strain R6-15 was grown aerobically in ONR7a medium [22] with sodium acetate as the sole carbon and energy source. The genomic DNA was extracted from the cell, concentrated and purified using the AxyPrep bacterial genomic DNA miniprep Kit (Axygen), as detailed in the manual for the instrument.

Genome sequencing and assembly
The genome was sequenced by using a massively parallel pyrosequencing technology (454 GS FLX) [23]. A total of 140,550 reads counting up to 78,223,504 bases were obtained, covered 21.1folds of genome. The Newbler V2.7 [24] software package was used for sequence assembly and quality assessment. After assembling, 64 contigs ranging from 500 bp to 304,980 bp were obtained, and the relationship of the contigs was determined by multiplex PCR [25]. Gaps were then filled in by sequencing the PCR products using ABI 3730xl capillary sequencers. A total of 284 additional reactions were necessary to close gaps and to raise the quality of the finished sequence. Finally, the sequences were assembled using Phred, Phrap and Consed software packages [26], and low quality regions of the genome were resequenced. The final sequence accuracy was approximately 99.999%.

Genome annotation
The protein-coding genes, structural RNAs (5S, 16S, 23S), tRNAs and small non-coding RNAs were predicted and achieved by using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) server online [27]. The functional annotation of predicted ORFs was performed using RPS-BLAST [28] against the cluster of orthologous groups (COG) database [29] and Pfam database [30]. TMHMM program was used for gene prediction with transmembrane helices [31] and signalP program was used for prediction of genes with peptide signals [32].

Genome properties
The properties and the statistics of the genome are summarized in Table 4. The genome includes one circular chromosome of 3,764,053 bp (46.6% GC content). In total, 3,489 genes were predicted, 3,372 of which are protein-coding genes, and 61 RNAs; 56 pseudogenes were also identified. The majority of the protein-coding genes (67.07%) were assigned a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 5

Insights from the genome sequence
Until now, only the genome sequence of the type strain T. oleivorans MIL-1 T was available within the genus of Thalassolituus [9]. Here, we compared the genome of strain R6-15 with strain MIL-1 T (  [33,34]. The DDH estimate value between them was 84.5% ± 2.57, which were above the standard criteria (70%) [35]. Therefore, these results confirmed that strain R6-15 belonged to the species of Thalassolituus oleivorans.

Conclusion
Strain R6-15 is the first strain with the complete genome sequence of the genus Thalassolituus isolated from the Arctic Ocean. These genomic data will provide insights into the mechanisms of how this bacterium can thrive on the crude oil in the polar marine environments.