Draft genome sequence of Rubidibacter lacunae strain KORDI 51-2T, a cyanobacterium isolated from seawater of Chuuk lagoon

A photoautotrophic cyanobacterium, Rubidibacter lacunae was reported in 2008 for the first time. The type strain, KORDI 51-2T, was isolated from seawater of Chuuk lagoon located in a tropical area. Although it belonged to a clade exclusively comprised of extremely halotolerant strains by phylogenetic analyses, R. lacunae is known to be incapable of growth at high salt concentration over 10%. Here we report the main features of the genome of R. lacunae strain KORDI 51-2T. The genome of R. lacunae contains a gene cluster for phosphonate utilization encoding three transporters, one regulator and eight C-P lyase subunits.


Introduction
Rubidibacter lacunae type strain KORDI 51-2 T (=KCTC 40015 T =UTEX L2944 T ) is a photoautotrophic cyanobacterium isolated from lagoon seawater of Chuuk, Micronesia [1]. At this time, the genus Rubidibacter is comprised of a single isolate. Further, only three environmental 16S rRNA gene sequences in the NCBI showed relatively high sequence similarity of ca. 96% to 16S rRNA gene of the strain. Thus, the genus seems either to be a numerically rare cyanobacterium or, to exploit specific environments such as microbial mats. Actually, the most similar sequences (accession no. of DQ861063 and DQ861117 in GenBank) to Rubidibacter were obtained in microbial mats of a coastal hypersaline pool. Nonetheless, the strain KORDI 51-2 T is a non-extreme halotolerant member in the Halothece cluster, exclusively composed of extremely halophilic/halotolerant bacteria. Considering this contrasting phenotypic trait, genomic information of KORDI 51-2 T could provide a good clue to understand genomic adaptation of cyanobacteria at extreme salt condition. Here we present a summary of the genomic features of R. lacunae strain KORDI 51-2 T .

Classification and features
By phylogenetic analysis of 16S ribosomal RNA genes (Figure 1), R. lacunae KORDI 51-2 T was clustered into the Halothece cluster. Four Euhalothece strains belonging to the cluster were isolated from a hypersaline pond (strains MPI 96N303 and MPI 96N304) or a solar evaporation pond (strains MPI95AH10 and MPI95AH13) in Mexico [2]. These strains showed sustained growth between 6-16% salinity, and several strains could grow even in NaCl saturated brine, suggesting that they are at least extremely halotolerant cyanobacteria [2]. Dactylococcopsis salina and other Halothece strains belonging to the cluster were also isolated from various hypersaline environments, such as a solar lake in Egypt, a solar evaporation pond in Spain and hypersaline lagoon in Australia [2,3]. On the contrary, R. lacunae KORDI 51-2 T was isolated from natural seawater and able to grow at a salinity between 2 and 7% (Table 1). In addition, R. lacunae KORDI 51-2 T contains phycoerythrin, which differentiated it from the other strains belonging to the 'Halothece' cluster [1]. The epifluorescence micrograph of the cells and other classification and general features were shown in Figure 2 and Table 1, respectively.   , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [9]. Standards in Genomic Sciences

Genome sequencing and annotation Genome project history
The organism was selected for sequencing on the basis of its phylogenetic position. The genome project was deposited in the Genomes On Line Database [10] and draft genome sequence was deposited in GenBank database (accession number ASSJ00000000). The genome sequencing was carried out in Macrogen Inc. (Seoul, Korea) using GS-FLX Titanium sequencing technology. Table 2 presents the project information and its association with MIGS version 2.0 compliance [4].

Growth conditions and DNA isolation
R. lacunae KORDI 51-2 T was grown in a 50 ml culture flask filled with 50 ml of modified f/2 medium in which silicate was omitted and ammonium chloride was supplemented (final conc. of 100 μM). The culture flask with inoculum was incubated at 25 o C at about 20 μE m -2 s -1 (light:dark=14:10) for 3 weeks. Genomic DNA was isolated using Qiagen Genomic-tip 100/G (Qiagen) according to the manufacturer's instruction.

Genome sequencing and assembly
The genome was sequenced by pyrosequencing (GS-FLX Titanium). A shotgun library was constructed according to GS FLX Titanium Sequencing Method Manual. The 291,414 pyrosequencing reads obtained has an average length of 442.12 bp and were assembled using the Newbler assembler (version, 2.3; Roche) with default options. The final assembly resulted in 126 contigs longer than or equal to 500 bp with the contigs sum of 4,215,105 bp. After removing 27 short contigs with low coverage in order to minimize possible contamination, the remaining 99 contigs were used for further analyses (Table 3).

Genome annotation
The gene prediction and functional annotation of the genome sequence was basically performed within the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [11]. The tRNAScan-SE was used to find tRNA genes [12]. Ribosomal RNA genes and ncRNA were predicted using RNAmmer [13] and Infernal [14] using the Rfam model [15], respectively. Identification of protein coding genes was performed using Prodigal [16], followed by a round of manual curation using the JGI GenePRIMP pipeline [17]. The predicted CDS were searched using the TIGR-fam, Pfam and COG databases implemented in the IMG systems.

Genome properties
The draft genome of R. lacunae KORDI 51-2 T , with a total of 4.15 Mbp from 99 contigs, contains 56.22% G+C contents ( Figure 3 and Table 3). A total of 3,790 genes were predicted. Of these, 283 pseudogenes. The remaining 3,457 were annotated as protein-coding genes and 50 for RNA genes (3 for rRNA, 41 for tRNA and 6 other nc RNA). The properties and the statistics of the genome are summarized in Table 3. The distribution of genes into COGs functional categories is presented in Table 4.  a) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.

Insights from the genome sequence
A genome analysis of R. lacunae KORID 51-2 T , revealed that it contains a gene cluster participating in organic phosphonate utilization. Likewise with a marine nitrogen-fixing cyanobacterium, Trichodesmium erythraeum IMS101 [18], the strain KORDI 51-2 T has orthologs to phnC-E (transporters) and phnG-M (C-P lyase complex) ( Figure 4A). Additionally, an ortholog to phnF (transcriptional regulator) is found in strain KORDI 51-2 T , but not in T. erythraeum IMS101. Phylogenetic analysis of PhnJ proteins found in various bacterial strains, showed that PhnJ proteins of cyanobacteria form polyphyletic lineages ( Figure 4B), suggesting that the phn gene cluster of cyanobacteria might be acquired by horizontal gene transfer. As KORDI 51-2 T can grow in media supplemented with variety of organic phosphonate substrates (2-aminoethylphosphonate, methylphosphonate, phosphonoacetic acid and phosphonoformic acid) as a sole P-source (data not shown), the strain must be able to cleave C-P bonds of organic phosphonate by C-P lyase pathways and utilize them as a P-source.