High quality draft genome sequence of the slightly halophilic bacterium Halomonas zhanjiangensis type strain JSM 078169T (DSM 21076T) from a sea urchin in southern China

Halomonas zhanjiangensis Chen et al. 2009 is a member of the genus Halomonas, family Halomonadaceae, class Gammaproteobacteria. Representatives of the genus Halomonas are a group of halophilic bacteria often isolated from salty environments. The type strain H. zhanjiangensis JSM 078169T was isolated from a sea urchin (Hemicentrotus pulcherrimus) collected from the South China Sea. The genome of strain JSM 078169T is the fourteenth sequenced genome in the genus Halomonas and the fifteenth in the family Halomonadaceae. The other thirteen genomes from the genus Halomonas are H. halocynthiae, H. venusta, H. alkaliphila, H. lutea, H. anticariensis, H. jeotgali, H. titanicae, H. desiderata, H. smyrnensis, H. salifodinae, H. boliviensis, H. elongata and H stevensii. Here, we describe the features of strain JSM 078169T, together with the complete genome sequence and annotation from a culture of DSM 21076T. The 4,060,520 bp long draft genome consists of 17 scaffolds with the 3,659 protein-coding and 80 RNA genes and is a part of Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project.


Introduction
Strain JSM 078169 T (= DSM 21076 = KCTC 22279 = CCTCC AB 208031) is the type strain of the species Halomonas zhanjiangensis [1], one out of 84 species with a validly published name in the genus Halomonas [2], family Halomonadaceae [3]. The family Halomonadaceae currently comprises thirteen genera (Aidingimonas, Carnimonas, Chromohalobacter, Cobetia, Halomonas, Halotalea, Halo-vibrio, Kushneria, Marinospirillum, Modicisalibacter, Candidtus Portiera, Salinicola and Zymobacter) with Halomonas being the largest genus in this family [3][4][5][6]. Members of the genus Halomonas have been isolated from various saline environments and showed halophilic characteristics [7][8][9][10][11]. Strain JSM 078169 T was originally isolated from a sea urchin (Hemicentrotus pulcher-rimus) that was collected from the South China Sea. The genus name was derived from the Greek words 'halos' meaning 'salt' and 'monas' meaning 'monad', yielding the Neo-Latin word 'halomonas' [2]; the species epithet was derived from Latin word 'zhanjiangensis', of Zhanjiang, a city in China near where the sample was collected [1]. Strain JSM 078169 T was found to assimilate several monoand disaccharides and to produce numerous acid and alkaline phosphatases, leucine arylamidase, naphthol-ASBI-phosphohydrolase and valine arylamidase [1]. There are no PubMed records that document the use of these strain for any biotechnological studies; only comparative analyses performed for the description of later members of the genus Halomonas are recorded. However, the NamesforLife [12] database reports at least 70 patents in which Halomonas ssp. are referenced.
Here we present a summary classification and a set of feature for H. zhanjiangensis JSM 078169 T , together with the description of the genomic sequencing and annotation of DSM 21076.

Classification and features 16S rRNA analysis
The original assembly of the genome did not contain longer stretches of 16S rRNA copies. Therefore, a 1,413 bp long fragment of the 16S rRNA gene was later patched into the genome sequence assembly. This almost full length version of the 16S rRNA sequence was compared using NCBI BLAST [13,14] under default settings (e.g., considering only the high-scoring segment pairs (HSPs) from the best 250 hits) with the most recent version of the Greengenes database [15] and the relative frequencies of taxa and unidentified clones (or strains) were calculated by BLAST scores. The most frequently occurring genus was Halomonas (74.8%), and the unidentified clones or isolates represented 25.5% for the total BLAST results. Except for sequences of representatives of the genus Halomonas, no sequences from other genera were observed in the BLAST search. The highest degree of sequence similarity was reported with H. alkantarctica str. CRSS. Figure 1 shows the phylogenetic neighborhood of H. zhanjiangensis JSM 078169 T in a tree based on 16S rRNA genes. The 1,413 bp long sequence fragment of the 16S rRNA gene differs by three nucleotides from the previously published 16S rRNA sequence (FJ429198). The tree provided a precise insight into the nomenclature and classification of members of the genus Halomonas. The phylogenetic analysis showed that strain H. zhanjiangensis JSM 078169 T was most closely related to H. nanhaiensis YIM M 13059 T with 98.3% sequence similarity.

Morphology and physiology
H. zhanjiangensis JSM 078169 T is a Gram-negativestaining, non-sporulating, strictly aerobic (Table  1), catalase-positive, oxidase-negative and slightly halophilic bacterium that reduces nitrate [1]. Cells of JSM 078169 T are short rods (0.4-0.7 μm × 0.6-1.0 μm) and motile with peritrichous flagella (not visible in Figure 2). Colonies are yellowpigmented, flat and non-translucent with glistening surfaces and circular/slightly irregular margins, 2-3 mm in diameter after incubation on Marine Agar (MA) at 28 ºC for 3-5 days. No diffusible pigments are produced. Growth occurs at 4-40 ºC with an optimum growth at 25-30 ºC, at pH range of 6.0-10.5 with an optimum pH of 7.5. The salinity range suitable for growth was 1.0-20.0% (w/v) total salts with an optimum between 3.0-5.0% (w/v) total salts. No growth occurs in the absence of NaCl or with NaCl as the sole salt. Strain JSM 078169 T grows on Marine Agar and the medium contained the following: 5.0 g peptone, 1.0 g yeast extract, 0.1 g ferric citrate, 19.45 g NaCl, 8.8 g MgCl2, 3.24 g Na2SO4, 1.8 g CaCl2, 0.55 g KCl, 0.16 g NaHCO3, 0.08 g KBr, 0.034 g SrCl2, 0.022 g H3BO3, 0.004 g sodium silicate, 0.0024 g sodium fluoride, 0.0016 g ammonium nitrate, 0008 g disodium phosphate and 15 g agar. All the 16S rRNA gene sequences of the type strains within the genus Halomonas were included and combined with the representative 16S rRNA gene sequences of the type species in other genera, according to the most recent release of the EzTaxon database. The tree was inferred from 1,381 aligned characters [16] under the neighbor-joining (NJ) [17], and maximumlikelihood (ML) [18] method with 1,000 randomly selected bootstrap replicates using MEGA version 5.2 [19]. The branches are scaled in terms of the expected number of substitutions per site. Numbers adjacent to the branches are support values from 1,000 NJ bootstrap (left) and from 1,000 ML bootstrap (right) replicates [20] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [21] are labeled with one asterisk, those also listed as 'Complete and Published' with two asterisks [22].  Depth not reported Evidence codes-TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). Evidence codes are from the Gene Ontology project [37]. Standards in Genomic Sciences

Genome project history
This organism was selected for sequencing on the basis of its phylogenetic position [42,43]. Sequenc-ing strain JSM 078169 T is part of Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project [44], a followup of the GEBA project [45], which aims in increasing the sequencing coverage of key reference microbial genomes. The genome project is deposited in the Genomes OnLine Database [21] and the permanent draft genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI) using state of the art sequencing technology [46]. A summary of the project information is shown in Table 2.  ) [47] at 28 ºC. DNA was isolated from 0.5-1.0 g of cell paste using MasterPure Gram-positive DNA purification kit (Epicentre MGP04100) following the standard protocol as recommended by the manufacturer with modification st/DL for cell lysis as described by Wu et al. [45]. DNA is available through the DNA Bank Network [48].

Genome annotation
Genes were identified using Prodigal [55] as part of the DOE-JGI genome annotation pipeline [56], following by a round of manual curation using the JGI GenePRIMP pipeline [57]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro database. These data sources were combined to assert a product description for each predicted protein. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes-Expert Review (IMG-ER) platform [58].

Genome properties
The assembly of the draft genome sequence consists of 17 scaffolds amounting to 4,060,520 bp, and the G+C content is 54.5% (Table 3 and Figure  3). Of the 3,739 genes predicted, 3,659 were protein-coding genes, and 80 RNAs. The majority of the protein-coding genes (87.1%) were assigned a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4 and Figure 3.   C4-dicarboxylate transport system related proteins are accounted for 1.67%, 1.37%, 1.42%, 0.27% and 1.18% of the total protein-coding genes respectively. Therefore, H. zhanjiangensis has the highest percentage of TRAP-type C4-dicarboxylate transport system related encoding proteins in this group of bacteria to date. Of the signal transduction mechanisms, Methylaccepting Chemotaxis Proteins (MCPs) are transmembrane sensor proteins of bacteria. The MCPs allow bacteria to detect concentrations of molecules in the extracellular matrix so that they may smoothly swim or tumble accordingly [62,63]. Various environmental conditions give rise to diversity in bacterial signaling receptors, and consequently there are many genes encoding MCPs [64]. A number of MCPs (23)