Genome sequence of the marine bacterium Corynebacterium maris type strain Coryn-1T (= DSM 45190T)

Corynebacterium maris Coryn-1T Ben-Dov et al. 2009 is a member of the genus Corynebacterium which contains Gram-positive, non-spore forming bacteria with a high G+C content. C. maris was isolated from the mucus of the Scleractinian coral Fungia granulosa and belongs to the aerobic and non-haemolytic corynebacteria. It displays tolerance to salts (up to 10%) and is related to the soil bacterium Corynebacterium halotolerans. As this is a type strain in a subgroup of Corynebacterium without complete genome sequences, this project, describing the 2.78 Mbp long chromosome and the 45.97 kbp plasmid pCmaris1, with their 2,584 protein-coding and 67 RNA genes, will aid the Genomic Encyclopedia of Bacteria and Archaea project.


Introduction
Strain Coryn-1 T (= DSM 45190 T ) is the type strain of the species Corynebacterium maris originally isolated from the mucus of the coral Fungia granulosa from the Gulf of Eilat (Red Sea, Israel) [1]. The genus Corynebacterium is comprised of Gram-positive bacteria with a high G+C content. It currently contains over 80 members [2] isolated from diverse backgrounds like human clinical samples [3] and animals [4], but also from soil [5] and ripening cheese [6]. Within this diverse genus, C. maris has been proposed to form a distinct lineage with C. halotolerans YIM 70093 T demonstrating 94% similarity related to the 16S rRNA gene sequences [1]. Similar to the closest phylogenetic relative C. halotolerans, which displays the highest resistance to salt described for the genus Corynebacterium to date, C. maris Coryn-1 T is able to live under conditions with high salinity. This species grows on LB agar plates with salinity ranging between 0 and 10%. Optimal growth was detected between 0.5 and 4.0% [1]. Aside from this Coryn-1 T is an alkaline-tolerant bacterium, which grows well at pH 7.2-9.0 (optimum pH 7.2) [1]. Here we present a summary classification and a set of features for C. maris DSM 45190 T , together with the description of the genomic sequencing and annotation.

Classification and features
A representative genomic 16S rRNA sequence of C. maris DSM 45190 T was compared to the Ribosomal Database Project database [7] confirming the initial taxonomic classification. C. maris shows highest similarity to C. halotolerans (94%). Because sequence similarity greater than 97% was not obtained with any member of the genus Corynebacteria, it was suggested that C. maris forms an new novel species, a hypothesis that is backed by other taxonomic classifiers [1]. Figure 1 shows the phylogenetic neighborhood of C. maris in a 16S rRNA based tree. Within the larger group containing furthermore the species C. marinum 7015 T [10] and C. humireducens MFC-5 T [11], the two strains C. maris and C. halotolerans YIM 70093 T [1] were clustered in a common subgroup. C. maris Coryn-1 T is a Gram-positive coccobacillus, which is 0.8-1.5 μm long and 0.5-0.8 μm wide (Table 1, Figure 2). By reason that C. maris contains a thick peptidoglycan layer, the cells commonly do not separate after cell-division and stay diplocellular [1], the so called snapping division. Species with at least one publicly available genome sequence (not necessarily the type strain) are highlighted in bold face. The tree is based on sequences aligned by the RDP aligner and utilizes the Jukes-Cantor corrected distance model to construct a distance matrix based on alignment model positions without alignment inserts, using a minimum comparable position of 200. The tree is built with RDP Tree Builder, which utilizes Weighbor [8] with an alphabet size of 4 and length size of 1,000. The building of the tree also involves a bootstrapping process repeated 100 times to generate a majority consensus tree [9]. Rhodococcus equi (X80614) was used as an outgroup.  Table 1. Classification and general features of C. maris Coryn-1 T according to the MIGS recommendations [12].
It is described as non-motile [1], which coincides with a complete lack of genes associated with 'cell motility' (functional category N in COGs table).

Genome sequencing and annotation Genome project history
Because of its phylogenetic position and interesting capabilities, i.e. high salt tolerance, C. maris Coryn-1 T was selected for sequencing as part of a project to define the core genome and pan genome of the non-pathogenic corynebacteria. While not being part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) project [23], sequencing of the type strain will nonetheless aid the GEBA effort. The genome project is deposited in the Genomes OnLine Database [24] and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the Center of Biotechnology (CeBiTec). A summary of the project information is shown in Table 2.

Growth conditions and DNA isolation
C. maris strain Coryn-1 T , DSM 45190, was grown aerobically in LB broth (Carl Roth GmbH, Karlsruhe,Germany) at 37 °C. DNA was isolated from ~10 8 cells using the protocol described by Tauch et al. 1995 [25]. Reads were assembled using the Newbler assembler v2.6 (Roche). The initial Newbler assembly consisted of 26 contigs in seven scaffolds. Analysis of the seven scaffolds revealed one to be an extrachromosomal element (plasmid pCmaris1), five to make up the chromosome with the remaining one containing the four copies of the RRN operon which caused the scaffold breaks.
The scaffolds were ordered based on alignments to the complete genome of C. halotolerans [26] and subsequent verification by restriction digestion, Southern blotting and hybridization with a 16S rDNA specific probe. The Phred/Phrap/Consed software package [27][28][29][30] was used for sequence assembly and quality assessment in the subsequent finishing process. After the shotgun stage, gaps between contigs were closed by editing in Consed (for repetitive elements) and by PCR with subsequent Sanger sequencing (IIT Biotech GmbH, Bielefeld, Germany). A total of 67 additional reactions were necessary to close gaps not caused by repetitive elements.

Genome properties
The genome (on the scale of 2,833,547 bp) includes one circular chromosome of 2,787,574 bp (66.67% G+C content) and one plasmid of 45,973 bp (61.32% G+C content, [ Figure 3]). For chromosome and plasmid, a total of 2,653 genes were predicted, 2,584 of which are protein coding genes. The remaining were annotated as hypothetical proteins. A total of 1,494 (57,82%) of the protein coding genes were assigned to a putative function. Of the protein coding genes, 1,067 belong to 350 paralogous families in this genome corresponding to a gene content redundancy of 41.29%. The properties and the statistics of the genome are summarized in Tables 3 and 4. a) The total is based on either the size of the genome in base pairs or the total number of total genes in the annotated genome.