Non contiguous-finished genome sequence and description of Enterobacter massiliensis sp. nov.

Enterobacter massiliensis strain JC163T sp. nov. is the type strain of E. massiliensis sp. nov., a new species within the genus Enterobacter. This strain, whose genome is described here, was isolated from the fecal flora of a healthy Senegalese patient. E. massiliensis is an aerobic rod. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 4,922,247 bp long genome (1 chromosome but no plasmid) exhibits a G+C content of 55.1% and contains 4,644 protein-coding and 80 RNA genes, including 5 rRNA genes.


Introduction
Enterobacter massiliensis strain JC163 T (= CSUR P161 = DSM 26120) is the type strain of E. massiliensis sp. nov. This bacterium is a Gramnegative, aerobic, flagellate, indole-positive bacillus that was isolated from the feces of a healthy Senegalese patient in a study aiming at cultivating all bacterial species in human feces [1]. The current classification of prokaryotes, known as polyphasic taxonomy, relies on a combination of phenotypic and genotypic characteristics [2]. However, as more than 3,000 bacterial genomes have been sequenced [3] and the cost of genomic sequencing is decreasing, we recently proposed to integrate genomic information in the description of new bacterial species [4][5][6][7][8][9][10][11][12][13][14][15].
Here we present a summary classification and a set of features for E. massiliensis sp. nov. strain JC163 T (= CSUR P161 = DSM 26120), together with the description of the complete genomic sequencing and annotation. These characteristics support the circumscription of the species E. massiliensis. The genus Enterobacter (Hormaeche and Edwards, 1960) was created in 1960 [16]. To date, this genus is comprised of 25 species [17-35] and 2 subspecies. Members of the genus were isolated mostly from the environment, in particular from plants and fruits, but are also frequently isolated from humans, notably in health-care associated infections, causing bacteremia, pneumonia or urinary tract infections [36]. In addition, many Enterobacter spp. were isolated from the normal fecal flora.

Classification and features
A stool sample was collected from a healthy 16-year-old male Senegalese volunteer patient living in Dielmo (rural village in the Guinean-Sudanian zone in Senegal), who was included in a research protocol. Written assent was obtained from this individual. No written consent was needed from his guardians for this study because he was older than 15 years old (in accordance with the previous project approved by the Ministry of Health of Senegal and the assembled village population and as published elsewhere [37] [4][5][6][7][8][9][10][11][12][13][14][15]. The fecal specimen was preserved at -80°C after collection and sent to Marseille. Strain JC163 T (Table 1) was isolated in April 2011 by aerobic cultivation on Brain-Heart Infusion (BHI) agar at 37°C after preincubation of the stool specimen with lytic E. coli T1 and T4 phages [1,50]. This strain exhibited a nucleotide sequence similarity with Enterobacter species ranging from 95.74% with E. pyrinus (Chung et al., 1993) to 97.33% with E. cloacae subsp. cloacae (Jordan, 1980) (Figure 1). This latter value was lower than the 98.7% 16S rRNA gene sequence threshold recommended by Stackebrandt and Ebers to delineate a new species without carrying out DNA-DNA hybridization [51]. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [49]. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements.   Strain JC163 T exhibited catalase activity but not oxidase activity. Using the API 20E system, positive reactions were obtained for indole production, β-galactosidase and glucose, mannitol, sorbitol and rhamnose fermentation. E. massiliensis is susceptible to ticarcillin, imipenem, trimethoprim/sulfamthoxazole, gentamicin, amikacin, and colimycin but resistant to fosfomycin and nitrofurantoin. By comparison with E. arachidis, its phylogenetically-closest neighbor, E. massiliensis differed in arginine dihydrolase, ornithine decarboxylase, citrate and succinate fermentation [19]. Matrix-assisted laser-desorption/ionization timeof-flight (MALDI-TOF) MS protein analysis was carried out as previously described [52]. Briefly, a pipette tip was used to pick one isolated bacterial colony from a culture agar plate and spread it as a thin film on a MTP 384 MALDI-TOF target plate (Bruker Daltonics, Germany). Twelve distinct deposits were done for strain JC163 T from twelve isolated colonies. Each smear was overlaid with 2 µL of matrix solution (saturated solution of alphacyano-4-hydroxycinnamic acid) in 50% acetonitrile, 2.5% tri-fluoracetic acid, and allowed to dry for five minutes. Measurements were performed with a Microflex spectrometer (Bruker). Spectra were recorded in the positive linear mode for the mass range of 2,000 to 20,000 Da (parameter settings: ion source 1 (ISI), 20kV; IS2, 18.5 kV; lens, 7 kV). A spectrum was obtained after 675 shots at a variable laser power. The time of acquisition was between 30 seconds and 1 minute per spot. The twelve JC163 T spectra were imported into the MALDI BioTyper software (version 2.0, Bruker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 3,769 bacteria, including spectra from 34 spectra from validly published Enterobacter species that were used as reference data in the BioTyper database (updated March 15 th , 2012). The method of identification includes the m/z from 3,000 to 15,000 Da. For every spectrum, a maximum of 100 peaks were taken into account and compared with the spectra in database. A score enabled the presumptive identification and discrimination of the tested species from those in a database: a score > 2 with a validly published species enabled the identification at the species level; a score > 1.7 but < 2 enabled the identification at the genus level; and a score < 1.7 did not enable any identification. For strain JC163 T , the score obtained was 1.4, suggesting that our isolate was not a member of a known species. We incremented our database with the spectrum from strain JC163 T (Figure 4). In addition, the gel view allows the highlighting of spectra differences with other of Enterobacteriaceae family members ( Figure 5).

Figure 5.
Gel view comparing Enterobacter massiliensis JC163 T spectra with 23 other members into Enterobacteriaceae family. The Gel View displays the raw spectra of all loaded spectrum files arranged in a pseudo-gel like look. The xaxis records the m/z value. The left y-axis displays the running spectrum number originating from subsequent spectra loading. The peak intensity is expressed by a Gray scale scheme code. The color bar and the right y-axis indicate the relation between the color a peak is displayed with and the peak intensity in arbitrary units.

Genome sequencing information Genome project history
The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the genus Enterobacter, and is part of a study of the human digestive flora aiming at isolating all bacterial species within human feces. It was the 10 th genome of an Enterobacter species (including genomes from 5 validly published species) and the first genome of E. massiliensis sp. nov. The genome sequence deposited in GenBank under accession number CAEO00000000 consists of 224 contigs. Table 2 shows the project information and its association with MIGS version 2.0 compliance [53].

Growth conditions and DNA isolation
E. massiliensis strain JC163 T , (= CSUR P161 = DSM 26120) was grown aerobically on BHI agar at 37°C. Four petri dishes were spread and resuspended in 3×100µl of G2 buffer. A first mechanical lysis was performed by glass powder on the Fastprep-24 device (Sample Preparation system, MP Biomedicals, USA) during 2×20 seconds. DNA was then treated with 2.5 µg/µL lysozyme (30 minutes at 37°C) and extracted through the BioRobot EZ 1 Advanced XL (Qiagen). The DNA was then concentrated and purified on a Qiamp kit (Qiagen). The yield and the concentration was measured by the Quant-it Picogreen kit (Invitrogen) on the Genios_Tecan fluorometer at 118 ng/µl.

Genome annotation
Open Reading Frames (ORFs) were predicted using Prodigal [54] with default parameters but the predicted ORFs were excluded if they spanned a sequencing gap region. The predicted bacterial protein sequences were searched against the GenBank database [55] and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool [56] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [57] and BLASTN against the GenBank database. Signal peptides and numbers of transmembrane helices were predicted using SignalP [58] and TMHMM [59] respectively.
To estimate the mean level of nucleotide sequence similarity at the genome level between E. massiliensis strain JC163 T , E. aerogenes strain KCTC 2190 (GenBank accession number CP002824), E. asburiae strain LF7a (CP003026), E. cancerogenus strain ATCC35316 (ABWM00000000), E. cloacae subsp. cloacae strain ATCC13047 (CP001918), E. cloacae subsp. dissolvens strain SDM (CP003678) and E. hormaechei strain ATCC49162 (AFHR00000000), we compared the ORFs only using BLASTN and the following parameters: a query coverage of > 70% and a minimum nucleotide length of 100 bp. Standards in Genomic Sciences Figure 6. Graphical circular map of the chromosome. From outside to the center: genes on both the forward and reverse strands, genes on forward strand, genes on reverse strand, genes colored by COG categories, RNA genes (tRNAs and rRNAs) and blast of the genome vs itself.

Genome properties
The genome of E. massiliensis sp. nov. strain JC163 T is 4,922,247 bp long (1 chromosome but no plasmid) with a 55.1% G+C content ( Figure 6 and Table 3). Of the 4,724 predicted genes, 4,644 were protein-coding genes, and 80 were RNAs, including 1 complete rRNA operon, 2 additional 5S rRNAs and 75 tRNAs. A total of 3,181 genes (68.5%) were assigned a putative function. The remaining genes were annotated as hypothetical or unknown proteins. The distribution of genes into COGs functional categories is presented in Table 4. The properties and the statistics of the genome are summarized in Tables 3 and 4.

Comparison with other Enterobacter species genomes
Here, we compared the genome of E. massiliensis strain JC163 T   The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.  The total is based on the total number of protein coding genes in the annotated genome.

Description of Enterobacter massiliensis sp. nov.
Enterobacter massiliensis (mas.il.i.en′sis. L. gen. masc. n. massiliensis, of Massilia, the Latin name of Marseille where strain JC163 T was first isolated and cultivated). Colonies are 2 mm in diameter on Brain-Heart Infusion agar and are convex, opaque, light-cream colored and circular with regular margins. Cells are rods with tufts of polar flagella and a mean diameter of 1.02 µm and a mean length of 1.90 µm. Optimal growth is achieved in an aerobic atmosphere supplemented with 5% CO 2 . Weak growth is observed in microaerophilic conditions. No growth is observed under anaerobic conditions in the absence of CO 2 . Growth occurs between 25 and 45°C, with optimal growth occurring between 30 and 37°C. Cells stain Gram-negative, are nonendospore forming and are motile. Cells are positive for catalase and indole production. βgalactosidase and glucose, mannitol, sorbitol and rhamnose fermentation activities are present. Nitrate reduction, urease and oxidase activities are absent. Cells are susceptible to ticarcillin, imipenem, trimethoprim/sulfamethoxazole, gentamicin, amikacin, and colimycin, but resistant to fosfomycin and nitrofurantoin. The G+C content of the genome is 55.1%. The 16S rRNA and genome sequences are deposited in Genbank and EMBL under accession numbers JN657217 and CAEO00000000, respectively. The type strain JC163 T (= CSUR P161 = DSM 26120) was isolated from the fecal flora of a healthy patient in Senegal.