Non-contiguous finished genome sequence and description of Brevibacillus massiliensis sp. nov.

Brevibacillus massiliensis strain phRT sp. nov. is the type strain of B. massiliensis sp. nov., a new species within the genus Brevibacillus. This strain was isolated from the fecal flora of a woman suffering from morbid obesity. B. massiliensis is a Gram-positive aerobic rod-shaped bacterium. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 5,051,018 bp long genome (1 chromosome but no plasmid) contains 5,051 protein-coding and 84 RNA genes, and exhibits a G+C content of 53.1%.


Introduction
Brevibacillus massiliensis strain phR T (= CSUR P177 = DSM 25447) is the type strain of B. massiliensis sp. nov. This bacterium is a Grampositive, spore-forming, indole negative, aerobic and motile bacillus that was isolated from the stool of a 26-year-old woman suffering from morbid obesity. The strain was isolated as part of a study aiming at individually cultivating all species within human feces [1]. The current approach to classification of prokaryotes, often referred to as polyphasic taxonomy, relies on a combination of phenotypic and genotypic characteristics [2]. However, as more than 3,000 bacterial genomes have been sequenced to date [3] and the cost of genomic sequencing is decreasing, we recently proposed to integrate genomic information in the description of new bacterial species [4][5][6][7][8][9][10][11][12][13][14][15]. The genus Brevibacillus (Shilda et al. 1996) was created in 1996 by reclassification of 10 Bacillus species, on the basis of 16S rDNA gene sequence analysis [16]. To [16], B. invocatus [18], B. limnophilus [19], B. levickii [20], B. ginsengisoli [21], B. panacihumi [22], B. fluminis in [23], and B. nitrificans [24]. Members of the genus Brevibacillus are environmental bacteria and were mostly isolated from soil [22,25]. In addition, B. brevis and B. centrosporus were isolated from indoor dust in schools, day care centers for children and animal sheds [26], and fecal flora of children, respectively [27]. However, several Brevibacillus species are also frequently isolated from humans, notably in nosocomial infections, causing breast abscess, pneumonia [18], peritonitis [28] and endopthalmitis [29]. Here we present a summary classification and a set of features for B. massiliensis sp. nov. strain phR T (= CSUR P177 = DSM 25447), together with the description of the complete genomic sequencing and annotation. These characteristics support the circumscription of the B. massiliensis species.

Classification and features
A stool sample was collected from a 26-year-old woman living in Marseille (France). She suffered from morbid obesity and had a body mass index of 48.2 (118.8 kg, 1.57 meter). At the time of stool sample collection she was not under medication or on a diet. The patient gave an informed and signed consent. This study and the assent procedure were approved by the Ethics Committee of the Institut Fédératif de Recherche IFR48, Faculty of Medicine, Marseille, France (agreement 11-017). The fecal specimen was preserved at -80°C after collection. Strain phR T (Table 1) [18], B. panacihumi [22], B. levickii [20], (Figure 1). This latter value was lower than the 98.7% 16S rRNA gene sequence threshold recommended by Stackebrandt and Ebers to delineate a new species without carrying out DNA-DNA hybridization [40]. Different growth temperatures (25, 30, 37, 45°C [ Table 2]) were tested; no growth occurred at 25°C, growth occurred between 30 and 45°C, and optimal growth was observed at 37°C. Grey colonies were 0.8 mm to 1 mm in diameter on bloodenriched Columbia agar and Brain Heart Infusion (BHI) agar. Growth of the strain was tested under anaerobic and microaerophilic conditions using GENbag anaer and GENbag microaer systems, respectively (BioMerieux), and in the presence of air, with or without 5% CO 2 . Growth was obtained aerobically. A weak growth was observed with 5% CO 2 , but no growth occurred in microaerophilic and anaerobic conditions. Gram staining showed Gram-positive rods ( Figure 2). The motility test was positive. Cell diameters ranged from 0.61 µm to 0.80 µm, with a mean diameter of 0.74 µm, and from 2.60µm to 7.30 µm long, with a mean length of 4.3µm in electron microscopy. Peritrichous flagellae were also observed ( Figure 3). , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [39]. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements.

Figure 1.
Phylogenetic tree highlighting the position of Brevibacillus massiliensis strain phR T relative to other type strains within the Brevibacillus genus. GenBank accession numbers are indicated in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences obtained using the maximum-likelihood method within MEGA program. Numbers at the nodes are percentages of bootstrap values obtained by repeating the analysis 500 times to generate a majority consensus tree. Alicyclobacillus acidocaldarius was used as outgroup. The scale bar represents a 2% nucleotide sequence divergence.
B. massiliensis is susceptible to penicillin G, amoxicillin, amoxicillin + clavulanic acid, ceftriaxone, imipenem, erythromycin, doxycycline, rifampicin, vancomycin, ciprofloxacin, gentamicin, nitrofurantoin and resistant to metronidazole and trimetoprim + sulfamethoxazole. By comparison with B. borstelensis, its phylogenetically-closest neighbor, B. massiliensis differed in fumarate, phenylacetate and glutamate activities [18]. By comparison with B. brevis, B.massiliensis differed in alkaline and acid phosphatase production, nitrate reductase, esterase, esterase lipase, leucine arylamidase, cystine arylamidase and valine arylamidase production. By comparison with B. agri, B. massiliensis differed in oxidase production.  Matrix-assisted laser-desorption/ionization timeof-flight (MALDI-TOF) MS protein analysis was carried out as previously described [41]. Briefly, a pipette tip was used to pick an isolated bacterial colony from a culture agar plate and spread it as a thin film on a MTP 384 MALDI-TOF target plate (Bruker Daltonics, Germany). Twelve distinct deposits were done for strain phR T from twelve isolated colonies. Each smear was overlaid with 2µL of matrix solution (saturated solution of alphacyano-4-hydroxycinnamic acid) in 50% acetonitrile, 2.5% tri-fluoracetic acid, and allowed to dry for five minutes. Measurements were performed with a Microflex spectrometer (Bruker). Spectra were recorded in the positive linear mode for the mass range of 2,000 to 20,000 Da (parameter settings: ion source 1 (ISI), 20kV; IS2, 18.5 kV; lens, 7 kV). A spectrum was obtained after 675 shots at a variable laser power. The time of acquisition was between 30 seconds and 1 minute per spot. The twelve phR T spectra were imported into the MALDI BioTyper software (version 2.0, Bruker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 3,769 bacteria, including spectra from nine validly published Brevibacillus species that were used as reference data in the BioTyper database (updated March 15 th , 2012). The method of identification includes the m/z from 3,000 to 15,000 Da. For every spectrum, 100 peaks at most were taken into account and compared with the spectra in the database. A score enabled the presumptive identification and discrimination of the tested species from those in a database: a score > 2 with a validated species enabled the identification at the species level; a score > 1.7 but < 2 enabled the identification at the genus level; and a score < 1.7 did not enable any identification. For strain phR T , no significance score was obtained, thus suggesting that our isolate was not a member of a known species. We incremented our database with the spectrum from strain phR T (Figure 4). Finally, the gel view allows us to highlight the spectra differences with other of Brevibacillus genera members ( Figure 5).    View displays the raw spectra of all loaded spectrum files arranged in a pseudo-gel like look. The x-axis records the m/z value. The left y-axis displays the running spectrum number originating from subsequent spectra loading. The peak intensity is expressed by a Gray scale scheme code. The color bar and the right y-axis indicates the relation between the color a peak is displayed with and the peak intensity in arbitrary units.

Genome sequencing information Genome project history
The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the Brevibacillus genus, and is part of a study of the human digestive flora aiming at isolating all bacterial species within human feces. It was the fifth genome of a Brevibacillus species and the first genome of Brevibacillus massiliensis sp. nov. The Genbank accession number is CAGW00000000 and consists of 132 contigs. Table 3 shows the project information and its association with MIGS version 2.0 compliance [42].

Genome sequencing and assembly
A 3kb paired-end sequencing strategy (Roche, Meylan, France) was used. Five µg of DNA was mechanically fragmented on the Hydroshear device (Digilab, Holliston, MA,USA) with an enrichment size at 3-4kb. The DNA fragmentation was visualized through an Agilent 2100 BioAnalyzer on a DNA labchip 7500 with an optimal size of 3.2 kb. The library was constructed according to the 454 GS FLX Titanium paired end protocol. Circularization and nebulization were performed and generated a pattern with an optimal at 555 bp. After PCR amplification through 17 cycles followed by double size selection, the single stranded paired-end library was then quantified on the Quant-it Ribogreen kit (Invitrogen) on the Genios_Tecan fluorometer at 21 pg/µL. The library concentration equivalence was calculated as 6.94e+07 molecules/µL. The library was stored at -20°C until further use. The 3kb paired-end library was amplified in 9 emPCR reactions at 1cpb, and in 2 emPCRs at 0.5 cpb with the GS Titanium SV emPCR Kit (Lib-L) v2 (Roche).The yield of the 2 types of paired-end emPCR reactions was 7.8% and 11.2%, respectively, in the quality range of 5 to 20% expected from the Roche procedure. Both libraries were loaded onto GS Titanium PicoTiterPlates (PTP Kit 70×75, Roche) and pyrosequenced with the GS Titanium Sequencing Kit XLR70 and the GS FLX Titanium sequencer (Roche).The run was performed overnight and then analyzed on the cluster through the gsRunBrowser and Newbler assembler (Roche). A total of 969,014 passed filter wells were obtained and generated 274 Mb with a length average of 286 bp. The passed filter sequences were assembled using Newbler with 90% identity and 40bp as overlap. The final assembly identified 31 scaf-folds and 129 contigs (>1,500 bp) and generated a genome size of 5.05Mb, which corresponds to a coverage of 54.2× coverage.

Genome annotation
Open Reading Frames (ORFs) were predicted using Prodigal [43] with default parameters but the predicted ORFs were excluded if they spanned a sequencing gap region. The predicted bacterial protein sequences were searched against the GenBank database [44] and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool [45] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [46] and BLASTN against the GenBank database. Lipoprotein signal peptides and numbers of transmembrane helices were predicted using SignalP [47] and TMHMM [48], respectively. ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids. If alignment lengths were smaller than 80 amino acids, we used an Evalue of 1e-05. Such parameter thresholds have already been used in previous works to define ORFans. To estimate the mean level of nucleotide sequence similarity at the genome level between B. massiliensis strain phR T , B. laterosporus strain LMG15441 (GenBank accession number AFRV00000000) and B. brevis strain NBRC100599 (GenBank accession number AP008955) and B. agri strain BAB-2500, we compared genomes two by two and determined the mean percentage of nucleotide sequence identity among orthologous ORFs using BLASTn. Orthologous genes were detected using the Proteinortho software [49].

Genome properties
The genome of B. massiliensis strain phR T is 5,051,018 bp long (1 chromosome but no plasmid) with a G + C content of 53.1% ( Figure 6 and Table 4). Of the 5,135 predicted genes, 5,051 were protein-coding genes, and 84 were RNAs. Three rRNA genes (one 16S rRNA, one 23S rRNA and one 5S rRNA) and 81 predicted tRNA genes were iden-tified in the genome. A total of 3,793 genes (73.86%) were assigned a putative function. Three hundred and seventy-eight genes were identified as ORFans (7.36%). The remaining genes were annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Table 4. The distribution of genes into COGs functional categories is presented in Table 5.  The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome  (Table 6). average percentage similarity of nucleotides corresponding to orthologous protein shared between genomes (below diagonal) and the numbers of proteins per genome (bold) [49].

Conclusion
On the basis of phenotypic, phylogenetic and genomic analyses, we formally propose the creation of Brevibacillus massiliensis sp. nov. which currently contains strain phR T as its sole member; . This bacterial strain was originally isolated in Marseille, France. Colonies are grey and 0.8 mm to 1 mm in diameter on blood-enriched Columbia agar. Cells are rodshaped with a mean diameter of 0.74 µm and a mean length of 4.3µm with electron microscopy. Optimal growth is achieved aerobically. Weak growth was observed when cultures were gown under a 5% CO 2 . No growth is observed in microaerophilic or anaerobic conditions. Growth occurs between 30 and 45°C, with optimal growth occurring at 37°C. Cells stain Gram-positive, form endospores and are motile. Cells are positive for catalase, oxidase, alkaline phosphatase, cystine arylamidase, acid phosphatase, naphtol-AS-BIphosphohydrolase and pyrazinamidase. Asaccharolytic. Cells are susceptible to penicillin G, amoxicillin, amoxicillin + clavulanic acid, ceftriaxone, imipenem, erythromycin, doxycycline, rifampicin, vancomycin, ciprofloxacin, gentamicin, nitrofurantoin and resistant to metronidazole and trimethoprim/sulfamethoxazole. The G+C content of the genome is 53.1%. The 16S rRNA and genome sequences are deposited in Genbank and EMBL under accession numbers JN837488 and CAGW00000000, respectively. The type strain phR T (= CSUR P177 = DSM 25447) was isolated from the fecal flora of an obese patient in Marseille, France.