Non-contiguous finished genome sequence and description of Bacillus massilioanorexius sp. nov.

Bacillus massilioanorexius strain AP8T sp. nov. is the type strain of B. massilioanorexius sp. nov., a new species within the genus Bacillus. This strain, whose genome is described here, was isolated from the fecal flora of a 21-year-old Caucasian French female suffering from a severe form of anorexia nervosa since the age of 12 years. B. massilioanorexius is a Gram-positive aerobic bacillus. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 4,616,135 bp long genome (one chromosome but no plasmid) contains 4,432 protein-coding and 87 RNA genes, including 8 rRNA genes.


Introduction
Bacillus massilioanorexius strain AP8 T (= CSUR P201 = DSM 26092) is the type strain of B. massilioanorexius sp. nov. This bacterium is a Gram-positive, non-spore-forming, aerobic and motile bacillus that was isolated from the stool of a 21-year-old Caucasian French female suffering from a severe form of anorexia nervosa since the age of 12 years and is part of a "culturomics" study aiming at cultivating all species within human feces individually [1][2][3]. This bacterium was one of the 11 new bacterial species isolated from this single stool sample [3]. The current classification of Bacteria and Archaea remains a subject of debate and currently relies on a combination of phenotypic and genotypic characteristics [4]. Genomic data has not yet been routinely incorporated into descriptions. However, as more than 6,000 bacterial genomes have been sequenced including 982 type strains [5,6] and another 15,000 genomic projects are ongoing including 2,120 type strains [5,6], we recently proposed to integrate genomic information in the description of new bacterial species . The genus Bacillus (Cohn 1872) was created in 1872 [29]. It consists mainly of Gram-positive, motile, spore-forming bacteria classified within 251 species and 3 subspecies with validly pub-lished names [30]. Members of the genus Bacillus are ubiquitous bacteria isolated from various environments including soil, fresh and sea water and food. In humans, Bacillus species may be opportunists in immunocompromised patients [31] or pathogenic, such as B. anthracis [32] and B. cereus. However, in addition to these two species, various Bacillus species may be involved in a variety of aspecific human infections, including cutaneous, ocular, central nervous system or bone infections, pneumonia, endocarditis and bacteremia [33]. Here we present a summary classification and a set of features for B. massilioanorexius sp. nov. strain AP8 T (= CSUR P201 = DSM 26092), together with the description of the complete genomic sequence and its annotation. These characteristics support the circumscription of the species B. massilioanorexius.

Classification and information
A stool sample was collected from a 21-year-old Caucasian French female suffering from a severe restrictive form of anorexia nervosa since the age of 12 years. She was hospitalized in the nutrition unit of our hospital for recent aggravation of her medical condition. At the time of hospitalization, her weight and height was 27.7 kg, and 1.63 m (BMI: 10.4 kg/m 2 ) respectively. The patient gave an informed and signed consent. This study and the assent procedure were approved by the Ethics Committee of the Institut Fédératif de Recherche IFR48, Faculty of Medicine, Marseille, France (agreement 09-022). The fecal specimen was preserved at -80°C after collection. Strain AP8 T (Table  1) was isolated in March 2012 by aerobic cultivation on Columbia agar (BioMerieux, Marcy l'Etoile, France) after one month of preincubation of the stool sample with addition of 5ml of sheep rumen in blood bottle culture. This strain exhibited a 97% nucleotide sequence similarity with B. simplex [34], the phylogenetically closest validated Bacillus species (Figure 1). This value was lower than the 98.7% 16S rRNA gene sequence threshold recommended by Stackebrandt and Ebers to delineate a new species without carrying out DNA-DNA hybridization [35]. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [46]. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements.
Bacillus massilioanorexius (JX101689) Figure 1. Phylogenetic tree highlighting the position of Bacillus massilioanorexius strain AP8 T relative to a selection of type strains of validly published species of Bacillus genus. GenBank accession numbers are indicated in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences obtained using the maximum-likelihood method within MEGA program. Numbers at the nodes are percentages of bootstrap values obtained by repeating the analysis 500 times to generate a majority consensus tree. Clostridium botulinum was used as outgroup. The scale bar represents a 2% nucleotide sequence divergence.
Different growth temperatures (25, 30, 37, 45°C) were tested. Growth was observed between 25 and 45°C, with optimal growth at 37°C after 24 hours of incubation. Colonies were 3 mm in diameter and 0.5 mm in thickness and gray in color with coarse appearance on blood-enriched Columbia agar. Growth of the strain was tested under anaerobic and microaerophilic conditions using GENbag anaer and GENbag microaer systems, respectively (BioMerieux), and under aerobic  Strain AP8 T exhibited catalase and oxidase activity. Substrates oxidation and assimilation were examined with an API 50CH strip (BioMerieux) at the optimal growth temperature. Positive reactions were observed for D-glucose, D-fructose, Dsaccharose, ribose, mannose, mannitol and Dtrehalose and weak reactions were observed for L-rhamnose, esculine, salicine, D-cellobiose and gentiobiose. Using an API 20E strip (BioMerieux, Marcy l'Etoile), positive reactions were observed for tryptophane deaminase, acetoin and gelatinase production. Negative reactions were found for urease and indole production. B. massilioanorexius is susceptible to amoxicillin, rifampicin, ciprofloxacin, gentamicin, doxycycline and vancomycin but resistant to trimethoprim/sulfamethoxazole and metronidazole. When compared with representative species from the genus Bacillus, B. massilioanorexius strain AP8 T exhibited the phenotypic differences detailed in Table 2.

Hydrolysis of
Gelatin Matrix-assisted laser-desorption/ionization time-offlight (MALDI-TOF) MS protein analysis was carried out as previously described [47] using a Microflex spectrometer (Brüker Daltonics, Leipzig, Germany). Twelve individual colonies were deposited on a MTP 384 MALDI-TOF target plate (Brüker). The twelve AP8 T spectra were imported into the MALDI BioTyper software (version 2.0, Brüker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 3,769 bacteria, including 129 spectra from 98 validly named Bacillus species, used as reference data in the BioTyper database. A score enabled the presumptive identification and discrimination of the tested species from those in a database: a score > 2 with a validated species enabled the identification at the species level; and a score < 1.7 did not enable any identification. For strain AP8 T , no significant score was obtained, suggesting that our isolate was not a member of any known species (Figures 4 and 5).

Genome sequencing information Genome project history
The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the Bacillus genus, and is part of a "culturomics" study of the human digestive flora aiming at isolating all bacterial species within human feces. It was the twenty-seventh genome of a Bacillus species and the first genome of Bacillus massilioanorexius sp. nov. A summary of the project information is shown in Table 3. The Genbank accession number is CAPG00000000 and consists of 120 contigs. Table 3 shows the project information and its association with MIGS version 2.0 compliance [48].   The left y-axis displays the running spectrum number originating from subsequent spectra loading. The peak intensity is expressed by a Gray scale scheme code. The color bar and the right y-axis indicate the relation between the color a peak is displayed with and the peak intensity in arbitrary units. Displayed species are indicated on the left.

Genome annotation
Open Reading Frames (ORFs) were predicted using Prodigal [49] with default parameters but the predicted ORFs were excluded if they were spanning a sequencing gap region. The predicted bacterial protein sequences were searched against the GenBank database [50] and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool [51] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [52] and BLASTn against the GenBank database. Lipoprotein signal peptides and the number of transmembrane helices were predicted using SignalP [53] and TMHMM [54] respectively. ORFans were identified if their BLASTP E-value was lower than 1e -03 for alignment length greater than 80 amino acids. If alignment lengths were smaller than 80 amino acids, we used an  [56] was used for data management and DNA Plotter [57] was used for visualization of genomic features. Mauve alignment tool was used for multiple genomic sequence alignment and visualization [58].

Genome properties
The genome of B. massiliensis strain AP8 T is 4,616,135 bp long (1 chromosome, but no plasmid) with a 34.10% G + C content ( Figure 6 and Table 4). Of the 4,519 predicted genes, 4,432 were protein-coding genes, and 87 were RNAs. Eight rRNA genes (one 16S rRNA, one 23S rRNA and six 5S rRNA) and 79 predicted tRNA genes were identified in the genome. A total of 3,290 genes (72.80%) were assigned a putative function. Three hundred fifty-four genes were identified as ORFans (7.98%). The remaining genes were annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Table 4 and Table 5. The distribution of genes into COGs functional categories is presented in Table 5. Gold ID Gi20708

MIGS-13
Project relevance Study of the human gut microbiome  The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome  Table 6 summarizes the numbers of orthologous genes and the average percentage of nucleotide sequence identity between the different genomes studied.

Conclusion
On the basis of phenotypic, phylogenetic and genomic analyses, we formally propose the creation of Bacillus massilioanorexius sp. nov. that contains the strain AP8 T . The strain has been found in France.

Description of Bacillus massilioanorexius sp. nov.
Bacillus massilioanorexius (ma.si.li.o.a.no.rex'i.us. L. masc. adj. massilioanorexius, combination of Massilia, the Latin name of Marseille, France, where the type strain was isolated, and anorexia, the disease presented by the patient from whom the strain was cultivated). Colonies were 3 mm in diameter and 0.5 mm in thickness, gray in color with a coarse appearance on blood-enriched Columbia agar. Cells are rodshaped with a mean diameter of 0.77 µm. Optimal growth occurs aerobically, weak growth was observed under anaerobic conditions. Growth occurs between 25 and 45°C, with optimal growth ob-served at 37°C. Cells stain Gram-positive, are nonendospore forming and are motile. Cells are Grampositive, catalase-positive, oxidase-positive. Dglucose, D-fructose, D-saccharose, D-trehalose, ribose, mannitol, mannose were used as carbon source. Positive reactions were observed for tryptophane deaminase, acetoin and gelatinase production. Weak reactions were obtained for Lrhamnose, esculine, salicine, D-cellobiose and gentiobiose. Cells are susceptible to amoxicillin, rifampicin, ciprofloxacin, gentamicin, doxycycline and vancomycin but resistant to trimethoprim/sulfamethoxazole and metronidazole. The G+C content of the genome is 34.10%. The 16S rRNA and genome sequences are deposited in GenBank under accession numbers JX101689 and CAPG00000000, respectively. The type strain AP8 T (= CSUR P201 = DSM 26092) was isolated from the fecal flora of a female suffering from anorexia nervosa in Marseille, France.