Non contiguous-finished genome sequence and description of Alistipes obesi sp. nov

Alistipes obesi sp. nov. strain ph8T is the type strain of A. obesi, a new species within the genus Alistipes. This strain, whose genome is described here, was isolated from the fecal flora of a 26-year-old woman suffering from morbid obesity. A. obesi is an obligately anaerobic rod. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 3,162,233 bp long genome (1 chromosome but no plasmid) contains 2,623 protein-coding and 49 RNA genes, including three rRNA genes.


Introduction
Alistipes obesi strain ph8 T (CSUR= P186, DSMZ= 25724) is the type strain of A. obesi sp. nov. This bacterium is a Gram-negative, anaerobic, indolenegative bacillus and was isolated from the stool of a French patient suffering from morbid obesity as part of a culturomics study aiming at cultivating individually all species within human feces [1]. We recently proposed that genomic and proteomic data, which do not suffer from the lack of reproducibility and inter-laboratory comparability that the "gold standard" DNA-DNA hybridization and G+C content determination does [2], may be included in the official description of new bacterial species [3][4][5][6][7][8][9][10][11][12][13][14]. The genus Alistipes   [15] is currently comprised of five species, including A. finegoldii   [15], A. indistinctus (Nagai et al. 2010) [16], A. onderdonkii (Song et al. 2006) [17], A. putredinis   [15], and A. shahii (Song et al. 2006) [17]. In addition, we recently described two new species, A. senegalensis   [6] and A. timonensis   [7] that were isolated from the digestive microbiota from an asymptomatic Senegalese patient [1]. Members of the genus Alistipes are strictly anaerobic Gram-negative rods that are closely related to the Bacteroides fragilis group, with which they share the characteristic of bile-resistance and indole-positivity. Most Alistipes species have been isolated from human specimens, including the normal intestinal flora [17] and in cases of bacteremia, appendicitis, perirectal and brain abscess [18][19][20]. A 16S rRNA phylogenetic analysis revealed that A. obesi was closely related to A. shahii, A. senegalensis and A. timonensis. To the best of our knowledge, A. obesi sp. nov. is the first Alistipes species isolated from the digestive flora of an obese patient. Here we present a summary classification and a set of features for A. obesi sp. nov. strain ph8 T together with the description of the complete genome sequencing and annotation. These characteristics support the circumscription of the species A. obesi.

Classification and features
A stool sample was collected from an obese, 26-year-old woman living in Marseille, France, who suffered from morbid obesity: BMI=48.2 (118.8 kg, 1.57 meter). At the time of stool sample collection she was not a drug user and was not on a diet. The patient gave an informed and signed consent, and the agreement of the local ethics committee of the IFR48 (Marseille, France) was obtained under agreement 11-017. The fecal specimen was preserved at -80°C after collection. Strain ph8 (Table  1) was isolated in 2011 by anaerobic cultivation at 37°C on 5% sheep blood-enriched Columbia agar (BioMerieux, Marcy l'Etoile, France), after 11 days of preincubation of the stool sample with addition of rumen fluid in an anaerobic blood culture bottle.

Standards in Genomic Sciences
This strain exhibited a 93.5% 16S rRNA sequence similarity with A. shahii (Song et al. 2006) [17], the phylogenetically closest validated Alistipes species (Figure 1), and 94.26 and 93.38% with A. senegalensis   [6] and A. timonensis   [7], respectively. Among validly published Alistipes species [31], the percentage of 16S rRNA sequence similarity ranges from 90.5% between A. indistinctus (Nagai et al. 2010) [16] and A. shahii (Song et al. 2006) [17], to 96.8% between A. finegoldii ) [8] and A. onderdonkii (Song et al. 2006) [17]. As a consequence, and despite the fact that strain ph8 exhibited a 16SrRNA sequence similarity with the nearest validly published species lower than the 95.0% cutoff usually regarded as a threshold for the creation of new genus [32], we considered it as a new species within the genus Alistipes. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [30]. If the code is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements. Growth of the strain was tested under anaerobic and microaerophilic conditions using GENbag anaer and GENbag microaer systems, respectively (BioMerieux), and in the presence of air, with or without 5% CO 2 . Optimal growth was achieved anaerobically. No growth was observed under aerobic and microaerophilic conditions. Gram staining showed Gram-negative rods ( Figure 2). A motility test was positive. Cells grown on agar and diameter ranged from 0.44 µm to 0.76 µm, with a mean diameter of 0.61 µm by electron microscopy ( Figure 3). Comparison between seven Alistipes strains is presented in Table 2.
Strain ph8 T exhibited catalase activity but was not oxidase positive. Using the API RAPID ID 32A (BioMerieux), a positive reaction was obtained for α-galactosidase, β-galactosidase, N-acetyl-βglucosaminidase, alkaline phosphatase, leucyl glycine arylamidase, and alanine arylamidase. All other tested reactions were negative, notably nitrate reduction, indole formation, urease, arginine dihydrolase, α-and β-glucosidase, 6-phospho-βgalactosidase, arginine arylamidase, proline arylamidase, phenylalanine arylamidase, leucine arylamidase, pyroglutamic acid arylamidase, tyrosine arylamidase, glycine arylamidase, histidine arylamidase, glutamyl glutamic acid arylamidase, serine arylamidase, and mannose and raffinose fermentation. Using the Api Zym system (BioMerieux), esterase, esterase lipase, acid phosphatase, Naphtol-AS-BI phosphohydrolase and αgalactosidase activities were positive. A. obesi is susceptible to imipenem, ciprofloxacin, metronidazole, nitrofurantoin and rifampicin, but resistant to penicillin G, amoxicillin, amoxicillin-clavulanic acid, erythromycin, vancomycin, gentamicin 15 and gentamycin 500, doxycycline, ceftriaxone and trimethoprim/sulfamethoxazole. By comparison with A. senegalensis, A. obesi differed in motility, αgalactosidase, β-galactosidase, indole production, β-glucuronidase, arginine arylamidase, glycine arylamidase, proline arylamidase and mannose fermentation [6]. By comparison with A. timonensis, A. obesi differed in motility, indole production, βglucuronidase and N-acetyl-β-glucosaminidase [7]. By comparison with A.putredinis, A.obesi differed in motility, α-galactosidase, β-galactosidase N-acetylβ-glucosaminidase and indole production [15]. By comparison with A.finegoldii, A.obesi differed in catalase, α-glucosidase and indole production [15]. Finally, A. obesidiffered in indole production, catalase, esterase, esterase lipase and alpha-glucosidase with A. shahii [17], and alpha-glucosidase, esterase, esterase lipase and acid phosphatase with A. indistinctus [16]. Standards in Genomic Sciences  Matrix-assisted laser-desorption/ionization timeof-flight (MALDI-TOF) MS protein analysis was carried out as previously described [33]. Briefly, a pipette tip was used to pick one isolated bacterial colony from a culture agar plate, and to spread it as a thin film on a MTP 384 MALDI-TOF target plate (Bruker Daltonics, Leipzig, Germany). Four distinct deposits were done for strain ph8 from four isolated colonies. Each smear was overlaid with 2µL of matrix solution (saturated solution of alpha-cyano-4-hydroxycinnamic acid) in 50% acetonitrile, 2.5% tri-fluoracetic-acid, and allowed to dry for five minutes. Measurements were performed with a Microflex spectrometer (Bruker). Spectra were recorded in the positive linear mode for the mass range of 2,000 to 20,000 Da (parameter settings: ion source 1 (IS1), 20 kV; IS2, 18.5 kV; lens, 7 kV). A spectrum was obtained after 675 shots at a variable laser power. The time of acquisition was between 30 seconds and 1 minute per spot. The four ph8 spectra were imported into the MALDI BioTyper software (version 2.0, Bruker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 3,769 bacteria including the spectra from A. finegoldii, A. onderdonkii, A. shahii, A. senegalensis and A. timonensis used as reference data, in the BioTyper database. The method of identification included the m/z from 3,000 to 15,000 Da. For every spectrum, 100 peaks at most were taken into account and compared with spectra in the database. A score enabled either an identification, or non-identification, from the tested species: a score > 2 with a validly published species enabled a presumed identification at the species level, a score > 1.7 but < 2 enabled a presumed identification at the genus level; and a score < 1.7 did not enable an identification. For strain ph8, the obtained score was 1.1, suggesting that this isolate was not a member of a known species. We incremented our database with the spectrum from strain ph8 (Figure 4). Finally, the gel view allows us to highlight the spectra differences with other of Alistipes genera members ( Figure 5).

Genome sequencing information
The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the genus Alistipes, and is part of a study of the human digestive flora aiming at isolating all bacterial species contained within human feces. It was the seventh genome of an Alistipes species and the first genome of Alistipes obesi sp. nov. The EMBL accession number is CAHA00000000 and consists of 59 contigs. Table 3 shows the project information and its association with MIGS version 2.0 compliance [21].   The Gel View displays the raw spectra of all loaded spectrum files arranged in a pseudo-gel like look. The x-axis records the m/z value. The left y-axis displays the running spectrum number originating from subsequent spectra loading. The peak intensity is expressed by a Gray scale scheme code. The color bar and the right y-axis indicate the relation between the color a peak is displayed with and the peak intensity in arbitrary units.

Genome annotation
Open Reading Frames (ORFs) were predicted using Prodigal [34] with default parameters but the predicted ORFs were excluded if they spanned a sequencing gap. The predicted bacterial protein sequences were searched against the GenBank database and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool [35] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [36] and BLASTn against GenBank. Signal peptides and transmembrane helices were predicted using SignalP [37] and TMHMM [38], respectively. ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids. If alignment lengths were smaller than 80 amino acids, we used an E-value of 1e-05. Such parameter thresholds have already been used in previous works to define ORFans. To estimate the mean level of nucleotide sequence similarity at the genome level between Alistipes obesi strain ph8T and other members of the Alistipes genera, we compared genomes two by two and determined the mean percentage of nucleotide sequence identity among orthologous ORFs using BLASTn. Orthologous genes were detected using the Proteinortho software [39]. We compared A. obesi strain ph8 T with A. finegoldii strain AHN 2437 (GenBank accession number CP003274), A. indistinctus strain YIT 12060 (ADLD00000000), A. putredinis strain DSM 17216 (ABFK00000000), A. senegalensis strain JC50 T (CAHI00000000), A. shahii strain WAL 8301 (FP929032) and A. timonensis strain JC136 T (CAEG00000000).

Genome properties
The genome is 3,162,233 bp long (1 chromosome, but no plasmid) with a 58.6% G+C content (Table 4, Figure 6). Of the 2,672 predicted genes, 2,623 were protein-coding genes and 49 were RNAs. A total of 1,409 genes (52.75%) were assigned a putative function. One hundred twenty-seven genes were identified as ORFans (4.8%). The remaining genes were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 5. The properties and the statistics of the genome are summarized in Tables 4 and 5. The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.  (Table 6).

Conclusion
On the basis of phenotypic, phylogenetic and genomic analyses (taxono-genomics), we formally propose the creation of A. obesi sp. nov., which contains strain ph8 T . This bacterium has been cultivated from an obese patient in Marseille, France.