Non-contiguous finished genome sequence and description of Alistipes timonensis sp. nov.

Alistipes timonensis strain JC136T sp. nov. is the type strain of A. timonensis sp. nov., a new species within the genus Alistipes. This strain, whose genome is described here, was isolated from the fecal flora of a healthy patient. A. timonensis is an obligate anaerobic rod. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 3,497,779 bp long genome (one chromosome but no plasmid) contains 2,742 protein-coding and 50 RNA genes, including three rRNA genes.


Introduction
Alistipes timonensis strain JC136 T (= CSUR P148 = DSM 25383) is the type strain of A. timonensis sp. nov. This bacterium is a Gram-negative, anaerobic, indole-positive bacillus and was isolated from the stool of a healthy Senegalese patient as part of a "culturomics" study aiming at cultivating individually all species within human feces.
With more than 3,000 genome sequences available, bacterial genomics has revolutionized several aspects of microbiology. To date, taxonomy has remained unaffected by this progress, despite the debate around the definition of bacterial species. Despite its elevated cost, poor reproducibility and inter-laboratory comparability, DNA-DNA hybridization remains the "gold standard" criterion [1]. Even the application of internationally validated cutoff values in 16S rRNA sequence similarity that enabled the taxonomic classification or reclassification of hundreds of taxa, is debated [2]. High throughput genome sequencing and mass spectrometric analyses of bacteria provide access to a wealth of genetic and proteomic information [3]. We propose to use a polyphasic approach [4] to describe new bacterial taxa that includes their genome sequence, MALDI-TOF spectrum and main phenotypic characteristics (habitat, Gram-stain reaction, culture and metabolic characteristics, and when applicable, pathogenicity).
Here we present a summary classification and a set of features for A. timonensis sp. nov. strain JC136 T together with the description of the complete genomic sequencing and annotation. These characteristics support the circumscription of the species A. timonensis.
The genus Alistipes (Rautio et al. 2003) was created in 2003 [5]. To date, this genus, composed of bile-resistant, strictly anaerobic and Gramnegative bacilli, contains five species including A. finegoldii (Rautio et al. 2003) [5], A.indistinctus (Nagai et al. 2010) [6], A. onderdonkii (Song et al. 2006) [7], A. putredinis (Weinberg et al. 1937) Rautio et al. 2003 [5], and A. shahii (Song et al. 2006) [7]. Pigment production, initially considered as characteristic of Alistipes species, was recently demonstrated to be inconstant [8]. Members of the genus Alistipes are members of the normal human intestinal microbiota, but have also been reported in urine and the mouth [7], and have occasionally been isolated from abdominal, appendiceal and rectal abscesses, blood cultures from colon cancer patients [9], and feces from children with irritable bowel syndrome [10]. A. putredinis was also demonstrated to be associated to cruciferous vegetable intake [11]. In addition, A. finegoldii has been suspected to play the role of growth promoter in chickens [12].

Classification and features
A stool sample was collected from a healthy 16year-old male Senegalese volunteer patient living in Dielmo (a rural village in the Guinean-Sudanian zone in Senegal), who was included in a research protocol. The patient gave an informed and signed consent, and the agreement of the National Ethics Committee of Senegal and the local ethics committee of the IFR48 (Marseille, France) were obtained under agreement 09-022. The fecal specimen was preserved at -80°C after collection and sent to Marseille. Strain JC136 (Table 1) was isolated in June 2011 by anaerobic cultivation on 5% sheep blood-enriched Columbia agar (BioMerieux, Marcy l'Etoile, France). This strain exhibited 96.98% and 98.13% nucleotide sequence similarities with A. shahii (Song et al. 2006) and A. senegalensis (Mishra et al. 2012), respectively, the phylogenetically closest validated Alistipes species (Figure 1) [7]. This value was lower than the 98.7% 16S rRNA gene sequence threshold recommended by Stackebrandt and Ebers to delineate a new species without carrying out DNA-DNA hybridization [2]. It should be noted that both A. senegalensis strain JC50T and strain JC136 were cultivated from the same individual. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [20]. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements. Different growth temperatures (25, 30, 37, 45°C) were tested; no growth occurred at 25°C and 45°C, growth occurred at 30°C, and optimal growth was observed at 37°C. Colonies were 0.2 mm to 0.3 mm in diameter on blood-enriched Columbia agar and Brain Heart Infusion (BHI) agar. Growth of the strain was tested under anaerobic and microaerophilic conditions using GENbag anaer and GENbag microaer systems, respectively (BioMerieux), and in the presence of air, with or without of 5% CO 2 , and in aerobic conditions. Optimal growth was achieved anaerobically. No growth was observed in aerobic, microaerophilic and 5% CO 2 atmospheres. Gram staining showed Gram negative rods ( Figure 2). A motility test was negative. Cells grown on agar have a mean diameter of 0.62 µm ( Figure 3) and produce brown pigment.  Strain 136 T exhibited catalase activity but no oxidase activity, and was resistant to 20% bile. Using API Rapid ID 32A, a positive reaction was obtained for α-galactosidase, β-galactosidase, βglucuronidase, glutamic acid decarboxylase, leucyl glycine arylamidase and alanine arylamidase. Weak reactions were obtained for indole production and N-acetyl-β-glucosaminidase. No mannose and raffinose fermentation were observed. A. timonensis is susceptible to penicillin G, amoxicillin + clavulanic acid, imipeneme, clindamycin, metronidazole and resistant to vancomycin. By comparison with A. senegalensis, strain 136 T differed in mannose fermentation and proline arylamidase, arginine arylamidase and glycine arylamidase. By comparison with A. shahii, strain 136 T differed in catalase activity and mannose and raffinose fermentation [7]. Matrix-assisted laser-desorption/ionization timeof-flight (MALDI-TOF) MS protein analysis was carried out as previously described [21]. Briefly, a pipette tip was used to pick one isolated bacterial colony from a culture agar plate, and to spread it as a thin film on a MTP 384 MALDI-TOF target plate (Bruker Daltonics, Leipzig, Germany). Four distinct deposits were done for strain JC136 from four isolated colonies. Each smear was overlaid with 2µL of matrix solution (saturated solution of alpha-cyano-4-hydroxycinnamic acid) in 50% acetonitrile, 2.5% tri-fluoracetic-acid, and allowed to dry for five minutes. Measurements were performed with a Microflex spectrometer (Bruker). Spectra were recorded in the positive linear mode for the mass range of 2,000 to 20,000 Da (parameter settings: ion source 1 (IS1), 20 kV; IS2, 18.5 kV; lens, 7 kV). A spectrum was obtained after 675 shots at a variable laser power. The time of acquisition was between 30 seconds and 1 minute per spot. The four JC136 spectra were imported into the MALDI BioTyper software (version 2.0, Bruker) and analyzed by standard pattern matching (with default parameter settings) against the main spectra of 2,843 bacteria including the spectra from A. finegoldii, A. onderdonkii and A. shahii, used as reference data, in the BioTyper database. The method of identification included the m/z from 3,000 to 15,000 Da. For every spectrum, 100 peaks at most were taken into account and compared with spectra in the database. A score enabled the identification, or not, from the tested species: a score > 2 with a validated species enabled the identification at the species level, a score > 1.7 but < 2 enabled the identification at the genus level; and a score < 1.7 did not enable any identification. For strain 136, the obtained score was 1.2, thus suggesting that our isolate was not a member of a known species. We incremented our database with the spectrum from strain JC136 (Figure 4). The spectrum was made available online in our free-access URMS database [22].

Genome sequencing information Genome project history
The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the genus Alistipes, and is part of a "culturomics" study of the human digestive flora aiming at isolating all bacterial species within human feces. It was the third genome of an Alistipes species and the first genome of Alistipes timonensis sp. nov. A summary of the project information is shown in Table 2. The EMBL accession number is CAEG00000000 and consists of 23 contigs. Table 2 shows the project information and its association with MIGS version 2.0 compliance [5].

Genome sequencing and assembly
Both a shotgun and 3-kb paired-end sequencing were performed. The shotgun library was constructed with 500 ng of DNA with the GS Rapid library Prep kit (Roche). For the paired-end sequencing, 5 µg of DNA was mechanically fragmented on a Hydroshear device (Digilab, Holliston, MA, USA) with an enrichment size at 3-4kb.
The DNA fragmentation was visualized using the 2100 BioAnalyzer (Agilent, Massy, France) on a DNA labchip 7500 with an optimal size of 3.393 kb. The library was constructed according to the 454 GS FLX Titanium paired-end protocol. Circularization and nebulization were performed and generated a pattern with an optimal size of 423 bp. After PCR amplification through 15 cycles followed by double size selection, the single stranded paired-end library was then quantified using the Genios fluorometer (Tecan) at 205 pg/µL. The library concentration equivalence was calculated as 8,87E+08 molecules/µL. The library was stored at -20°C until further use. The shotgun and paired-end libraries were clonally-amplified with 3 cpb and 1cpb, respectively, in 2×8 emPCR reactions with the GS Titanium SV emPCR Kit (Lib-L) v2 (Roche

Genome annotation
Open Reading Frames (ORFs) were predicted using Prodigal [23] with default parameters but the predicted ORFs were excluded if they were spanning a sequencing GAP region. The predicted bacterial protein sequences were searched against the GenBank database and the Clusters of Orthologous Groups (COG) databases using BLASTP. The tRNAScanSE tool [24] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [25] and BLASTn against GenBank.
ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids. If alignment lengths were smaller than 80 amino acids, we used an E-value of 1e-05. Such parameter thresholds have already been used in previous works to define ORFans. To estimate the mean level of nucleotide sequence similarity at the genome level between Alistipes species, we compared the ORFs only using BLASTN and the following parameters: a query coverage of > 70% and a minimum nucleotide length of 100 bp.

Genome properties
The genome is 3,497,779 bp long (one chromosome, no plasmid) with a 58.82% GC content (Table 3, Figure 5). Of the 2,742 predicted genes, 2,692 were protein-coding genes, and 50 were RNAs. A total of 1,885 genes (70.02%) were assigned a putative function. Seventy-eight genes were identified as ORFans (2.9%). The remaining genes were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4. The properties and the statistics of the genome are summarized in Tables 3 and 4.  a) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome.  The total is based on the total number of protein coding genes in the annotated genome.

Conclusion
On the basis of phenotypic, phylogenetic and genomic analyses, we formally propose the creation of Alistipes timonensis sp. nov. that contains the strain JC136 T . This bacterium has been cultivated from an healthy Senegalese individual, from whom was also cultivated A. senegalensis strain JC50 T , thus suggesting that the fecal flora from humans may contain several undescribed bacterial species that may be isolatable through diversification of culture conditions.
Colonies are 0.2 to 0.3 mm in diameter and produce brown pigment on blood-enriched Columbia agar and Brain Heart Infusion (BHI) agar. Cells are rodshaped with a mean diameter of 0.62 µm. Optimal growth is achieved anaerobically. No growth is observed in aerobic or microaerophilic conditions.
Growth occurs between 30-37°C, with optimal growth observed at 37°C, in BHI medium + 5% NaCl. Cells stain Gram negative and are non-motile. Catalase, α-galactosidase, β-galactosidase, βglucuronidase, glutamic acid decarboxylase, leucyl glycine arylamidase, N-acetyl-β-glucosaminidase and alanine arylamidase activities are present. Indole production is also present. Oxidase activity is absent. Cells are susceptible to penicillin G, amoxicillin + clavulanic acid, imipeneme and clindamycin and metronidazole. The G+C content of the genome is 58.82%. The 16S rRNA and genome sequence are deposited in GenBank under accession numbers JF824799 and CAEG00000000, respectively.
A. timonensis is an obligate anaerobic Gramnegative bacterium. Grows on axenic medium at 37°C in an anaerobic atmosphere. Not motile.
The type strain JC136 T (= CSUR P148 = DSM 25383) was isolated from the fecal flora of a healthy patient in Senegal.