Complete genome sequence of Ilyobacter polytropus type strain (CuHbu1T)

Ilyobacter polytropus Stieb and Schink 1984 is the type species of the genus Ilyobacter, which belongs to the fusobacterial family Fusobacteriaceae. The species is of interest because its members are able to ferment quite a number of sugars and organic acids. I. polytropus has a broad versatility in using various fermentation pathways. Also, its members do not degrade poly-β-hydroxybutyrate but only the monomeric 3-hydroxybutyrate. This is the first completed genome sequence of a member of the genus Ilyobacter and the second sequence from the family Fusobacteriaceae. The 3,132,314 bp long genome with its 2,934 protein-coding and 108 RNA genes consists of two chromosomes (2 and 1 Mbp long) and one plasmid, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.


Introduction
Strain CuHbu1 T (= DSM 2926 = ATCC 51220 = LMG 16218) is the type strain of I. polytropus, which is the type species of the genus Ilyobacter [1,2]. Currently, there are four species placed in the genus Ilyobacter [1]. The generic name derives from the Greek word 'ilus' meaning 'mud' and the Neo-Latin word 'bacter' meaning 'a rod', referring to a mud-inhabiting rod [2]. The species epithet is derived from the Neo-Latin word 'polytropus' meaning 'versatile', referring to metabolic versatility of the species [2]. I. polytropus strain CuH-bu1 T was isolated from marine anoxic mud in Cuxhaven, Germany, and described by Stieb and Schink in 1984 [2]. No further isolates have been obtained for I. polytropus. Members of the genus Ilyobacter were isolated from anoxic marine sediments in Germany [2], Italy [3,4] and of estuarine origin [5]. Here we present a summary classification and a set of features for I. polytropus CuH-bu1 T , together with the description of the complete genomic sequencing and annotation.

Classification and features
The 16S rRNA gene sequence of I. polytropus shares the highest degree of sequence similarity with the type strains of the other two members of the genus, I. insuentus (97.3%) and I. tartaricus (98.3%), the latter was isolated from anoxic marine sediment of Canal Grande and Rio Martin in Venice, Italy. The degree of sequence identity with the type strains of the other members of the family Fusobacteriaceae varies between 89.5% and 97.8%, with Propionigenium modestum as most similar species [6] (Figure 1). The genome survey sequence database (gss) contains the 16S rRNA gene sequence of human gut metagenome clone 5192b-5192b-A-con-04 (FI579563) as the best hit, which is 91% identical to the 16S rRNA gene sequence of strain CuHbu1 T . No phylotypes from environmental samples database (env_nt) could be linked to the species I. polytropus or even the genus Ilyobacter, indicating a rather rare occurrence of these in the habitats screened so far (as of October 2010). A representative genomic 16S rRNA sequence of I. polytropus was compared using NCBI BLAST under default values (e.g., considering only the best 250 hits) with the most recent release of the Greengenes database [17] and the relative frequencies of taxa and keywords, weighted by BLAST scores, of taxa and keywords were determined. The four most frequent genera were Fusobacterium (70.2%), Ilyobacter (13.8%), Propionigenium (12.4%) and Clostridium (3.6%). Regarding hits to sequences from other members of the genus, the average identity within HSPs (high-scoring segment pairs) was 96.4%, whereas the average coverage by HSPs was 98.4%. The species yielding the highest score was I. tartaricus. The five most frequent keywords within the labels of environmental samples which yielded hits were 'microbiome' (6.5%), 'fecal' (6.1%), 'feces' (5.7%), 'calves/microorganisms/neonatal/shedding' (5.5%) and 'evolution/gut/mammals/microbes' (2.5%). These keywords suggest further, animal associated habitats for I. polytropus, beyond the anaerobic muds of marine origin as stated in the original description [2]. Environmental samples which yielded hits of a higher score than the highest scoring species were not found. Figure 1 shows the phylogenetic neighborhood of I. polytropus CuHbu1 T in a 16S rRNA based tree. The sequences of the eight 16S rRNA gene copies in the genome of I. polytropus differ from each other by up to three nucleotides, and differ by up to three nucleotides from the previously published 16S rRNA sequence (AJ307981), which contains two ambiguous base calls. The cells of I. polytropus are generally rod-shaped (0.7×1.5-3.0 µm) with rounded ends (Figure 2). Cells of I. polytropus show irregularly elongated rods, when grown on glucose and fructosecontaining media [2]. The cells are usually arranged in pairs or chains [2]. I. polytropus is a Gram-negative and non spore-forming bacterium ( Table 1). The organism is nonmotile and no flagellar genes have been found in the genome If at all, active movement by twitching motility could be possible, as some genes related to this phenotype were identified (this paper, see below). Interestingly, the original description states that "the originally motile rods lost motility after several transfers" [2]. The organism is a strictly anaerobic chemoorganotroph [2]. I. polytropus requires 1% NaCl in media for good growth [25]. The selective medium for I. polytropus is a NaCl-containing mineral media, which contains 3-hydroxybutyrate as a sole carbon and energy source [2]. The organism also grows in salt water medium or brackish water medium containing 1% NaCl and 0.15% MgCl2 . 6H2O [2]. Vitamins are not required in the enrichment media for at least five subsequent transfers [2]. Phosphate (up to 50 mM) does not inhibit growth of I. polytropus, when grown on 3-hydroxybutyrate [2]. The temperature range for growth is between 10°C and 35°C, with an optimum at 30°C [2]. The organism does not grow at 4°C or at 40°C [2]. The pH range for growth is 6.5-8.5, with an optimum at pH 7.0-7.5 [2]. No cytochromes are detected from I. polytropus [2]. I. polytropus is able to utilize 3-hydroxybutyrate, crotonate, glycerol, pyruvate, citrate, oxaloacetate, glucose, fructose, malate and fumarate and to ferment a variety of sugars and organic acids [2]. The organism does not utilize lactose, sucrose, mannitol, sorbitol, xylitol, 1,2-butanediol, 1,3-butanediol, 2,3butanediol, maltose, arabinose, cellobiose, mannose, melezitose, raffinose, sorbose, rhamnose, trehalose, xylose, acetone, diacetyl acetoin, acetoacetyl ethylester, acetoacetyl amide, peptone, casamino acids, yeast extract, glyoxylate, glycolate, lactate, succinate, L-tartrate, poly-β-hydroxybutyrate, starch, methanol plus acetate and formate plus acetate [2]. I. polytropus is able to ferment 3hydroxybutyrate and crotonate to acetate and butyrate [2]. Glycerol is fermented to 1,3-propanediol and 3-hydroxypropionate [2]. Malate and fumarate are fermented to acetate, formate and propionate [2]. I. polytropus is able to ferment glucose and fructose to acetate, formate and ethanol [2]. The organism does not reduce sulfate, sulfur, thiosulfate and nitrate [2]. I. polytropus grows in mineral media with a reductant [2]. It does not hydrolyze gelatin or urea and does not produce indole [2]. I. polytropus shows acetate kinase, phosphate acetyl transferase and hydrogenase activities, which are sufficient for involvement in dissimilatory metabolism [2]. Also, pyruvate formate lyase activity was shown in crude cell extracts, however, activity was extremely low and ascribed to a potential instability of this enzyme if traces of oxygen are present during the enzyme activity measurement [2]. I. polytropus maintains its energy metabolism exclusively by substrate-linked phosphorylation reactions [2]. I. polytropus differs from other anaerobes because the organism exhibits broad versatility in its use of various fermentation pathways. However, pathway regulation was reported as enigmatic because neither propionate nor butyrate were formed during glucose or fructose fermentation, although the necessary enzymes are present [2]. I. polytropus is of ecological interest because the organism does not degrade poly-β-hydroxybutyrate but only the monomeric of 3-hydroxybutyrate [2]. Metabolism of the polymer appears to be confined to aerobic microbial communities [26].  [7,8] of the 16S rRNA gene sequence under the maximum likelihood criterion [9] and rooted in accordance with the current taxonomy. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 900 bootstrap replicates [10] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [11] are shown in blue, published genomes in bold [12][13][14][15]. Note that Ilyobacter appears as polyphyletic in the tree [16], but none of the relevant branches obtains any bootstrap support. Thus, the current classification is not in significant conflict with our phylogenetic analysis. Standards in Genomic Sciences

Chemotaxonomy
No chemotaxonomic data are currently available for I. polytropus or for the genus Ilyobacter.

Genome sequencing and annotation Genome project history
This organism was selected for sequencing on the basis of its phylogenetic position [27], and is part of the Genomic Encyclopedia of Bacteria and Archaea project [28]. The genome project is deposited in the Genome OnLine Database [11,29] and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Growth conditions and DNA isolation
I. polytropus CuHbu1 T , DSM 2926, was grown anaerobically in medium 314 (Ilyobacter polytropus medium) [30] at 30°C. DNA was isolated from 0.5-1 g of cell paste using MasterPure Grampositive DNA purification kit (Epicentre MGP04100) following the standard protocol as recommended by the manufacturer, with modification st/LALM for cell lysis as described in Wu et al. [28].

Genome sequencing and assembly
The genome was sequenced using a combination of Illumina and 454 sequencing platforms. All general aspects of library construction and sequencing can be found at the JGI website [31]. Pyrosequencing reads were assembled using the Newbler assembler version 2.0.00.20-PostRelease-10-28-2008-g++-3.4.6 (Roche). The initial Newbler assembly consisting of 85 contigs in 1 scaffold was converted into a phrap assembly by [32] making fake reads from the consensus, to collect the read pairs in the 454 paired end library. Illumina GAii sequencing data (387 Mb) was assembled with Velvet [33] and the consensus sequences were shredded into 1.5 kb overlapped fake reads and assembled together with the 454 data. The 454 draft assembly was based on 284.7 Mb 454 draft data and all of the 454 paired end data. Newbler parameters are -conseda 50 -l 350 -g -m -ml 20.  Altitude sea level NAS Evidence codes -IDA: Inferred from Direct Assay (first time in publication); TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from of the Gene Ontology project [24]. If the evidence code is IDA, then the property was directly observed by one of the authors or an expert mentioned in the acknowledgements.
The Phred/Phrap/Consed software package [32] was used for sequence assembly and quality assessment in the subsequent finishing process. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with ga-pResolution [31], Dupfinisher, or sequencing cloned bridging PCR fragments with subcloning or transposon bombing (Epicentre Biotechnologies, Madison, WI) [34]. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks (J.-F.Chang, unpublished). A total of 719 additional reactions were necessary to close gaps and to raise the quality of the finished sequence. Illumina reads were also used to correct potential base errors and increase consensus quality using a software Polisher developed at JGI [35]. The error rate of the completed genome sequence is less than 1 in 100,000. Together, the combination of the Illumina and 454 sequencing platforms provided 215.8 × coverage of the genome. The final assembly contained 656,481 pyrosequence and 10,750,000 Illumina reads.

Genome annotation
Genes were identified using Prodigal [36] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [37].
The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, Uni-Prot, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and In-terPro databases. Additional gene prediction anal-ysis and functional annotation was performed within the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [38].

Genome properties
The genome consists of a 2,046,464 bp long chromosome I with a GC content of 35%, a 961,624 bp long chromosome II with 34% GC content, and a 124,226 bp long plasmid with 32% GC content (Table 3 and Figures 3a-c,). Of the 3,042 genes predicted, 2,934 were protein-coding genes, and 108 RNAs; 108 pseudogenes were also identified. The majority of the protein-coding genes (73.3%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.