Complete genome sequence of Aminobacterium colombiense type strain (ALA-1T)

Aminobacterium colombiense Baena et al. 1999 is the type species of the genus Aminobacterium. This genus is of large interest because of its isolated phylogenetic location in the family Synergistaceae, its strictly anaerobic lifestyle, and its ability to grow by fermentation of a limited range of amino acids but not carbohydrates. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the second completed genome sequence of a member of the family Synergistaceae and the first genome sequence of a member of the genus Aminobacterium. The 1,980,592 bp long genome with its 1,914 protein-coding and 56 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.


Introduction
Strain ALA-1 T (= DSM 12261) is the type strain of the species Aminobacterium colombiense, which is the type species of the genus Aminobacterium [1,2]. The name of the genus relates to its ability to ferment amino acids and the species name refers to origin of the isolate, Columbia [1]. Currently, the genus Aminobacterium consists of only two species [1,3,4]. Strain ALA-1 T has been isolated from an anaerobic dairy wastewater lagoon in 1998 or before [1]. At the moment, strain ALA-1 T is the only known isolate of this species. Highly similar (98%) nearly complete (>1,400 bp) uncultured 16S gene clone sequences were frequently obtained from anaerobic habitats, e.g., from anaerobic municipal solid waste samples in France [5], from a biogas fermentation enrichment culture in China (GU476615), from a swine wastewater anaerobic digestion in a UASB reactor in China (FJ535518), and from a mesophilic anaerobic BSA digester in Japan [6], suggesting quite a substantial contribution of Aminobacterium to anaerobic prokaryotic communities. The type strain of the only other species in the genus, A. mobile [3] shares 95% 16S rRNA sequence identity with A. colombiense, whereas the type strains of the other species in the family Synergistaceae share be-tween 84.3 and 88.3% 16S rRNA sequence identity [7]. Environmental samples and metagenomic surveys detected only one significantly similar phylotype (BABF01000111, 92% sequence similarity) in a human gut microbiome [7], with all other phylotypes sharing less than 84% 16S rRNA gene sequence identity, indicating a rather limited general ecological importance of the members of the genus Aminobacterium (status April 2010). Here we present a summary classification and a set of features for A. colombiense ALA-1 T , together with the description of the complete genomic sequencing and annotation. Figure 1 shows the phylogenetic neighborhood of A. colombiense ALA-1 T in a 16S rRNA based tree. The sequences of the three identical copies of the 16S rRNA gene in the genome differ by 14 nucleotides (0.9%) from the previously published 16S rRNA sequence generated from DSM 12661 (AF069287). which contains 3 ambiguous base calls. These differences are most likely due to sequencing errors in AF069287.

Figure 1.
Phylogenetic tree highlighting the position of A. colombiense ALA-1 T relative to the other type strains within the phylum Synergistetes. The tree was inferred from 1,282 aligned characters [8,9] of the 16S rRNA gene sequence under the maximum likelihood criterion [10] and rooted in accordance with the current taxonomy [11]. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 250 bootstrap replicates [12] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [13] are shown in blue, published genomes in bold, e.g. the recently published GEBA genome of Thermanaerovibrio acidaminovorans [14].  Table 1) [1]. The colonies are up to 1.0 mm in diameter and are round, smooth, lensshaped, and white [1]. Strain ALA-1 T requires yeast extract for growth and ferments serine, glycine, threonine, and pyruvate in its presence [1]. Poor growth is obtained on casamino acids, peptone, biotrypcase, cysteine and α-ketoglutarate [1]. The fermentation and end-products include acetate and H2, and also propionate in the case of αketoglutarate fermentation. Carbohydrates (such as glucose, saccharose, ribose, xylose, cellobiose, mellobiose, maltose, galactose, mannose, arabinose, rhamnose, lactose, sorbose and mannitol), gelatin, casein, glycerol, ethanol, acetate, propionate, butyrate, lactate, citrate, fumarate, malate, succinate and the other amino acids tested are not utilized [1]. As typical for anoxic habitats, strain ALA-1 T is engaged in syntrophic interactions: alanine, glutamate, valine, isoleucine, leucine, methionine, aspartate and malate are oxidized only in the presence of the hydrogenotroph, Methanobacterium formici-cum, strain DSM 1525 [1]. In addition, the utilization of cysteine, threonine and α-ketoglutarate are also improved in the presence of M. formicicum [1]. An 80% hydrogen atmosphere (supplied as H2-CO, (80:20) at 2 bar pressure) inhibits growth of strain ALA-1 T on threonine and α-ketoglutarate, whereas glycine degradation is not affected [1]. Serine and pyruvate degradation are partially affected by the presence of hydrogen. Sulfate, thiosulfate, elemental sulfur, sulfite, nitrate, and fumarate are not utilized as electron acceptors [1]. Strain ALA-1 T does not perform the Stickland reaction when alanine is provided as an electron donor and glycine, serine, arginine or proline are provided as electron acceptor. As noted above, alanine is oxidized only in the presence of the hydrogenotroph M. formicicum, which utilizes the produced H2 [1]. In the absence of an H2-consuming organism, the H2 partial pressure would rapidly reach a level that thermodynamically inhibits further fermentation [21]. Adams and colleagues used a H2-purging culture vessel to replace the H2-consuming syntrophic partner, in order to study in detail the energetic characteristics of alanine consumption of strain ALA-1 T in a pure culture [21]. Strain ALA-1 T is non-motile [1], whereas interestingly the other species in the genus, A. mobile, is motile by means of lateral flagella [3]. A parallel situation is in the genus Anaerobaculum ( Figure   1), where A. thermoterrenum is non-motile [22] but A. mobile is motile by means of lateral flagella [23]. In fact, the phenotype of non-motility versus motility by means of lateral flagella is heterogeneously distributed among the organisms depicted in Figure 1. This may suggest that the last common ancestor of the group shown in Figure 1 was motile by flagella and that the selection pressure for a functioning flagella might be currently more relaxed in this group, leading in individual strains to mutational inactivation of the flagella. Interestingly, the annotation of the genome does not give any indication of the presence of any genes related to flagellar assembly. The only genes related to cellular motility refer to type II secretory pathway and to pilus assembly. This is surprising, as it is hardly probable that strain ALA-1 T lost all genes for flagellar assembly after the evolutionary separation of strain ALA-1 T and its closely related sister species A. mobile from their last common ancestor. A similar situation has been observed in the non-motile strain Alicyclobacillus acidocaldarius 104-IA T in comparison to several motile sister species in the genus Alicyclobacillus [24]. Here, the genome of the non-motile strain A. acidocaldarius 104-IA T still contains most of the genes needed for flagellar assembly [24]. Thus, the genotypic status of flagellar motility in the genus Aminobacterium remains unclear.  , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from of the Gene Ontology project [20]. If the evidence code is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements.

Chemotaxonomy
Ultrathin sections of strain ALA-1 T revealed a thick cell wall with an external S-layer similar to that of Gram-positive type cell walls [1]. Unfortunately, no chemotaxonomic data have been published for the genus Aminobacterium. Among the organisms depicted in Figure 1, chemotaxonomic data are available for Dethiosulfovibrio peptidovorans, Jonquetella anthropi, Pyramidobacter piscolens, Cloacibacillus evryensis, and Synergistes jone-sii, though the data are not always present in the original species description publications [25][26][27].

Genome sequencing and annotation Genome project history
This organism was selected for sequencing on the basis of its phylogenetic position [28], and is part of the Genomic Encyclopedia of Bacteria and Archaea project [29]. The genome project is deposited in the Genome OnLine Database [13] and the complete genome sequence is deposited in Gen-Bank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Genome sequencing and assembly
The genome was sequenced using a combination of Illumina and 454 technologies. An Illumina GAii shotgun library with reads of 909 Mb, a 454 Titanium draft library with average read length of 283 bases, and a paired end 454 library with average insert size of 12 kb were generated for this genome. All general aspects of library construction and sequencing can be found at http://www.jgi.doe.gov/. Draft assemblies were based on 169 Mb 454 draft data and 454 paired end data (543,550 reads).
Newbler (version 2.0.0-PostRelease-10/28/2008 was used) parameters are -consed -a 50 -l 350 -gm -ml 20. The initial Newbler assembly contained 18 contigs in 1 scaffold. The initial 454 assembly was converted into a phrap assembly by making fake reads from the consensus, collecting the read pairs in the 454 paired end library. Illumina sequencing data was assembled with VELVET [31], and the consensus sequences were shredded into 1.5 kb overlapped fake reads and assembled together with the 454 data. The Phred/Phrap/Consed software package was used for sequence assembly and quality assessment in the following finishing process. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with gapResolution, Dupfinisher, or sequencing cloned bridging PCR fragments with subcloning or transposon bombing [32]. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks (J-F.Cheng, unpublished). A total of 113 additional Sanger reactions were necessary to close gaps and to raise the quality of the finished sequence. The error rate of the completed genome sequence is less than 1 in 100,000.

Genome annotation
Genes were identified using Prodigal [33] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [34]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, Uni-Prot, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and In-terPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [35].

Genome properties
The genome consists of a 1,980,592 bp long chromosome with an overall GC content of 45.3% (Table 3 and Figure 3). Of the 1,970 genes predicted, 1,914 were protein-coding genes, and 56 RNAs; 38 pseudogenes were also identified. The majority of the protein-coding genes (77.2%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.