High quality draft genome sequence of Olivibacter sitiensis type strain (AW-6T), a diphenol degrader with genes involved in the catechol pathway

Olivibacter sitiensis Ntougias et al. 2007 is a member of the family Sphingobacteriaceae, phylum Bacteroidetes. Members of the genus Olivibacter are phylogenetically diverse and of significant interest. They occur in diverse habitats, such as rhizosphere and contaminated soils, viscous wastes, composts, biofilter clean-up facilities on contaminated sites and cave environments, and they are involved in the degradation of complex and toxic compounds. Here we describe the features of O. sitiensis AW-6T, together with the permanent-draft genome sequence and annotation. The organism was sequenced under the Genomic Encyclopedia for Bacteria and Archaea (GEBA) project at the DOE Joint Genome Institute and is the first genome sequence of a species within the genus Olivibacter. The genome is 5,053,571 bp long and is comprised of 110 scaffolds with an average GC content of 44.61%. Of the 4,565 genes predicted, 4,501 were protein-coding genes and 64 were RNA genes. Most protein-coding genes (68.52%) were assigned to a putative function. The identification of 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase-coding genes indicates involvement of this organism in the catechol catabolic pathway. In addition, genes encoding for β-1,4-xylanases and β-1,4-xylosidases reveal the xylanolytic action of O. sitiensis.


Introduction
The genus Olivibacter currently contains six species with validly published names, all of which are aerobic and heterotrophic, non-motile, rodshaped Gram-negative bacteria [1][2][3]. Strain AW-6 T (= DSM 17696 T = CECT 7133 T = CIP 109529 T ) is the type strain of Olivibacter sitiensis [1], which is the type species of the genus Olivibacter. The strain was isolated from alkaline alperujo, an olive mill sludge-like waste produced by two-phase centrifugal decanters located in the vicinity of Toplou Monastery, Sitia, Greece [1]. The genus name derived from the Latin term oliva and the Neo-Latin bacter, meaning a rod-shaped bacterium living in olives/olive processing by-products [1]. The Neo-Latin species epithet sitiensis pertains to the region Sitia (Crete, Greece) where the olive mill is operating [1]. The [2][3][4]. O. soli and O. ginsengisoli were isolated from soil of a ginseng field [2], O. terrae from a compost prepared of cow manure and rice straw [2], O. oleidegradans from a biofilter clean-up facility in a hydrocarboncontaminated site [3] and O. jilunii from a DDTcontaminated soil [4]. O. sitiensis can be distinguished from O. soli, O. ginsengisoli and O. terrae on the basis of temperature and NaCl concentration ranges for growth, in its ability to assimilate N-acetyl-D-glucosamine, L-histidine, maltose and sorbitol, and for expression of naphthol-AS-BIphosphohydrolase, in the presence/absence of iso-C15: 1 F, C16: 1 2-OH, anteiso-C17: 1 B and/or iso-C17: 1 I, and in by its DNA G+C content [1,2,4]. Moreover, it differs from O. soli in terms of Larabinose assimilation and valine arylamidase expression, from O. ginsengisoli in terms of inositol, mannitol and salicin assimilation and in oxidase reaction test, and from O. terrae in terms of Larabinose and mannitol assimilation, and βglucuronidase and valine arylamidase expression [1,2,4]. O. sitiensis can be differentiated from O. oleidegradans on the basis of DNA G+C content, pH upper limit for growth, in the ability for assimilation of D-adonitol, L-arabinose, N-acetyl-Dglucosamine, L-histidine, D-lyxose, maltose, melezitoze, salicin and turanose, and for expression of esterase, β-galactosidase, α-mannosidase, urease and valine arylamidase as well as in the presence/absence of some minor fatty acid components of membrane lipids, menaquinone-6 (as minor respiratory quinone) and aminophospholipids (as cellular polar lipids) [1,3,4]. In addition, O. sitiensis can be distinguished from O. jelunii on the basis of DNA G+C content, pH, temperature and NaCl concentration upper limits for growth, lactose fermentation, in the ability for assimilation of acetate, L-arabinose, N-acetyl-D-glucosamine, L-histidine, malonate, maltose, D-mannose, salicin and L-serine, and for expression of α-mannosidase, oxidase and valine arylamidase as well as in the presence/absence of some minor fatty acid components of membrane lipids, menaquinone-8 (as minor respiratory quinone) and aminophospholipids (as cellular polar lipids) [1,4]. Here we present a summary classification and a set of features for O. sitiensis AW-6 T , together with the description of the permanentdraft genome sequencing and annotation.

Classification and features
The 16S rRNA gene sequence of O. sitiensis AW-6 T was compared using NCBI BLAST under default settings (e.g., considering only the high-scoring segment pairs (HSPs) from the best 250 hits) with the most recent release of the Greengenes database [5] and the relative frequencies of taxa and keywords (reduced to their stem [6]) were determined and weighted by BLAST scores. The frequency of genera that belonged to the family Sphingobacteriaceae was 61.8%. The most fre-  [7] and Olivibacter ginsengisoli Gsoil 060 T (AB267716) [2], showing similarity in 16S rRNA gene of 90.1% (both of them) and HSP coverages of 99.8% and 99.9% respectively. It is noteworthy that the Greengenes database uses the INSDC (=EMBL/NCBI/DDBJ) annotation, which is not an authoritative source for nomenclature or classification. The highest-scoring environmental sequences was AM114441 ['Interactions U(VI) added natural dependence on various incubation conditions soil uranium mining waste pile clone JG35+U2A-AG9'], which showed identity of 90.3% with HSP coverage of 86.1%. The most frequently occurring keywords within the labels of all environmental samples that yielded hits were 'rumen' (23.1%), 'oil' (10.8%), 'water' (9.7%), 'soil' (9.7%), 'fluid' (9.1%) and 'gut' (9.1%) (186 hits in total). The most frequently occurring keywords within the labels of those environmental samples that yielded hits of a higher score than the highest scoring species were 'waste' (50.0%) and 'soil' (50.0%) (4 hits in total), which are keywords with biological meaning fitting the environment from which O. sitiensis AW-6 T was isolated. Figure 1 shows the phylogenetic neighborhood of O. sitiensis in the 16S rRNA gene sequence-based trees constructed. Independently from the clustering method applied, all Olivibacter species together with Pseudosphingobacterium domesticum and 'Sphingobacterium' sp. 21 fell into a distinct cluster, indicating the unique phylogenetic position of genus Olivibacter and the necessity for reconsidering the taxonomic status of the genus Pseudosphingobacterium.
In addition, 'Sphingobacterium' sp. 21 should be assigned to the genus Olivibacter, and not to the genus Sphingobacterium. In the ML tree, members of the genus Parapedobacter branched together with O. sitiensis, although the unique topology of the genus was established by applying a character-based (parsimony) method. As previously stated by Ntougias et al. [1], S. antarcticum should be reassigned to the genus Pedobacter.    [8,9] of the 16S rRNA gene sequence under (A) [previous page] the maximum likelihood (ML) [10] and (B) [this page] the maximum-parsimony criterion. In ML tree, the branches are scaled in terms of the expected number of substitutions per site. Numbers adjacent to the branches are support values from 100 ML bootstrap replicates (A) and from 1,000 maximum-parsimony bootstrap replicates (B) [11]. Lineages with strain genome sequencing projects registered in GOLD [12] are labeled with one asterisk, while those listed as 'Complete and Published' with two asterisks (e.g. Pedobacter heparinus [13] and P. saltans [14]).   Figure 2). The temperature range for growth is 5-45°C, with an optimum at 28-32°C [1]. O. sitiensis is neutrophilic, showing no growth at 30 g L -1 NaCl [1]. The pH for growth ranges between 5 and 8, with pH values of 6-7 being the optimum [1]. O. sitiensis is strictly aerobic and chemoorganotrophic; it assimilates mostly D(+)-glucose, protocatechuate and D(+)-xylose, while Lcysteine, D(-)-fructose, D(+)-galactose, L-histidine, lactose, sorbitol and sucrose are also utilized by strain AW-6 T [1]. O. sitiensis was found to be sensitive to ampicillin, bacitracin, chloramphenicol, penicillin, rifampicin, tetracycline and trimethoprim, and resistant to kanamycin, polymixin B and streptomycin (antibiotics' concentration of 50 μg ml -1 ) [1].  . not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [23]. If the evidence code is IDA, then the property was directly observed for a living isolate by one of the authors or an expert mentioned in the acknowledgements.

Genome project history
This microorganism was selected for sequencing on the basis of its phylogenetic position [24,25], and is part of the Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project [26] which aims in increasing the sequencing coverage of key reference microbial genomes. The genome project is deposited in the Genomes On Line Database [12] and the genome sequence is available from GenBank. Se-quencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI) using state of the art sequencing technology [27]. A summary of the project information is presented in Table 2.

Growth conditions and DNA isolation
O. sitiensis strain AW-6 T was grown aerobically in DSMZ medium 92 (trypticase soy yeast extract medium) [28] at 28°C. DNA was isolated from 0.5-1 g of cell paste using Jetflex Genomic DNA purification kit (Genomed_600100) following the standard protocol as recommended by the manufacturer but applying a modified cell lysis procedure (1 hour incubation at 58°C with additional 50 µl proteinase K followed by overnight incubation on ice with additional 200 µl PPT-buffer). DNA is available via the DNA Bank Network [29].

Genome annotation
Genes were identified using Prodigal [34] as part of the DOE-JGI Annotation pipeline [35]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes (IMG-ER) [36].

Genome properties
The genome is 5,053,571 bp long and comprises 110 scaffolds with an average GC content of 44.61% (Table 3). Of the 4,565 genes predicted, 4,501 were protein-coding genes and 64 RNA genes. Most protein-coding genes (68.52%) were assigned to a putative function, while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4. Not in COGs a The total is based on the total number of protein coding genes in the annotated genome.
Based on genomic analysis of the metabolic features, O. sitiensis is an auxotroph for L-alanine, Larginine, L-histidine, L-isoleucine, L-leucine, Llysine, L-phenylalanine, L-proline, L-serine, Ltyrosine, L-tryptophan and L-valine, and a prototroph for L-aspartate, L-glutamate and glycine. Selenocysteine and biotin cannot be synthesized by O. sitiensis. Strain AW-6 T can utilize L-arabinose and maltose (via orthophosphate activation), whereas no maltose hydrolysis is achieved [1]. Genome analysis revealed the genetic and molecular bases of the degradation of recalcitrant compounds by O. sitiensis. The ability of O. sitiensis to degrade phenolic compounds is verified by the distribution of genes encoding oxidoreductases that act on diphenols and related substances and by the 2-keto-4-pentenoate hydratase/2oxohepta-3-ene-1,7-dioic acid hydratase-coding genes that are involved in the catechol pathway. Genes encoding β-1,4-xylanases and β-1,4xylosidases were also identified in the genome of strain AW-6 T , indicating that O. sitiensis is a xylanolytic bacterium involved in the cleavage of β-1,4-xylosic bonds in hemicelluloses. The existence of protocatechuate 3,4-dioxygenase (dioxygenase_C)-coding genes are indicative of the ability of this bacterium to degrade benzoate and 2,4-dichlorobenzoate.
Genes encoding carboxymethylenebutenolidase were distributed in the genome of O. sitiensis, indicating its potential for hexachlorocyclohexane and 1,4dichlorobenzene degradation. Oxidoreductases related to aryl-alcohol dehydrogenases were predicted, showing that O. sitiensis may be also involved in biphenyl and toluene/xylene degradation. This is also strengthened by the identification of an uncharacterized protein, possibly involved in aromatic compounds catabolism. Moreover, putative multicopper oxidases with possible laccaselike activity were identified. Mercuric reductaseand arsenate reductase-coding genes as well as organic solvent tolerance and chromate transport proteins encoded in the genome indicate possible resistance of O. sitiensis to the presence of heavy metals and organic solvents.