Complete genome sequence of Haliangium ochraceum type strain (SMP-2T)

Haliangium ochraceum Fudou et al. 2002 is the type species of the genus Haliangium in the myxococcal family ‘Haliangiaceae’. Members of the genus Haliangium are the first halophilic myxobacterial taxa described. The cells of the species follow a multicellular lifestyle in highly organized biofilms, called swarms, they decompose bacterial and yeast cells as most myxobacteria do. The fruiting bodies contain particularly small coccoid myxospores. H. ochraceum encodes the first actin homologue identified in a bacterial genome. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the myxococcal suborder Nannocystineae, and the 9,446,314 bp long single replicon genome with its 6,898 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.


Introduction
Strain SMP-2 T (DSM 14365 = CIP 107738 = JCM 11303) is the type strain of the species Haliangium ochraceum and was first described in 2002 by Fudou et al. [1]. In 1998 strain SMP-2 T was described as swarming myxobacteria-like microorganism isolated from a dry seaweed sample (Laminariales)with optimum growth at NaCl concentrations of 2%. The attempt to isolate halophilic myxobacteria was initiated by the detection of myxobacterial phylotypes in marine sediments [2]. A second species of the genus Haliangium, H. tepidum, was described along with H. ochraceum [1]. Only two other genera of marine myxobacteria, each comprising one species, have been described to date: Plesiocystis pacifica and Enhygromyxa salina [3,4]. All marine myxobacteria are phylogenetically grouped within one of the three suborders within the order Myxococcales, the Nannocystineae. INSDC databases indicate (as of December 2009) that members of Haliangium are very rare in the environment, with the most closely related 16S rRNA gene sequences from uncultured bacteria being less than 94% similar to H. ochraceum SMP-2 T .

Classification and features
At the time of species description of the two Haliangium species, the most similar 16S rRNA gene sequence from cultivated strains originated from strain Pl vt1 T . This strain was published with the name Polyangium vitellinum [5], hence the accession entry of its sequence (AJ233944) was also registered with this species name up to November 2009. However, Reichenbach perceived that these organisms meet perfectly Kofler's description of "Polyangium flavum", but do not conform to the description of the genus Polyangium. Thus Reichenbach revived Kofler's "Polyangium flavum" in a new genus, Kofleria, and designated strain Pl vt1 T the type strain of the species Kofleria flava [6]. Subsequently, the species name was changed in the Genbank entry for AJ233944. The 16S rRNA gene sequences of the two Haliangium species were less than 94% similar to this nearest neighbor [1], and thus far no sequences of cultivated or uncultivated bacteria with higher similarities to SMP-2 T were deposited in GenBank. In 2005, the family Kofleriaceae was created by Reichenbach, containing the single species K. flava [6], and the author mentioned in a note added during the edition of Bergey's Manual that he regarded the two Haliangium species as members of the family Kofleriaceae. This family name has standing in nomenclature [7]. Albeit, Haliangium ochraceum is listed in the Taxonomic Outline of the Prokaryotes [8] as member of the family "Haliangiaceae", that has no standing in nomenclature. From a phylogenetic point of view, the genera Kofleria (terrestrial) and Haliangium (marine) should be members of a single family. Myxobacteria are distinct because of two exceptional features. The first is their high potential to produce secondary metabolites, most of them affecting prokaryotic or eukaryotic cells and hence awaiting exploitation for pharmaceutical applications or in plant protection. They encode genes for key enzymes in the biosynthesis of polyketide and peptide metabolites, polyketide synthases and nonribosomal peptide synthetases, respectively [9]. Their second distinctive characteristic is their morphogenesis, i.e. the formation of fruiting bodies and development of myxospores, that is based on cell-to-cell signaling among the single cells of the population in a swarm. The genetic background of the so called 'social motility' and morphogenesis is understood best for Myxococcus xanthus [10]. It is no surprise that these phenomena are regulated by sophisticated networks including two-component regulatory systems [11]. Figure 1 shows the phylogenetic neighborhood of H. ochraceum SMP-2 T in a 16S rRNA based tree. The sequences of the two 16S rRNA gene copies in the genome of do not differ from each other, and do not differ from the previously published 16S rRNA sequence of DSM 14365 (AB016470). Vegetative cells of H. ochraceum stain Gramnegative and form cylindrical rods with blunt ends ( Table 1). They are embedded in an extracellular matrix and measure 0.5-0.6 by 3-8 µm ( Figure 2). This cell form is characteristic for members of the suborder Nannocystineae [6]. The colonies exhibit spreading on solid surfaces such as agar as filmlike layers and thus are called 'swarms'. The extending motion is propelled by gliding. On aging culture plates, the cells do no more spread to explore new substrates (so called adventurous or A motility) but also gather on specific points of the swarms to form fruiting bodies (social or S motility) [10]. The fruiting bodies of strain SMP-2 T are light yellow to yellowish-brown, irregular, sessile knobs with a diameter of 50-200 µm and contain one or more oval-shaped sporangioles, each 20-60 µm in size [1,2]. The spherical to ovoid myxospores within the sporangioles measure 0.5-0.7 µm. Thus they resemble the myxospores of Nannocystis species in being very tiny [1]. The myxospores tolerate heat treatment at 55-60°C for 5 minutes and storage in a desiccated stage for at least 3 months [23]. The strain requires NaCl for growth with an optimum concentration of 2% and good growth in the range of 0.5-4% NaCl in agar or in liquid medium [1,2,23]. Fruiting body formation was observed at salt concentrations corresponding to 40-100% sea water concentration but not at lower salt concentrations [23]. Media supporting growth are CY medium, diluted 1:5, (DSMZ medium 67) or VY/2 medium (DSMZ medium 9) [26], each supplemented with seawater salts. No growth was obtained in tryptic soy broth with seawater salts [1]. Corresponding to the multicellular lifestyle, new agar or liquid cultures of strain SMP-2 T can only be successfully started with very high inoccula. The minimum cell load on a plate in order to induce a swarm is 105 [23]. The temperature range for growth is 20-40°C with an optimum at 30-34°C [1].

Figure 1.
Phylogenetic tree highlighting the position of H. ochraceum SMP-2 T relative to the other type strains within the genus and the type strains of the other genera within the order Myxococcales. The tree was inferred from 1,463 aligned characters [12,13] of the 16S rRNA gene sequence under the maximum likelihood criterion [14] and rooted in accordance with the current taxonomy. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 1,000 bootstrap replicates if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [15] are shown in blue, published genomes in bold.
Cells of strain SMP-2 T are strictly aerobic with weak oxidase and catalase reactions. They do not grow in mineral media with carbohydrates or organic acids but are specialized decomposers of macromolecules such as starch, DNA, casein, chitin or gelatin. Cellulose, however is not cleaved. The cells are equipped to decompose cells of other bacteria or yeasts [1]. Correspondingly, enzymes such as lipase (C14), trypsin, chymotrypsin, valine or leucine arylamidases are active [1]. Whether or not H. ochraceum actively hunt for prey bacteria as shown for M. xanthus [27] has not been studied yet.

Chemotaxonomy
The fatty acid profile of strain SMP-2 T reveals saturated straight chain C16:0 (38.3%) and branched chain iso-C16:0 (15.3%) acids as the major fatty acids. No hydroxylated fatty acids were detected, a feature shared with members of the genera Nannocystis, Sorangium [1], Plesiocystis [4] and Enhygromyxa [3]. While the two Haliangium species also contain anteiso-branched fatty acids as distinctive compounds [1], the specific feature of the two other marine genera Plesiocystis and Enhygromyxa is the presence of polyunsaturated C20:4 acids [3,4]. A novel pathway for the biosynthesis of iso-even fatty acids (by α-oxidation of iso-odd fatty acids) was detected for the myxobacterium Stigmatella aurantiaca [28]. In members of the genus Nannocystis and Polyangium, true steroids were detected, a very unusual trait among prokaryotes [6,29]. It would be interesting to study whether these pathways are also found in H. ochraceum. MK-8 is the predominant menaquinone in SMP-2 T as it is in all terrestrial myxobacterial taxa studied [1,29]. It is noteworthy that the members of the other marine genera Plesiocystis and Enhygromyxa contain MK-8(H2) and MK-7, respectively [3,4]. The compositions of polyamines and the polar lipids of Haliangium strains have not been analyzed. Standards in Genomic Sciences , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from of the Gene Ontology project [25]. If the evidence code is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements.

Genome project history
This organism was selected for sequencing on the basis of its phylogenetic position, and is part of the Genomic Encyclopedia of Bacteria and Archaea project [30]. The genome project is deposited in the Genome OnLine Database [15] and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Genome sequencing and assembly
The genome was sequenced using a combination of Sanger and 454 sequencing platforms. All general aspects of library construction and sequencing performed at the JGI can be found at the JGI website (http://www.jgi.doe.gov/). 454 Pyrosequencing reads were assembled using the Newbler assembler version 1.1.02.15 (Roche). Large Newbler contigs were broken into 10,273 overlapping fragments of 1,000 bp and entered into the final assembly as pseudo-reads. The sequences were assigned quality scores based on Newbler consensus q-scores with modifications to account for overlap redundancy and to adjust inflated qscores. A hybrid 454/Sanger assembly was made using the parallel phrap assembler (High Performance Software, LLC). Possible mis-assemblies were corrected with Dupfinisher or transposon bombing of bridging clones [31]. Gaps between contigs were closed by editing in Consed, custom primer walk or PCR amplification. A total of 2,013 Sanger finishing reads were produced to close gaps, to resolve repetitive regions, and to raise the quality of the finished sequence. The error rate of the completed genome sequence is less than 1 in 100,000. Together all sequence types provided 24.3× coverage of the genome. The final assembly contains 90,757 Sanger and 689,516 pyrosequencing reads.

Genome annotation
Genes were identified using Prodigal [32] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline (http://geneprimp.jgi-psf.org/) [33]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes -Expert Review (http://img.jgi.doe.gov/er) platform [34].

Genome properties
The genome is 9,446,314 bp long and comprises one main circular chromosome with a 69.5% GC content (Table 3 and Figure 3). Of the 6,951 genes predicted, 6,898 were protein coding genes, and 53 RNAs. Fifty-three pseudogenes were also identified. The majority of the protein-coding genes (62.1%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The percentage of genes which were not assigned to COGs is relatively high, 42%, a proportion similar to that in the ge-nome of Sorangium cellulosum So ce56 [11]. This fact suggests that the genome harbors many yet unknown genes. The distribution of genes into COGs functional categories is presented in Table 4. Starting from one of the conspicuous features of the myxobacteria, the diversity of secondary metabolism, the number of known genes putatively assigned to the COG category "Secondary metabolites biosynthesis, transport and catabolism" is not exceptionally high: 174 genes in comparison to, for example, 136 genes in Pseudomonas putida F1. The number of COG genes involved in "Replication, recombination and repair", however, are remarkably increased: in H. ochraceum: 439 genes were assigned to this category, in S. cellulosum there are 541, whereas P. putida only contains 157 genes assigned to this category.

Insights from genome sequence
The genomes of two other myxobacteria, M. xanthus DK1622 and S. cellulosum strain So ce56, were analyzed in depth [11,[35][36][37] and may serve as a roadmap to explore the genome of strain SMP-2 T . Sixteen genes of strain SMP-2 T were putatively assigned to the COG category 'Cytoskeleton'. Recognizing that almost all other bacteria do not harbor any genes assigned to this category it is worth mentioning that all myxobacterial genomes studied so far include several copies in this category. Fifteen of the cytoskeleton genes of SMP-2 T belong to COG 5184 'Alpha-tubulin suppressor and related RCC1 domain-containing proteins'. Strain SMP-2 T and P. pacifica, another rare marine myxobacterium, together with Salinispora arenicola are the prokaryotes with the highest degree of similarity of these genes, 15, 12, 14 and 15, respectively. Whereas RCC1 was known as a eukaryotic cell cycle regulator, RCC1-like repeats were recently also detected in several prokaryotic genomes [38]. Future studies will have to elucidate whether the SMP-2 T sequences, automatically assigned to a RCC1 domain, are related to these repeats in particular. As the genes most similar to the H. ochraceum RCC1-like proteins, as determined by protein BLAST with the NCBI database, derive exclusively from other myxobacteria such as P. pacifica or Stigmatella aurantiaca, it seems plausible that they build a myxobacterial branch within the RCC1 superfamily.   The most striking finding in the H. ochraceum genome was a sequence coding for a protein of the actin family (COG 5277) within the Cytoskeleton category [30]. Only eight years ago, it became obvious that bacterial cells contain a cytoskeleton at least as active as in eukaryotic cells. The bacterial functional and structural homologues to the eukaryotic actin compound are the proteins MreB and ParM [39]. However, the prokaryotic and eukaryotic genes coding for these proteins, or their amino acid sequences, are not related on the sequence level. In contrast, the sequence detected in H. ochraceum shows a striking sequence similarity to actin and is the very first report of an actin homolog in a bacterial genome. The protein was called BARP, bacterial actin-related protein. The genomic context of barP, its sequence, the putative structure of the protein and evidence that the gene is expressed were recently described by Wu et al. [30]. Interestingly, several hits for proteins of the actin family are given for Archaea by IMG. Myxobacteria became known for their potential to synthesize a vast array of secondary metabolites. Polyketide synthases (PKS) and nonribosomal peptide synthetases play the key role in the building pathways [37]. PKS multidomain complexes are listed in COG 3221 in the category 'Secondary metabolites'. The sum of automatic assignments to this category is not extraordinarily increased for H. ochraceum in comparison to other bacteria (174 hits as compared to, e.g., 136 in P. putida strain F1), and the search for the gene product 'polyketide synthase' does not find any gene for H. ochraceum. However, the genome of H. ochraceum, like the other myxobacteria studied, contains a high number of stretches assigned to COG 3321. The number of hits in COG is less than 10 in bacteria except for the myxobacteria, Burkholderia mallei, B. pseudomallei, Mycobacterium spp. and members of the Streptomyces. The annotations in COG 3321 for H. ochraceum identify the homologues as known domains of PKS (for example acyltransferases or ketoreductases) or of a distinct PKS synthesizing the aglycone precursor of erythromycin B. A search for PKS in different myxobacteria using PCR unfortunately did not include H. ochraceum but it included a strain representing the second species in the genus, H. tepidum [40]. The authors found the highest percentage of yet unde-scribed PKS sequences (50% of all newly detected PKS sequences) in the marine myxobacteria (as compared to terrestrial myxobacteria). In H. tepidum, all 10 PKS sequences represented novel PKS genes (threshold 70% identity to known sequences). These findings suggest that an in-depth search for novel genes coding for isoprenoid metabolites in H. ochraceum has a very good prospect of success.
Other promising fields of gene mining in H. ochraceum, as a representative of the marine myxobacteria, most likely are the genes of energy metabolism and the genes coding for the coordinated movement of cells during fruiting body and myxospore formation. This morphogenesis is conducted by cell-to-cell cross-talk, signal transduction and induction of 'social motility' [10,41].