Complete genome sequence of the motile actinomycete Actinoplanes missouriensis 431T (= NBRC 102363T)

Actinoplanes missouriensis Couch 1963 is a well-characterized member of the genus Actinoplanes, which is of morphological interest because its members typically produce sporangia containing motile spores. The sporangiospores are motile by means of flagella and exhibit chemotactic properties. It is of further interest that members of Actinoplanes are prolific sources of novel antibiotics, enzymes, and other bioactive compounds. Here, we describe the features of A. missouriensis 431T, together with the complete genome sequence and annotation. The 8,773,466 bp genome contains 8,125 protein-coding and 79 RNA genes.


Introduction
Strain 431 T [= NBRC 102363 T = DSM 43046 T = ATCC 14538 T and other culture collections] is the type strain of the species Actinoplanes missouriensis [1,2], which is a well-characterized member of the genus Actinoplanes [3]. The genus is of morphological interest because its members typically produce spherical, subspherical, cylindrical, or very irregular sporangia arising from vegetative mycelia [1,4]. Sporangiospores, which are released following the immersion of sporangia in water, are motile by means of polar or peritrichous flagella [5] and display a positive chemotactic response to a number of amino acids, aromatic compounds, sugars and inorganic ions [6,7]. Recently, the flagellin gene of various Actinoplanes strains was successfully amplified and sequenced to reveal the evolutionary relationships between flagellar genes [8]. As members of the genus Actinoplanes, including A. missouriensis, produce a variety of antibiotics, enzymes and other bioactive compounds [9][10][11][12], the genomes of Actinoplanes species are potentially useful genetic resources for discovering secondary metabolites and enzymes. The genome analysis of the acarbose producer Actinoplanes sp. SE50/110 was most recently reported [13]. Here, we present a summary classification and a set of features for A. missouriensis strain 431 T , together with a description of the complete genome sequencing and annotation.

Classification and features
A. missouriensis strain 431 T was originally isolated from barnyard soil near Hamilton, Missouri, USA using the pollen-baiting technique and was first described by Couch in 1963 [1]. Couch also isolated additional strains of A. missouriensis from ten soil collections obtained from the Mississippi Valley to the West Coast. The range of 16S rRNA gene sequence similarities between strain 431 T (= NBRC 102363 T ) (AB711914) and valid members of the genus Actinoplanes was 95.7-97.9%. The highest sequence similarities were to Actinoplanes utahensis NBRC 13244 T . Figure 1 shows the phylogenetic neighborhood of A. missouriensis 431 T in a 16S rRNA gene-based tree. The sequences of five of the six 16S rRNA gene copies in the genome of A. missouriensis 431 T are identical (one sequence differed by 4 nucleotides), but differ by 55 nucleotides (3.8%) from the previously published 16S rRNA gene sequence of NBRC 13243 T (AB037008). The differences between the genome data and the reported 16S rRNA gene sequence are likely due to sequencing errors in the previously reported sequence data. The 16S rRNA sequence of strain 431 T did not match significantly with any 16S rRNA gene sequences from environmental genomic samples and surveys available at the NCBI BLAST server (March 2012).

Figure 1.
Phylogenetic tree highlighting the position of A. missouriensis 431 T relative to Actinoplanes sp. strain SE50/110 and other type strains within the genus Actinoplanes. The tree was inferred from 1,351 aligned characters [14,15] of the 16S rRNA gene sequence under the maximum likelihood criterion [16] and rooted with the type strain of the neighboring genus Dactylosporangium. Only bootstrap values above 50% are shown (1,000 resamplings) at branching points. Lineages with type strain genome sequencing projects registered in GOLD [17] are labeled with an asterisk, those also listed as 'Complete and Published' with two asterisks. Standards in Genomic Sciences The color of the substrate mycelia of strain 431 T is ochraceous salmon on Czapek agar. The strain does not form aerial mycelium. On Czapek agar, the strain produces a soluble pale lavender pigment [1]. Optimum growth occurs at 28°C. Strain 431 T produces globose spores (1-1.2 µm) arranged in irregular coils within a terminal sporangium. Sporangia are globose to subglobose in shape and 6 to 14 µm in diameter [ Figure 2] [1]. Sporangia release motile spores, called zoospores, after unwrapping of the membranous sheath covering sporangia (Table 1) [5]. Released zoospores have chemotactic properties for several substrate types, including sugars, amino acids, aromatic compounds and mineral ions [6,7]. Flagella of the zoospores consist of a 44-kDa flagellar protein (FliC) [5] encoded by the fliC gene, which has been applied as a molecular marker in a taxonomic study of the genus Actinoplanes [8]. Strain 431 T utilizes D-xylose, L-arabinose, D-glucose, Dfructose, D-mannose, L-rhamnose, D-mannitol, sucrose, mannose, dextrin, D-galactose, D-lactose, methyl β-D-glucoside and L-rhamnose (1%, w/v), but not myo-inositol, raffinose, cellulose, adonitol, D-arabinose, D-melezitose, methyl α-D-glucoside, D-ribose or D-sorbitol (1%, w/v) [4,29]. Strain 431 T also utilizes p-hydroxybenzoic acid, sodium acetate and sodium fumarate (0.1%, w/v), but not m-hydroxybenzoic acid, sodium succinate, syringic acid, or vanillin (0.1%, w/v) [4,29]. Strain 431 T degrades chitin, DNA, elastin, lecithin, RNA and tyrosine (26), and also actively decomposes natural rubber [32] and flavonoids, such as quercetin, rutin and hesperidin [33]. Strain 431 T produces glucose isomerase (xylose isomerase) with a molecular weight of approximately 8,000 daltons [34] and also produces 6-alkyl-4-Odihydrogeranyl-2-methoxyhydroquinones as novel phenolic lipids [35]. Antimicrobial activity against Streptomyces murinus ISP 5091 is positive, but is negative against Aspergillus niger LIV 131, Bacillus subtilis NCIB 3610 and Staphylococcus aureus NCTC 8532 [29]. Additional physiological and drug susceptibility data are described in detail elsewhere [29].  Evidence codes -IDA: Inferred from Direct Assay (first time in publication); TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [31]. If the evidence code is IDA, then the property was directly observed by one of the authors or an expert mentioned in the acknowledgements. Standards in Genomic Sciences

Genome sequencing information Genome project history
A consortium consisting of universities, research institutions and private companies in Japan was organized to accelerate cooperative research and development utilizing genome sequences of actinomycetes. The consortium successfully accomplished the genome projects of Kitasatospora setae NBRC 14216 T [38] and A. missouriensis 431 T (= NBRC 102363 T ), as reported here. The complete genome sequences of these strains have been deposited in the INSDC database and are also available from the DOGAN genome database [39]. A summary of the genome sequencing project information of A. missouriensis is shown in Table 2.

Growth conditions and DNA isolation
A. missouriensis strain 431 T (= NBRC 102363 T ) was grown in NBRC 231 medium (Maltose-Bennett's Agar) at 28°C. DNA was isolated from 3.76 g of mycelial paste using the CTAB method and fragmented with the HydroShear device (Genomic Solutions), as recommended by the manufacturer.

Genome sequencing and assembly
The genome sequence of A. missouriensis 431 T was determined using a whole-genome shotgun sequencing approach together with the Sanger method. DNA shotgun libraries with average insert sizes of 1.6 and 6 kbp were constructed in pUC118, and a library with an average insert size of 36 kbp was constructed in pCC1FOS (Epicentre). A total of 96,384 reads were trimmed at a threshold quality value of 20 and assembled using the Phrap and CONSED assembly tools [40,41]. A total of 141 gaps (18,532 bp in total) were closed by sequencing PCR products, and 236 low-quality regions were re-sequenced by primer walking. In the final assembly step, we confirmed that each base of the genome was sequenced from multiple clones in either both or a single direction and had Phrap quality scores of >70 and >40, respectively. For validation of the contig alignment, the Argus Optical Mapping System (OpGen, Inc.) was used.

GOLD ID Gc02182
Project relevance Biotechnological

Genome annotation
Gene prediction was performed using Glimmer3 [42] and tRNA-scanSE [43], followed by manual inspection of each translation start site using the Frameplot program [44]. Similarity search results against the NCBI nr and Pfam databases were used for functional prediction. Manual functional annotation was performed within an in-house platform developed by J.I. (Unpublished).

Genome properties
The genome of strain 431 T consists of one circular chromosome with a length of 8,773,466 bp and a G+C content of 70.8% ( Table 3 and Figure 3). A. missouriensis DSM 43046 T was reported to contain a linear plasmid, pAM1 [33]; however, the corresponding plasmid was not found in strain 431 T (= NBRC 102363 T ) by CHEF electrophoresis analysis. It is possible that the linear plasmid was cured from strain 431 T by repeated subculturing in the laboratory. In comparison to other members of the family Micromonosporaceae, such as Salinispora and Micromonospora, A. missouriensis has a larger genome size. Of the total of 8,204 predicted genes, 8,125 were protein-coding genes and 79 were RNA genes. More than half of the protein-coding genes (4,539, 55.9%) were assigned a putative function, while the remaining predicted genes were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4. a) The total is based on either the size of the genome in base pairs or the total number of protein-coding genes in the annotated genome.
n/a, not available  a) The total is based on the total number of protein-coding genes in the annotated genome.