Complete genome sequence of Intrasporangium calvum type strain (7 KIPT)

Intrasporangium calvum Kalakoutskii et al. 1967 is the type species of the genus Intrasporangium, which belongs to the actinobacterial family Intrasporangiaceae. The species is a Gram-positive bacterium that forms a branching mycelium, which tends to break into irregular fragments. The mycelium of this strain may bear intercalary vesicles but does not contain spores. The strain described in this study is an airborne organism that was isolated from a school dining room in 1967. One particularly interesting feature of I. calvum is that the type of its menaquinone is different from all other representatives of the family Intrasporangiaceae. This is the first completed genome sequence from a member of the genus Intrasporangium and also the first sequence from the family Intrasporangiaceae. The 4,024,382 bp long genome with its 3,653 protein-coding and 57 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.


Introduction
Strain 7 KIP T (= DSM 43043 = ATCC 23552 = JCM 3097) is the type strain of the species Intrasporangium calvum, which is the type species of its genus Intrasporangium [1,2]. The generic name derived from the Latin word intra meaning within and the Greek word spora meaning a seed. The name Intrasporangium, was selected to emphasize the possibility of intercalary formation of sporangia in mycelial filaments [3]. Intrasporangium is the type genus of the family Intrasporangiaceae and one out of currently nineteen genera in the family Intrasporangiaceae [4][5][6]. Strain 7 KIP T was first described in 1967 by Kalakoutskii et al. as an airborne organ-ism, which was isolated under nonselective conditions on plates of meat-peptone agar exposed to the atmosphere of a school dining room [1,7,8]. I. calvum is of particular interest because the type of its menaquinones is different from all other representatives of the family Intrasporangiaceae [8].
Here we present a summary classification and a set of features for I. calvum 7 KIP T , together with the description of the complete genomic sequencing and annotation.

Classification and features
The 16S rRNA gene of strain 7 KIP T shares 92. .7% sequence identity with the sequences of the type strains from the other members of the family Intrasporangiaceae [9], with Humihabitans oryzae as the closest relative. The 16S rRNA gene sequence of 7 KIP T is 99% identical to the uncultured Intrasporangiaceae clone HT06Ba24, isolated from soil of a former coal gasification site in Gliwice, Poland [10,11] and AKAU4164, isolated from uranium contaminated soil in Oak Ridge, USA [10,12]. The environmental samples database (env_nt) contains the marine metagenome clone 1096626841081 (AA-CY020552144) from surface water (92% sequence identity with 7 KIP T ). The genomic survey sequences database (gss) contains the metagenomic clone 1061002660518 from Floreana island in Punta Cormorant, Ecuador [10], which shares 93% sequence identity with 7 KIP T (as of July 2010). One of the 16S rRNA sequences of strain 7 KIP T was compared using NCBI BLAST under default values (e.g., considering only the best 250 hits) with the most recent release of the Greengenes database [13] and the relative frequencies, weighted by BLAST scores, of taxa and keywords, weighted by BLAST scores, were determined. The five most frequent genera were Janibacter (29.6%), Terrabacter (19.8%), Sanguibacter (8.4%), Dermacoccus (7.7%) and Tetrasphaera (6.2%). The five most frequent keywords within the labels of environmental samples which yielded hits were 'skin' (9.1%), 'human' (4.7%),'microbiome/temporal/topographical' (4.5%), 'sludge' (4.4%) and 'heel/plantar' (3.1%). The single most frequent keyword within the labels of environmental samples which yielded hits of a higher score than the highest scoring species was 'contaminated/soil/uranium' (33.3%). Figure 1 shows the phylogenetic neighborhood of I. calvum 7 KIP T in a 16S rRNA based tree. The sequences of the two 16S rRNA gene copies in the genome are differ by only one nucleotide from each other and by up to one nucleotide from the previously published sequence generated from DSM 43043 (AJ566282).  [14,15] of the 16S rRNA gene sequence under the maximum likelihood criterion [16] and rooted with the type strains of the genera within the family Kineosporiaceae [17]. The branches are scaled in terms of the expected number of substitutions per site. Numbers above branches are support values from 650 bootstrap replicates [18] if larger than 60%. Lineages with type strain genome sequencing projects registered in GOLD [19] are shown in blue, published genomes in bold [20]. Strain 7 KIP T forms a branching mycelium, which tends to break into irregular fragments, i.e., typically nocardioform [1,8]. The mycelium may bear intercalary vesicles that do not contain spores [ Table 1, Figure 2 , 7,26]. The vesicles of strain 7 KIP T are ovoid and lemon-shaped (5-15 µm in diameter) [1,7]. Several round or oval bodies (1.2-1.5 µm in diameter) may be observed in the vesicles of older cultures [1,7]. The oval bodies in the vesicles of strain 7 KIP T are nonmotile but may undergo a Brownian movement (in mature vesicles) [1,7]. There was no aerial mycelium observed from the strain 7 KIP T [1,7,8]. The mycelial filaments penetrate the agar and form compact, small colonies (1-5 mm of diameter) [1]. These colonies are round, glistening and whitish (cream-whitish in old colonies) when the cells are grown on meat-extract peptone agar [1]. Strain 7 KIP T is aerobic and Gram-positive (Gram-variable in old cultures) and not acid-fast [1]. Strain 7 KIP T is rather fastidious in nutritional requirements [1]. Growth is seemingly dependent on some unidentified substances present in the peptone used in the growth medium [1]. The strain prefers complex me-dia for growth, especially containing peptone and yeast extract [1,7]. As such, the growth characteristics on a variety of media such as meat-extract peptone, blood serum broth, oatmeal agar, Sauton medium agar and other media, also in combination of different atmospheric gases and their concentrations, have been studied in detail [1]. Strain 7 KIP T is able to grow between 28°C and 37°C, however, the cells grow faster at 37°C than 28°C, but it does not at 45°C [1]. It grows slowly on meat-extract peptone medium and the first signs of macroscopic growth will appear after 3-5 days when incubated at 28°C [1]. Strain 7 KIP T does not grow on the majority of synthetic mineral media that are routinely used for actinomycetes [1,7]. Strain 7 KIP T is able to reduce nitrate to nitrite when KNO3 is added to the growth medium (meat-extract peptone broth) [1]. The liquefaction of gelatin does not occur when the strain 7 KIP T was grown on meat-extract peptone gelatine [1]. Strain 7 KIP T has no antibiotic activity against Micrococcus luteus, Staphylococcus aureus, Escherichia coli, Bacillus subtilis, Candida albicans and Mycobacterium sp. v-5 [1].  Altitude not reported Evidence codes -IDA: Inferred from Direct Assay (first time in publication); TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from of the Gene Ontology project [28]. If the evidence code is IDA, then the property was directly observed by one of the authors or an expert mentioned in the acknowledgements.

Genome sequencing and annotation Genome project history
This organism was selected for sequencing on the basis of its phylogenetic position [31], and is part of the Genomic Encyclopedia of Bacteria and Archaea project [32]. The genome project is deposited in the Genome OnLine Database [19] and the complete genome sequence is deposited in Gen-Bank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.

Growth conditions and DNA isolation
I. calvum 7 KIP T was grown in medium 65 (GYM Streptomycetes medium) supplemented with one third of BHI (medium 215) [33] at 28°C. DNA was isolated from 0.5-1 g of cell paste using Qiagen Genomic 500 DNA Kit (Qiagen, Hilden, Germany) following the standard protocol as recommended by the manufacturer, with modification st/LALMP for cell lysis as described by Wu et al. [32].

Genome sequencing and assembly
The genome was sequenced using a combination of Illumina and 454 sequencing platforms. All general aspects of library construction and sequencing can be found at the JGI website [34]. Pyrosequencing reads were assembled using the Newbler assembler version 2.0.0-PostRelease-11/04/2008 (Roche). The initial Newbler assembly consisted of 28 contigs in two scaffolds and was converted into a phrap assembly by making fake reads from the consensus, collecting the read pairs in the 454 paired end library. Illumina GAii sequencing data (309MB) was assembled with Velvet [35] and the consensus sequences were shredded into 1.5 kb overlapped fake reads and assembled together with the 454 data. The 454 draft assembly was based on 226.2 Mb 454 draft data and all of the 454 paired end data. Newbler parameters are -consed -a 50 -l 350 -g -m -ml 20. The Phred/Phrap/Consed software package [36] was used for sequence assembly and quality assessment in the following finishing process. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with ga-pResolution [34], Dupfinisher, or sequencing cloned bridging PCR fragments with subcloning or transposon bombing (Epicentre Biotechnologies, Madison, WI) [20]. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks (J.-F.Chang, unpublished). A total of 139 additional reactions were necessary to close gaps and to raise the quality of the finished sequence. Illumina reads were also used to correct potential base errors and increase consensus quality using a software Polisher developed at JGI [37]. The error rate of the completed genome sequence is less than one error in 100,000. Together, the combination of the Illumina and 454 sequencing platforms provided 154.9 × coverage of the genome. The final assembly contains 847,906 pyrosequencing and 11,758,818 Illumina reads.

Genome annotation
Genes were identified using Prodigal [38] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [39].
The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, Uni-Prot, TIGRFam, Pfam, PRIAM, KEGG, COG, and In-terPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes -Expert Review (IMG-ER) platform [40].

Genome properties
The genome consists of a 4,024,382 bp long chromosome with a 70.7% GC content (Table 3 and Figure 3). Of the 3,710 genes predicted, 3,653 were protein-coding genes, and 57 RNAs; ninety pseudogenes were also identified. The majority of the protein-coding genes (71.3%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.