High quality draft genome sequence of Streptomyces sp. strain AW19M42 isolated from a sea squirt in Northern Norway

Here we report the 8 Mb high quality draft genome of Streptomyces sp. strain AW19M42, together with specific properties of the organism and the generation, annotation and analysis of its genome sequence. The genome encodes 7,727 putative open reading frames, of which 6,400 could be assigned with COG categories. Also, 62 tRNA genes and 8 rRNA operons were identified. The genome harbors several gene clusters involved in the production of secondary metabolites. Functional screening of the isolate was positive for several enzymatic activities, and some candidate genes coding for those activities are listed in this report. We find that this isolate shows biotechnological potential and is an interesting target for bioprospecting.


Introduction
The filamentous and Gram-positive genus Streptomyces, belonging to the phylum Actinobacteria [1], are attractive organisms for bioprospecting being the largest antibiotic-producing genus discovered in the microbial world so far [2]. These species have also been exploited for heterologous expression of a variety of secondary metabolites [3]. Additionally, these species harbor genes coding for enzymes that can be applicable in industry and biotechnology [4,5]. Since the first, complete Streptomycesgenome was published [6], a number of strains isolated from terrestrial environments have been reported [7][8][9][10][11]. Genomic investigations on Streptomycesfrom marine sources have, however, just recently begun [12][13][14][15][16]. Here, we present the draft genome sequence of Streptomyces sp. strain AW19M42 isolated from a marine source, together with the description of genome properties and annotation. Results from functional enzyme screening of the bacterium are also reported.

Classification and features
The Streptomyces sp. strain AW19M42 was identified in a biota sample collected from the internal organs of a sea squirt (class Ascidiacea, subphylum Tunicate, phylum Chordata). The tunicate was isolated using an Agassiz trawl at a depth of 77m in Hellmofjorden, in the sub-Arctic region of Norway ( Table 1). The trawling was done during a research cruise with R/V Jan Mayen in April 2010. The bacterium was isolated during four weeks of incubation at 4-15°C on humic acid containing agar media that is selective for growth of actinomycetes [29,30]. For isolation and nucleic acid extraction the bacterium was cultivated in autoclaved media containing 0.1% (w/v) malt extract, 0.1% (v/v) glycerol, 0.1% (w/v) peptone, 0.1% (w/v) yeast extract, 2% (w/v) agar in 50% (v/v) natural sea water and 50% (v/v) distilled water, pH 8.2 [29]. The gene encoding16S rRNA was amplified by using two universal primers, 27F (5′-AGAGTTTGATCCTGGCTCAG) and 1492R (5′-GGTTACCTTGTTACGACTT) [31], in a standard Taq polymerase driven PCR (VWR) on crude genomic DNA prepared by using InstaGene Matrix (BioRad). Following PCR purification by PureLink PCR Purification (Invitrogen), sequencing was carried out with the BigDye terminator kit version 3.1 (Applied Biosystems) and a universal 515F primer (5′-GTGCCAGCMGCCGCGGTAA) [32]. Using the 16S rRNA sequence data in a homology search by BLAST [33] indicated that the isolate belonged to the Streptomycesgenus, among the Streptomycetaceaefamily of Actinobacteria. A phylogenetic tree was reconstructed from the 16S rRNA gene sequence together with other Streptomyceshomologues ( Figure 1) using the MEGA 5.10 software suit [34]. The evolutionary history was inferred using the UPGMA method [35] and the evolutionary distances were computed using the Maximum Composite Likelihood method [36]. The phylogenetic analysis confirmed that the isolate AW19M42 belongs to the genus Streptomyces. The closest neighbor with a reported, complete genome sequence is Streptomyces griseus subsp. griseus [7], however, the phylogenetic tree indicates that the Streptomyces sp. strain AW19M42 isolate belongs to a closely related but separate clade. Draft genomes have not been reported for this clade previously. Evidence codes -IDA: Inferred from Direct Assay (first time in publication); TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living , isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from of the Gene Ontolog y project [28]. If the evidence code is IDA, then the property was directly observed for a live isolate by one of the authors or an expert or mentioned in the acknowledg ements.
The bacterium was isolated during four weeks of incubation at 4-15°C on humic acid containing agar media that is selective for growth of actinomycetes [29,30]. For isolation and nucleic acid extraction the bacterium was cultivated in autoclaved media con-

natural sea water and 50%
(v/v) distilled water, pH 8.2 [29]. The gene encod-ing16S rRNA was amplified by using two universal primers, 27F (5′-AGAGTTTGATCCTGGCTCAG) and 1492R (5′-GGTTACCTTGTTACGACTT) [31], in a standard Taq polymerase driven PCR (VWR) on crude genomic DNA prepared by using InstaGene Matrix (BioRad). Following PCR purification by PureLink PCR Purification (Invitrogen), sequencing was carried out with the BigDye terminator kit version 3.1 (Applied Biosystems) and a universal 515F primer (5′-GTGCCAGCMGCCGCGGTAA) [32]. Using the 16S rRNA sequence data in a homology search by BLAST [33] indicated that the isolate belonged to the Streptomycesgenus, among the Streptomycetaceaefamily of Actinobacteria. A phylogenetic tree was reconstructed from the 16S rRNA gene sequence together with other Streptomyceshomologues ( Figure 1) using the MEGA 5.10 software suit [34]. The evolutionary history was inferred using the UPGMA method [35] and the evolutionary distances were computed using the Maximum Composite Likelihood method [36]. The phylogenetic analysis confirmed that the isolate AW19M42 belongs to the genus Streptomyces. The closest neighbor with a reported, complete genome sequence is Streptomyces griseus subsp. griseus [7], however, the phylogenetic tree indicates that the Streptomyces sp. strain AW19M42 isolate belongs to a closely related but separate clade. Draft genomes have not been reported for this clade previously.

Genome sequencing and annotation
The organism was selected for genome sequencing on the basis of its phylogenetic position. The genome project is part of a Norwegian bioprospecting project called Molecules for the Future (MARZymes) which aims to search Arctic and sub-Arctic regions for marine bacterial isolates that might serve as producers of novel secondary metabolites and enzymes. High quality genomic DNA for sequencing was isolated with the GenElute Bacterial Genomic DNA Kit (Sigma) according to the protocol for ex-traction of nucleic acids from gram positive bacteria. A 700 bp paired-end library was prepared and sequenced using the HiSeq 2000 (Illumina) pairedend technology ( Table 2). This generated 13.94 million paired-end reads that were assembled into 670 contigs larger than 500 bp using the CLC Genomics Workbench 5.0 software package [37]. Gene prediction was performed using Glimmer 3 [38] and gene functions were annotated using an in-house genome annotation pipeline.

Genome properties
The total size of the genome is 8,008,851 bp and has a GC content of 70.57% (Table 3), similar to that of other sequenced Streptomycesisolates. A total of 7,727 coding DNA sequences (CDSs) were predicted (Table 3). Of these, 6,400 could be assigned to a COG number (Table 4). In addition, 62 tRNAs and 8 copies of the rRNA operons were identified.   The total is based on the total number of protein coding g enes in the annotated genome.
All putative protein coding sequences were assigned KEGG orthology [39], and mapped onto pathways using the KEGG Automatic Annotation Server (KAAS) server [40]. The analysis revealed that Streptomyces sp. strain AW19M42 harbors several genes related to biosynthesis of secondary metabolites. We have identified genes that map to the streptomycin biosynthesis pathway (glucose-1-phosphate thymidylyltransferase (EC 2.7.7.24), dTDP-glucose 4,6-dehydratase (EC 4.2.1.46) and dTDP-4-dehydrorhamnose reductase (EC 1.1.1.133)). Also, several genes map to the pathways for biosynthesis of siderophore group nonribosomal peptides, biosynthesis of type II polyketide product pathway and polyketide sugar unit biosynthesis. Interestingly, two clusters, comprising five genes, both mapped to the biosynthesis of type II polyketide backbone pathway. These genes clusters comprise genes STREP_3146-3150 and STREP_4370-4374. This suite of genes may contribute to a distinct profile of secondary metabolites production.

Conclusion
The 8 Mb draft genome belonging to Streptomyces sp. strain AW19M42, originally isolated from a marine sea squirt in the sub-Arctic region of Norway has been deposited at ENA/DDBJ/GenBank under accession number CBRG000000000. The isolate was successfully screened for several en-zymatic activities that are applicable in biotechnology and candidate genes coding for the enzyme activities were identified in the genome. Streptomyces sp. strain AW19M42 will serve as a source of functional enzymes and other bioactive chemicals in future bioprospecting projects.