Draft genome sequence of Arthrospira platensis C1 (PCC9438)

Arthrospira platensis is a cyanobacterium that is extensively cultivated outdoors on a large commercial scale for consumption as a food for humans and animals. It can be grown in monoculture under highly alkaline conditions, making it attractive for industrial production. Here we describe the complete genome sequence of A. platensis C1 strain and its annotation. The A. platensis C1 genome contains 6,089,210 bp including 6,108 protein-coding genes and 45 RNA genes, and no plasmids. The genome information has been used for further comparative analysis, particularly of metabolic pathways, photosynthetic efficiency and barriers to gene transfer.


Introduction
Arthrospira platensis is a cyanobacterium that contains large amounts of proteins, vitamins, lipids and pigments [1]. It is widely used as a human food and an animal feed. In addition, its extracts can enhance the immune system and promote health [1,2]. As the natural habitat is soda lakes, Arthrospira spp are cultivated under highly alkaline conditions in open ponds on a large commercial scale. This condition can minimize and sometimes prevent the culture from contamination [3]. Unlike many plant food products whose nutritional value rapidly deteriorates at high temperatures, the nutritional value of Arthrospira products is maintained even when the cells are processed at high temperatures [4]. In contrast to many cyanobacteria, there is no report of toxicity of Arthrospira for humans, animals or environments [4]. The genome sequences of Arthrospira spp. have been the subject of immense interest due to the beneficial properties of these organisms in the biotechnology and environmental fields [5,6]. A. platensis C1 is the fifth complete genome report for a member of the genus Arthrospira. A. platensis C1 has long been used as a laboratory strain for physiological and molecular studies due to its non-gliding property, which enables single colonies formation. This property facilitates studies at the molecular level and strain improvement, particularly, the development of a transformation system. Currently, a successful transformation system for Arthrospira has not yet been established. Standards in Genomic Sciences Thus the genome sequences may help to identify barriers responsible for the instability of the transformants. Here, we present a summary classification and a set of features of A. platensis C1 together with the complete genomic sequence and its annotation.

Classification and features
Historically, the classification of the Arthrospira and Spirulina genera [ Figure 1] was a subject of controversy. For the commercial strain, Arthrospira or Spirulina was used interchangeably. Both Arthrospira and Spirulina are similar in morphological characters; cylindrical, multicellular, filamentous cyanobacteria with an open, left-handed helical shape [Table 1]. They both belong to the Phylum Cyanobacteria, Order Oscillatoriales and Family Oscillatoriaceae [13]. However, they can be differentiated by the presence of cell septa: Arthrospira possess septa, whereas Spirulina do not [14]. , not directly observed for the living, isolated sample but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [15]. If the evidence code is IDA, then the property was directly observed for a living isolate by one of the authors or an expert mentioned in the acknowledgements.
http://standardsingenomics.org 45 Figure 1 The phylogenetic tree of 51 cyanobacterial concatenated ribosomal proteins. The main topology is in agreement with earlier inferences of the phylogeny of this taxon with the 16s rRNA based on the GTR+G+I substitution model [7]. The tree is built using the Neighbor-Joining method and 1,000 re-samplings to calculate bootstrap values. A. platensis C1 was clustered together with other strains in the order Oscillatoriales and was clearly separated from related species in the order Nostocales. The conserved, concatenated ribosomal protein phylogenetic tree indicated the monophyly of this Arthrospira genus. Standards in Genomic Sciences

Chemotaxonomy
Arthrospira platensis C1 or Arthrospira sp. PCC 9438, as designated based on its morphology by Pasteur Institute, Paris, France, was originally classified as Spirulina platensis C1 [1,3]. This reclassification was in agreement with the presence of γlinolenic acid (GLA) in the fatty acid profile, a chemotaxonomic marker of Arthrospira, while GLA is absence in Spirulina [16][17][18].
The phylogenetic tree of cyanobacteria was reconstructed with evolutionary information embedded in conserved, concatenated ribosomal proteins. platensis Paraca and A. platensis C1 (this study), respectively). In this study, the genome of A. platensis C1 has been sequenced, and the results provide data that can be used for the further study of its biological functions.
The genome project is deposited in the GenBank Database (NCBI ID 67617 and accession number AFXD00000000). DNA sequencing and finishing were performed in collaboration between the Genome Institute, BIOTEC-NSTDA, Thailand and Kazusa DNA Research Institute, Japan. The genome assembly and annotation steps were performed in collaboration between KMUTT, the Genome Institute, BIOTEC-NSTDA, Chiang Mai University, Thailand and Kazusa DNA Research Institute. The summary of the project information is shown in Table 2.

Genome sequencing and assembly
The genome of A. platensis C1 was sequenced using a hybrid method between the 454 Life Sciences technology on the Genome Sequencer (GS) FLX System and BigDye Terminator v3.1 Cycle sequencing. Pyrosequencing reads were assembled using the Newbler de novo sequence assembly software version 2.0.0 (Roche). The Phred/Phrap/Consed software package [20] was used for sequence assembly and quality assessment in the finishing process. The remaining gaps between contigs were closed by custom primer walk or PCR amplification and then editing in Consed. The final assembly contains 739,684 reads from pyrosequencing and 45,959 reads from Sanger sequencing, resulting in 28× coverage of the genome. Employing A. maxima CS-328 contigs [21] as a reference, the A. platensis C1 circular genome of 6.08 Mb total size with 1 scaffold and 63 gaps has been constructed.

Genome annotation
In agreement with the result from the Integrated Microbial Genomes Expert Review (IMG-ER) platform [22], all the genes in the A. platensis C1 genome were identified using the GLIMMER 3.0 program in our Microbial Inhouse Annotation Pipeline. Initial criteria for automated functional assignment required a minimum of 50% residue identity with over 80% length match for BLASTP alignments to the NCBI nonredundant database, InterPro, SwissProt, SignalP, COG, and KEGG databases. The tRNAscan-SE tool [23] was used to find tRNA genes, whereas ribosomal RNAs were found by using the tool RNAmmer [24]. Additional gene prediction analysis and functional annotation were performed within IMG-ER platform [22] and a round of manual curation, including confirmation with proteomic data [25,26].

Genome properties
The genome of A. platensis C1, with a total of 6.08 Mbp (6,089,210 base pairs), contains 44.68% G+C (Table 3) and, in agreement with the findings of Fujisawa et al. (2010) [6], no plasmid DNA. Our results confirm the presence of a single genome in A. platensis C1. The A. platensis C1 circular genome of 6.08 Mbp was compared with the A. maxima CS-328 and A. platensis NIES-39 genomes [6]. A total of 6,153 open reading frames (orfs) were predicted. Of these, 3,757 were annotated as coding for known protein functions and 45 for RNA genes (6 for rRNA and 39 for tRNA). The distribution of genes into COGs is presented in Table 4.

Genomic comparison
The genome sequences from all Arthrospira strains (A. maxima CS-328, A. platensis NITE-39, A. platensis PCC8005, A. platensis Paraca, and A. platensis C1) provided the data for a comparative genome analysis of these strains. Based on the genome statistics comparison (Table 5), all of the Arthrospira spp. genomes revealed highly conserved sequences. Interestingly, the number of signal peptides in A. platensis C1 has been reported to be the lowest among the Arthrospira spp. Further study of these primary sequences may reveal the importance of signal peptides in protein targeting in this cyanobacterial species compared with other laboratory and commercial strains.

Insights from the genome sequence
Analysis of the genome of Arthrospira spp. compared to other cyanobacteria confirmed that the Arthrospira are non-nitrogen fixing, filamentous cyanobacteria. They are nontoxic, which might be due to the absence of polyketide and nonribosomal peptide-producing genes. Genes involved in gliding motility in A. platensis NIES-39 (a vigorously motile strain utilizing type IV pili as the major mechanism for gliding) [6] have been compared with those in A. platensis C1 (a nonmotile strain). Interestingly, all the genes involved in type IV pili are present in the A. platensis C1 genome, however, the lack of gliding ability in A. platensis C1 is due to an unknown mechanism. Further studies are needed to elucidate this mechanism of cell motility.

Cellular defense mechanism
Like other genomes of Arthrospira spp., the genome of A. platensis C1 contains highly interspersed repetitive sequences that account for 9% of its genome. Genome comparisons among cyanobacteria revealed unusual genes involved in defense mechanisms, including restriction and modification enzymes, group II introns, insertion elements and CRISPR. These genes are considered to be major barriers for stable transformation. Therefore, these genes have been targeted for the development of a stable gene transformation system for Arthrospira spp. Because of the nonmotile property of A. platensis C1, single colonies can be selected and used for further strain improvement and genetic manipulation experiments. By combining a stable transformation system with the advantage of colony-forming ability, we should be able to harness A. platensis C1 for many biotechnological applications using gene manipulation and systems biology.

Transporter Characteristic of A. platensis C1
Because of the ability of cyanobacteria to adapt to extremely different habitats, the relationship between their membrane transporter proteins and their habitats has been a focus of interest. Membrane transporters are proteins that allow cell membranes to deliver essential nutrients, eject waste products, and help the cell sense environmental conditions around it [28]. Interestingly, all Arthrospira species contain genes for Na+/H+ antiporters. The NapA-type Na+/H+ antiporter homolog, which is reported to be involved in salt tolerance at alkaline pH in some cyanobacteria, is found in all Arthrospira genomes. Arthrospira species live in high-alkalinity environments, and there are many alkali transporters in their genomes. These results revealed the relationship between the transporters and the lifestyles and niche adaptations of cyanobacteria.