Non contiguous-finished genome sequence of Pseudomonas syringae pathovar syringae strain B64 isolated from wheat

The Gram-negative gammaproteobacterium Pseudomonas syringae is one of the most wide-spread plant pathogens and has been repeatedly reported to cause significant damage to crop plantations. Research on this pathogen is very intensive, but most of it is done on isolates that are pathogenic to Arabidopsis, tomato, and bean. Here, we announce a high-quality draft genome sequence of Pseudomonas syringae pv. syringae B64 which is the first published genome of a P. syringae strain isolated from wheat up to date. The genome sequence will assist in gaining insights into basic virulence mechanisms of this pathogen which has a relatively small complement of type III effectors.


Introduction
Pseudomonas syringae strains have been isolated from more than 180 host species [1] across the entire plant kingdom, including many agriculturally important crops, such as bean, tomato, cucumber, as well as kiwi, stone fruit, and olive trees. Strains are divided into more than 50 pathovars primarily based on host-specificity, disease symptoms, and biochemical profiles [2][3][4]. The first strain of this species was isolated from a lilac tree (Syringa vulgaris), which gave origin to its name [5].The observed wide host range is reflected in a relatively large genetic heterogeneity among different pathovars. This is most pronounced in the complement of virulence factors, which is also assumed to be the key factor defining host specificity [6]. For successful survival and reproduction, both epiphytic and endophytic P. syringae strains deploy different sets of type III and type VI secretion system effectors, phytotoxins, EPS, and other types of secreted molecules [6][7][8][9][10][11]. Currently, there are three completely sequenced P. syringae genomes published: pathovar syringae strain B728a which causes brown spot disease of bean [12], pathovar tomato strain DC3000 which is pathogenic to tomato and Arabidopsis [13], and pathovar phaseolicola strain 1448A, causal agent of halo blight on bean [14]. There are also a number of incomplete genomes of various qualities available for other strains. Pseudomonas syringae pv. syringae strain B64 was isolated from hexaploid wheat (Triticum aestivum) in Minnesota, USA [15]. The strain has been deployed in several studies mainly addressing phylogenetic diversity of P. syringae varieties [15][16][17][18], but never as an infection model for wheat. The genome sequencing of the B64 strain and its comparison with the other published genomes should reveal wheat-specific adaptations and give insights in virulence strategies for colonizing monocot plants.

Classification and features
Pseudomonas syringae belongs to class Gammaproteobacteria. Detailed classification of this species is still under heavy debate. Young and colleagues have proposed to group all plantpathogenic oxidase-negative and fluorescent Pseudomonas strains into a single species, P. syringae, which is to be further sub-divided into pathovars [4,19]. Several DNA hybridization studies have shown a large genetic heterogeneity among the groups, however biochemical characteristics, with a few exceptions, did not allow elevating those into distinct species [20,21]. Currently, the species is divided into five phylogenetic clades based on MLST analysis. P. syringae pv. syringae (Pss) strains belong to group II within this nomenclature [22]. The basic characteristics of Pss B64 are summarized in Table 1, while its phylogenetic position is depicted in Figure 1. Altitude Not reported a Evidence codes -IDA: Inferred from Direct Assay (first time in publication); TAS: Traceable Author Statement (a direct report exists in the literature); NAS: Non-traceable Author Statement (not directly observed for the living, isolated sample, but based on a generally accepted property of the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [39]. Standards in Genomic Sciences Pss B64 has similar physiological properties as other representatives of its genus. It can grow in complex media such as LB [42] or King's B [43], as well as in various defined minimal media: HSC [44], MG-agar [45], PMS [46], AB-agar [47], and SRM AF [48]. Even though the optimal growth temperature is 28°C, the bacterium can also replicate at 4°C. Growth is completely inhibited above 35°C. Pss B64 is capable of endophytic growth in the wheat leaf mesophyll, but does not seem to cause any symptoms unless a very high inoculation dose is applied. The bacterium has a weak resistance to ampicillin (25 mg/L) and chloramphenicol (10 mg/L). It is also possible to develop spontaneous rifampicinresistant mutants. In addition, the genomic sequence predicts this strain to be polymyxin B insensitive due to presence of the arn gene cluster.

Genome sequencing information Genome project history
The organism was selected for sequencing because it has been identified to have a syringolin biosynthesis gene cluster [49]. Syringolin is a proteasome inhibitor produced by some strains of pathovar syringae. As a consequence of proteasome inactivation a number of plant intracellular pathways are being inhibited, including the entire salicylic acid-dependent defense pathway, thus promoting the entry of bacteria into leaf tissue and subsequent endophytic growth [9]. Since up to now it has not been possible to establish an infection model for syringolin in the model plant Arabidopsis, it was decided to explore another common research target and one of the most important crop plants, bread wheat (Triticum aestivum). The genome project has been deposited in the Genbank Database (ID 180994) and the genome sequence is available under accession number ANZF00000000. The version described in this paper is the first version, ANZF01000000. The details of the project are shown in Table 2.

Growth conditions and DNA isolation
P. syringae pv. syringae strain B64 was grown in 40 mL of LB medium at 28°C, 220 rpm until OD 600 of ~1.0. Genomic DNA was isolated from the pelleted cell using a Qiagen Genomic-tip 100/G column (Qiagen, Hilden, Germany) according to the manufacturer's instructions. Note that the Genbank record contains 42 contigs due to fact that one of the contigs was split into two parts in order to start the assembly with the dnaA gene. While closing gaps it became possible to allocate the positions of all ribosomal operons by sequence overlap and thus to incorporate the largest of the minor scaffolds. However, it was not possible to precisely map the remaining two minor scaffolds. These must be located within two distinct remaining large gaps, but due to insignificance to the project they have been excluded from the assembly.

Genome annotation
Initial open-reading frame (ORF), tRNA, and rRNA prediction and functional annotation has been performed using the RAST (Rapid Annotation using Subsystem Technology) server [50]. For the purpose of comparison, the genome has also been annotated using Prokka [51], which utilizes Prodigal [52] for ORF prediction (the RAST server utilizes a modified version of Glimmer [53]). Start codons of all the predicted ORFs were further verified manually, using the position of potential ribosomal binding sites and BLASTP [54] alignments with homologous ORFs from other P. syringae strains as a reference. Functional annotations have also been refined for every ORF using BLASTP searches against the non-redundant protein sequence database (nr) and the NCBI Conserved-Domain search engine [55]. Functional category assignment and signal peptide prediction was done using the Integrated Microbial Genomes/Expert reviews (IMG/ER) system [56].

Genome properties
The genome of the strain B64 is estimated to be comprised of 5,930,035 base pairs with an average GC-content of 58.55 % (Table 3 and Figure 2), which is similar to what is observed in other P. syringae strains [12,13,53]. Of the 5,021 predicted genes, 4,947 were protein coding genes, 4 ribosomal RNA operons, and 61 tRNA genes; 78 were identified to be pseudo-genes. The majority of the protein-coding genes (83.65 %) were assigned a putative function, while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4. The genome contains a complete canonical type III secretion system and ten known effector proteins: AvrE1, HopAA1, HopI1, HopM1, HopAH1, HopAG1, HopAI1, HopAZ1, HopBA1, and HopZ3. Out of these ten, the first five are present in all other sequenced P. syringae strains, thereby constituting the effector core, whereas the latter five could be host-determinants for wheat. That there is such a small number of effectors is not something unusual, and is seen in other strains of clade II [22]. In addition, there are two complete type VI secretion system gene clusters and nine putative effector proteins belonging to the VgrG and Hcp1 families. Pss B64 genome also encodes gene clusters for biosynthesis of four phytotoxin: syringomycin, syringopeptin, syringolin, and mangotoxin. All of the above-mentioned genome components have been previously demonstrated to be involved in virulence, epiphytic fitness of P. syringae, as well as in competition with other microbial species [7][8][9][10][57][58][59]. Additional identified virulenceassociated traits are: exopolysaccharides alginate, Psl, and levan biosynthesis, surfactant syringofactin, type VI pili, large surface adhesins, siderophores pyoverdine and achromobactin, proteases and other secreted hydrolytic enzymes, RND-type transporters (including putative mexAB, mexCD, mexEF, and mexMN homologs [60,61]), all of which are found in other P. syringae strains. It is also notable that inaZ gene encoding icenucleation protein is truncated by a frameshift, thus making this strain ice-negative. The latter contradicts results of a previous study by Hwang and colleagues [16] in which Pss B64 has been identified to be ice-positive. This could be due to an assembly error, or the frameshift could have been introduced at a later point during propagation.