Draft genome sequence of Francisella tularensis subsp. holarctica BD11-00177

Francisella tularensis is a facultative intracellular bacterium in the class Gammaproteobacteria. This strain is of interest because it is the etiologic agent of tularemia and a highly virulent category A biothreat agent. Here we describe the draft genome sequence and annotation of Francisella tularensis subsp. holarctica BD11-00177, isolated from the first case of indigenous tularemia detected in The Netherlands since 1953. Whole genome DNA sequence analysis assigned this isolate to the genomic group B.FTNF002–00, which previously has been exclusively reported from Spain, France, Italy, Switzerland and Germany. Automatic annotation of the 1,813,372 bp draft genome revealed 2,103 protein-coding and 46 RNA genes.


Introduction
Francisella tularensis is a Gram negative, nonmotile, non-spore forming, facultative intracellular bacterium appearing as short rods or coccoid forms [1]. F. tularensis is the etiologic agent of tularemia, a zoonotic infection also known as rabbit fever and deer-fly fever. Transmission to humans has been reported by direct contact with infected animals, arthropod bites, inhalation of contaminated dust or ingestion of contaminated food or water. This pathogen is highly infectious as it can cause infection upon inhalation of as few as 10 cells. This extremely low infectious dose makes transmission via aerosols easy, and previous attempts to weaponize this microorganism have led to its recognition as a category A biothreat agent (CDC classification) [2,3]. F. tularensis contains three subspecies that are infectious to humans; the highly virulent Francisella tularensis subsp. tularensis, which often causes a lethal multisystemic disease with a fatality rate of up to 30%, the less virulent Francisella tularensis subsp. holartica and Francisella tularensis subsp. mediasiatica, which both seldom cause infectious in humans. Here we present a summary classification together with the description of the draft genome sequence and annotation of Francisella tularensis subsp. holarctica BD11-00177, that was isolated from a vesicle on the forehead of a 72year-old male living in The Netherlands. As the patient had not been abroad for years, this was the first documented case of indigenous tularemia in The Netherlands since 1953.

Classification and features
Francisella is the only genus within the family Francisellaceae and is a member of the order Thiotrichales and the class Gammaproteobacteria  [4,17, Figure 1]. Only rare human infections with F. hispaniensis and F. novicida, and F. philomiragia are described, often caused after nearly drowning [18,19]. F. tularensis is capable of infecting hundreds of different vertebrate and invertebrate hosts [20]. The most widely distributed subspecies is F. tularensis subsp. holarctica, which is found throughout much of the Northern Hemisphere and is the only subspecies naturally occurring in Europe [21]. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project.

Genome sequencing information Genome project history
Strain BD11-00177 was sequenced because of its relevance to biodefense. The draft genome sequence was finished in August 2012. The GenBank accession number for the project is 177784. The genome project is listed in the Genome OnLine Database (GOLD) [22] as project Gi21611. Sequencing was carried out at the Dutch Organization for Applied Scientific Research (TNO) and the Swedish Defense Research Agency (FOI). Initial automatic annotation was performed using the DOE-JGI Microbial Annotation Pipeline (DOE-JGI MAP). Table 2 shows the project information and its association with MIGS 2.0 compliance.

Growth conditions and DNA isolation
For DNA preparation, strain BD11-00177 was grown on 5% sheep blood agar plates for 72 h at 35°C in the presence of 5% CO 2 . DNA was extracted using the Qiamp DNA Micro Kit according manufacturers guidelines (Qiagen, Westburg b.v., Leusden, The Netherlands).  [24]. Standards in Genomic Sciences

Genome annotation
Open Reading Frames (ORFs) were predicted using the Prodigal gene prediction algorithm [23] as part of the DOE-JGI Microbial Annotation Pipeline (DOE-JGI MAP) using default parameters, followed by a round of manual curation. CRISPR elements were predicted using CRT and PILERCR [25]. Predictions from both methods were concatenated. Identification of tRNAs was performed using tRNAScan. Ribosomal RNA genes (5S, 16S, 23S) are predicted using the program RNAmmer [26].
With the exception of tRNA and rRNA, all models from Rfam [27] are used to search the genome sequence. For faster detection, sequences are first compared to a database containing all the ncRNA genes in the Rfam database using BLAST, with a very loose cutoff. Subsequently, sequences that have hits to any genes belonging to an Rfam model are searched using the program INFERNAL [27].
Protein coding genes were compared to protein families (e.g., COGs, Pfam, KEGG) and the proteome of selected "core" genomes, which are publicly available, and the product names were assigned based on the results of these comparisons.

Genome properties
The genome was assembled into 95 large (>1,000 bp) contigs and includes one circular chromosome with a total size of 11,813,372 bp (32.23% GC content). A total of 2,149 genes were predicted, 2,103 of which are protein-coding genes. Of the protein coding genes, 1,592 were assigned to a putative function, with the remaining being annotated as hypothetical proteins. The properties and the statistics of the genome are summarized in Tables 3  and 4. a) The total is based either on the size of the genome in base pairs or on the total number of protein coding genes in the annotated genome. a) The total is based on the total number of protein coding genes in the annotated genome.
A BLAST Ring Image Generator (BRIG) analysis comparing the F. tularensis subsp. holarctica BD11-00177 genome against the F. tularensis subsp. holarctica genomes of F92, LVS, and FTNF002-00 revealed that the BD11-00177 draft genome shows considerable resemblance to FTNF002-00 ( Figure  2). Evolutionary history of F. tularensis subspecies holarctica strain BD11-00177 was inferred using publicly available whole genome sequences. The trees in Figure 3 A and B are drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the number of differences method and are in the units of the number of base differences per sequence. The overview of Francisella genus involved 52 public genome sequences using Piscirickettia salmonis as outgroup ( Figure 3A). The detailed analysis involved 14 F. tularensis subsp. holarctica genome sequences using F. tularensis subsp. tularensis strain SCHU S4 as outgroup (Figure 3B) [17,30,33,[36][37][38][39][40][41]. All positions containing gaps and missing data were eliminated. There were a total of 1,599,589 positions in the final dataset. Standards in Genomic Sciences

Conclusion
Here we have presented the draft genome of the first member of FTNF002-00 genomic group of F. tularensis subspecies holarctica. As more genetic information of members from this genomic group becomes available, a better understanding of the evolution and biogeography of this pathogen will be gained. This knowledge may help us to understand the epidemiology and potential expansion of the geographical distribution of this genomic group. Despite potential biases associated with discontinuous draft genomes, we would like to focus on the added value of draft bacterial genome sequencing. Taking advantage of low cost and high-throughput sequencing platforms allows us to probe the vast microbial diversity present in nature and rapidly respond to clinical outbreaks and acute biosecurity hazards. From an evolutionary ecology perspective, increased sequencing efforts allow us to characterize the biogeography of microbial taxa and differentiate between neutral and conserved genome contents.