Non-contiguous finished genome sequence of Staphylococcus capitis CR01 (pulsetype NRCS-A)

Staphylococcus capitis is a coagulase-negative staphylococcus (CoNS) commonly found in the human microflora. Recently, a clonal population of Staphylococcus capitis (denominated NRCS-A) was found to be a major cause of late-onset sepsis (LOS) in several neonatal intensive care units in France. Here, we report the complete genome sequence and annotation of the prototype Staphylococcus capitis NCRS-A strain CR01. The 2,504,472 bp long genome (1 chromosome and no plasmids) exhibits a G+C content of 32.81%, and contains 2,468 protein-coding and 59 tRNA genes and 4 rRNA genes.


Introduction
A frequent cause of low-weight newborns mortality and morbidity in Neonatal Intensive Care Units (NICUs) are late-onset sepsis (LOS), that are defined as sepsis occurring after 3 days of age. The most frequently encountered pathogens are coagulase-negative staphylococci (CoNS) and within those Staphylococcus epidermidis has been shown to be the most prevalent [1,2]. However, a few studies have reported the emergence of Staphylococcus capitis as a main CoNS-and LOS-causative pathogen in NICU settings [2][3][4]. A study in French NICUs [2] has demonstrated the spread of a single clonal population of methicillin-resistant S. capitis (pulsotype NRCS-A) associated to reduced susceptibility to vancomycin, the first line of antibiotics used in cases of LOS. Moreover, this clone has also been recently identified in NICUs in Belgium, United Kingdom and Australia, which suggests a worldwide distribution. In contrast, in adult bac-teremia, S. capitis are rarely found and when detected, it presents a bigger diversity in terms of genotypes as well as antimicrobial susceptibility profiles than neonates bacteremia. In order to elucidate the molecular mechanisms behind the wide spreading of the S. capitis NRCS-A clone in NICUs throughout the world, we sequenced a prototype strain (CR01).

Classification and information
A strain belonging to the clonal population of Staphylococcus capitis NCRS-A pulsetype (Table 1) was isolated from the blood culture of a preterm infant with LOS, hospitalized in the NICU of the Northern Hospital Group Center (Hospices Civils de Lyon, Lyon, France) and suffering of LOS. Species identification of the bacterial isolates and antimicrobial susceptibility testing (AST) were performed, respectively, using Vitek MS (bioMérieux, Marcy l'Etoile), 16S rDNA sequencing, the automated BD Phoenix system (Becton Dickinson, Sparks, MD) and with Shimadzu-MALDI-TOF MS system (Shimadzu Corporation), as implemented on [21]. The strain was identified as being aStaphylococcus capitis by VITEK MS with 99.9% and at 93.7% by the MALDI-TOF MS, using the Shimadzu Launchpad software program and the SARAMIS database application (AnagnosTec GmbH) for automatic measurement and identification ( Figure  1). Based on the information provided by the manufacture, when the score is ≥70%, identification is considered of high confidence.
The antimicrobial susceptibility test (AST) results were analyzed according to the recommendations of the French Microbiology Society [22]. The S. capitis bacteremia was considered positive based on a single positive blood culture [2,23]. The S. capitis NCRS-A isolate CR01, as all isolates from this clone, is resistant to penicillin, methicillin, gentamicin, rifampicin, hetero-resistant to vancomycin and sensitive to fusidic acid and fluoroquinolones. Table 1, Figure 2 and Figure 3 show detailed information concerning general features of Staphylococcus capitis strain (CR01) and position within the genus Staphylococcus. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [20]. Standards in Genomic Sciences   All 16S rRNA sequences were obtained from the RDP database using as filtering criteria: sequences with more than 1200 nt and classified as "good" quality sequences. The tree uses sequences aligned with the MUSCLE software, with the default parameters as implemented on Seaview version 4 [24], and a tree was inferred based on 1285 sites using the distance model of observed divergence, as implemented in the BioNJ algorithm.
The 16S rRNA sequences were aligned using the MUSCLE software, with the default parameters as implemented on Seaview version 4 [24], and a tree was inferred based on 1285 sites using the distance model of observed divergence, as imple-mented in the BioNJ algorithm, and a bootstrapping process repeated 500 times. The final tree was rooted using the 16S rRNA sequence of Macrococcus equipercicus Type strain that belongs to a closely-related sister genus.

Genome sequencing information
The genome sequence of S. capitis strain CR01 was determined by high-throughput sequencing performed on a Genome Sequencer FLX + system (454 Life Sciences/Roche) using FLX Titanium reagents according to the manufacturer's protocols and instructions, with approximately 47-fold coverage of the genome. This platform provides longer read lengths than other sequencing platforms to obtain raw sequences. De novo assemblies were performed using the Roche Newbler (v 2.7) software package. Table 2 presents the project information and its association with MIGS version 2.0 compliance [5].

Growth conditions and DNA isolation
The sample was prepared for sequencing by growing S. capitis CR01, aerobically at 37°C in Blood Agar for 24-48 hours. Genomic DNA was extracted using the PureLink TM genomic DNA kit (Invitrogen TM ) according to the manufacturer's recommended protocol. The quantity of DNA obtained was determined using a NanoVue TM Plus (HVD Life Sciences), and 1 µg of DNA was used for sequencing of whole-genome of this strain.

Genome sequencing and assembly
The

Genome annotation
An automatic syntactic and functional annotation of the draft genome was performed using the Mi-croScope platform pipeline [26,27]. The syntactic analysis combines a set of programs including AMIGene [28], tRNAscan-SE [29], RNAmmer [30], Rfam scan [31] and Prodigal software [32] to predict genomic objects that are mainly CDSs and RNA genes. More than 20 bioinformatics methods are then used for functional and relational analyses: homology search in the generalist databank UniProt [33] and in more specialized databases as COG [34], InterPro [35], PRIAM profiles for enzymatic classification [36], prediction of protein localization using TMHMM [37], SignalP [38] and PsortB [39] tools.

Genome properties
The genome includes one circular chromosome of 2,504,472 bp (32.81% GC content). A total of 2,565 genes were predicted with 2,453 being protein-coding genes, 59 tRNA-enconding genes, 4 rRNA-encoding genes (including 2 copies of 5S rRNA, 1 copy of both the large and the smallsubunits, respectively, 23S and 16S rRNA) and 34 other RNA related ORFs. No plasmid was detected. Of the 2,453 protein-coding genes, 1,892 genes (76.7%) were assigned to a putative function with the remaining annotated as hypothetical proteins. The predicted coding density in S. capitis strain CR01 was 86%. Table 3 and Figure 4 detailed description of the properties and the statistics of Staphylococcus capitis strain CR01 genome. The distribution of the genes into COGs functional categories is presented in Table 4.   Not in COGs a) The total is based on the total number of protein coding genes in the annotated genome. Standards in Genomic Sciences

Conclusion
Here, we described a new genome sequence of Staphylococcus capitis (strain CR01 belonging to NRCS-A clone) as a first step toward comparing its content with other sequenced Staphylococcus capitis genomes as well as CoNS genomes of species associated with late-onset sepsis. Detailed analyses are in progress to identify virulence factors and mobile genetic elements (MBE), such as the staphylococcal chromosome cassette (SCCmec) [18], potentially related to the high specificity of the NRCS-A clone to the NICU environment.