Non-contiguous finished genome sequence and description of Paucisalibacillus algeriensis sp. nov.

Paucisalibacillus algeriensis strain EB02T is the type strain of Paucisalibacillus algeriensis sp. nov., a new species within the genus Paucisalibacillus. This strain, whose genome is described here, was isolated from soil sample from the hypersaline lake Ezzemoul Sabkha in northeastern Algeria. Paucisalibacillus algeriensis is a Gram-positive and strictly aerobic bacterium. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 4,006,766 bp long genome (1 chromosome but no plasmid) exhibits a low G+C content of 36% and contains 3,956 protein-coding and 82 RNA genes, including 9 rRNA genes.


Introduction
Strain EB02 T (= CSUR P858 = DSM 27335) is the type strain of Paucisalibacillus algeriensis sp. nov. It is a strictly aerobic Gram-positive rod, motile by means of peritrichous flagella, and spore-forming bacteria. It was isolated from a soil sample from the hypersaline lake Ezzemoul Sabkha of Oum-El-Bouaghi region in northeastern Algeria, which is the largest nesting area of Mediterranean flamingos. This lake is a Ramsar site (http://www.ramsar.org The genus ).
Paucisalibacillus belongs in the Bacillaceae family, and was created by Nunes in 2006 [1]. To date, the genus contains only one validly published species, Paucisalibacillus globulus strain B22 T which was isolated from potting soil in Portugal [1]. It has been described as a Gram positive rod-shaped bacterium, strictly aerobic, spore-forming and motile by means of two polar flagella at one end. It grows in the absence of NaCl, but a low NaCl concentration (1% w/v) improves growth [1]. The current bacterial taxonomy relies on a combination of various phenotypic, chemotaxonomic and genetic criteria [2][3][4]. The essential genetic criteria used are DNA-DNA hybridization, which is the 'gold standard' criterion to define bacterial species [3,5,6], G+C content and 16S rRNA gene sequence based phylogeny [7]. However, these criteria have several drawbacks and their cutoffs can not be used for all bacterial genera [8]. Presently, as the number of available bacterial genomes is increasing, while costs of whole genome sequencing are decreasing, it has been proposed that genomic information and MALDI-TOF spectra [9] be included with the main phenotypic characteristics of a strain, in a polyphasic approach (taxono-genomics) to the description of new bacterial taxa [8,[10][11][12][13][14][15][16][17][18][19][20][21][22][23].
Here we present a summary classification and a set of features for Paucisalibacillus algeriensis sp. nov. strain EB02 T together with the description of the complete genome sequence and annotation. These characteristics support the circumscription of the species Paucisalibacillus algeriensis.

Classification and features
Paucisalibacillus algeriensis strain EB02 T was isolated accidentally in July 2012 during research work for the isolation of halophilic actinomycetes, and further characterized. The source of the isolate was a hypersaline soil sample from the Northwestern periphery of the hypersaline lake Ezzemoul Sabkha in the Oum-El-Bouaghi region of northeastern Algeria. This part of the lake is bounded by halophilic vegetation. Samples were taken aseptically at a depth of 10 cm and transferred to sterile containers, then transported in a cooler (4°C) to our lab in Algeria. 10 g of hyprersaline soil were suspended in 90 ml of sterile saline water (0.9% NaCl) and vigorously vortexed. Tenfold serial dilutions up to10 -5 of the soil suspension were plated in ISP (International Streptomyces Project) medium 2 (dextrose 4 g/l, malt extract 10 g/l, yeast extract 4 g/l, agar 20 g/l) [24] and the plates were incubated at 30°C for 21 days. Strain EB02 T was obtained after 24 h of incubation. In order to obtain a pure culture, colonies were transferred after microscopic examination to Nutrient Agar (NA) medium (meat extract 1 g/l, peptone 5 g/l, yeast extract 2 g/l, sodium chloride 5 g/l, agar 15 g/l). Paucisalibacillus algeriensis sp. nov. strain EB02 T (Table 1) was isolated by cultivation under aerobic conditions at 30°C. When compared to sequences available in GenBank database using the BLAST program through the National Center for Biotechnology Information (NCBI) server, the 16S rRNA gene sequence of Paucisalibacillus algeriensis strain EB02 T (GenBank accession number HG315680) exhibited the highest identity (98.2%) with Paucisalibacillus globulus type strain DSM18846 T (Figure 1), the phylogenetically closest validly published Paucisalibacillus species. This value was lower than the 98.7% 16S rRNA gene sequence threshold recommended by Stackebrandt and Ebers to delineate a new species without carrying out DNA-DNA hybridizidation [7] . , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [34]. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgements. GenBank accession numbers are displayed in parentheses. Sequences were aligned using CLUSTALW, and phylogenetic inferences obtained using the neighbor-joining method [35] in the MEGA 5 software package [36]. Numbers above the nodes are percentages of bootstrap values obtained from 1,000 replicates that support the node. Paenibacillus polymyxa was used as the outgroup. The scale bar represents 0.01 substitutions per nucleotide position.
Six growth temperatures (25,30,37,45, 50 and 55°C), ten pHs (5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 10, 11) and nine NaCl concentrations (0, 2.5, 5, 7.5, 10, 15, 20, 25, 30%) were tested. Growth occurred between 25°C and 50°C, however the optimal growth was observed between 30°C and 37°C, the strain was able to grow at between 0% and 5% NaCl concentration and at pHs in the range of 6.5-9 (optimum at pH 7). After 24 h of aerobic incubation under optimal growth conditions on sheep blood agar (BioMerieux), strain EB02 T formed light beige, circular, slightly elevated colonies from 1mm to 2 mm in diameter. Growth of the strain was tested in anaerobic and microaerophilic atmospheres using GasPak EZ Anaerobe Pouch (Becton, Dickinson and Company) and CampyGen Compact (Oxoid) systems, respectively, and in an aerobic atmosphere, with or without 5% CO2. Growth was achieved under aerobic (with and without CO2) and microaerophilic conditions but no growth was observed under anaerobic conditions. Gram staining showed Gram positive rods ( Figure 2). Cells grown on agar sporulate. A motility test was positive. The presence of peritrichous flagella and the size of cells were determined by negative staining transmission electron microscopy on a Technai G 2 Cryo (FEI) at an operating voltage of 200 kV, the rods have a length ranging from 2.1 μm to 3.2 μm (mean 2.6 μm) and a diameter ranging from 0.4 μm to μm 0.6 (mean 0.5 μm) ( Figure 3). Strain EB02 T exhibited catalase activity but oxidase activity was negative. Using the commercially available API ZYM system (BioMerieux), positive reactions were observed for alkaline phosphatase, esterase (C4), trypsin, α-glucosidase, and a weak positive reaction was observed for esterase lipase (C8); the other tests were negative. Using the API 50CH system (BioMerieux) according to the manufacturer's instructions, a weak positive reaction was observed for D-glucose, D-fructose, N-acetylglucosamine, D-saccharose, amygdalin, esculin and salicin. The remaining tests were negative. Indole production, β-galactosidase, urease, and hydrolysis of gelatin and starch were negative, but nitrate reduction reaction was positive. Paucisalibacillus algeriensis was resistant to nalidixic acid, but susceptible to amoxicillin, nitrofurantoin, erythromycin, doxycycline, rifam-picin, vancomycin, gentamicin, imipenem, trimethoprim-sulfamethoxazole, ciprofloxacin, ceftriaxone and amoxicillin-clavulanic acid. When compared to other Paucisalibacillus, Ornithinibacillus, Oceanobacillus and Virgibacillus species [1,[37][38][39][40][41][42][43], Paucisalibacillus algeriensis sp. nov. strain EB02 T exhibited the phenotypic differences detailed in (Table 2).    A score enabled the identification, or not, from the tested species: a score > 2 with a validated species enabled the identification at the species level, a score > 1.7 but < 2 enabled the identification at the genus level; and a score < 1.7 did not enable any identification. For strain EB02 T , the scores obtained ranged from 1.0 to 1.4 thus suggesting that our isolate was a new species. We added the spectrum from strain EB02 T (Figure 4) to our database. Spectrum differences with those of Ornithinibacillus and Oceanobacillus related species are shown in ( Figure  5).

Genome sequencing information
Genome project historyThe organism was selected for sequencing on the basis of its phylogenetic position and 16S rDNA sequence similarity to other members of the genus Paucisalibacillus, and is part of a study of the microbial diversity of the hypersaline lakes in northeastern Algeria. It was the 2 nd genome of a Paucisalibacillus species and the first genome of Paucisalibacillus algeriensis sp. nov. The EMBL accession number is CBYO000000000 and consists of 23 contigs. Table 3 shows the project information and its association with MIGS version 2.0 compliance [46] .  51] and BLASTn against the GenBank database, whereas the tRNAScan-SE tool [52] was used to find tRNA genes. Transmembrane helices and lipoprotein signal peptides were predicted using phobius web server [53]. ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids, if alignment lengths were smaller than 80 amino acids, we used an Evalue of 1e-05. Artemis [54] was used for data management and DNA Plotter [55] was used for visualization of genomic features. To estimate the mean level of nucleotide sequence similarity at the genome level between Paucisalibacillus algeriensis sp nov. strain EB02 T and Paucisalibacillus globulus,

Ornithinibacillus scapharcae, Oceanobacillus iheyensis
and Virgibacillus halodenitrificans, we used the Average Genomic Identity of Orthologous gene Sequences (AGIOS) home-made software. Briefly, this software uses the Proteinortho software [56] to detect orthologous proteins between genomes compared two by two, then retrieves the corresponding genes and determines the mean percentage of nucleotide sequence identity among orthologous ORFs using the Needleman-Wunsch global alignment algorithm.

Genome properties
The genome is 4,006,766 bp long (1 chromosome but no plasmid) with 36% GC content ( Figure 6 and Table 4). It is composed of 23 contigs. Of the 4,038 predicted genes, 3,956 were protein-coding genes, and 82 were RNAs (7 5S rRNA genes, 1 16S rRNA gene, 1 23S rRNA gene, and 73 tRNA genes). A total of 2,691 genes (68.02%) were assigned a putative function (by cogs or by NR blast), of which 179 were identified as ORFans (4.52%). The remaining genes were annotated as hypothetical proteins (821 genes, 20.75%). The distribution of genes into COGs functional categories is presented in Table 5. The properties and statistics of the genome are summarized in Tables 4 and 5 Genes with transmembrane helices 1,067 26.97 a The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome

Conclusion
On the basis of phenotypic (Table 2), phylogenetic and genomic analyses (taxonogenomics) ( Table 6,  Table 7.), we formally propose the creation of Paucisalibacillus algeriensis sp. nov. that contains the strain EB02 T . This strain has been found in a hypersaline lacustrine soil sample collected from Algeria.

Description of Paucisalibacillus algeriensis sp. nov EB02 T
Paucisalibacillus algeriensis (al.ge.ri.en'sis. N.L. masc.adj. algeriensis, of Algeria, where strain EB02 T was isolated). Strain EB02 T is a strictly aerobic Gram-positive rod, endospore-forming, motile by means of peritrichous flagella. Growth is achieved aerobically between 25°C and 50°C, but optimal growth was observed between 30°C-37°C. The strain was able to grow between 0% and 5% NaCl concentration and at pHs in the range of 6.5-9(optimum at pH 7). Growth is also observed under a microaerophilic atmosphere, however, no growth was observed under anaerobic conditions. After 24h growth on 5% sheep blood-enriched Columbia agar (BioMerieux) at 30°C, bacterial colonies were light beige, circular, slightly elevated and from 1 mm to 2 mm in diameter. Cells have a length ranging from 2.1 μm to 3.2 μm (mean 2.6 μm) and a diameter ranging from 0.4 μm to μm 0.6 (mean 0.5 μm). Catalase activity was positive but oxidase activity was negative. Using the commercially available API ZYM system (BioMerieux), positive reactions were observed for alkaline phosphatase, esterase (C4), trypsin, α-glucosidase, and weak positive reaction was observed for esterase lipase (C8). The other tests were negative. Using the API 50CH system (BioMerieux) according to the manufacturer's instructions, a weak positive reaction was observed for, D-glucose, D-fructose, N-acetylglucosamine, D-saccharose, amygdalin, esculin and salicin. The remaining tests were negative. Indole production, β-galactosidase, urease, hydrolysis of gelatin and starch were negative, but the nitrate reduction reaction was positive. Paucisalibacillus algeriensis was resistant to nalidixic acid, but susceptible to amoxicillin, nitrofurantoin, erythromycin, doxycycline, rifampicin, vancomycin, gentamicin, imipenem, trimethoprim-sulfamethoxazole, ciproflox-acin, ceftriaxone and amoxicillin-clavulanic acid. The G+C content of the genome is 36%. The 16S rRNA and genome sequences are deposited in GenBank under accession number HG315680 and EMBL database under accession number CBYO000000000, respectively. The type strain EB02 T (= CSUR P858 = DSM 27335) was isolated from a soil sample from the margin of the hypersaline lake Ezzemoul Sabkha in the Oum-El-Bouaghi region of northeastern Algeria.