Here, we report the draft genome sequence and annotation of Nocardia farcinica TRH1, a petroleum hydrocarbons degrading Actinobacteria isolated from the coastal water of Trindade Island, Brazil.
Several species of the genus Nocardia are known to degrade hydrocarbons.1 The strain Nocardia farcinica TRH1 was isolated from the coastal water of Trindade Island, a pristine oceanic island in Brazil.2 It is capable of growing using several petroleum hydrocarbons as the sole source of carbon and energy, such as phenanthrene, pyrene, anthracene, eicosane, pentacontane, triacontane, tetracosane, naphthalene, hexadecane, octane, toluene and xylene.2
Genome sequencing of N. farcinica TRH1 was performed using the Ion Torrent PGM platform (ThermoFisher Scientific). Briefly, the genomic DNA was fragmented using the Bioruptor UCD-200. The template library was prepared with the Ion Plus fragment library kit and clonally amplified in the One Touch System with the Ion PGM template OT2 400 kit. The amplified library was sequenced using the Ion PGM sequencing 400 kit within the 318 v2 microchip. A total of 2531733 reads were obtained with sizes ranging from 25 to 492 bp in length. The reads were filtered for length (minimum, 100 bp) and quality (minimum score, Q20) and used for de novo assembling using CLC Genomics Workbench version 6.5.1 (CLC bio). From assembling, we obtained 321 contigs, corresponding to 5230013 bp, with an average size (N50) of 9853 bp, longest contig size of 122221 bp, G+C content of 68.0% and genome coverage of 52.53X. Genes from the contigs were predicted using GeneMarkS,3 which revealed 4946 coding sequence set (CDS). The protein sets were functionally annotated using BLAST (http://blast.ncbi.nlm.nih.gov/), and approximately 68.6% of the proteins were assigned to Clusters of Orthologous Groups (COG) families.4
Genome annotation was performed using BlastKOALA5 and revealed 1869 protein-coding-sequences, including 106 related with xenobiotics biodegradation and metabolism, and 309 unclassified. The KEGG Automatic Annotation Server (KASS)6 was used for pathways analysis, which identified 516 genes related to metabolic pathways, including 21 genes related to biodegradation of aromatic compounds.
KASS also identified the presence of catA and dmpC, which are among the genes of the catabolic pathways of ortho and meta-cleavage of catechol, respectively. Catechol is a toxic intermediate generated during the biodegradation of polycyclic aromatic hydrocarbons.7 Furthermore, several genes were identified as being related to biodegradation of aliphatic hydrocarbons as the alkB gene. We also found aldH, paaF, fadB and fadJ genes related to biodegradation of the nylon precursor caprolactam, as well as the atzD gene, related to the degradation of the pesticide atrazine, showing another possibility for biotechnological applications of this strain.
The abundance of genes involved in biodegradation pathways in the genome of N. farcinica TRH1 implies in high metabolic plasticity of this strain, what is consistent with the results obtained during the screening of bacteria for hydrocarbon biodegradation.2 The genome sequencing data from this study will support a better understanding of the metabolism and the potential applications of N. farcinica TRH1 in biotechnological processes, as hydrocarbons and xenobiotics bioremediation.
Nucleotide sequence accession numbers: This WGS BioProject has been deposited at DDBJ/EMBL/GenBank under the accession number PRJNA322144 and the sequences under the accession number LYCQ00000000. The versions described in this paper are the first versions.
Conflicts of interestThe authors declare no conflicts of interest.
We thank the Brazilian Navy and Rodrigo Otoch Chaves for logistic support while collecting samples and for providing the essential structure to transport and store samples. CNPq grant 405544/2012-0 (PROTRINDADE), FAPEMIG, and CAPES (PROEX) funded this work. This work is also supported by the Brazilian Microbiome Project (http://brmicrobiome.org).