VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

VIRify : An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models. / Rangel-Pineros, Guillermo; Almeida, Alexandre; Beracochea, Martin; Sakharova, Ekaterina; Marz, Manja; Muñoz, Alejandro Reyes; Hölzer, Martin; Finn, Robert D.

In: PLOS Computational Biology, Vol. 19, No. 8, e1011422, 2023.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Rangel-Pineros, G, Almeida, A, Beracochea, M, Sakharova, E, Marz, M, Muñoz, AR, Hölzer, M & Finn, RD 2023, 'VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models', PLOS Computational Biology, vol. 19, no. 8, e1011422. https://doi.org/10.1371/journal.pcbi.1011422

APA

Rangel-Pineros, G., Almeida, A., Beracochea, M., Sakharova, E., Marz, M., Muñoz, A. R., Hölzer, M., & Finn, R. D. (2023). VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models. PLOS Computational Biology, 19(8), [e1011422]. https://doi.org/10.1371/journal.pcbi.1011422

Vancouver

Rangel-Pineros G, Almeida A, Beracochea M, Sakharova E, Marz M, Muñoz AR et al. VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models. PLOS Computational Biology. 2023;19(8). e1011422. https://doi.org/10.1371/journal.pcbi.1011422

Author

Rangel-Pineros, Guillermo ; Almeida, Alexandre ; Beracochea, Martin ; Sakharova, Ekaterina ; Marz, Manja ; Muñoz, Alejandro Reyes ; Hölzer, Martin ; Finn, Robert D. / VIRify : An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models. In: PLOS Computational Biology. 2023 ; Vol. 19, No. 8.

Bibtex

@article{8e9702545a834f1ba9269bf467772b59,
title = "VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models",
abstract = "The study of viral communities has revealed the enormous diversity and impact these biological entities have on various ecosystems. These observations have sparked widespread interest in developing computational strategies that support the comprehensive characterization of viral communities based on sequencing data. Here we introduce VIRify, a new computational pipeline designed to provide a user-friendly and accurate functional and taxonomic characterization of viral communities. VIRify identifies viral contigs and prophages from metagenomic assemblies and annotates them using a collection of viral profile hidden Markov models (HMMs). These include our manually-curated profile HMMs, which serve as specific taxonomic markers for a wide range of prokaryotic and eukaryotic viral taxa and are thus used to reliably classify viral contigs. We tested VIRify on assemblies from two microbial mock communities, a large metagenomics study, and a collection of publicly available viral genomic sequences from the human gut. The results showed that VIRify could identify sequences from both prokaryotic and eukaryotic viruses, and provided taxonomic classifications from the genus to the family rank with an average accuracy of 86.6%. In addition, VIRify allowed the detection and taxonomic classification of a range of prokaryotic and eukaryotic viruses present in 243 marine metagenomic assemblies. Finally, the use of VIRify led to a large expansion in the number of taxonomically classified human gut viral sequences and the improvement of outdated and shallow taxonomic classifications. Overall, we demonstrate that VIRify is a novel and powerful resource that offers an enhanced capability to detect a broad range of viral contigs and taxonomically classify them.",
author = "Guillermo Rangel-Pineros and Alexandre Almeida and Martin Beracochea and Ekaterina Sakharova and Manja Marz and Mu{\~n}oz, {Alejandro Reyes} and Martin H{\"o}lzer and Finn, {Robert D.}",
note = "Publisher Copyright: Copyright: {\textcopyright} 2023 Rangel-Pineros et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.",
year = "2023",
doi = "10.1371/journal.pcbi.1011422",
language = "English",
volume = "19",
journal = "P L o S Computational Biology (Online)",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "8",

}

RIS

TY - JOUR

T1 - VIRify

T2 - An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models

AU - Rangel-Pineros, Guillermo

AU - Almeida, Alexandre

AU - Beracochea, Martin

AU - Sakharova, Ekaterina

AU - Marz, Manja

AU - Muñoz, Alejandro Reyes

AU - Hölzer, Martin

AU - Finn, Robert D.

N1 - Publisher Copyright: Copyright: © 2023 Rangel-Pineros et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PY - 2023

Y1 - 2023

N2 - The study of viral communities has revealed the enormous diversity and impact these biological entities have on various ecosystems. These observations have sparked widespread interest in developing computational strategies that support the comprehensive characterization of viral communities based on sequencing data. Here we introduce VIRify, a new computational pipeline designed to provide a user-friendly and accurate functional and taxonomic characterization of viral communities. VIRify identifies viral contigs and prophages from metagenomic assemblies and annotates them using a collection of viral profile hidden Markov models (HMMs). These include our manually-curated profile HMMs, which serve as specific taxonomic markers for a wide range of prokaryotic and eukaryotic viral taxa and are thus used to reliably classify viral contigs. We tested VIRify on assemblies from two microbial mock communities, a large metagenomics study, and a collection of publicly available viral genomic sequences from the human gut. The results showed that VIRify could identify sequences from both prokaryotic and eukaryotic viruses, and provided taxonomic classifications from the genus to the family rank with an average accuracy of 86.6%. In addition, VIRify allowed the detection and taxonomic classification of a range of prokaryotic and eukaryotic viruses present in 243 marine metagenomic assemblies. Finally, the use of VIRify led to a large expansion in the number of taxonomically classified human gut viral sequences and the improvement of outdated and shallow taxonomic classifications. Overall, we demonstrate that VIRify is a novel and powerful resource that offers an enhanced capability to detect a broad range of viral contigs and taxonomically classify them.

AB - The study of viral communities has revealed the enormous diversity and impact these biological entities have on various ecosystems. These observations have sparked widespread interest in developing computational strategies that support the comprehensive characterization of viral communities based on sequencing data. Here we introduce VIRify, a new computational pipeline designed to provide a user-friendly and accurate functional and taxonomic characterization of viral communities. VIRify identifies viral contigs and prophages from metagenomic assemblies and annotates them using a collection of viral profile hidden Markov models (HMMs). These include our manually-curated profile HMMs, which serve as specific taxonomic markers for a wide range of prokaryotic and eukaryotic viral taxa and are thus used to reliably classify viral contigs. We tested VIRify on assemblies from two microbial mock communities, a large metagenomics study, and a collection of publicly available viral genomic sequences from the human gut. The results showed that VIRify could identify sequences from both prokaryotic and eukaryotic viruses, and provided taxonomic classifications from the genus to the family rank with an average accuracy of 86.6%. In addition, VIRify allowed the detection and taxonomic classification of a range of prokaryotic and eukaryotic viruses present in 243 marine metagenomic assemblies. Finally, the use of VIRify led to a large expansion in the number of taxonomically classified human gut viral sequences and the improvement of outdated and shallow taxonomic classifications. Overall, we demonstrate that VIRify is a novel and powerful resource that offers an enhanced capability to detect a broad range of viral contigs and taxonomically classify them.

U2 - 10.1371/journal.pcbi.1011422

DO - 10.1371/journal.pcbi.1011422

M3 - Journal article

C2 - 37639475

AN - SCOPUS:85170244121

VL - 19

JO - P L o S Computational Biology (Online)

JF - P L o S Computational Biology (Online)

SN - 1553-734X

IS - 8

M1 - e1011422

ER -

ID: 367713903