Rapid discovery of novel prophages using biological feature engineering and machine learning

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Rapid discovery of novel prophages using biological feature engineering and machine learning. / Sirén, Kimmo; Millard, Andrew; Petersen, Bent; Gilbert, M. Thomas P.; Clokie, Martha R. J.; Sicheritz-Ponten, Thomas.

In: NAR Genomics and Bioinformatics, Vol. 3, No. 1, lqaa109, 2021.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Sirén, K, Millard, A, Petersen, B, Gilbert, MTP, Clokie, MRJ & Sicheritz-Ponten, T 2021, 'Rapid discovery of novel prophages using biological feature engineering and machine learning', NAR Genomics and Bioinformatics, vol. 3, no. 1, lqaa109. https://doi.org/10.1093/nargab/lqaa109

APA

Sirén, K., Millard, A., Petersen, B., Gilbert, M. T. P., Clokie, M. R. J., & Sicheritz-Ponten, T. (2021). Rapid discovery of novel prophages using biological feature engineering and machine learning. NAR Genomics and Bioinformatics, 3(1), [lqaa109]. https://doi.org/10.1093/nargab/lqaa109

Vancouver

Sirén K, Millard A, Petersen B, Gilbert MTP, Clokie MRJ, Sicheritz-Ponten T. Rapid discovery of novel prophages using biological feature engineering and machine learning. NAR Genomics and Bioinformatics. 2021;3(1). lqaa109. https://doi.org/10.1093/nargab/lqaa109

Author

Sirén, Kimmo ; Millard, Andrew ; Petersen, Bent ; Gilbert, M. Thomas P. ; Clokie, Martha R. J. ; Sicheritz-Ponten, Thomas. / Rapid discovery of novel prophages using biological feature engineering and machine learning. In: NAR Genomics and Bioinformatics. 2021 ; Vol. 3, No. 1.

Bibtex

@article{0567e549108a4d18ba41bcf402ba3776,
title = "Rapid discovery of novel prophages using biological feature engineering and machine learning",
abstract = "Prophages are phages that are integrated into bacterial genomes and which are key to understanding many aspects of bacterial biology. Their extreme diversity means they are challenging to detect using sequence similarity, yet this remains the paradigm and thus many phages remain unidentified. We present a novel, fast and generalizing machine learning method based on feature space to facilitate novel prophage discovery. To validate the approach, we reanalyzed publicly available marine viromes and single-cell genomes using our feature-based approaches and found consistently more phages than were detected using current state-of-the-art tools while being notably faster. This demonstrates that our approach significantly enhances bacteriophage discovery and thus provides a new starting point for exploring new biologies.",
author = "Kimmo Sir{\'e}n and Andrew Millard and Bent Petersen and Gilbert, {M. Thomas P.} and Clokie, {Martha R. J.} and Thomas Sicheritz-Ponten",
year = "2021",
doi = "10.1093/nargab/lqaa109",
language = "English",
volume = "3",
journal = "NAR Genomics and Bioinformatics",
issn = "2631-9268",
publisher = "Oxford University Press",
number = "1",

}

RIS

TY - JOUR

T1 - Rapid discovery of novel prophages using biological feature engineering and machine learning

AU - Sirén, Kimmo

AU - Millard, Andrew

AU - Petersen, Bent

AU - Gilbert, M. Thomas P.

AU - Clokie, Martha R. J.

AU - Sicheritz-Ponten, Thomas

PY - 2021

Y1 - 2021

N2 - Prophages are phages that are integrated into bacterial genomes and which are key to understanding many aspects of bacterial biology. Their extreme diversity means they are challenging to detect using sequence similarity, yet this remains the paradigm and thus many phages remain unidentified. We present a novel, fast and generalizing machine learning method based on feature space to facilitate novel prophage discovery. To validate the approach, we reanalyzed publicly available marine viromes and single-cell genomes using our feature-based approaches and found consistently more phages than were detected using current state-of-the-art tools while being notably faster. This demonstrates that our approach significantly enhances bacteriophage discovery and thus provides a new starting point for exploring new biologies.

AB - Prophages are phages that are integrated into bacterial genomes and which are key to understanding many aspects of bacterial biology. Their extreme diversity means they are challenging to detect using sequence similarity, yet this remains the paradigm and thus many phages remain unidentified. We present a novel, fast and generalizing machine learning method based on feature space to facilitate novel prophage discovery. To validate the approach, we reanalyzed publicly available marine viromes and single-cell genomes using our feature-based approaches and found consistently more phages than were detected using current state-of-the-art tools while being notably faster. This demonstrates that our approach significantly enhances bacteriophage discovery and thus provides a new starting point for exploring new biologies.

U2 - 10.1093/nargab/lqaa109

DO - 10.1093/nargab/lqaa109

M3 - Journal article

C2 - 33575651

VL - 3

JO - NAR Genomics and Bioinformatics

JF - NAR Genomics and Bioinformatics

SN - 2631-9268

IS - 1

M1 - lqaa109

ER -

ID: 281283272