Estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples. / Pipes, Lenore; Chen, Zihao; Afanaseva, Svetlana; Nielsen, Rasmus.

In: Cell Reports Methods, Vol. 2, No. 10, 100313, 2022.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Pipes, L, Chen, Z, Afanaseva, S & Nielsen, R 2022, 'Estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples', Cell Reports Methods, vol. 2, no. 10, 100313. https://doi.org/10.1016/j.crmeth.2022.100313

APA

Pipes, L., Chen, Z., Afanaseva, S., & Nielsen, R. (2022). Estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples. Cell Reports Methods, 2(10), [100313]. https://doi.org/10.1016/j.crmeth.2022.100313

Vancouver

Pipes L, Chen Z, Afanaseva S, Nielsen R. Estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples. Cell Reports Methods. 2022;2(10). 100313. https://doi.org/10.1016/j.crmeth.2022.100313

Author

Pipes, Lenore ; Chen, Zihao ; Afanaseva, Svetlana ; Nielsen, Rasmus. / Estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples. In: Cell Reports Methods. 2022 ; Vol. 2, No. 10.

Bibtex

@article{82ed76c19e66442bac17f9c9a242f49e,
title = "Estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples",
abstract = "Wastewater surveillance has become essential for monitoring the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The quantification of SARS-CoV-2 RNA in wastewater correlates with the coronavirus disease 2019 (COVID-19) caseload in a community. However, estimating the proportions of different SARS-CoV-2 haplotypes has remained technically difficult. We present a phylogenetic imputation method for improving the SARS-CoV-2 reference database and a method for estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples. The phylogenetic imputation method uses the global SARS-CoV-2 phylogeny and imputes based on the maximum of the posterior probability of each nucleotide. We show that the imputation method has error rates comparable to, or lower than, typical sequencing error rates, which substantially improves the reference database and allows for accurate inferences of haplotype composition. Our method for estimating relative proportions of haplotypes uses an initial step to remove unlikely haplotypes and an expectation maximization (EM) algorithm for obtaining maximum likelihood estimates of the proportions of different haplotypes in a sample. Using simulations with a reference database of >3 million SARS-CoV-2 genomes, we show that the estimated proportions reflect the true proportions given sufficiently high sequencing depth.",
keywords = "COVID-19, expectation maximization, imputation, SARS-CoV-2, wastewater surveillance, wastewater-based epidemiology",
author = "Lenore Pipes and Zihao Chen and Svetlana Afanaseva and Rasmus Nielsen",
note = "Publisher Copyright: {\textcopyright} 2022 The Author(s)",
year = "2022",
doi = "10.1016/j.crmeth.2022.100313",
language = "English",
volume = "2",
journal = "Cell Reports Methods",
issn = "2667-2375",
publisher = "Cell Press",
number = "10",

}

RIS

TY - JOUR

T1 - Estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples

AU - Pipes, Lenore

AU - Chen, Zihao

AU - Afanaseva, Svetlana

AU - Nielsen, Rasmus

N1 - Publisher Copyright: © 2022 The Author(s)

PY - 2022

Y1 - 2022

N2 - Wastewater surveillance has become essential for monitoring the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The quantification of SARS-CoV-2 RNA in wastewater correlates with the coronavirus disease 2019 (COVID-19) caseload in a community. However, estimating the proportions of different SARS-CoV-2 haplotypes has remained technically difficult. We present a phylogenetic imputation method for improving the SARS-CoV-2 reference database and a method for estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples. The phylogenetic imputation method uses the global SARS-CoV-2 phylogeny and imputes based on the maximum of the posterior probability of each nucleotide. We show that the imputation method has error rates comparable to, or lower than, typical sequencing error rates, which substantially improves the reference database and allows for accurate inferences of haplotype composition. Our method for estimating relative proportions of haplotypes uses an initial step to remove unlikely haplotypes and an expectation maximization (EM) algorithm for obtaining maximum likelihood estimates of the proportions of different haplotypes in a sample. Using simulations with a reference database of >3 million SARS-CoV-2 genomes, we show that the estimated proportions reflect the true proportions given sufficiently high sequencing depth.

AB - Wastewater surveillance has become essential for monitoring the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The quantification of SARS-CoV-2 RNA in wastewater correlates with the coronavirus disease 2019 (COVID-19) caseload in a community. However, estimating the proportions of different SARS-CoV-2 haplotypes has remained technically difficult. We present a phylogenetic imputation method for improving the SARS-CoV-2 reference database and a method for estimating the relative proportions of SARS-CoV-2 haplotypes from wastewater samples. The phylogenetic imputation method uses the global SARS-CoV-2 phylogeny and imputes based on the maximum of the posterior probability of each nucleotide. We show that the imputation method has error rates comparable to, or lower than, typical sequencing error rates, which substantially improves the reference database and allows for accurate inferences of haplotype composition. Our method for estimating relative proportions of haplotypes uses an initial step to remove unlikely haplotypes and an expectation maximization (EM) algorithm for obtaining maximum likelihood estimates of the proportions of different haplotypes in a sample. Using simulations with a reference database of >3 million SARS-CoV-2 genomes, we show that the estimated proportions reflect the true proportions given sufficiently high sequencing depth.

KW - COVID-19

KW - expectation maximization

KW - imputation

KW - SARS-CoV-2

KW - wastewater surveillance

KW - wastewater-based epidemiology

U2 - 10.1016/j.crmeth.2022.100313

DO - 10.1016/j.crmeth.2022.100313

M3 - Journal article

C2 - 36159190

AN - SCOPUS:85139179174

VL - 2

JO - Cell Reports Methods

JF - Cell Reports Methods

SN - 2667-2375

IS - 10

M1 - 100313

ER -

ID: 331788323