Bayes Empirical Bayes Inference of Amino Acid Sites Under Positive Selection

Research output: Contribution to journal › Journal article › Research › peer-review

Ziheng Yang
Wendy Shuk Wan Wong
Nielsen, Rasmus

Codon-based substitution models have been widely used to identifyamino acid sites under positive selection in comparative analysisof protein-coding DNA sequences. The nonsynonymous-synonymoussubstitution rate ratio (d_N/d_S, denoted ${omega}$ ) is used as a measureof selective pressure at the protein level, with ${omega}$ > 1 indicatingpositive selection. Statistical distributions are used to modelthe variation in ${omega}$ among sites, allowing a subset of sites tohave ${omega}$ > 1 while the rest of the sequence may be under purifyingselection with ${omega}$ < 1. An empirical Bayes (EB) approach isthen used to calculate posterior probabilities that a site comesfrom the site class with ${omega}$ > 1. Current implementations, however,use the naive EB (NEB) approach and fail to account for samplingerrors in maximum likelihood estimates of model parameters,such as the proportions and ${omega}$ ratios for the site classes. Insmall data sets lacking information, this approach may leadto unreliable posterior probability calculations. In this paper,we develop a Bayes empirical Bayes (BEB) approach to the problem,which assigns a prior to the model parameters and integratesover their uncertainties. We compare the new and old methodson real and simulated data sets. The results suggest that insmall data sets the new BEB method does not generate false positivesas did the old NEB approach, while in large data sets it retainsthe good power of the NEB approach for inferring positivelyselected sites.

Original language	English
Journal	Molecular Biology and Evolution
Volume	22
Issue number	4
Pages (from-to)	1107-1118
ISSN	0737-4038
DOIs	https://doi.org/10.1093/molbev/msi097
Publication status	Published - 2005

Bibliographical note

Key Words: positive selection • codon-substitution models • Bayes empirical Bayes

ID: 87244

Globe Institute

Bayes Empirical Bayes Inference of Amino Acid Sites Under Positive Selection

Bibliographical note