Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference. / Vankan, Mezzalina; Ho, Simon Y. W.; Duchene Garzon, David Alejandro.

In: Systematic Biology, Vol. 71, No. 2, 2022, p. 490-500.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Vankan, M, Ho, SYW & Duchene Garzon, DA 2022, 'Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference', Systematic Biology, vol. 71, no. 2, pp. 490-500. https://doi.org/10.1093/sysbio/syab051

APA

Vankan, M., Ho, S. Y. W., & Duchene Garzon, D. A. (2022). Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference. Systematic Biology, 71(2), 490-500. https://doi.org/10.1093/sysbio/syab051

Vancouver

Vankan M, Ho SYW, Duchene Garzon DA. Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference. Systematic Biology. 2022;71(2):490-500. https://doi.org/10.1093/sysbio/syab051

Author

Vankan, Mezzalina ; Ho, Simon Y. W. ; Duchene Garzon, David Alejandro. / Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference. In: Systematic Biology. 2022 ; Vol. 71, No. 2. pp. 490-500.

Bibtex

@article{6a4bf572939f4d4ab4051d0cf39ca732,
title = "Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference",
abstract = "Phylogenetic analyses of genomic data provide a powerful means of reconstructing the evolutionary relationships among organisms, yet such analyses are often hindered by conflicting phylogenetic signals among loci. Identifying the signals that are the most influential to species-tree estimation can help to inform the choice of data for phylogenomic analysis. We investigated this in an analysis of 30 phylogenomic data sets. For each data set, we examined the association between several branch-length characteristics of gene trees and the distance between these gene trees and the corresponding species trees. We found that the distance of each gene tree to the species tree inferred from the full data set was positively associated with variation in root-to-tip distances and negatively associated with mean branch support. However, no such associations were found for gene-tree length, a measure of the overall substitution rate at each locus. We further explored the usefulness of the best-performing branch-based characteristics for selecting loci for phylogenomic analyses. We found that loci that yield gene trees with high variation in root-to-tip distances have a disproportionately distant signal of tree topology compared with the complete data sets. These results suggest that rate variation across lineages should be taken into consideration when exploring and even selecting loci for phylogenomic analysis.",
author = "Mezzalina Vankan and Ho, {Simon Y. W.} and {Duchene Garzon}, {David Alejandro}",
year = "2022",
doi = "10.1093/sysbio/syab051",
language = "English",
volume = "71",
pages = "490--500",
journal = "Systematic Biology",
issn = "1063-5157",
publisher = "Oxford University Press",
number = "2",

}

RIS

TY - JOUR

T1 - Evolutionary Rate Variation Among Lineages in Gene Trees has a Negative Impact on Species-Tree Inference

AU - Vankan, Mezzalina

AU - Ho, Simon Y. W.

AU - Duchene Garzon, David Alejandro

PY - 2022

Y1 - 2022

N2 - Phylogenetic analyses of genomic data provide a powerful means of reconstructing the evolutionary relationships among organisms, yet such analyses are often hindered by conflicting phylogenetic signals among loci. Identifying the signals that are the most influential to species-tree estimation can help to inform the choice of data for phylogenomic analysis. We investigated this in an analysis of 30 phylogenomic data sets. For each data set, we examined the association between several branch-length characteristics of gene trees and the distance between these gene trees and the corresponding species trees. We found that the distance of each gene tree to the species tree inferred from the full data set was positively associated with variation in root-to-tip distances and negatively associated with mean branch support. However, no such associations were found for gene-tree length, a measure of the overall substitution rate at each locus. We further explored the usefulness of the best-performing branch-based characteristics for selecting loci for phylogenomic analyses. We found that loci that yield gene trees with high variation in root-to-tip distances have a disproportionately distant signal of tree topology compared with the complete data sets. These results suggest that rate variation across lineages should be taken into consideration when exploring and even selecting loci for phylogenomic analysis.

AB - Phylogenetic analyses of genomic data provide a powerful means of reconstructing the evolutionary relationships among organisms, yet such analyses are often hindered by conflicting phylogenetic signals among loci. Identifying the signals that are the most influential to species-tree estimation can help to inform the choice of data for phylogenomic analysis. We investigated this in an analysis of 30 phylogenomic data sets. For each data set, we examined the association between several branch-length characteristics of gene trees and the distance between these gene trees and the corresponding species trees. We found that the distance of each gene tree to the species tree inferred from the full data set was positively associated with variation in root-to-tip distances and negatively associated with mean branch support. However, no such associations were found for gene-tree length, a measure of the overall substitution rate at each locus. We further explored the usefulness of the best-performing branch-based characteristics for selecting loci for phylogenomic analyses. We found that loci that yield gene trees with high variation in root-to-tip distances have a disproportionately distant signal of tree topology compared with the complete data sets. These results suggest that rate variation across lineages should be taken into consideration when exploring and even selecting loci for phylogenomic analysis.

U2 - 10.1093/sysbio/syab051

DO - 10.1093/sysbio/syab051

M3 - Journal article

C2 - 34255084

VL - 71

SP - 490

EP - 500

JO - Systematic Biology

JF - Systematic Biology

SN - 1063-5157

IS - 2

ER -

ID: 327058506