An approach for estimating haplotype diversity from sequences with unequal lengths

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

An approach for estimating haplotype diversity from sequences with unequal lengths. / Fan, Ping; Fjeldså, Jon; Liu, Xuan; Dong, Yafei; Chang, Yongbin; Qu, Yanhua; Song, Gang; Lei, Fumin.

In: Methods in Ecology and Evolution, Vol. 12, No. 9, 2021, p. 1658-1667.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Fan, P, Fjeldså, J, Liu, X, Dong, Y, Chang, Y, Qu, Y, Song, G & Lei, F 2021, 'An approach for estimating haplotype diversity from sequences with unequal lengths', Methods in Ecology and Evolution, vol. 12, no. 9, pp. 1658-1667. https://doi.org/10.1111/2041-210X.13643

APA

Fan, P., Fjeldså, J., Liu, X., Dong, Y., Chang, Y., Qu, Y., Song, G., & Lei, F. (2021). An approach for estimating haplotype diversity from sequences with unequal lengths. Methods in Ecology and Evolution, 12(9), 1658-1667. https://doi.org/10.1111/2041-210X.13643

Vancouver

Fan P, Fjeldså J, Liu X, Dong Y, Chang Y, Qu Y et al. An approach for estimating haplotype diversity from sequences with unequal lengths. Methods in Ecology and Evolution. 2021;12(9):1658-1667. https://doi.org/10.1111/2041-210X.13643

Author

Fan, Ping ; Fjeldså, Jon ; Liu, Xuan ; Dong, Yafei ; Chang, Yongbin ; Qu, Yanhua ; Song, Gang ; Lei, Fumin. / An approach for estimating haplotype diversity from sequences with unequal lengths. In: Methods in Ecology and Evolution. 2021 ; Vol. 12, No. 9. pp. 1658-1667.

Bibtex

@article{929d809b0e5f4494be96887141f8ed1f,
title = "An approach for estimating haplotype diversity from sequences with unequal lengths",
abstract = "Genetic diversity is an essential component of biodiversity. Developing robust quantification methods is critically important in depicting the genetic diversity underlying the geographical distributions of species, especially for the sequence data with unequal lengths. Traditional calculation of genetic diversity depends on sequences of equal length. However, many homologous sequences downloaded from online repositories vary in length, posing a significant challenge to quantify the genetic diversity, especially haplotype diversity. We developed a new approach independent of sequence length by applying the same parameters used in calculating nucleotide diversity to estimate haplotype diversity. We compared this novel approach with the calculations by the program DNAsp, and we used simulation data from terrestrial vertebrates (birds, mammals and amphibians) and Homo sapiens to validate the method's performance. We further applied this approach to explore the global latitudinal gradients of haplotype diversity in amphibians, mammals and birds, and compared the results by traditional methods. The haplotype diversity calculated by our novel approach is consistent with the results from DNAsp. The simulations showed that our approach is robust and has a good estimating performance for sequence data with unequal lengths. For the datasets of terrestrial vertebrates and H. sapiens, our approach is capable of estimating haplotype diversity with unequal intraspecific sequence lengths. In contrast to patterns based on traditional methods, we observed different latitudinal patterns of haplotype diversity between the northern and southern hemispheres for terrestrial vertebrates, which is consistent with the updated pattern of nucleotide diversity for mammals. The present work contributes to the development of more precise quantification methods, which may be broadly applied to assessing biogeographical patterns of genetic diversity.",
keywords = "genetic diversity, haplotype diversity, haplotype–nucleotide diversity relationship, nucleotide diversity, unequal length sequences",
author = "Ping Fan and Jon Fjelds{\aa} and Xuan Liu and Yafei Dong and Yongbin Chang and Yanhua Qu and Gang Song and Fumin Lei",
note = "Funding Information: We thank Xiaolu Jiao and Xin Yu for assistance with data analysis, Weiwei Zhai, Liang Ma and Hechuan Yang for discussion and Huijie Qiao for his generous help during our revision. This study was funded by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA19050202 to F.L.), the Second Tibetan Plateau Scientific Expedition and Research (STEP) Program (2019QZKK0304 to F.L. and G.S.), National Science Foundation of China (32070434 & 31572291 to G.S.; 31630069 to F.L.), the National Science and Technology Basic Resources Survey Program of China (2019FY100204 to F.L. and P.F.) and China Scholarship Council, Grant/Award Number: [2017]7011 to P.F. Publisher Copyright: {\textcopyright} 2021 British Ecological Society",
year = "2021",
doi = "10.1111/2041-210X.13643",
language = "English",
volume = "12",
pages = "1658--1667",
journal = "Methods in Ecology and Evolution",
issn = "2041-210X",
publisher = "Wiley-Blackwell",
number = "9",

}

RIS

TY - JOUR

T1 - An approach for estimating haplotype diversity from sequences with unequal lengths

AU - Fan, Ping

AU - Fjeldså, Jon

AU - Liu, Xuan

AU - Dong, Yafei

AU - Chang, Yongbin

AU - Qu, Yanhua

AU - Song, Gang

AU - Lei, Fumin

N1 - Funding Information: We thank Xiaolu Jiao and Xin Yu for assistance with data analysis, Weiwei Zhai, Liang Ma and Hechuan Yang for discussion and Huijie Qiao for his generous help during our revision. This study was funded by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDA19050202 to F.L.), the Second Tibetan Plateau Scientific Expedition and Research (STEP) Program (2019QZKK0304 to F.L. and G.S.), National Science Foundation of China (32070434 & 31572291 to G.S.; 31630069 to F.L.), the National Science and Technology Basic Resources Survey Program of China (2019FY100204 to F.L. and P.F.) and China Scholarship Council, Grant/Award Number: [2017]7011 to P.F. Publisher Copyright: © 2021 British Ecological Society

PY - 2021

Y1 - 2021

N2 - Genetic diversity is an essential component of biodiversity. Developing robust quantification methods is critically important in depicting the genetic diversity underlying the geographical distributions of species, especially for the sequence data with unequal lengths. Traditional calculation of genetic diversity depends on sequences of equal length. However, many homologous sequences downloaded from online repositories vary in length, posing a significant challenge to quantify the genetic diversity, especially haplotype diversity. We developed a new approach independent of sequence length by applying the same parameters used in calculating nucleotide diversity to estimate haplotype diversity. We compared this novel approach with the calculations by the program DNAsp, and we used simulation data from terrestrial vertebrates (birds, mammals and amphibians) and Homo sapiens to validate the method's performance. We further applied this approach to explore the global latitudinal gradients of haplotype diversity in amphibians, mammals and birds, and compared the results by traditional methods. The haplotype diversity calculated by our novel approach is consistent with the results from DNAsp. The simulations showed that our approach is robust and has a good estimating performance for sequence data with unequal lengths. For the datasets of terrestrial vertebrates and H. sapiens, our approach is capable of estimating haplotype diversity with unequal intraspecific sequence lengths. In contrast to patterns based on traditional methods, we observed different latitudinal patterns of haplotype diversity between the northern and southern hemispheres for terrestrial vertebrates, which is consistent with the updated pattern of nucleotide diversity for mammals. The present work contributes to the development of more precise quantification methods, which may be broadly applied to assessing biogeographical patterns of genetic diversity.

AB - Genetic diversity is an essential component of biodiversity. Developing robust quantification methods is critically important in depicting the genetic diversity underlying the geographical distributions of species, especially for the sequence data with unequal lengths. Traditional calculation of genetic diversity depends on sequences of equal length. However, many homologous sequences downloaded from online repositories vary in length, posing a significant challenge to quantify the genetic diversity, especially haplotype diversity. We developed a new approach independent of sequence length by applying the same parameters used in calculating nucleotide diversity to estimate haplotype diversity. We compared this novel approach with the calculations by the program DNAsp, and we used simulation data from terrestrial vertebrates (birds, mammals and amphibians) and Homo sapiens to validate the method's performance. We further applied this approach to explore the global latitudinal gradients of haplotype diversity in amphibians, mammals and birds, and compared the results by traditional methods. The haplotype diversity calculated by our novel approach is consistent with the results from DNAsp. The simulations showed that our approach is robust and has a good estimating performance for sequence data with unequal lengths. For the datasets of terrestrial vertebrates and H. sapiens, our approach is capable of estimating haplotype diversity with unequal intraspecific sequence lengths. In contrast to patterns based on traditional methods, we observed different latitudinal patterns of haplotype diversity between the northern and southern hemispheres for terrestrial vertebrates, which is consistent with the updated pattern of nucleotide diversity for mammals. The present work contributes to the development of more precise quantification methods, which may be broadly applied to assessing biogeographical patterns of genetic diversity.

KW - genetic diversity

KW - haplotype diversity

KW - haplotype–nucleotide diversity relationship

KW - nucleotide diversity

KW - unequal length sequences

U2 - 10.1111/2041-210X.13643

DO - 10.1111/2041-210X.13643

M3 - Journal article

AN - SCOPUS:85108822137

VL - 12

SP - 1658

EP - 1667

JO - Methods in Ecology and Evolution

JF - Methods in Ecology and Evolution

SN - 2041-210X

IS - 9

ER -

ID: 273697299