An approach for estimating haplotype diversity from sequences with unequal lengths

Research output: Contribution to journalJournal articleResearchpeer-review

  • Ping Fan
  • Fjeldså, Jon
  • Xuan Liu
  • Yafei Dong
  • Yongbin Chang
  • Yanhua Qu
  • Gang Song
  • Fumin Lei

Genetic diversity is an essential component of biodiversity. Developing robust quantification methods is critically important in depicting the genetic diversity underlying the geographical distributions of species, especially for the sequence data with unequal lengths. Traditional calculation of genetic diversity depends on sequences of equal length. However, many homologous sequences downloaded from online repositories vary in length, posing a significant challenge to quantify the genetic diversity, especially haplotype diversity. We developed a new approach independent of sequence length by applying the same parameters used in calculating nucleotide diversity to estimate haplotype diversity. We compared this novel approach with the calculations by the program DNAsp, and we used simulation data from terrestrial vertebrates (birds, mammals and amphibians) and Homo sapiens to validate the method's performance. We further applied this approach to explore the global latitudinal gradients of haplotype diversity in amphibians, mammals and birds, and compared the results by traditional methods. The haplotype diversity calculated by our novel approach is consistent with the results from DNAsp. The simulations showed that our approach is robust and has a good estimating performance for sequence data with unequal lengths. For the datasets of terrestrial vertebrates and H. sapiens, our approach is capable of estimating haplotype diversity with unequal intraspecific sequence lengths. In contrast to patterns based on traditional methods, we observed different latitudinal patterns of haplotype diversity between the northern and southern hemispheres for terrestrial vertebrates, which is consistent with the updated pattern of nucleotide diversity for mammals. The present work contributes to the development of more precise quantification methods, which may be broadly applied to assessing biogeographical patterns of genetic diversity.

Original languageEnglish
JournalMethods in Ecology and Evolution
Volume12
Issue number9
Pages (from-to)1658-1667
Number of pages10
ISSN2041-210X
DOIs
Publication statusPublished - 2021

    Research areas

  • genetic diversity, haplotype diversity, haplotype–nucleotide diversity relationship, nucleotide diversity, unequal length sequences

ID: 273697299