REPdenovo: Inferring De Novo repeat motifs from short sequence reads

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

REPdenovo : Inferring De Novo repeat motifs from short sequence reads. / Chu, Chong; Nielsen, Rasmus; Wu, Yufeng.

In: PLoS ONE, Vol. 11, No. 3, e0150719, 2016.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Chu, C, Nielsen, R & Wu, Y 2016, 'REPdenovo: Inferring De Novo repeat motifs from short sequence reads', PLoS ONE, vol. 11, no. 3, e0150719. https://doi.org/10.1371/journal.pone.0150719

APA

Chu, C., Nielsen, R., & Wu, Y. (2016). REPdenovo: Inferring De Novo repeat motifs from short sequence reads. PLoS ONE, 11(3), [e0150719]. https://doi.org/10.1371/journal.pone.0150719

Vancouver

Chu C, Nielsen R, Wu Y. REPdenovo: Inferring De Novo repeat motifs from short sequence reads. PLoS ONE. 2016;11(3). e0150719. https://doi.org/10.1371/journal.pone.0150719

Author

Chu, Chong ; Nielsen, Rasmus ; Wu, Yufeng. / REPdenovo : Inferring De Novo repeat motifs from short sequence reads. In: PLoS ONE. 2016 ; Vol. 11, No. 3.

Bibtex

@article{80c6df9dcba14d92bd3f555f202dbadd,
title = "REPdenovo: Inferring De Novo repeat motifs from short sequence reads",
abstract = "Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo.",
author = "Chong Chu and Rasmus Nielsen and Yufeng Wu",
year = "2016",
doi = "10.1371/journal.pone.0150719",
language = "English",
volume = "11",
journal = "PLoS ONE",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "3",

}

RIS

TY - JOUR

T1 - REPdenovo

T2 - Inferring De Novo repeat motifs from short sequence reads

AU - Chu, Chong

AU - Nielsen, Rasmus

AU - Wu, Yufeng

PY - 2016

Y1 - 2016

N2 - Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo.

AB - Repeat elements are important components of eukaryotic genomes. One limitation in our understanding of repeat elements is that most analyses rely on reference genomes that are incomplete and often contain missing data in highly repetitive regions that are difficult to assemble. To overcome this problem we develop a new method, REPdenovo, which assembles repeat sequences directly from raw shotgun sequencing data. REPdenovo can construct various types of repeats that are highly repetitive and have low sequence divergence within copies. We show that REPdenovo is substantially better than existing methods both in terms of the number and the completeness of the repeat sequences that it recovers. The key advantage of REPdenovo is that it can reconstruct long repeats from sequence reads. We apply the method to human data and discover a number of potentially new repeats sequences that have been missed by previous repeat annotations. Many of these sequences are incorporated into various parasite genomes, possibly because the filtering process for host DNA involved in the sequencing of the parasite genomes failed to exclude the host derived repeat sequences. REPdenovo is a new powerful computational tool for annotating genomes and for addressing questions regarding the evolution of repeat families. The software tool, REPdenovo, is available for download at https://github.com/Reedwarbler/REPdenovo.

U2 - 10.1371/journal.pone.0150719

DO - 10.1371/journal.pone.0150719

M3 - Journal article

C2 - 26977803

AN - SCOPUS:84961627463

VL - 11

JO - PLoS ONE

JF - PLoS ONE

SN - 1932-6203

IS - 3

M1 - e0150719

ER -

ID: 222639822