Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations. / Lauterbur, M. Elise; Cavassim, Maria Izabel A.; Gladstein, Ariella L.; Gower, Graham; Pope, Nathaniel S.; Tsambos, Georgia; Adrion, Jeffrey; Belsare, Saurabh; Biddanda, Arjun; Caudill, Victoria; Cury, Jean; Echevarria, Ignacio; Haller, Benjamin C.; Hasan, Ahmed R.; Huang, Xin; Iasi, Leonardo Nicola Martin; Noskova, Ekaterina; Obsteter, Jana; Pavinato, Vitor Antonio Correa; Pearson, Alice; Peede, David; Perez, Manolo F.; Rodrigues, Murillo F.; Smith, Chris C. R.; Spence, Jeffrey P.; Teterina, Anastasia; Tittes, Silas; Unneberg, Per; Vazquez, Juan Manuel; Waples, Ryan K.; Wohns, Anthony Wilder; Wong, Yan; Baumdicker, Franz; Cartwright, Reed A.; Gorjanc, Gregor; Gutenkunst, Ryan N.; Kelleher, Jerome; Kern, Andrew D.; Ragsdale, Aaron P.; Ralph, Peter L.; Schrider, Daniel R.; Gronau, Ilan.

In: eLife, Vol. 12, RP84874, 2023.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Lauterbur, ME, Cavassim, MIA, Gladstein, AL, Gower, G, Pope, NS, Tsambos, G, Adrion, J, Belsare, S, Biddanda, A, Caudill, V, Cury, J, Echevarria, I, Haller, BC, Hasan, AR, Huang, X, Iasi, LNM, Noskova, E, Obsteter, J, Pavinato, VAC, Pearson, A, Peede, D, Perez, MF, Rodrigues, MF, Smith, CCR, Spence, JP, Teterina, A, Tittes, S, Unneberg, P, Vazquez, JM, Waples, RK, Wohns, AW, Wong, Y, Baumdicker, F, Cartwright, RA, Gorjanc, G, Gutenkunst, RN, Kelleher, J, Kern, AD, Ragsdale, AP, Ralph, PL, Schrider, DR & Gronau, I 2023, 'Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations', eLife, vol. 12, RP84874. https://doi.org/10.7554/eLife.84874

APA

Lauterbur, M. E., Cavassim, M. I. A., Gladstein, A. L., Gower, G., Pope, N. S., Tsambos, G., Adrion, J., Belsare, S., Biddanda, A., Caudill, V., Cury, J., Echevarria, I., Haller, B. C., Hasan, A. R., Huang, X., Iasi, L. N. M., Noskova, E., Obsteter, J., Pavinato, V. A. C., ... Gronau, I. (2023). Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations. eLife, 12, [RP84874]. https://doi.org/10.7554/eLife.84874

Vancouver

Lauterbur ME, Cavassim MIA, Gladstein AL, Gower G, Pope NS, Tsambos G et al. Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations. eLife. 2023;12. RP84874. https://doi.org/10.7554/eLife.84874

Author

Lauterbur, M. Elise ; Cavassim, Maria Izabel A. ; Gladstein, Ariella L. ; Gower, Graham ; Pope, Nathaniel S. ; Tsambos, Georgia ; Adrion, Jeffrey ; Belsare, Saurabh ; Biddanda, Arjun ; Caudill, Victoria ; Cury, Jean ; Echevarria, Ignacio ; Haller, Benjamin C. ; Hasan, Ahmed R. ; Huang, Xin ; Iasi, Leonardo Nicola Martin ; Noskova, Ekaterina ; Obsteter, Jana ; Pavinato, Vitor Antonio Correa ; Pearson, Alice ; Peede, David ; Perez, Manolo F. ; Rodrigues, Murillo F. ; Smith, Chris C. R. ; Spence, Jeffrey P. ; Teterina, Anastasia ; Tittes, Silas ; Unneberg, Per ; Vazquez, Juan Manuel ; Waples, Ryan K. ; Wohns, Anthony Wilder ; Wong, Yan ; Baumdicker, Franz ; Cartwright, Reed A. ; Gorjanc, Gregor ; Gutenkunst, Ryan N. ; Kelleher, Jerome ; Kern, Andrew D. ; Ragsdale, Aaron P. ; Ralph, Peter L. ; Schrider, Daniel R. ; Gronau, Ilan. / Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations. In: eLife. 2023 ; Vol. 12.

Bibtex

@article{249d1f95d63044468f22ec91f6ccd56e,
title = "Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations",
abstract = "Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.",
keywords = "genetics, genomics, none, open source, population genetics, simulations",
author = "Lauterbur, {M. Elise} and Cavassim, {Maria Izabel A.} and Gladstein, {Ariella L.} and Graham Gower and Pope, {Nathaniel S.} and Georgia Tsambos and Jeffrey Adrion and Saurabh Belsare and Arjun Biddanda and Victoria Caudill and Jean Cury and Ignacio Echevarria and Haller, {Benjamin C.} and Hasan, {Ahmed R.} and Xin Huang and Iasi, {Leonardo Nicola Martin} and Ekaterina Noskova and Jana Obsteter and Pavinato, {Vitor Antonio Correa} and Alice Pearson and David Peede and Perez, {Manolo F.} and Rodrigues, {Murillo F.} and Smith, {Chris C. R.} and Spence, {Jeffrey P.} and Anastasia Teterina and Silas Tittes and Per Unneberg and Vazquez, {Juan Manuel} and Waples, {Ryan K.} and Wohns, {Anthony Wilder} and Yan Wong and Franz Baumdicker and Cartwright, {Reed A.} and Gregor Gorjanc and Gutenkunst, {Ryan N.} and Jerome Kelleher and Kern, {Andrew D.} and Ragsdale, {Aaron P.} and Ralph, {Peter L.} and Schrider, {Daniel R.} and Ilan Gronau",
note = "Publisher Copyright: {\textcopyright} 2023, Lauterbur et al.",
year = "2023",
doi = "10.7554/eLife.84874",
language = "English",
volume = "12",
journal = "eLife",
issn = "2050-084X",
publisher = "eLife Sciences Publications Ltd.",

}

RIS

TY - JOUR

T1 - Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations

AU - Lauterbur, M. Elise

AU - Cavassim, Maria Izabel A.

AU - Gladstein, Ariella L.

AU - Gower, Graham

AU - Pope, Nathaniel S.

AU - Tsambos, Georgia

AU - Adrion, Jeffrey

AU - Belsare, Saurabh

AU - Biddanda, Arjun

AU - Caudill, Victoria

AU - Cury, Jean

AU - Echevarria, Ignacio

AU - Haller, Benjamin C.

AU - Hasan, Ahmed R.

AU - Huang, Xin

AU - Iasi, Leonardo Nicola Martin

AU - Noskova, Ekaterina

AU - Obsteter, Jana

AU - Pavinato, Vitor Antonio Correa

AU - Pearson, Alice

AU - Peede, David

AU - Perez, Manolo F.

AU - Rodrigues, Murillo F.

AU - Smith, Chris C. R.

AU - Spence, Jeffrey P.

AU - Teterina, Anastasia

AU - Tittes, Silas

AU - Unneberg, Per

AU - Vazquez, Juan Manuel

AU - Waples, Ryan K.

AU - Wohns, Anthony Wilder

AU - Wong, Yan

AU - Baumdicker, Franz

AU - Cartwright, Reed A.

AU - Gorjanc, Gregor

AU - Gutenkunst, Ryan N.

AU - Kelleher, Jerome

AU - Kern, Andrew D.

AU - Ragsdale, Aaron P.

AU - Ralph, Peter L.

AU - Schrider, Daniel R.

AU - Gronau, Ilan

N1 - Publisher Copyright: © 2023, Lauterbur et al.

PY - 2023

Y1 - 2023

N2 - Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.

AB - Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.

KW - genetics

KW - genomics

KW - none

KW - open source

KW - population genetics

KW - simulations

U2 - 10.7554/eLife.84874

DO - 10.7554/eLife.84874

M3 - Journal article

C2 - 37342968

AN - SCOPUS:85163100933

VL - 12

JO - eLife

JF - eLife

SN - 2050-084X

M1 - RP84874

ER -

ID: 359130381