Accurate and Effective Detection of Recurrent Copy Number Variants in Large SNP Genotype Datasets
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
Accurate and Effective Detection of Recurrent Copy Number Variants in Large SNP Genotype Datasets. / Montalbano, Simone; Sánchez, Xabier Calle; Vaez, Morteza; Helenius, Dorte; Werge, Thomas; Ingason, Andrés.
In: Current Protocols, Vol. 2, No. 12, e621, 2022.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Accurate and Effective Detection of Recurrent Copy Number Variants in Large SNP Genotype Datasets
AU - Montalbano, Simone
AU - Sánchez, Xabier Calle
AU - Vaez, Morteza
AU - Helenius, Dorte
AU - Werge, Thomas
AU - Ingason, Andrés
N1 - Publisher Copyright: © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC.
PY - 2022
Y1 - 2022
N2 - Structural variations, including recurrent Copy Number Variants (CNVs) at specific genomic loci, have been found to be associated with increased risk of several diseases and syndromes. CNV carrier status can be determined in large collections of samples using SNP arrays and, more recently, sequencing data. Although there is some consensus among researchers about the essential steps required in such analysis (i.e., CNV calling, filtering of putative carriers, and visual validation using intensity data plots of the genomic region), standard methodologies and processes to control the quality and consistency of the results are lacking. Here, we present a comprehensive and user-friendly protocol that we have refined from our extensive research experience in the field. We cover every aspect of the analysis, from input data curation to final results. For each step, we highlight which parameters affect the analysis the most and how different settings may lead to different results. We provide a pipeline to run the complete analysis with effective (but customizable) pre-sets. We present software that we developed to better handle and filter putative CNV carriers and perform visual inspection to validate selected candidates. Finally, we describe methods to evaluate the critical sections and actions to counterbalance potential problems. The current implementation is focused on Illumina SNP array data. All the presented software is freely available and provided in a ready-to-use docker container.
AB - Structural variations, including recurrent Copy Number Variants (CNVs) at specific genomic loci, have been found to be associated with increased risk of several diseases and syndromes. CNV carrier status can be determined in large collections of samples using SNP arrays and, more recently, sequencing data. Although there is some consensus among researchers about the essential steps required in such analysis (i.e., CNV calling, filtering of putative carriers, and visual validation using intensity data plots of the genomic region), standard methodologies and processes to control the quality and consistency of the results are lacking. Here, we present a comprehensive and user-friendly protocol that we have refined from our extensive research experience in the field. We cover every aspect of the analysis, from input data curation to final results. For each step, we highlight which parameters affect the analysis the most and how different settings may lead to different results. We provide a pipeline to run the complete analysis with effective (but customizable) pre-sets. We present software that we developed to better handle and filter putative CNV carriers and perform visual inspection to validate selected candidates. Finally, we describe methods to evaluate the critical sections and actions to counterbalance potential problems. The current implementation is focused on Illumina SNP array data. All the presented software is freely available and provided in a ready-to-use docker container.
KW - bioinformatics pipeline
KW - CNVs
KW - SNPs
KW - structural variation
U2 - 10.1002/cpz1.621
DO - 10.1002/cpz1.621
M3 - Journal article
C2 - 36469582
AN - SCOPUS:85143552042
VL - 2
JO - Current Protocols
JF - Current Protocols
SN - 2691-1299
IS - 12
M1 - e621
ER -
ID: 340552503