9 May 2022

New Research Recovers High-Quality Host Genomes from Chicken Gut Samples

DATA RECOVERY

A newly published study has found a method to extract reliable insights into host population genetics through a two-step imputation of intestinal and faecal samples.

Picture of chickens

Metagenomic datasets, primarily used to analyse the genomic architecture of microbial communities, have become even more valuable, as researchers have found a way to also extract precise genomic information of the host animal, facilitating a hologenomic approach analysing host and microbial genomic data through the same dataset. This optimization of the datasets can therefore potentially be used to improve animal production. The results of the study have just been published in Advanced Genetics.

Using a so-called two-step imputation strategy, PhD student Sofia Marcos from the University of the Basque Country, who is also part of the Alberdi Group, Associate Professor Antton Alberdi from the Center for Evolutionary Hologenomics and colleagues have, in a sense, created something valuable from nothing: useable host-DNA in metagenomic datasets. Typically, the host-DNA is discarded from metagenomic datasets of host-associated microbial communities due to the data being too unreliable for accurate host-genetic analysis, as the amount of host DNA sequences generated are often insufficient for accurate assessment, but no more.

Two-step dataset optimisation

“Genotype imputation refers to the analysis through which missing genetic information of organisms is predicted using a panel of well-known genome sequences of related individuals. This is possible because DNA sequences that are close together on a chromosome tend to be inherited together,” explains Associate Professor Antton Alberdi.

“If gene A and B appear systematically linked in the reference genomes, but it has been only possible to detect gene A in a given sample due to insufficient genome sequencing, it is possible to predict that gene B will also be present, therefore reconstructing a more complete map of the genome of that individual. The two-step imputation approach enables this principle to be applied to very little amounts of genetic data, to reconstruct millions of missing bits of genetic information with a high level of accuracy.”

The strategy obtained high imputation accuracy (>0.90) in its initial evaluation of 12 samples of both low- and high depth sequencing data. The impact of reference panel choice in population genetics statistics was then assessed, and all four panels yielded comparable results.

This opens up broad and exciting possibilities from reanalysing already published data to reducing the resources spent on population genetic studies thanks to the possibility of obtaining reliable information on host and microbial communities from the same data source. Practically, these reconstructed genotypes will be employed in the H2020 project HoloFood to detect interactions with microbial metagenomic features, and thus implementing a hologenomic approach to improve animal production. 

Contact: Associate Professor Antton Alberdi.