Combining multi-population datasets for joint genome-wide association and meta-analyses: The case of bovine milk fat composition traits

G. Gebreyesus*, A.J. Buitenhuis, N.A. Poulsen, M.H.P.W. Visker, Q. Zhang, H.J.F. van Valenberg, D. Sun, H. Bovenhuis

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

In genome-wide association studies (GWAS), sample size is the most important factor affecting statistical power that is under control of the investigator, posing a major challenge in understanding the genetics underlying difficult-to-measure traits. Combining data sets available from different populations for joint or meta-analysis is a promising alternative to increasing sample sizes available for GWAS. Simulation studies indicate statistical advantages from combining raw data or GWAS summaries in enhancing quantitative trait loci (QTL) detection power. However, the complexity of genetics underlying most quantitative traits, which itself is not fully understood, is difficult to fully capture in simulated data sets. In this study, population-specific and combined-population GWAS as well as a meta-analysis of the population-specific GWAS summaries were carried out with the objective of assessing the advantages and challenges of different data-combining strategies in enhancing detection power of GWAS using milk fatty acid (FA) traits as examples. Gas chromatography (GC) quantified milk FA samples and high-density (HD) genotypes were available from 1,566 Dutch, 614 Danish, and 700 Chinese Holstein Friesian cows. Using the joint GWAS, 28 additional genomic regions were detected, with significant associations to at least 1 FA, compared with the population-specific analyses. Some of these additional regions were also detected using the implemented meta-analysis. Furthermore, using the frequently reported variants of the diacylglycerol acyltransferase 1 (DGAT1) and stearoyl-CoA desaturase (SCD1) genes, we show that significant associations were established with more FA traits in the joint GWAS than the remaining scenarios. However, there were few regions detected in the population-specific analyses that were not detected using the joint GWAS or the meta-analyses. Our results show that combining multi-population data set can be a powerful tool to enhance detection power in GWAS for seldom-recorded traits. Detection of a higher number of regions using the meta-analysis, compared with any of the population-specific analyses also emphasizes the utility of these methods in the absence of raw multi-population data sets to undertake joint GWAS.

Original languageEnglish
Pages (from-to)11124-11141
JournalJournal of Dairy Science
Volume102
Issue number12
Early online date25 Sep 2019
DOIs
Publication statusPublished - Dec 2019

Fingerprint

Genome-Wide Association Study
milk fat
Meta-Analysis
Milk
Fats
milk
genome
Population
meta-analysis
Joints
Fatty Acids
Sample Size
Datasets
genome-wide association study
Diacylglycerol O-Acyltransferase
Stearoyl-CoA Desaturase
diacylglycerol acyltransferase
fatty acids
stearoyl-CoA desaturase
Quantitative Trait Loci

Keywords

  • mega-analysis
  • meta-analysis
  • multi-population GWAS

Cite this

Gebreyesus, G. ; Buitenhuis, A.J. ; Poulsen, N.A. ; Visker, M.H.P.W. ; Zhang, Q. ; van Valenberg, H.J.F. ; Sun, D. ; Bovenhuis, H. / Combining multi-population datasets for joint genome-wide association and meta-analyses: The case of bovine milk fat composition traits. In: Journal of Dairy Science. 2019 ; Vol. 102, No. 12. pp. 11124-11141.
@article{813d8188b955459ea56a899342d12628,
title = "Combining multi-population datasets for joint genome-wide association and meta-analyses: The case of bovine milk fat composition traits",
abstract = "In genome-wide association studies (GWAS), sample size is the most important factor affecting statistical power that is under control of the investigator, posing a major challenge in understanding the genetics underlying difficult-to-measure traits. Combining data sets available from different populations for joint or meta-analysis is a promising alternative to increasing sample sizes available for GWAS. Simulation studies indicate statistical advantages from combining raw data or GWAS summaries in enhancing quantitative trait loci (QTL) detection power. However, the complexity of genetics underlying most quantitative traits, which itself is not fully understood, is difficult to fully capture in simulated data sets. In this study, population-specific and combined-population GWAS as well as a meta-analysis of the population-specific GWAS summaries were carried out with the objective of assessing the advantages and challenges of different data-combining strategies in enhancing detection power of GWAS using milk fatty acid (FA) traits as examples. Gas chromatography (GC) quantified milk FA samples and high-density (HD) genotypes were available from 1,566 Dutch, 614 Danish, and 700 Chinese Holstein Friesian cows. Using the joint GWAS, 28 additional genomic regions were detected, with significant associations to at least 1 FA, compared with the population-specific analyses. Some of these additional regions were also detected using the implemented meta-analysis. Furthermore, using the frequently reported variants of the diacylglycerol acyltransferase 1 (DGAT1) and stearoyl-CoA desaturase (SCD1) genes, we show that significant associations were established with more FA traits in the joint GWAS than the remaining scenarios. However, there were few regions detected in the population-specific analyses that were not detected using the joint GWAS or the meta-analyses. Our results show that combining multi-population data set can be a powerful tool to enhance detection power in GWAS for seldom-recorded traits. Detection of a higher number of regions using the meta-analysis, compared with any of the population-specific analyses also emphasizes the utility of these methods in the absence of raw multi-population data sets to undertake joint GWAS.",
keywords = "mega-analysis, meta-analysis, multi-population GWAS",
author = "G. Gebreyesus and A.J. Buitenhuis and N.A. Poulsen and M.H.P.W. Visker and Q. Zhang and {van Valenberg}, H.J.F. and D. Sun and H. Bovenhuis",
year = "2019",
month = "12",
doi = "10.3168/jds.2019-16676",
language = "English",
volume = "102",
pages = "11124--11141",
journal = "Journal of Dairy Science",
issn = "0022-0302",
publisher = "American Dairy Science Association",
number = "12",

}

Combining multi-population datasets for joint genome-wide association and meta-analyses: The case of bovine milk fat composition traits. / Gebreyesus, G.; Buitenhuis, A.J.; Poulsen, N.A.; Visker, M.H.P.W.; Zhang, Q.; van Valenberg, H.J.F.; Sun, D.; Bovenhuis, H.

In: Journal of Dairy Science, Vol. 102, No. 12, 12.2019, p. 11124-11141.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Combining multi-population datasets for joint genome-wide association and meta-analyses: The case of bovine milk fat composition traits

AU - Gebreyesus, G.

AU - Buitenhuis, A.J.

AU - Poulsen, N.A.

AU - Visker, M.H.P.W.

AU - Zhang, Q.

AU - van Valenberg, H.J.F.

AU - Sun, D.

AU - Bovenhuis, H.

PY - 2019/12

Y1 - 2019/12

N2 - In genome-wide association studies (GWAS), sample size is the most important factor affecting statistical power that is under control of the investigator, posing a major challenge in understanding the genetics underlying difficult-to-measure traits. Combining data sets available from different populations for joint or meta-analysis is a promising alternative to increasing sample sizes available for GWAS. Simulation studies indicate statistical advantages from combining raw data or GWAS summaries in enhancing quantitative trait loci (QTL) detection power. However, the complexity of genetics underlying most quantitative traits, which itself is not fully understood, is difficult to fully capture in simulated data sets. In this study, population-specific and combined-population GWAS as well as a meta-analysis of the population-specific GWAS summaries were carried out with the objective of assessing the advantages and challenges of different data-combining strategies in enhancing detection power of GWAS using milk fatty acid (FA) traits as examples. Gas chromatography (GC) quantified milk FA samples and high-density (HD) genotypes were available from 1,566 Dutch, 614 Danish, and 700 Chinese Holstein Friesian cows. Using the joint GWAS, 28 additional genomic regions were detected, with significant associations to at least 1 FA, compared with the population-specific analyses. Some of these additional regions were also detected using the implemented meta-analysis. Furthermore, using the frequently reported variants of the diacylglycerol acyltransferase 1 (DGAT1) and stearoyl-CoA desaturase (SCD1) genes, we show that significant associations were established with more FA traits in the joint GWAS than the remaining scenarios. However, there were few regions detected in the population-specific analyses that were not detected using the joint GWAS or the meta-analyses. Our results show that combining multi-population data set can be a powerful tool to enhance detection power in GWAS for seldom-recorded traits. Detection of a higher number of regions using the meta-analysis, compared with any of the population-specific analyses also emphasizes the utility of these methods in the absence of raw multi-population data sets to undertake joint GWAS.

AB - In genome-wide association studies (GWAS), sample size is the most important factor affecting statistical power that is under control of the investigator, posing a major challenge in understanding the genetics underlying difficult-to-measure traits. Combining data sets available from different populations for joint or meta-analysis is a promising alternative to increasing sample sizes available for GWAS. Simulation studies indicate statistical advantages from combining raw data or GWAS summaries in enhancing quantitative trait loci (QTL) detection power. However, the complexity of genetics underlying most quantitative traits, which itself is not fully understood, is difficult to fully capture in simulated data sets. In this study, population-specific and combined-population GWAS as well as a meta-analysis of the population-specific GWAS summaries were carried out with the objective of assessing the advantages and challenges of different data-combining strategies in enhancing detection power of GWAS using milk fatty acid (FA) traits as examples. Gas chromatography (GC) quantified milk FA samples and high-density (HD) genotypes were available from 1,566 Dutch, 614 Danish, and 700 Chinese Holstein Friesian cows. Using the joint GWAS, 28 additional genomic regions were detected, with significant associations to at least 1 FA, compared with the population-specific analyses. Some of these additional regions were also detected using the implemented meta-analysis. Furthermore, using the frequently reported variants of the diacylglycerol acyltransferase 1 (DGAT1) and stearoyl-CoA desaturase (SCD1) genes, we show that significant associations were established with more FA traits in the joint GWAS than the remaining scenarios. However, there were few regions detected in the population-specific analyses that were not detected using the joint GWAS or the meta-analyses. Our results show that combining multi-population data set can be a powerful tool to enhance detection power in GWAS for seldom-recorded traits. Detection of a higher number of regions using the meta-analysis, compared with any of the population-specific analyses also emphasizes the utility of these methods in the absence of raw multi-population data sets to undertake joint GWAS.

KW - mega-analysis

KW - meta-analysis

KW - multi-population GWAS

U2 - 10.3168/jds.2019-16676

DO - 10.3168/jds.2019-16676

M3 - Article

VL - 102

SP - 11124

EP - 11141

JO - Journal of Dairy Science

JF - Journal of Dairy Science

SN - 0022-0302

IS - 12

ER -