CNVs are associated with genomic architecture in a songbird

Vinicius H. da Silva*, Veronika N. Laine, Mirte Bosse, Kees van Oers, Bert Dibbits, Marcel E. Visser, Richard P.M.A. Crooijmans, Martien A.M. Groenen

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

3 Citations (Scopus)

Abstract

Background: Understanding variation in genome structure is essential to understand phenotypic differences within populations and the evolutionary history of species. A promising form of this structural variation is copy number variation (CNV). CNVs can be generated by different recombination mechanisms, such as non-allelic homologous recombination, that rely on specific characteristics of the genome architecture. These structural variants can therefore be more abundant at particular genes ultimately leading to variation in phenotypes under selection. Detailed characterization of CNVs therefore can reveal evolutionary footprints of selection and provide insight in their contribution to phenotypic variation in wild populations. Results: Here we use genotypic data from a long-term population of great tits (Parus major), a widely studied passerine bird in ecology and evolution, to detect CNVs and identify genomic features prevailing within these regions. We used allele intensities and frequencies from high-density SNP array data from 2,175 birds. We detected 41,029 CNVs concatenated into 8,008 distinct CNV regions (CNVRs). We successfully validated 93.75% of the CNVs tested by qPCR, which were sampled at different frequencies and sizes. A mother-daughter family structure allowed for the evaluation of the inheritance of a number of these CNVs. Thereby, only CNVs with 40 probes or more display segregation in accordance with Mendelian inheritance, suggesting a high rate of false negative calls for smaller CNVs. As CNVRs are a coarse-grained map of CNV loci, we also inferred the frequency of coincident CNV start and end breakpoints. We observed frequency-dependent enrichment of these breakpoints at homologous regions, CpG sites and AT-rich intervals. A gene ontology enrichment analyses showed that CNVs are enriched in genes underpinning neural, cardiac and ion transport pathways. Conclusion: Great tit CNVs are present in almost half of the genes and prominent at repetitive-homologous and regulatory regions. Although overlapping genes under selection, the high number of false negatives make neutrality or association tests on CNVs detected here difficult. Therefore, CNVs should be further addressed in the light of their false negative rate and architecture to improve the comprehension of their association with phenotypes and evolutionary history.

Original languageEnglish
Article number195
JournalBMC Genomics
Volume19
Issue numberSupplement 2
DOIs
Publication statusPublished - 13 Mar 2018

Fingerprint

Songbirds
Birds
History
Overlapping Genes
Genome
Population
Genes
Phenotype
Gene Ontology
Nucleic Acid Repetitive Sequences
Nucleic Acid Regulatory Sequences
Homologous Recombination
Ion Transport
Ecology
Gene Frequency
Genetic Recombination
Single Nucleotide Polymorphism

Keywords

  • Duplication
  • Genetic variation
  • Inheritance
  • Parus major
  • Recombination

Cite this

@article{1e24b62966c941069d977af39b21015f,
title = "CNVs are associated with genomic architecture in a songbird",
abstract = "Background: Understanding variation in genome structure is essential to understand phenotypic differences within populations and the evolutionary history of species. A promising form of this structural variation is copy number variation (CNV). CNVs can be generated by different recombination mechanisms, such as non-allelic homologous recombination, that rely on specific characteristics of the genome architecture. These structural variants can therefore be more abundant at particular genes ultimately leading to variation in phenotypes under selection. Detailed characterization of CNVs therefore can reveal evolutionary footprints of selection and provide insight in their contribution to phenotypic variation in wild populations. Results: Here we use genotypic data from a long-term population of great tits (Parus major), a widely studied passerine bird in ecology and evolution, to detect CNVs and identify genomic features prevailing within these regions. We used allele intensities and frequencies from high-density SNP array data from 2,175 birds. We detected 41,029 CNVs concatenated into 8,008 distinct CNV regions (CNVRs). We successfully validated 93.75{\%} of the CNVs tested by qPCR, which were sampled at different frequencies and sizes. A mother-daughter family structure allowed for the evaluation of the inheritance of a number of these CNVs. Thereby, only CNVs with 40 probes or more display segregation in accordance with Mendelian inheritance, suggesting a high rate of false negative calls for smaller CNVs. As CNVRs are a coarse-grained map of CNV loci, we also inferred the frequency of coincident CNV start and end breakpoints. We observed frequency-dependent enrichment of these breakpoints at homologous regions, CpG sites and AT-rich intervals. A gene ontology enrichment analyses showed that CNVs are enriched in genes underpinning neural, cardiac and ion transport pathways. Conclusion: Great tit CNVs are present in almost half of the genes and prominent at repetitive-homologous and regulatory regions. Although overlapping genes under selection, the high number of false negatives make neutrality or association tests on CNVs detected here difficult. Therefore, CNVs should be further addressed in the light of their false negative rate and architecture to improve the comprehension of their association with phenotypes and evolutionary history.",
keywords = "Duplication, Genetic variation, Inheritance, Parus major, Recombination",
author = "{da Silva}, {Vinicius H.} and Laine, {Veronika N.} and Mirte Bosse and {van Oers}, Kees and Bert Dibbits and Visser, {Marcel E.} and Crooijmans, {Richard P.M.A.} and Groenen, {Martien A.M.}",
year = "2018",
month = "3",
day = "13",
doi = "10.1186/s12864-018-4577-1",
language = "English",
volume = "19",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "Springer Verlag",
number = "Supplement 2",

}

CNVs are associated with genomic architecture in a songbird. / da Silva, Vinicius H.; Laine, Veronika N.; Bosse, Mirte; van Oers, Kees; Dibbits, Bert; Visser, Marcel E.; Crooijmans, Richard P.M.A.; Groenen, Martien A.M.

In: BMC Genomics, Vol. 19, No. Supplement 2, 195, 13.03.2018.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - CNVs are associated with genomic architecture in a songbird

AU - da Silva, Vinicius H.

AU - Laine, Veronika N.

AU - Bosse, Mirte

AU - van Oers, Kees

AU - Dibbits, Bert

AU - Visser, Marcel E.

AU - Crooijmans, Richard P.M.A.

AU - Groenen, Martien A.M.

PY - 2018/3/13

Y1 - 2018/3/13

N2 - Background: Understanding variation in genome structure is essential to understand phenotypic differences within populations and the evolutionary history of species. A promising form of this structural variation is copy number variation (CNV). CNVs can be generated by different recombination mechanisms, such as non-allelic homologous recombination, that rely on specific characteristics of the genome architecture. These structural variants can therefore be more abundant at particular genes ultimately leading to variation in phenotypes under selection. Detailed characterization of CNVs therefore can reveal evolutionary footprints of selection and provide insight in their contribution to phenotypic variation in wild populations. Results: Here we use genotypic data from a long-term population of great tits (Parus major), a widely studied passerine bird in ecology and evolution, to detect CNVs and identify genomic features prevailing within these regions. We used allele intensities and frequencies from high-density SNP array data from 2,175 birds. We detected 41,029 CNVs concatenated into 8,008 distinct CNV regions (CNVRs). We successfully validated 93.75% of the CNVs tested by qPCR, which were sampled at different frequencies and sizes. A mother-daughter family structure allowed for the evaluation of the inheritance of a number of these CNVs. Thereby, only CNVs with 40 probes or more display segregation in accordance with Mendelian inheritance, suggesting a high rate of false negative calls for smaller CNVs. As CNVRs are a coarse-grained map of CNV loci, we also inferred the frequency of coincident CNV start and end breakpoints. We observed frequency-dependent enrichment of these breakpoints at homologous regions, CpG sites and AT-rich intervals. A gene ontology enrichment analyses showed that CNVs are enriched in genes underpinning neural, cardiac and ion transport pathways. Conclusion: Great tit CNVs are present in almost half of the genes and prominent at repetitive-homologous and regulatory regions. Although overlapping genes under selection, the high number of false negatives make neutrality or association tests on CNVs detected here difficult. Therefore, CNVs should be further addressed in the light of their false negative rate and architecture to improve the comprehension of their association with phenotypes and evolutionary history.

AB - Background: Understanding variation in genome structure is essential to understand phenotypic differences within populations and the evolutionary history of species. A promising form of this structural variation is copy number variation (CNV). CNVs can be generated by different recombination mechanisms, such as non-allelic homologous recombination, that rely on specific characteristics of the genome architecture. These structural variants can therefore be more abundant at particular genes ultimately leading to variation in phenotypes under selection. Detailed characterization of CNVs therefore can reveal evolutionary footprints of selection and provide insight in their contribution to phenotypic variation in wild populations. Results: Here we use genotypic data from a long-term population of great tits (Parus major), a widely studied passerine bird in ecology and evolution, to detect CNVs and identify genomic features prevailing within these regions. We used allele intensities and frequencies from high-density SNP array data from 2,175 birds. We detected 41,029 CNVs concatenated into 8,008 distinct CNV regions (CNVRs). We successfully validated 93.75% of the CNVs tested by qPCR, which were sampled at different frequencies and sizes. A mother-daughter family structure allowed for the evaluation of the inheritance of a number of these CNVs. Thereby, only CNVs with 40 probes or more display segregation in accordance with Mendelian inheritance, suggesting a high rate of false negative calls for smaller CNVs. As CNVRs are a coarse-grained map of CNV loci, we also inferred the frequency of coincident CNV start and end breakpoints. We observed frequency-dependent enrichment of these breakpoints at homologous regions, CpG sites and AT-rich intervals. A gene ontology enrichment analyses showed that CNVs are enriched in genes underpinning neural, cardiac and ion transport pathways. Conclusion: Great tit CNVs are present in almost half of the genes and prominent at repetitive-homologous and regulatory regions. Although overlapping genes under selection, the high number of false negatives make neutrality or association tests on CNVs detected here difficult. Therefore, CNVs should be further addressed in the light of their false negative rate and architecture to improve the comprehension of their association with phenotypes and evolutionary history.

KW - Duplication

KW - Genetic variation

KW - Inheritance

KW - Parus major

KW - Recombination

U2 - 10.1186/s12864-018-4577-1

DO - 10.1186/s12864-018-4577-1

M3 - Article

VL - 19

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - Supplement 2

M1 - 195

ER -