Background: Understanding variation in genome structure is essential to understand phenotypic differences within populations and the evolutionary history of species. A promising form of this structural variation is copy number variation (CNV). CNVs can be generated by different recombination mechanisms, such as non-allelic homologous recombination, that rely on specific characteristics of the genome architecture. These structural variants can therefore be more abundant at particular genes ultimately leading to variation in phenotypes under selection. Detailed characterization of CNVs therefore can reveal evolutionary footprints of selection and provide insight in their contribution to phenotypic variation in wild populations. Results: Here we use genotypic data from a long-term population of great tits (Parus major), a widely studied passerine bird in ecology and evolution, to detect CNVs and identify genomic features prevailing within these regions. We used allele intensities and frequencies from high-density SNP array data from 2,175 birds. We detected 41,029 CNVs concatenated into 8,008 distinct CNV regions (CNVRs). We successfully validated 93.75% of the CNVs tested by qPCR, which were sampled at different frequencies and sizes. A mother-daughter family structure allowed for the evaluation of the inheritance of a number of these CNVs. Thereby, only CNVs with 40 probes or more display segregation in accordance with Mendelian inheritance, suggesting a high rate of false negative calls for smaller CNVs. As CNVRs are a coarse-grained map of CNV loci, we also inferred the frequency of coincident CNV start and end breakpoints. We observed frequency-dependent enrichment of these breakpoints at homologous regions, CpG sites and AT-rich intervals. A gene ontology enrichment analyses showed that CNVs are enriched in genes underpinning neural, cardiac and ion transport pathways. Conclusion: Great tit CNVs are present in almost half of the genes and prominent at repetitive-homologous and regulatory regions. Although overlapping genes under selection, the high number of false negatives make neutrality or association tests on CNVs detected here difficult. Therefore, CNVs should be further addressed in the light of their false negative rate and architecture to improve the comprehension of their association with phenotypes and evolutionary history.
- Genetic variation
- Parus major