A generic pipeline for CADD Score generation: chickenCADD and turkeyCADD

Research output: Working paperPreprint

Abstract

Combined Annotation Dependent Depletion (CADD) is a machine learning approach used to predict the deleteriousness of genetic variants across a genome. By integrating diverse genomic features, CADD assigns a PHRED-like rank score to each potential variant. Unlike other methods, CADD does not rely on limited datasets of known pathogenic or benign variants but uses larger and less biased training sets. The rapid increase in high-quality genomes and functional annotations across species highlights the need for an automated, non-species-specific pipeline to generate CADD scores. Here, we introduce such a pipeline, facilitating the generation of CADD scores for various species using only a high-quality genome with gene annotation and a multi-species alignment. Additionally, we present updated chickenCADD scores and newly generated turkeyCADD scores, both generated with the pipeline.
Original languageEnglish
PublisherBioRxiv
DOIs
Publication statusPublished - 3 Nov 2024

Fingerprint

Dive into the research topics of 'A generic pipeline for CADD Score generation: chickenCADD and turkeyCADD'. Together they form a unique fingerprint.

Cite this