DNA sequence and shape are predictive for meiotic crossovers throughout the plant kingdom

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)

Abstract

A better understanding of genomic features influencing the location of meiotic crossovers (COs) in plant species is both of fundamental importance and of practical relevance for plant breeding. Using CO positions with sufficiently high resolution from four plant species [Arabidopsis thaliana, Solanum lycopersicum (tomato), Zea mays (maize) and Oryza sativa (rice)] we have trained machine-learning models to predict the susceptibility to CO formation. Our results show that CO occurrence within various plant genomes can be predicted by DNA sequence and shape features. Several features related to genome content and to genomic accessibility were consistently either positively or negatively related to COs in all four species. Other features were found as predictive only in specific species. Gene annotation-related features were especially predictive for maize, whereas in tomato and Arabidopsis propeller twist and helical twist (DNA shape features) and AT/TA dinucleotides were found to be the most important. In rice, high roll (another DNA shape feature) and low CA dinucleotide frequency in particular were found to be associated with CO occurrence. The accuracy of our models was sufficient for Arabidopsis and rice (area under receiver operating characteristic curve, AUROC > 0.5), and was high for tomato and maize (AUROC ≫ 0.5), demonstrating that DNA sequence and shape are predictive for meiotic COs throughout the plant kingdom.

LanguageEnglish
Pages686-699
Number of pages14
JournalPlant Journal
Volume95
Issue number4
DOIs
Publication statusPublished - 1 Aug 2018

Fingerprint

Lycopersicon esculentum
Zea mays
Arabidopsis
nucleotide sequences
tomatoes
rice
corn
Plant Genome
Molecular Sequence Annotation
genomics
genome
artificial intelligence
DNA
Solanum lycopersicum
plant breeding
ROC Curve
Oryza sativa
Arabidopsis thaliana
Genome
Oryza

Keywords

  • Arabidopsis thaliana
  • crossover
  • DNA shape
  • genome accessibility
  • machine learning
  • maize
  • meiotic recombination
  • prediction
  • rice
  • tomato

Cite this

@article{92f2fce8a8bd4e6586910b2a5597952f,
title = "DNA sequence and shape are predictive for meiotic crossovers throughout the plant kingdom",
abstract = "A better understanding of genomic features influencing the location of meiotic crossovers (COs) in plant species is both of fundamental importance and of practical relevance for plant breeding. Using CO positions with sufficiently high resolution from four plant species [Arabidopsis thaliana, Solanum lycopersicum (tomato), Zea mays (maize) and Oryza sativa (rice)] we have trained machine-learning models to predict the susceptibility to CO formation. Our results show that CO occurrence within various plant genomes can be predicted by DNA sequence and shape features. Several features related to genome content and to genomic accessibility were consistently either positively or negatively related to COs in all four species. Other features were found as predictive only in specific species. Gene annotation-related features were especially predictive for maize, whereas in tomato and Arabidopsis propeller twist and helical twist (DNA shape features) and AT/TA dinucleotides were found to be the most important. In rice, high roll (another DNA shape feature) and low CA dinucleotide frequency in particular were found to be associated with CO occurrence. The accuracy of our models was sufficient for Arabidopsis and rice (area under receiver operating characteristic curve, AUROC > 0.5), and was high for tomato and maize (AUROC ≫ 0.5), demonstrating that DNA sequence and shape are predictive for meiotic COs throughout the plant kingdom.",
keywords = "Arabidopsis thaliana, crossover, DNA shape, genome accessibility, machine learning, maize, meiotic recombination, prediction, rice, tomato",
author = "Sevgin Demirci and Peters, {Sander A.} and {de Ridder}, Dick and {van Dijk}, {Aalt D.J.}",
year = "2018",
month = "8",
day = "1",
doi = "10.1111/tpj.13979",
language = "English",
volume = "95",
pages = "686--699",
journal = "The Plant Journal",
issn = "0960-7412",
publisher = "Wiley",
number = "4",

}

DNA sequence and shape are predictive for meiotic crossovers throughout the plant kingdom. / Demirci, Sevgin; Peters, Sander A.; de Ridder, Dick; van Dijk, Aalt D.J.

In: Plant Journal, Vol. 95, No. 4, 01.08.2018, p. 686-699.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - DNA sequence and shape are predictive for meiotic crossovers throughout the plant kingdom

AU - Demirci, Sevgin

AU - Peters, Sander A.

AU - de Ridder, Dick

AU - van Dijk, Aalt D.J.

PY - 2018/8/1

Y1 - 2018/8/1

N2 - A better understanding of genomic features influencing the location of meiotic crossovers (COs) in plant species is both of fundamental importance and of practical relevance for plant breeding. Using CO positions with sufficiently high resolution from four plant species [Arabidopsis thaliana, Solanum lycopersicum (tomato), Zea mays (maize) and Oryza sativa (rice)] we have trained machine-learning models to predict the susceptibility to CO formation. Our results show that CO occurrence within various plant genomes can be predicted by DNA sequence and shape features. Several features related to genome content and to genomic accessibility were consistently either positively or negatively related to COs in all four species. Other features were found as predictive only in specific species. Gene annotation-related features were especially predictive for maize, whereas in tomato and Arabidopsis propeller twist and helical twist (DNA shape features) and AT/TA dinucleotides were found to be the most important. In rice, high roll (another DNA shape feature) and low CA dinucleotide frequency in particular were found to be associated with CO occurrence. The accuracy of our models was sufficient for Arabidopsis and rice (area under receiver operating characteristic curve, AUROC > 0.5), and was high for tomato and maize (AUROC ≫ 0.5), demonstrating that DNA sequence and shape are predictive for meiotic COs throughout the plant kingdom.

AB - A better understanding of genomic features influencing the location of meiotic crossovers (COs) in plant species is both of fundamental importance and of practical relevance for plant breeding. Using CO positions with sufficiently high resolution from four plant species [Arabidopsis thaliana, Solanum lycopersicum (tomato), Zea mays (maize) and Oryza sativa (rice)] we have trained machine-learning models to predict the susceptibility to CO formation. Our results show that CO occurrence within various plant genomes can be predicted by DNA sequence and shape features. Several features related to genome content and to genomic accessibility were consistently either positively or negatively related to COs in all four species. Other features were found as predictive only in specific species. Gene annotation-related features were especially predictive for maize, whereas in tomato and Arabidopsis propeller twist and helical twist (DNA shape features) and AT/TA dinucleotides were found to be the most important. In rice, high roll (another DNA shape feature) and low CA dinucleotide frequency in particular were found to be associated with CO occurrence. The accuracy of our models was sufficient for Arabidopsis and rice (area under receiver operating characteristic curve, AUROC > 0.5), and was high for tomato and maize (AUROC ≫ 0.5), demonstrating that DNA sequence and shape are predictive for meiotic COs throughout the plant kingdom.

KW - Arabidopsis thaliana

KW - crossover

KW - DNA shape

KW - genome accessibility

KW - machine learning

KW - maize

KW - meiotic recombination

KW - prediction

KW - rice

KW - tomato

U2 - 10.1111/tpj.13979

DO - 10.1111/tpj.13979

M3 - Article

VL - 95

SP - 686

EP - 699

JO - The Plant Journal

T2 - The Plant Journal

JF - The Plant Journal

SN - 0960-7412

IS - 4

ER -