Genome-wide computational function prediction of Arabidopsis thaliana proteins by integration of multiple data sources

Y.I.A. Kourmpetis, A.D.J. van Dijk, R.C.H.J. van Ham, C.J.F. ter Braak

Research output: Contribution to journalArticleAcademicpeer-review

30 Citations (Scopus)

Abstract

Although Arabidopsis thaliana is the best studied plant species, the biological role of one third of its proteins is still unknown. We developed a probabilistic protein function prediction method that integrates information from sequences, protein-protein interactions and gene expression. The method was applied to proteins from Arabidopsis thaliana. Evaluation of prediction performance showed that our method has improved performance compared to single source-based prediction approaches and two existing integration approaches. An innovative feature of our method is that enables transfer of functional information between proteins that are not directly associated with each other. We provide novel function predictions for 5,807 proteins. Recent experimental studies confirmed several of the predictions. We highlight these in detail for proteins predicted to be involved in flowering and floral organ development.
Original languageEnglish
Pages (from-to)271-281
JournalPlant Physiology
Volume155
DOIs
Publication statusPublished - 2011

Keywords

  • generalized linear-models
  • transcription factor
  • flowering time
  • cell-death
  • thaliana
  • gene
  • algorithm
  • networks
  • biology
  • family

Fingerprint

Dive into the research topics of 'Genome-wide computational function prediction of Arabidopsis thaliana proteins by integration of multiple data sources'. Together they form a unique fingerprint.

Cite this