Improved inference of intermolecular contacts through protein–protein interaction prediction using coevolutionary analysis

M. Correa Marrero, G.H. Immink, D. de Ridder, A.D.J. van Dijk*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)

Abstract

Motivation: Predicting residue–residue contacts between interacting proteins is an important problem in bioinformatics. The growing wealth of sequence data can be used to infer these contacts through correlated mutation analysis on multiple sequence alignments of interacting homologs of the proteins of interest. This requires correct identification of pairs of interacting proteins for many species, in order to avoid introducing noise (i.e. non-interacting sequences) in the analysis that will decrease predictive performance.
Results: We have designed Ouroboros, a novel algorithm to reduce such noise in intermolecular contact prediction. Our method iterates between weighting proteins according to how likely they are to interact based on the correlated mutations signal, and predicting correlated mutations based on the weighted sequence alignment. We show that this approach accurately discriminates between protein interaction versus non-interaction and simultaneously improves the prediction of intermolecular contact residues compared to a naive application of correlated mutation analysis. This requires no training labels concerning interactions or contacts. Furthermore, the method relaxes the assumption of one-to-one interaction of previous approaches, allowing for the study of many-to-many interactions.
Original languageEnglish
Pages (from-to)2036-2042
Number of pages7
JournalBioinformatics
Volume35
Issue number12
Early online date6 Nov 2018
DOIs
Publication statusPublished - Jun 2019

Fingerprint

Protein-protein Interaction
Contact
Proteins
Protein
Mutation
Prediction
Sequence Alignment
Interaction
Many to many
Multiple Sequence Alignment
Bioinformatics
Computational Biology
Iterate
Weighting
Sequence Analysis
Labels
Likely
Decrease

Cite this

@article{540f40d57b0a4d979fb6d41bae7b405c,
title = "Improved inference of intermolecular contacts through protein–protein interaction prediction using coevolutionary analysis",
abstract = "Motivation: Predicting residue–residue contacts between interacting proteins is an important problem in bioinformatics. The growing wealth of sequence data can be used to infer these contacts through correlated mutation analysis on multiple sequence alignments of interacting homologs of the proteins of interest. This requires correct identification of pairs of interacting proteins for many species, in order to avoid introducing noise (i.e. non-interacting sequences) in the analysis that will decrease predictive performance.Results: We have designed Ouroboros, a novel algorithm to reduce such noise in intermolecular contact prediction. Our method iterates between weighting proteins according to how likely they are to interact based on the correlated mutations signal, and predicting correlated mutations based on the weighted sequence alignment. We show that this approach accurately discriminates between protein interaction versus non-interaction and simultaneously improves the prediction of intermolecular contact residues compared to a naive application of correlated mutation analysis. This requires no training labels concerning interactions or contacts. Furthermore, the method relaxes the assumption of one-to-one interaction of previous approaches, allowing for the study of many-to-many interactions.",
author = "{Correa Marrero}, M. and G.H. Immink and {de Ridder}, D. and {van Dijk}, A.D.J.",
year = "2019",
month = "6",
doi = "10.1093/bioinformatics/bty924",
language = "English",
volume = "35",
pages = "2036--2042",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "12",

}

TY - JOUR

T1 - Improved inference of intermolecular contacts through protein–protein interaction prediction using coevolutionary analysis

AU - Correa Marrero, M.

AU - Immink, G.H.

AU - de Ridder, D.

AU - van Dijk, A.D.J.

PY - 2019/6

Y1 - 2019/6

N2 - Motivation: Predicting residue–residue contacts between interacting proteins is an important problem in bioinformatics. The growing wealth of sequence data can be used to infer these contacts through correlated mutation analysis on multiple sequence alignments of interacting homologs of the proteins of interest. This requires correct identification of pairs of interacting proteins for many species, in order to avoid introducing noise (i.e. non-interacting sequences) in the analysis that will decrease predictive performance.Results: We have designed Ouroboros, a novel algorithm to reduce such noise in intermolecular contact prediction. Our method iterates between weighting proteins according to how likely they are to interact based on the correlated mutations signal, and predicting correlated mutations based on the weighted sequence alignment. We show that this approach accurately discriminates between protein interaction versus non-interaction and simultaneously improves the prediction of intermolecular contact residues compared to a naive application of correlated mutation analysis. This requires no training labels concerning interactions or contacts. Furthermore, the method relaxes the assumption of one-to-one interaction of previous approaches, allowing for the study of many-to-many interactions.

AB - Motivation: Predicting residue–residue contacts between interacting proteins is an important problem in bioinformatics. The growing wealth of sequence data can be used to infer these contacts through correlated mutation analysis on multiple sequence alignments of interacting homologs of the proteins of interest. This requires correct identification of pairs of interacting proteins for many species, in order to avoid introducing noise (i.e. non-interacting sequences) in the analysis that will decrease predictive performance.Results: We have designed Ouroboros, a novel algorithm to reduce such noise in intermolecular contact prediction. Our method iterates between weighting proteins according to how likely they are to interact based on the correlated mutations signal, and predicting correlated mutations based on the weighted sequence alignment. We show that this approach accurately discriminates between protein interaction versus non-interaction and simultaneously improves the prediction of intermolecular contact residues compared to a naive application of correlated mutation analysis. This requires no training labels concerning interactions or contacts. Furthermore, the method relaxes the assumption of one-to-one interaction of previous approaches, allowing for the study of many-to-many interactions.

U2 - 10.1093/bioinformatics/bty924

DO - 10.1093/bioinformatics/bty924

M3 - Article

VL - 35

SP - 2036

EP - 2042

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 12

ER -