TY - JOUR
T1 - PortPred: Exploiting deep learning embeddings of amino acid sequences for the identification of transporter proteins and their substrates
AU - Anteghini, Marco
AU - Martins dos Santos, Vitor
AU - Saccenti, Edoardo
N1 - FEuropean Union's Horizon 2020 research and innovation program under the Marie Sklodowska Curie grant agreement No. 812968.
PY - 2023/11
Y1 - 2023/11
N2 - The physiology of every living cell is regulated at some level by transporter proteins which constitute a relevant portion of membrane-bound proteins and are involved in the movement of ions, small and macromolecules across bio-membranes. The importance of transporter proteins is unquestionable. The prediction and study of previously unknown transporters can lead to the discovery of new biological pathways, drugs and treatments. Here we present PortPred, a tool to accurately identify transporter proteins and their substrate starting from the protein amino acid sequence. PortPred successfully combines pre-trained deep learning-based protein embeddings and machine learning classification approaches and outperforms other state-of-the-art methods. In addition, we present a comparison of the most promising protein sequence embeddings (Unirep, SeqVec, ProteinBERT, ESM-1b) and their performances for this specific task.
AB - The physiology of every living cell is regulated at some level by transporter proteins which constitute a relevant portion of membrane-bound proteins and are involved in the movement of ions, small and macromolecules across bio-membranes. The importance of transporter proteins is unquestionable. The prediction and study of previously unknown transporters can lead to the discovery of new biological pathways, drugs and treatments. Here we present PortPred, a tool to accurately identify transporter proteins and their substrate starting from the protein amino acid sequence. PortPred successfully combines pre-trained deep learning-based protein embeddings and machine learning classification approaches and outperforms other state-of-the-art methods. In addition, we present a comparison of the most promising protein sequence embeddings (Unirep, SeqVec, ProteinBERT, ESM-1b) and their performances for this specific task.
KW - membrane proteins
KW - pre-trained embeddings
KW - protein sequence embeddings
KW - substrates prediction
KW - transporter proteins
U2 - 10.1002/jcb.30490
DO - 10.1002/jcb.30490
M3 - Article
AN - SCOPUS:85174629756
SN - 0730-2312
VL - 124
SP - 1803
EP - 1824
JO - Journal of Cellular Biochemistry
JF - Journal of Cellular Biochemistry
IS - 11
ER -