pLMMoRF: A web server that accurately predicts membrane-interacting molecular recognition features by employing a protein language model

dc.contributor.authorBasu, Sushmita
dc.date.accessioned2026-01-09T11:34:26Z
dc.date.available2026-01-09T11:34:26Z
dc.date.issued2025-09
dc.description.abstractInteractions between proteins and lipids are crucial for numerous cellular processes. Some of the lipid interacting segments in protein sequences are intrinsically disordered regions (IDRs), which may gain secondary structures upon binding. We collected experimentally annotated lipid-interacting IDRs, named membrane molecular recognition features (MemMoRFs). We used this dataset to develop and test an accurate and relatively fast sequence-based MemMoRF predictor, pLMMoRF, thereby supporting tedious and costly experimental identification of MemMoRFs. Our predictor utilizes a protein language model (pLM) which we processed to generate inputs to a deep convolutional neural network. We considered various pLMs (ESM-2, ProstT5, ProtT5 and Ankh) and applied feature selection to reduce their outputs, creating a more compact neural network model. pLMMoRF leverages the Ankh-based model, selected for its higher accuracy compared to our other models. Tests on low similarity test datasets demonstrate that pLMMoRF is more accurate than the sole current predictor of MemMoRFs, CoMemMoRFPred. Moreover, pLMMoRF has a relatively small computational footprint because of the compact network size and use of dedicated GPU nodes. This allowed us to make MemMoRF predictions for the human proteome. We analyzed these predictions and made them publicly available, facilitating an improved understanding of functions of membrane-coupled proteins. Our work underscores the importance of selecting key embedding features to enhance predictive performance and reduce computational footprint of sequence-based predictors of protein functions. The web server for the pLMMoRF predictor and the predictions for human proteinsen_US
dc.identifier.urihttps://www.sciencedirect.com/science/article/pii/S002228362500302X
dc.identifier.urihttps://dspace.bits-pilani.ac.in/handle/123456789/20516
dc.language.isoenen_US
dc.publisherElsevieren_US
dc.subjectBiologyen_US
dc.subjectMembrane interacting molecular recognition featureen_US
dc.subjectIntrinsically disordered protein regionsen_US
dc.subjectMachine learning (ML)en_US
dc.subjectProtein language modelen_US
dc.titlepLMMoRF: A web server that accurately predicts membrane-interacting molecular recognition features by employing a protein language modelen_US
dc.typeArticleen_US

Files

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: