Department of Biological Sciences
Permanent URI for this collectionhttp://localhost:4000/handle/123456789/1922
Browse
5 results
Search Results
Item A structural perspective of RNA recognition by intrinsically disordered proteins(Springer, 2016-05) Basu, SushmitaProtein-RNA recognition is essential for gene expression and its regulation, which is indispensable for the survival of the living organism at one hand, on the other hand, misregulation of this recognition may lead to their extinction. Polymorphic conformation of both the interacting partners is a characteristic feature of such molecular recognition that promotes the assembly. Many RNA binding proteins (RBP) or regions in them are found to be intrinsically disordered, and this property helps them to play a central role in the regulatory processes. Sequence composition and the length of the flexible linkers between RNA binding domains in RBPs are crucial in making significant contacts with its partner RNA. Polymorphic conformations of RBPs can provide thermodynamic advantage to its binding partner while acting as a chaperone. Prolonged extensions of the disordered regions in RBPs also contribute to the stability of the large cellular machines including ribosome and viral assemblies. The involvement of these disordered regions in most of the significant cellular processes makes RBPs highly associated with various human diseases that arise due to their misregulation.Item Computational prediction of disordered binding regions(Elsevier, 2023) Basu, SushmitaOne of the key features of intrinsically disordered regions (IDRs) is their ability to interact with a broad range of partner molecules. Multiple types of interacting IDRs were identified including molecular recognition fragments (MoRFs), short linear sequence motifs (SLiMs), and protein-, nucleic acids- and lipid-binding regions. Prediction of binding IDRs in protein sequences is gaining momentum in recent years. We survey 38 predictors of binding IDRs that target interactions with a diverse set of partners, such as peptides, proteins, RNA, DNA and lipids. We offer a historical perspective and highlight key events that fueled efforts to develop these methods. These tools rely on a diverse range of predictive architectures that include scoring functions, regular expressions, traditional and deep machine learning and meta-models. Recent efforts focus on the development of deep neural network-based architectures and extending coverage to RNA, DNA and lipid-binding IDRs. We analyze availability of these methods and show that providing implementations and webservers results in much higher rates of citations/use. We also make several recommendations to take advantage of modern deep network architectures, develop tools that bundle predictions of multiple and different types of binding IDRs, and work on algorithms that model structures of the resulting complexes.Item CoMemMoRFPred: sequence-based prediction of MemMoRFs by combining predictors of intrinsic disorder, MoRFs and disordered lipid-binding regions(Elsevier, 2023-11) Basu, SushmitaMolecular recognition features (MoRFs) are a commonly occurring type of intrinsically disordered regions (IDRs) that undergo disorder-to-order transition upon binding to partner molecules. We focus on recently characterized and functionally important membrane-binding MoRFs (MemMoRFs). Motivated by the lack of computational tools that predict MemMoRFs, we use a dataset of experimentally annotated MemMoRFs to conceptualize, design, evaluate and release an accurate sequence-based predictor. We rely on state-of-the-art tools that predict residues that possess key characteristics of MemMoRFs, such as intrinsic disorder, disorder-to-order transition and lipid-binding. We identify and combine results from three tools that include flDPnn for the disorder prediction, DisoLipPred for the prediction of disordered lipid-binding regions, and MoRFCHiBiLight for the prediction of disorder-to-order transitioning protein binding regions. Our empirical analysis demonstrates that combining results produced by these three methods generates accurate predictions of MemMoRFs. We also show that use of a smoothing operator produces predictions that closely mimic the number and sizes of the native MemMoRF regions.Item Taxonomy-specific assessment of intrinsic disorder predictions at residue and region levels in higher eukaryotes, protists, archaea, bacteria and viruses(Elsevier, 2024-12) Basu, SushmitaIntrinsic disorder predictors were evaluated in several studies including the two large CAID experiments. However, these studies are biased towards eukaryotic proteins and focus primarily on the residue-level predictions. We provide first-of-its-kind assessment that comprehensively covers the taxonomy and evaluates predictions at the residue and disordered region levels. We curate a benchmark dataset that uniformly covers eukaryotic, archaeal, bacterial, and viral proteins. We find that predictive performance differs substantially across taxonomy, where viruses are predicted most accurately, followed by protists and higher eukaryotes, while bacterial and archaeal proteins suffer lower levels of accuracy. These trends are consistent across predictors. We also find that current tools, except for flDPnn, struggle with reproducing native distributions of the numbers and sizes of the disordered regions. Moreover, analysis of two variants of disorder predictions derived from the AlphaFold2 predicted structures reveals that they produce accurate residue-level propensities for archaea, bacteria and protists. However, they underperform for higher eukaryotes and generally struggle to accurately identify disordered regions. Our results motivate development of new predictors that target bacteria and archaea and which produce accurate results at both residue and region levels. We also stress the need to include the region-level assessments in future assessments.Item flDPnn3: Fast and accurate prediction of intrinsic disorder in protein sequences(Elsevier, 2026-01) Basu, SushmitaflDPnn3 provides fast and highly accurate predictions of intrinsic disorder. Compared to its earlier versions, it uses a more sophisticated sequence-derived profile as input, covering a modern protein language model and additional predicted disorder functions, while maintaining a similarly small computational footprint. flDPnn3 and over 70 other disorder predictors were independently evaluated on the Disorder-NOX dataset by assessors in CAID3 (3rd Critical Assessment of protein Intrinsic Disorder prediction). A side-by-side comparison in CAID3, including low-sequence-similarity subsets of the CAID3 test data, reveals that our method matches the predictive quality of the best disorder predictors. The runtime analysis shows that flDPnn3 produces results between 3 and 8 times faster than similarly accurate disorder predictors and can be used to produce predictions at the whole-proteome scale. Additionally, flDPnn3 achieves 100% coverage by predicting all proteins, while some other accurate tools fail to predict some proteins. The CAID3 results also demonstrate that flDPnn3 is significantly more accurate than its previous versions, flDPnn and flDPnn2, which were among the top-ranked methods in CAID1 and CAID2, respectively. The flDPnn3’s web server supports batch predictions, provides interactive visualization of results, offers a tutorial page,