Abstract:
Computational prediction of DNA-binding residues (DBRs) and the RNA-binding residues (RBRs) in protein sequences is an active area of research, with about 90 predictors and 20 that were published over the last two years. The new predictors rely on sophisticated deep neural networks and protein language models, produce accurate predictions, and are conveniently available as code and/or web servers. However, we identified shortage of tools that predict these interactions in intrinsically disordered regions and tools capable of predicting residues that interact with specific RNA and DNA types. Moreover, cross-predictions between RBRs and DBRs should be quantified and minimized to ensure that future tools accurately differentiate between these two distinct types of nucleic acids.