BITS Faculty Publications
Permanent URI for this communityhttp://localhost:4000/handle/123456789/1867
Browse
Item A clustering and graph deep learning-based framework for COVID-19 drug repurposing(Elsevier, 2024-09) Agarwal, Vinti; Deepa, P.R.Drug repurposing (or repositioning) is the process of finding new therapeutic uses for drugs already approved by drug regulatory authorities (e.g., the Food and Drug Administration (FDA) and Therapeutic Goods Administration (TGA)) for other diseases. This involves analysing the interactions between different biological entities, such as drug targets (genes/proteins and biological pathways) and drug properties, to discover novel drug–target or drug–disease relations. Machine learning and deep learning models have successfully analysed complex heterogeneous data with applications in the biomedical domain, and have also been used for drug repurposing. This study presents a novel unsupervised machine learning framework that utilizes a graph-based autoencoder for multi-feature type clustering on heterogeneous drug data. The dataset consists of 438 drugs, of which 224 are under clinical trials for COVID-19 (category A). The rest are systematically filtered to ensure the safety and efficacy of the treatment (category B). The framework solely relies on reported drug data, including its pharmacological properties, chemical/physical properties, interaction with the host, and efficacy in different publicly available COVID-19 assays. Our machine-learning framework revealed three clusters of interest and provided recommendations featuring the top 15 drugs for COVID-19 drug repurposing, which were shortlisted based on the predicted clusters that were dominated by category A drugs. Our framework can be extended to support other datasets and drug repurposing studies with the availability of our open-source code.Item A collaborative filtering framework for friends recommendation in social networks based on interaction intensity and adaptive user similarity(Springer, 2012-09) Agarwal, VintiThe tremendous growth in the amount of attention and users, on social networking sites (SNSs), has led to information overload and that adds to the difficulty of making accurate recommendations of new friends to the users of SNSs. This article incorporates collaborative filtering (CF), the most successful and widely used filtering technique, in social networks to facilitate users in exploring new friends having similar interests while being connected with old ones as well. Here, first we design an implicit rating model, for estimating a user’s affinity toward his friends, which uncover the strength of relationship, utilizing both attribute similarity and user interaction intensity. We then propose a CF-based framework that offers list of friends to the user by leveraging on the preference of likeminded users, with a given small set of people that user has already labeled as friends. Despite the immense success of CF, accuracy and sparsity are still major challenges, especially in social networking domain with a staggering growth having enormous number of users. To address these inherent challenges, first we have explored the idea of adaptive similarity computation between users by employing evolutionary algorithms to learn individual preferences toward particular set of attributes that results in considerable improvement in recommendation accuracy as compared to the situation where all the attributes are given equal importance.Item Combined Hamartoma of the Retina and Retinal Pigment Epithelium: An Optical Coherence Tomography–Based Reappraisal(Elsevier, 2017-09) Agarwal, VintiTo analyze the optical coherence tomography (OCT) characteristics of combined hamartoma of the retina and retinal pigment epithelium (CHRRPE) involving the macula.Item CoviRx: A User-Friendly Interface for Systematic Down-Selection of Repurposed Drug Candidates for COVID-19(MDPI, 2022) Agarwal, Vintilthough various vaccines are now commercially available, they have not been able to stop the spread of COVID-19 infection completely. An excellent strategy to get safe, effective, and affordable COVID-19 treatments quickly is to repurpose drugs that are already approved for other diseases. The process of developing an accurate and standardized drug repurposing dataset requires considerable resources and expertise due to numerous commercially available drugs that could be potentially used to address the SARS-CoV-2 infection. To address this bottleneck, we created the CoviRx.org platform. CoviRx is a user-friendly interface that allows analysis and filtering of large quantities of data, which is onerous to curate manually for COVID-19 drug repurposing. Through CoviRx, the curated data have been made open source to help combat the ongoing pandemic and encourage users to submit their findings on the drugs they have evaluated, in a uniform format that can be validated and checked for integrity by authenticated volunteers. This article discusses the various features of CoviRx, its design principles, and how its functionality is independent of the data it displays. Thus, in the future, this platform can be extended to include any other disease beyond COVID-19Item CTI-Twitter: Gathering Cyber Threat Intelligence from Twitter using Integrated Supervised and Unsupervised Learning(IEEE, 2020) Agarwal, VintiCyber threat intelligence (CTI) can be gathered from multiple sources, and Twitter is one such open source platform where a large volume and variety of threat data is shared every day. The automated and timely mining of relevant threat knowledge from this data can be crucial for enrichment of existing threat intelligence platforms to proactively defend against cyber attacks. We propose CTI-Twitter: a novel frame-work combining supervised and unsupervised learning models to collect, process, analyze and generate threat specific knowledge from tweets coming from multiple users. CTI-Twitter has multi-fold contributions: i) first collecting tweets through Twitter API, ii) extracting relevant threat tweets from irrelevant ones, and classifying relevant ones into multiple classes of threats iii) then grouping tweets belonging to each class using topic modeling iv) finally performing data enrichment and verification process. We evaluate our proposed model on real-time tweets collected for about four months (in year 2020) using Twitter API. The encouraging results obtained indicate the effectiveness of CTI-Twitter in terms of timeliness and discovery of trending attacks patterns, and vulnerabilities.Item Eye Share: The P.V. Ramana Reddy Judgment - Power to Arrest Under Special Laws vis-a-vis Code of Criminal Procedure(SSRN, 2019-08) Agarwal, VintiThe grant of powers of arrest under fiscal statutes has often come under the microscope and the GST law is no different. While enforcement of the new tax regime was initially put on the back burner, as the GST law progresses, tackling tax evasion has become one of the tax authorities’ top priorities. Of late, countless cases have been filed by persons seeking relief under the apprehension of arrest, and there have been multiple contradictory High Court judgments on the extent of the GST officials’ power to arrest. In a case before the High Court of Telangana, the court refused to take action to protect the petitioners against arrest, however the High Courts of Karnataka and the High Court of Bombay granted anticipatory bail to the aggrieved in similar matters. Many of these cases have also reached the Apex court. The Supreme Court dismissed the appeal against the aforementioned judgment of the High Court of Telangana, confirming the High Court’s order. However, in its order on the appeal filed against the judgment of the High Court of Bombay, a division bench of the Supreme Court acknowledged the need for clarification on the issue, and referred the matter to a three judge bench. In anticipation of this three judge bench Supreme Court judgment, the authors critically analyse the P.V. Ramana Reddy and Others v. Union of India (2019) judgment which was passed by the High Court of Telangana and affirmed by the Supreme Court in its order dated 27.5.2019.Item Friends Recommendations in Dynamic Social Networks(Springer, 2014-01) Agarwal, VintiItem Identifying Anomalous HTTP Traffic with Association Rule Mining(IEEE, 2019) Agarwal, VintiWeb applications are compromised by exploiting different vulnerabilities. The protection systems designed to detect such attacks, screen the HTTP requests to decide whether a particular request is benign or malicious. Generating effective screening rules governs the detection performance and false positive rate. In this paper, we propose to generate classification rules to identify malicious HTTP requests using co-occurrence between certain character combinations. Our idea is motivated by the fact that, a successful attack will have some combination of characters together. For e.g., in an SQL injection attack = sign may appear along with “'”. We propose to learn such character combinations using association rules with a set of carefully chosen feature (character) set. We experiment with a publicly available HTTP dataset and show that malicious HTTP requests can be identified with rules generated from such associations.Item Inflammatory carcinoma of breast in a post menopausal woman - a case report(Obstetrics & Gynaecological Department, 2021) Agarwal, VintiInflammatory breast carcinoma (IBC) is also known as carcinoma mastitis (CM) and represents the most virulent form of breast cancer. It is an uncommon and aggressive form of breast cancer with inflammatory skin changes Usually presents in women between the 4th and 5 The first description of IBC / CM in the scientific literature was published in 1814 by Sir Charles Bell 1938 the terms “True IBC” and “Primary IBC” were coined to distinguish “IBC” and “secondary IBC”. Secondary IBC was defined by secondary changes in the breast or recurrence of breast cancer 3. The incidence of IBC varies in different regions of the world. More common in North Africa, 5 of all breast cancer in Tunisia 4, 4-5% in Morocco Egypt it has a rate of 11% 6.Item Learning to Detect: A Semi Supervised Multi-relational Graph Convolutional Network for Uncovering Key Actors on Hackforums(IEEE, 2021) Agarwal, VintiCybercriminals who interact extensively on underground forums, often, exchange illegal commodities and indulge in discussions on unwarranted topics. To facilitate the disruption of these highly proficient criminals, we propose a deep learning based multi-relational graph convolutional network approach to analyse the underground forum and identify key actors. We first modeled the hackforum into a homogeneous graph of users, where the multiple edges between users are captured based on their involvement in private conversations, group discussions and other miscellaneous activities. In addition, we also encode the textual content shared among users’ in form of distributed feature representation generated from BERT. To obtain ground truth labels for training data, we propose a hypothesis to calculate the scores for each user based on the quality and quantity of their involvement in the underground forum. The proposed framework jointly embeds the users’ and multi relational information to learn the nodes embeddings in the graph. We demonstrate the effectiveness of the proposed model on a neonazi underground forum, Iron March. We conducted an ablation study on the model parameters to generate the best results and achieved a classification accuracy of 82% which validates the proposed hypothesis for score computation and class labelling. To establish the robustness of our model, we compare its performance against state-of-art models. Though we used an underground forum as a showcase, the proposed model can be implemented to identify influential users’ on other social media platforms.Item PACE: Platform for Android Malware Classification and Performance Evaluation(IEEE, 2019) Agarwal, VintiAndroid malware has become the topmost threat for ubiquitous and useful Android eco-system. Multiple solutions leveraging big data and machine learning capabilities to detect android malware are being constantly developed. Too often, many of these solutions are either limited to the research output or remain isolated and unable to reach to end-users or malware researchers. In this paper, we propose, PACE, a unified solution to offer open and easy implementation access to several machine learning-based Android malware detection techniques that make most of the research in this domain reproducible. The benefits of PACE are offered using three interfaces i.e. through REST API, Web Interface and ADB interface. Multiple interfaces enable users with different expertise such as IT administrator, security practitioners, malware researcher, etc. to avail its offered services. A community-accepted dataset is used for testing of all the techniques to provide a better comparison of performance. A prototype of the proposed platform is introduced and our vision is that it will help malware analysts to tackle challenges and reduce the amount of manual work.Item PACER: Platform for Android Malware Classification, Performance Evaluation and Threat Reporting(MDPI, 2020-01) Agarwal, VintiAndroid malware has become the topmost threat for the ubiquitous and useful Android ecosystem. Multiple solutions leveraging big data and machine-learning capabilities to detect Android malware are being constantly developed. Too often, these solutions are either limited to research output or remain isolated and incapable of reaching end users or malware researchers. An earlier work named PACE (Platform for Android Malware Classification and Performance Evaluation), was introduced as a unified solution to offer open and easy implementation access to several machine-learning-based Android malware detection techniques, that makes most of the research reproducible in this domain. The benefits of PACE are offered through three interfaces: Representational State Transfer (REST) Application Programming Interface (API), Web Interface, and Android Debug Bridge (ADB) interface. These multiple interfaces enable users with different expertise such as IT administrators, security practitioners, malware researchers, etc. to use their offered services. In this paper, we propose PACER (Platform for Android Malware Classification, Performance Evaluation, and Threat Reporting), which extends PACE by adding threat intelligence and reporting functionality for the end-user device through the ADB interface. A prototype of the proposed platform is introduced, and our vision is that it will help malware analysts and end users to tackle challenges and reduce the amount of manual workItem POS-804 Donor vascular endothelial growth factor gene polymorphism association with acute allograft rejection in live related renal transplant recipient patients(Elsevier, 2022-02) Agarwal, VintiRenal allograft rejection risk associated with donor’s vascular endothelial growth factor (VEGF) gene polymorphism remain unelucidated till now. Although, studies have shown, an association of recipient’s VEGF polymorphism with the end-stage renal disease and early acute rejection. VEGF has pleiotropic function, which regulates vasculogenesis, endothelial cell survival signaling. Endothelial cell regulates tonicity, the permeability of blood vessels and egression of allo-stimulated inflammatory cell in intragraft compartments, thus regulate the events of rejection. In the current study, we aimed to investigate the distribution of VEGF -634C>G, -1154 G>A, -1190G>A, -1455T>C, -1499 C>T, -2578 C>A, -2549 18bp Insertion/Deletion, +405 C>G and +936 C>T SNPs among donors and recipients and to evaluate the VEGF mRNA and protein expression in intragraft tissue and in plasma.Item Predicting Friends and Foes in Signed Networks Using Inductive Inference and Social Balance Theory(IEEE, 2012) Agarwal, VintiBesides the notion of friendship, trust or support in social networking sites (SNSs), quite often social interactions also reflect users' antagonistic attitude towards each other. Thus, the hidden knowledge contained in social network data can be considered as an important resource to discover the formation of such positive and negative links. In this work, an inductive learning framework is presented to suggest 'friends' and 'foes' links to individuals which envisage the social balance among users in the corresponding friends and foes networks (FFN). First we learn a model by applying C4.5, the most widely adopted decision tree based classification algorithm, to exploit the feature patterns presented in the users' FFN and utilizing it to further predict friend/foe relationship of unknown links. Secondly, a quantitative measure of social balance, balance index, is used to support our decision on the recommendation of new friends and foes links (FFL) to avoid possible imbalance in the extended FFN with newly suggested links. The proposed scheme ensures that the recommendation of new FFLs either maintains or enhances the balancing factor of the existing FFN of an individual. Experimental results show the effectiveness of our proposed schemes.Item Predicting the dynamics of social circles in ego networks using pattern analysis and GA K-means clustering(Wiley, 2015-04) Agarwal, VintiThe tremendous amount of content generated on online social networks has led to a radical paradigm shift in the interest of people to group friends dynamically and share content selectively. At large, social networking sites (e.g. Google+, Facebook, Twitter, etc.) offer users with various controls over categorizing their family members, friends, colleagues, etc. into one or more ‘circles’ that they want to share content with. However, it is typically impossible to design social circles in large networks and update their number and size, whenever networks grow. Aiming at predicting the dynamics of formation and evolution of social circles, we performed several experiments on ground-truth data, and found that studying patterns of network and profile features at individual level rather than studying circle as a whole can greatly enhance the understanding of social circles development in online social networks. In this review, we first present a comprehensive study of the structural behavior of circles, and then build an observation that within every circle there exist some key elements, termed as ‘Node of Creations (NoCs)’, which play an important role in the development, survival, and evolvability of circle structures. We, therefore, propose a Genetic Algorithm–based framework to determine these key elements (NoCs) and differentiate Ego networks into non-overlapping, hierarchically nested as well as overlapping circles by leveraging knowledge from the identified patterns in order to assist K-means clustering. We have performed our experiments using Facebook and Twitter datasets and the experimental results clearly demonstrate the effectiveness of our scheme. WIREs Data Mining Knowl Discov 2015, 5:113–141. doi: 10.1002/widm.1150Item Recommending diverse friends in signed social networks based on adaptive soft consensus paradigm using variable length genetic algorithm(Springer, 2017-10) Agarwal, VintiDespite the strategic role played by individuals, who act as intermediaries between distinct groups of people, the problem of recommending diverse friends in signed social networks (SSNs) still remains largely unexplored. Our model integrates homophily and diversity to develop an adaptive consensus based framework, which involves fuzzy group decision making analysis by leveraging on the signed social links and underlying users’ preferences, to offer lists of connections which are diverse as well as relevant. Our contributions are three-fold. First, we modeled the fuzzy binary adjacency relations between users, thereafter referred as decision makers (DMs), exploiting users’ preferences conferred on a set of items, and then higher order fuzzy m-ary adjacency relations are constructed to represent the grade of agreement between a set of m DMs. Further, in order to evaluate the relevance of each decision maker involved in the decision making process, we introduce a novel diversity measure based on the knowledge of socio-psychological theories and the information contained in social and interest links. Next, by employing variable-length genetic algorithm, an idea of adaptive consensus is explored to evolve groups of experts which are highly consensual as well as influential in the social network. Finally, on the basis of opinions gleaned from the members of these groups, sign of unknown links are predicted, thereby generating a top-N recommendations list of diverse friends. Extensive experimental study conducted on Epinions dataset illustrates that our proposed scheme outperforms the traditional graph-based methods.Item Safeguards against Arrest and the GST Law(Heinonline, 2018) Agarwal, VintiItem Systematic Down-Selection of Repurposed Drug Candidates for COVID-19(IJMS, 2022) Agarwal, VintiSARS-CoV-2 is the cause of the COVID-19 pandemic which has claimed more than 6.5 million lives worldwide, devastating the economy and overwhelming healthcare systems globally. The development of new drug molecules and vaccines has played a critical role in managing the pandemic; however, new variants of concern still pose a significant threat as the current vaccines cannot prevent all infections. This situation calls for the collaboration of biomedical scientists and healthcare workers across the world. Repurposing approved drugs is an effective way of fast-tracking new treatments for recently emerged diseases. To this end, we have assembled and curated a database consisting of 7817 compounds from the Compounds Australia Open Drug collection. We developed a set of eight filters based on indicators of efficacy and safety that were applied sequentially to down-select drugs that showed promise for drug repurposing efforts against SARS-CoV-2. Considerable effort was made to evaluate approximately 14,000 assay data points for SARS-CoV-2 FDA/TGA-approved drugs and provide an average activity score for 3539 compounds. The filtering process identified 12 FDA-approved molecules with established safety profiles that have plausible mechanisms for treating COVID-19 disease. The methodology developed in our study provides a template for prioritising drug candidates that can be repurposed for the safe, efficacious, and cost-effective treatment of COVID-19, long COVID, or any other future disease. We present our database in an easy-to-use interactive interface (CoviRx that was also developed to enable the scientific community to access to the data of over 7000 potential drugs and to implement alternative prioritisation and down-selection strategies.Item Trust-Enhanced Recommendation of Friends in Web Based Social Networks Using Genetic Algorithms to Learn User Preferences(Springer, 2011) Agarwal, VintiWeb-based social networks (WBSNs) are a promising new paradigm for large scale distributed data management and collective intelligences. But the exponential growth of social networks poses a new challenge and presents opportunities for recommender systems, such as complicated nature of human to human interaction which comes into play while recommending people. Web based recommender systems (RSs) are the most notable application of the web personalization to deal with problem of information overload. In this paper, we present a Friend RS for WBSNs. Our contribution is three fold. First, we have identified appropriate attributes in a user profile and suggest suitable similarity computation formulae. Second, a real-valued Genetic algorithm is used to learn user preferences based on comparison of individual features to increase recommendation effectiveness. Finally, inorder to alleviate the sparsity problem of collaborative filtering, we have employed trust propagation techniques. Experimental results clearly demonstrate the effectiveness of our proposed schemes.Item Unsupervised machine learning framework for discriminating major variants of concern during COVID-19(ARXIV, 2022-10) Agarwal, VintiDue to high mutation rates, COVID-19 evolved rapidly, and several variants such as Alpha, Gamma, Delta, Beta, and Omicron emerged with altered viral properties like the severity of the disease caused, transmission rates, etc. These variants burdened the medical systems worldwide and created a massive impact on the world economy as each had to be studied and dealt with in its specific ways. Unsupervised machine learning methods have the ability to compress, characterize, and visualize unlabelled data. In this paper, we present a framework that utilizes unsupervised machine learning methods to discriminate and visualize the associations between major COVID-19 variants based on their genome sequences. These methods comprise a combination of selected dimensionality reduction and clustering techniques. The framework processes the RNA sequences by performing a k-mer analysis on the data and then compares the results from different dimensionality reduction methods including: Principal Component Analysis (PCA), t-Distributed Stochastic Neighbour Embedding (t-SNE), and Uniform Manifold Approximation Projection (UMAP). Our framework also employs agglomerative hierarchical clustering to visualize the mutational differences among major variants of concern and country-wise mutational differences for a particular variant (Delta and Omicron) using dendrograms. We also provide country-wise mutational differences for selected variants via dendrograms. We conclude that the proposed framework can effectively distinguish between the major variants and hence can be used for the identification of emerging variants in the future.