Unsupervised machine learning framework for discriminating major variants of concern during COVID-19

Agarwal, Vinti

DSpace Home
→
BITS Faculty Publications
→
Department of Computer Science and Information Systems
→
View Item

dc.contributor.author	Agarwal, Vinti
dc.date.accessioned	2023-01-10T09:17:05Z
dc.date.available	2023-01-10T09:17:05Z
dc.date.issued	2022-10
dc.identifier.uri	https://arxiv.org/abs/2208.01439
dc.identifier.uri	http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8434
dc.description.abstract	Due to high mutation rates, COVID-19 evolved rapidly, and several variants such as Alpha, Gamma, Delta, Beta, and Omicron emerged with altered viral properties like the severity of the disease caused, transmission rates, etc. These variants burdened the medical systems worldwide and created a massive impact on the world economy as each had to be studied and dealt with in its specific ways. Unsupervised machine learning methods have the ability to compress, characterize, and visualize unlabelled data. In this paper, we present a framework that utilizes unsupervised machine learning methods to discriminate and visualize the associations between major COVID-19 variants based on their genome sequences. These methods comprise a combination of selected dimensionality reduction and clustering techniques. The framework processes the RNA sequences by performing a k-mer analysis on the data and then compares the results from different dimensionality reduction methods including: Principal Component Analysis (PCA), t-Distributed Stochastic Neighbour Embedding (t-SNE), and Uniform Manifold Approximation Projection (UMAP). Our framework also employs agglomerative hierarchical clustering to visualize the mutational differences among major variants of concern and country-wise mutational differences for a particular variant (Delta and Omicron) using dendrograms. We also provide country-wise mutational differences for selected variants via dendrograms. We conclude that the proposed framework can effectively distinguish between the major variants and hence can be used for the identification of emerging variants in the future.	en_US
dc.language.iso	en	en_US
dc.publisher	ARXIV	en_US
dc.subject	Computer Science	en_US
dc.subject	Machine Learning	en_US
dc.subject	SARS-CoV-2	en_US
dc.subject	Mutation	en_US
dc.subject	COVID-19	en_US
dc.subject	Unsupervised machine learning	en_US
dc.subject	UMAP	en_US
dc.title	Unsupervised machine learning framework for discriminating major variants of concern during COVID-19	en_US
dc.type	Article	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Department of Computer Science and Information Systems [1099]

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

Unsupervised machine learning framework for discriminating major variants of concern during COVID-19

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account