Browsing by Author "Narang, Pratik"

Now showing 1 - 20 of 48

AENeT: an attention-enabled neural architecture for fake news detection using contextual features
(Springer, 2021) Narang, Pratik; Sharma, Yashvardhan
In the current era of social media, the popularity of smartphones and social media platforms has increased exponentially. Through these electronic media, fake news has been rising rapidly with the advent of new sources of information, which are highly unreliable. Checking off a particular news article is genuine or fake is not easy for any end user. Search engines like Google are also not capable of telling about the fakeness of any news article due to its restriction with limited query keywords. In this paper, our end goal is to design an efficient deep learning model to detect the degree of fakeness in a news statement. We propose a simple network architecture that combines the use of contextual embedding as word embedding and uses attention mechanisms with relevant metadata available. The efficacy and efficiency of our models are demonstrated on several real-world datasets. Our model achieved 46.36% accuracy on the LIAR dataset, which outperforms the current state of the art by 1.49%.
AffectSRNet: facial emotion-aware super-resolution network
(2025-02) Narang, Pratik
Facial expression recognition (FER) systems in low-resolution settings face significant challenges in accurately identifying expressions due to the loss of fine-grained facial details. This limitation is especially problematic for applications like surveillance and mobile communications, where low image resolution is common and can compromise recognition accuracy. Traditional single-image face super-resolution (FSR) techniques, however, often fail to preserve the emotional intent of expressions, introducing distortions that obscure the original affective content. Given the inherently ill-posed nature of single-image super-resolution, a targeted approach is required to balance image quality enhancement with emotion retention. In this paper, we propose AffectSRNet, a novel emotion-aware super-resolution framework that reconstructs high-quality facial images from low-resolution inputs while maintaining the intensity and fidelity of facial expressions. Our method effectively bridges the gap between image resolution and expression accuracy by employing an expression-preserving loss function, specifically tailored for FER applications. Additionally, we introduce a new metric to assess emotion preservation in super-resolved images, providing a more nuanced evaluation of FER system performance in low-resolution scenarios. Experimental results on standard datasets, including CelebA, FFHQ, and Helen, demonstrate that AffectSRNet outperforms existing FSR approaches in both visual quality and emotion fidelity, highlighting its potential for integration into practical FER applications. This work not only improves image clarity but also ensures that emotion-driven applications retain their core functionality in suboptimal resolution environments, paving the way for broader adoption in FER systems.
AGD-Net: Attention-Guided Dense Inception U-Net for Single-Image Dehazing
(Springer, 2023-12) Chamola, Vinay; Narang, Pratik
Image hazing poses a significant challenge in various computer vision applications, degrading the visual quality and reducing the perceptual clarity of captured scenes. The proposed AGD-Net utilizes a U-Net style architecture with an Attention-Guided Dense Inception encoder-decoder framework. Unlike existing methods that heavily rely on synthetic datasets which are based on CARLA simulation, our model is trained and evaluated exclusively on realistic data, enabling its effectiveness and reliability in practical scenarios. The key innovation of AGD-Net lies in its attention-guided mechanism, which empowers the network to focus on crucial information within hazy images and effectively suppress artifacts during the dehazing process. The dense inception modules further advance the representation capabilities of the model, facilitating the extraction of intricate features from the input images. To assess the performance of AGD-Net, a detailed experimental analysis is conducted on four benchmark haze datasets. The results show that AGD-Net significantly outperforms the state-of-the-art methods in terms of PSNR and SSIM. Moreover, a visual comparison of the dehazing results further validates the superior performance gains achieved by AGD-Net over other methods. By leveraging realistic data exclusively, AGD-Net overcomes the limitations associated with synthetic datasets which are based on CARLA simulation, ensuring its adaptability and effectiveness in real-world circumstances. The proposed AGD-Net offers a robust and reliable solution for single-image dehazing, presenting a significant advancement over existing methods.
AI-Enabled Object Detection in UAVs: Challenges, Design Choices, and Research Directions
(IEEE, 2021-08) Narang, Pratik; Chamola, Vinay
Unmanned aerial vehicles (UAVs) are emerging as a powerful tool for various industrial and smart city applications. UAVs coupled with various sensors can perform many cognitive tasks such as object detection, surveillance, traffic management, and urban planning. Deep learning has emerged as a popular technique to speed up the processing of high-dimensional data like images and videos, which has led to several applications in surveillance and autonomous driving. However, the area of aerial object detection has been understudied. This work proposes a deep learning approach for detection of objects in aerial scenes captured by UAVs. Our work first categorizes the current methods for aerial object detection using deep learning techniques and discusses how the task is different from general object detection scenarios. We delineate the specific challenges involved and experimentally demonstrate the key design decisions that significantly affect the accuracy and robustness of models. We further propose an optimized architecture that utilizes these optimal design choices along with the recent Res-NeSt backbone to achieve superior performance in aerial object detection. Lastly, we propose several research directions to inspire further advancement in aerial object detection.
AI-enabled remote monitoring of vital signs for COVID-19: methods, prospects and challenges
(Springer, 2021-03) Narang, Pratik; Narang, Pratik; Chamola, Vinay
The COVID-19 pandemic has overwhelmed the existing healthcare infrastructure in many parts of the world. Healthcare professionals are not only over-burdened but also at a high risk of nosocomial transmission from COVID-19 patients. Screening and monitoring the health of a large number of susceptible or infected individuals is a challenging task. Although professional medical attention and hospitalization are necessary for high-risk COVID-19 patients, home isolation is an effective strategy for low and medium risk patients as well as for those who are at risk of infection and have been quarantined. However, this necessitates effective techniques for remotely monitoring the patients’ symptoms. Recent advances in Machine Learning (ML) and Deep Learning (DL) have strengthened the power of imaging techniques and can be used to remotely perform several tasks that previously required the physical presence of a medical professional. In this work, we study the prospects of vital signs monitoring for COVID-19 infected as well as quarantined individuals by using DL and image/signal-processing techniques, many of which can be deployed using simple cameras and sensors available on a smartphone or a personal computer, without the need of specialized equipment. We demonstrate the potential of ML-enabled workflows for several vital signs such as heart and respiratory rates, cough, blood pressure, and oxygen saturation. We also discuss the challenges involved in implementing ML-enabled techniques.
Anomaly detection in diurnal CPS monitoring data using a local density approach
(IEEE, 2016) Narang, Pratik
Devices that monitor and measure various system parameters or physical phenomena form an integral part of cyber-physical systems. Such devices usually operate continuously and gather important data that is often critical for the operation of the underlying system. Thus, it becomes important to understand and detect abnormal or malicious device behavior, false injection of data by an adversary, or other security threats that may lead to incorrect measurement data. This paper addresses the problem of detection of anomalies in diurnal traffic volume data in an intelligent transportation system. The proposed approach leverages the statistical properties of the data to perform anomaly detection by calculating the `local density' of the data points. Anomalous behavior in the traffic volumes reported by road segments is calculated based on sparse local density of the data points. Our approach for detecting anomalies does not require any information about the outside factors which might have influenced the data. The proposed approach has been evaluated on attacks simulated on transportation data collected by the New York State Department of Transportation. The proposed approach also extends to other cyber-physical systems where the monitored data exhibits diurnal patterns.
Attention-enabled Deep Neural Network for Enhancing UAV-Captured Pavement Imagery in Poor Visibility
(IEEE, 2023) Singh, Ajit Pratap; Srinivas, Rallapalli; Narang, Pratik
Integrating Unmanned Aerial Vehicle (UAV) technology with Artificial Intelligence AI and Computer Vision has revolutionized asset management, particularly pavement health monitoring. However, current AI-based methods often struggle in low-visibility scenarios, limiting their effectiveness. To address this, we present a novel end-to-end deep learning pipeline that detects image degradation using an efficient Attention mechanism and performs subsequent enhancement. This algorithm can be seamlessly integrated into drones or used for post-processing of pavement imagery. Its efficiency allows for scalability, making it a valuable tool for downstream road health monitoring tasks, such as cost estimation for road repairs. Our approach achieves mean accuracies of 93.34% with a mean inference time of 0.154 sec., demonstrating its efficacy.
Background Invariant Faster Motion Modeling for Drone Action Recognition
(MDPI, 2021-07) Narang, Pratik
Visual data collected from drones has opened a new direction for surveillance applications and has recently attracted considerable attention among computer vision researchers. Due to the availability and increasing use of the drone for both public and private sectors, it is a critical futuristic technology to solve multiple surveillance problems in remote areas. One of the fundamental challenges in recognizing crowd monitoring videos’ human action is the precise modeling of an individual’s motion feature. Most state-of-the-art methods heavily rely on optical flow for motion modeling and representation, and motion modeling through optical flow is a time-consuming process. This article underlines this issue and provides a novel architecture that eliminates the dependency on optical flow. The proposed architecture uses two sub-modules, FMFM (faster motion feature modeling) and AAR (accurate action recognition), to accurately classify the aerial surveillance action. Another critical issue in aerial surveillance is a deficiency of the dataset. Out of few datasets proposed recently, most of them have multiple humans performing different actions in the same scene, such as a crowd monitoring video, and hence not suitable for directly applying to the training of action recognition models. Given this, we have proposed a novel dataset captured from top view aerial surveillance that has a good variety in terms of actors, daytime, and environment. The proposed architecture has shown the capability to be applied in different terrain as it removes the background before using the action recognition model. The proposed architecture is validated through the experiment with varying investigation levels and achieves a remarkable performance of 0.90 validation accuracy in aerial action recognition.
Balancing the scales: enhancing fairness in facial emotion recognition with latent alignment
(Springer, 2024-12) Narang, Pratik
Automatically recognizing emotional intent using facial expression has been a thoroughly investigated topic in the realm of computer vision. Facial Expression Recognition (FER), being a supervised learning task, relies heavily on substantially large data exemplifying various socio-cultural demographic attributes. Over the past decade, several real-world in-the-wild FER datasets that have been proposed were collected through crowd-sourcing or web-scraping. However, most of these practically used datasets employ a manual annotation methodology for labelling emotional intent, which inherently propagates individual demographic biases. Moreover, these datasets also lack an equitable representation of various socio-cultural demographic groups, thereby inducing a class imbalance. Bias analysis and its mitigation have been investigated across multiple domains and problem settings; however, in the FER domain, this is a relatively lesser explored area. This work leverages representation learning based on latent spaces to mitigate bias in facial expression recognition systems, thereby enhancing a deep learning model’s fairness and overall accuracy.
Classification and study of music genres with multimodal Spectro-Lyrical Embeddings for Music (SLEM)
(Springer, 2024-04) Narang, Pratik
The essence of music is inherently multi-modal – with audio and lyrics going hand in hand. However, there is very less research done to study the intricacies of the multi-modal nature of music, and its relation with genres. Our work uses this multi-modality to present spectro-lyrical embeddings for music representation (SLEM), leveraging the power of open-sourced, lightweight, and state-of-the-art deep learning vision and language models to encode songs. This work summarises extensive experimentation with over 20 deep learning-based music embeddings of a self-curated and hand-labeled multi-lingual dataset of 226 recent songs spread over 5 genres. Our aim is to study the effects of varying the weight of lyrics and spectrograms in the embeddings on the multi-class genre classification. The purpose of this study is to prove that a simple linear combination of both modalities is better than either modality alone. Our methods achieve an accuracy ranging between 81.08% to 98.60% for different genres, by using the K-nearest neighbors algorithm on the multimodal embeddings. We successfully study the intricacies of genres in this representational space, including their misclassification, visual clustering with EM-GMM, and the domain-specific meaning of the multi-modal weight for each genre with respect to ’instrumentalness’ and ’energy’ metadata. SLEM presents one of the first works on an end-to-end method that uses spectro-lyrical embeddings without hand-engineered features.
Deep3DSCan: Deep residual network and morphological descriptor based framework forlung cancer classification and 3D segmentation
(IET, 2020-04) Raman, Sundaresan; Chamola, Vinay; Narang, Pratik
With the increasing incidence rate of lung cancer patients, early diagnosis could help in reducing the mortality rate. However, accurate recognition of cancerous lesions is immensely challenging owing to factors such as low contrast variation, heterogeneity and visual similarity between benign and malignant nodules. Deep learning techniques have been very effective in performing natural image segmentation with robustness to previously unseen situations, reasonable scale invariance and the ability to detect even minute differences. However, they usually fail to learn domain-specific features due to the limited amount of available data and domain agnostic nature of these techniques. This work presents an ensemble framework Deep3DSCan for lung cancer segmentation and classification. The deep 3D segmentation network generates the 3D volume of interest from computed tomography scans of patients. The deep features and handcrafted descriptors are extracted using a fine-tuned residual network and morphological techniques, respectively. Finally, the fused features are used for cancer classification. The experiments were conducted on the publicly available LUNA16 dataset. For the segmentation, the authors achieved an accuracy of 0.927, significant improvement over the template matching technique, which had achieved an accuracy of 0.927. For the detection, previous state-of-the-art is 0.866, while ours is 0.883.
DeepFakE: improving fake news detection using tensor decomposition-based deep neural network
(Springer, 2020-05) Narang, Pratik
Social media platforms have simplified the sharing of information, which includes news as well, as compared to traditional ways. The ease of access and sharing the data with the revolution in mobile technology has led to the proliferation of fake news. Fake news has the potential to manipulate public opinions and hence, may harm society. Thus, it is necessary to examine the credibility and authenticity of the news articles being shared on social media. Nowadays, the problem of fake news has gained massive attention from research communities and needed an optimal solution with high efficiency and low efficacy. Existing detection methods are based on either news-content or social-context using user-based features as an individual. In this paper, the content of the news article and the existence of echo chambers (community of social media-based users sharing the same opinions) in the social network are taken into account for fake news detection. A tensor representing social context (correlation between user profiles on social media and news articles) is formed by combining the news, user and community information. The news content is fused with the tensor, and coupled matrix-tensor factorization is employed to get a representation of both news content and social context. The proposed method has been tested on a real-world dataset: BuzzFeed. The factors obtained after decomposition have been used as features for news classification. An ensemble machine learning classifier (XGBoost) and a deep neural network model (DeepFakE) are employed for the task of classification. Our proposed model (DeepFakE) outperforms with the existing fake news detection methods by applying deep learning on combined news content and social context-based features as an echo-chamber.
DerainGAN: Single image deraining using wasserstein GAN
(Springer, 2021-09) Narang, Pratik
Rainy weather greatly affects the visibility of salient objects and scenes in the captured images and videos. The object/scene visibility varies with the type of raindrops, i.e. adherent rain droplets, streaks, rain, mist, etc. Moreover, they pose multifaceted challenges to detect and remove the raindrops to reconstruct the rain-free image for higher-level tasks like object detection, road segmentation etc. Recently, both Convolutional Neural Networks (CNN) and Generative Adversarial Network (GAN) based models have been designed to remove rain droplets from a single image by dealing with it as an image to image mapping problem. However, most of them fail to capture the complexities of the task, create blurry output, or are not time efficient. GANs are a prime candidate for solving this problem as they are extremely effective in learning image maps without harsh overfitting. In this paper, we design a simple yet effective ‘DerainGAN’ framework to achieve improved deraining performance over the existing state-of-the-art methods. The learning is based on a Wasserstein GAN and perceptual loss incorporated into the architecture. We empirically analyze the effect of different parameter choices to train the model for better optimization. We also identify the strengths and limitations of various components for single image deraining by performing multiple ablation studies on our model. The robustness of the proposed method is evaluated over two synthetic and one real-world rainy image datasets using Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) values. The proposed DerainGAN significantly outperforms almost all state-ofthe- art models in Rain100L and Rain700 datasets, both in semantic and visual appearance, achieving SSIM of 0.8201 and PSNR 24.15 in Rain700 and SSIM of 0.8701 and PSNR of 28.30 in Rain100L. This accounts for an average improvement of 10 percent in PSNR and 20 percent in SSIM over benchmarked methods. Moreover, the DerainGAN is one of the fastest methods in terms of time taken to process the image, giving it over 0.1 to 150 seconds of advantage in some cases.
Domain-Aware Unsupervised Hyperspectral Reconstruction for Aerial Image Dehazing
(ARXIV, 2020) Narang, Pratik
Haze removal in aerial images is a challenging problem due to considerable variation in spatial details and varying contrast. Changes in particulate matter density often lead to degradation in visibility. Therefore, several approaches utilize multi-spectral data as auxiliary information for haze removal. In this paper, we propose SkyGAN for haze removal in aerial images. SkyGAN consists of 1) a domain-aware hazy-to-hyperspectral (H2H) module, and 2) a conditional GAN (cGAN) based multi-cue image-to-image translation module (I2I) for dehazing. The proposed H2H module reconstructs several visual bands from RGB images in an unsupervised manner, which overcomes the lack of hazy hyperspectral aerial image datasets. The module utilizes task supervision and domain adaptation in order to create a "hyperspectral catalyst" for image dehazing. The I2I module uses the hyperspectral catalyst along with a 12-channel multi-cue input and performs effective image dehazing by utilizing the entire visual spectrum. In addition, this work introduces a new dataset, called Hazy Aerial-Image (HAI) dataset, that contains more than 65,000 pairs of hazy and ground truth aerial images with realistic, non-homogeneous haze of varying density. The performance of SkyGAN is evaluated on the recent SateHaze1k dataset as well as the HAI dataset. We also present a comprehensive evaluation of HAI dataset with a representative set of state-of-the-art techniques in terms of PSNR and SSIM.
Downlink power control for latency aware grid energy savings in green cellular networks
(IEEE, 2016) Narang, Pratik; Chamola, Vinay
Mobile service providers can achieve cost savings by deploying Base Stations (BSs) which harvest renewable energy as they reduce the energy drawn from the grid and its associated cost. The cost savings can be further enhanced by careful management of the system resources. Furthermore, mobile operators require that such resource management be carefully coupled with managing the quality of service (QoS) to ensure customer satisfaction. This process involves trade-off between energy drawn from the grid and the QoS performance. In contrast to prior research which has addressed the problem of joint management of grid energy savings and the QoS performance using user-association reconfiguration or BS on/off schemes, we present a framework for doing so using BS downlink power control. Our proposed framework is evaluated through simulations using a real BS deployment from London, UK, and we show its superior performance over existing benchmarks. We demonstrate that our framework can lead to around 40% grid energy savings with better network latency performance as compared to the traditionally used scheme.
Drone-surveillance for search and rescue in natural disaster
(Elsevier, 2020-04) Narang, Pratik
Due to the increasing capability of drones and requirements to monitor remote areas, drone surveillance is becoming popular. In case of natural disaster, it can scan the wide affected-area quickly and make the search and rescue (SAR) faster to save more human lives. However, using autonomous drone for search and rescue is least explored and require attention of researchers to develop efficient algorithms in autonomous drone surveillance. To develop an automated application using recent advancement of deep-learning, dataset is the key. For this, a substantial amount of human detection and action detection dataset is required to train the deep-learning models. As dataset of drone surveillance in SAR is not available in literature, this paper proposes an image dataset for human action detection for SAR. Proposed dataset contains 2000 unique images filtered from 75,000 images. It contains 30000 human instances of different actions. Also, in this paper various experiments are conducted with proposed dataset, publicly available dataset, and stat-of-the art detection method. Our experiments shows that existing models are not adequate for critical applications such as SAR, and that motivates us to propose a model which is inspired by the pyramidal feature extraction of SSD for human detection and action recognition Proposed model achieves 0.98mAP when applied on proposed dataset which is a significant contribution. In addition, proposed model achieve 7% higher mAP value when applied to standard Okutama dataset in comparison with the state-of-the-art detection models in literature.
DroneSegNet: Robust Aerial Semantic Segmentation for UAV-Based IoT Applications
(IEEE, 2022-04) Narang, Pratik; Chamola, Vinay
Unmanned Aerial Vehicles (UAVs) are the promising “Flying IoT” devices of the future, which can be equipped with various sensors and cognitive capabilities to perform numerous tasks related to remote sensing, search and rescue operations, object tracking, segmentation of roads and buildings, surveillance, etc. However, these AI-driven tasks require heavy computation and may lead to suboptimal performance with embedded processors on a power-constrained battery-operated drone. This work proposes a novel deep learning approach for performing robust semantic segmentation of aerial scenes captured by UAVs. In our setup, the power-constrained drone is used only for data collection, while the computationally intensive tasks are offloaded to a GPU cloud server. Our architecture performs robust semantic segmentation by learning the segmentation maps from jointly utilizing of aerial scenes along with the respective “elevation maps” in a semi-supervised approach. We propose a three-tier deep learning architecture, wherein the first module aims at preliminary feature extraction from aerial scenes using a backbone feature extractor. The second module captures the spatial dependency between the aerial scenes and their respective elevation maps to obtain better semantic information, which is achieved by a bi-directional LSTM. The third module is aimed at enhancing the performance of semantic segmentation through a semi-supervised approach with an encoder to generate segmentation maps and a decoder to reconstruct feature maps. This semi-supervised feature learning ensures robust extraction along with scalability. The proposed architecture was validated on real-world aerial datasets and achieves state-of-the-art results for aerial image segmentation.
EchoFakeD: improving fake news detection in social media with an efficient deep neural network
(Springer, 2021-01) Narang, Pratik
The increasing popularity of social media platforms has simplified the sharing of news articles that have led to the explosion in fake news. With the emergence of fake news at a very rapid rate, a serious concern has produced in our society because of enormous fake content dissemination. The quality of the news content is questionable and there exists a necessity for an automated tool for the detection. Existing studies primarily focus on utilizing information extracted from the news content. We suggest that user-based engagements and the context related group of people (echo-chamber) sharing the same opinions can play a vital role in the fake news detection. Hence, in this paper, we have focused on both the content of the news article and the existence of echo chambers in the social network for fake news detection. Standard factorization methods for fake news detection have limited effectiveness due to their unsupervised nature and primarily employed with traditional machine learning models. To design an effective deep learning model with tensor factorization approach is the priority. In our approach, the news content is fused with the tensor following a coupled matrix–tensor factorization method to get a latent representation of both news content as well as social context. We have designed our model with a different number of filters across each dense layer along with dropout. To classify on news content and social context-based information individually as well as in combination, a deep neural network (our proposed model) was employed with optimal hyper-parameters. The performance of our proposed approach has been validated on a real-world fake news dataset: BuzzFeed and PolitiFact. Classification results have demonstrated that our proposed model (EchoFakeD) outperforms existing and appropriate baselines for fake news detection and achieved a validation accuracy of 92.30%. These results have shown significant improvements over the existing state-of-the-art models in the area of fake news detection and affirm the potential use of the technique for classifying fake news.
EraisNET: An Optical Flow based 3D ConvNET for Erasing Obstructions
(IEEE, 2022) Narang, Pratik; Rajput, Amitesh Singh
Images captured from behind a fence, window, or during rain generally face occlusions. Though prior works have addressed the problems of individually de-raining, reflection, and occlusion removal, a common approach that removes all the obstruction has found little attention in the literature. In this paper, we address the image occlusion problem by proposing a deep learning-based approach wherein the proposed method uses motion differences between two images and extracts important moving features from videos to separate the background and the obstruction. To accomplish this task, a novel 3D-convolution architecture is introduced, which is trained with synthetically blended videos. We have used learned layer-based CNN methods combined with dense-optical flow with generative networks for better output images. Moreover, a dataset for obstruction removal with sequences for reflection and fencing removal is proposed. The proposed approach is well experimented over a different variety of images and is found as a good candidate against state-of-the-art schemes.
FakeBERT: Fake news detection in social media with a BERT-based deep learning approach
(Springer, 2021-01) Narang, Pratik
In the modern era of computing, the news ecosystem has transformed from old traditional print media to social media outlets. Social media platforms allow us to consume news much faster, with less restricted editing results in the spread of fake news at an incredible pace and scale. In recent researches, many useful methods for fake news detection employ sequential neural networks to encode news content and social context-level information where the text sequence was analyzed in a unidirectional way. Therefore, a bidirectional training approach is a priority for modelling the relevant information of fake news that is capable of improving the classification performance with the ability to capture semantic and long-distance dependencies in sentences. In this paper, we propose a BERT-based (Bidirectional Encoder Representations from Transformers) deep learning approach (FakeBERT) by combining different parallel blocks of the single-layer deep Convolutional Neural Network (CNN) having different kernel sizes and filters with the BERT. Such a combination is useful to handle ambiguity, which is the greatest challenge to natural language understanding. Classification results demonstrate that our proposed model (FakeBERT) outperforms the existing models with an accuracy of 98.90%.