BITS Faculty Publications

Permanent URI for this communityhttp://localhost:4000/handle/123456789/1867

Browse

Search Results

Now showing 1 - 10 of 28
  • Item
    Permuted spectral and permuted spectral-spatial cnn models for polsar-multispectral data based land cover classification
    (Taylor & Francis, 2020-12) Phartiyal, Gopal Singh
    It is a challenge to develop methods which can process the polarimetric synthetic aperture radar (PolSAR) and multispectral (MS) data modalities together without losing information from either for remote sensing applications. This paper presents a study which attempts to introduce novel deep learning-based remote sensing data processing frameworks that utilize convolutional neural networks (CNNs) in both spatial and spectral domains to perform land cover (LC) classification with PolSAR-MS data. Also since earth observation remotely sensed data have usually larger spectral depth than normal camera image data, exploiting the spectral information in remote sensing (RS) data is crucial as well. In fact, convolutions in the sub-spectral space are intuitive and alternative to the process of feature selection. Recently, researchers have gained success in exploiting the spectral information of RS data, especially the hyperspectral data with CNNs. In this paper, exploitation of the spectral information in the PolSAR-MS data via a permuted localized spectral convolution along with localized spatial convolution is proposed. Further, the study in this paper also establishes the significance of performing permuted localized spectral convolutions over non-localized or localized spectral convolutions. Two models are proposed, namely a permuted local spectral convolutional network (Perm-LS-CNN) and a permuted local spectral-spatial convolutional network (Perm-LSS-CNN). These models are trained on ground truth class data points measured directly on the terrain. The evaluation of the generalization performance is done using ground truth knowledge on selected well-known regions in the study areas. Comparison with other popular machine learning classifiers shows that the Perm-LSS-CNN model provides better classification results in terms of both accuracy and generalization.
  • Item
    Land cover mapping of mixed classes using 2D CNN with multi-frequency SAR data
    (Elsevier, 2024-07) Phartiyal, Gopal Singh
    Synthetic aperture radar (SAR) data obtained at multiple frequencies and polarizations offers valuable complementary information for classifying mixed classes that exhibit similar backscattering response. Although deep learning-based convolutional neural networks (CNNs) effectively extract features from multi-frequency SAR data, the arbitrary ordering of SAR features may hinder optimal convolution of the best feature sub-space for a specific class and underutilize available multi-frequency data. To address this, a novel CNN transforming SAR feature-space from 1-D to 2-D and employing varied dilation-rate convolutions is introduced. This transformation maximizes unique and localized feature combinations, efficiently utilizing the available feature sub-spaces and extracting discriminative features for accurate classifications, addressing the challenge of arbitrary band neighborhoods. Utilizing dual-polarization SAR data from ALOS-2 PALSAR-2 and Sentinel-1 sensors, the proposed CNN achieves an average f-score of 0.97 and a kappa coefficient of 0.97, an improvement of 11 %, 7 % and 3 % in OA compared to the 1-D, 2-D and 3-D CNN classifiers, without feature transformation. The classifier's generalization ability is evaluated using ground truth knowledge of various heterogeneous classes, and the proposed CNN classifier outperforms others in terms of accuracy metrics and generalization ability.
  • Item
    An attention-based deep network for plant disease classification
    (2024) Bera, Asish
    Plant disease classification using machine learning in a real agricultural field environment is a difficult task. Often, an automated plant disease diagnosis method might fail to capture and interpret discriminatory information due to small variations among leaf sub-categories. Yet, modern Convolutional Neural Networks (CNNs) have achieved decent success in discriminating various plant diseases using leave images. A few existing methods have applied additional pre-processing modules or sub-networks to tackle this challenge. Sometimes, the feature maps ignore partial information for holistic description by part-mining. A deep CNN that emphasizes integration of partial descriptiveness of leaf regions is proposed in this work. The efficacious attention mechanism is integrated with high-level feature map of a base CNN for enhancing feature representation. The proposed method focuses on important diseased areas in leaves, and employs an attention weighting scheme for utilizing useful neighborhood information. The proposed Attention-based network for Plant Disease Classification (APDC) method has achieved state-of-the-art performances on four public plant datasets containing visual/thermal images. The best top-1 accuracies attained by the proposed APDC are: PlantPathology 97.74%, PaddyCrop 99.62%, PaddyDoctor 99.65%, and PlantVillage 99.97%. These results justify the suitability of proposed method.
  • Item
    FakeExpose: Uncovering the falsity of news by targeting the multimodality via transfer learning
    (Taru Publications, 2023-08) Chauhan, Gajendra Singh; Sharma, Yashvardhan
    Social media for news utilization has its own pros and cons. There are several reasons why people look for and read news through internet media. On the one hand, it is easier to access, and on the other, social media’s dynamic content and misinformation pose serious problems for both government and public institutions. Several studies have been conducted in the past to classify online reviews and their textual content. The current paper suggests a multimodal strategy for the (FND) task that covers both text and image. The suggested model (FakeExpose) is created to automatically learn a variety of discriminative features, instead of relying on manually created features. Several pre-trained words and image embedding models, such as DistilRoBERTa and Vision Transformers (ViTs) are used and fine-tined for the best feature extraction and the various word dependencies. Data augmentation is used to address the issue of pre-trained textual feature extractors not processing a maximum of 512 tokens at a time. The accuracy of the presented model on PolitiFact and GossipCop is 91.35 percent and 98.59 percent, respectively, based on current standards. According to our knowledge, this is the first attempt to use the FakeNewsNet repository to reach the maximum multimodal accuracy. The results show that combining text and image data improves accuracy when compared to utilizing only text or images (Unimodal). Moreover, the outcomes imply that adding more data has improved the model’s accuracy rather than degraded it.
  • Item
    Privacy and Security Concerns in Generative AI: A Comprehensive Survey
    (IEEE, 2024-03) Chamola, Vinay
    Generative Artificial Intelligence (GAI) has sparked a transformative wave across various domains, including machine learning, healthcare, business, and entertainment, owing to its remarkable ability to generate lifelike data. This comprehensive survey offers a meticulous examination of the privacy and security challenges inherent to GAI. It provides five pivotal perspectives essential for a comprehensive understanding of these intricacies. The paper encompasses discussions on GAI architectures, diverse generative model types, practical applications, and recent advancements within the field. In addition, it highlights current security strategies and proposes sustainable solutions, emphasizing user, developer, institutional, and policymaker involvement.
  • Item
    A novel end-to-end deep convolutional neural network based skin lesion classification framework
    (Elsevier, 2024-07) Chamola, Vinay
    Skin diseases are reported to contribute 1.79% of the global burden of disease. The accurate diagnosis of specific skin diseases is known to be a challenging task due, in part, to variations in skin tone, texture, body hair, etc. Classification of skin lesions using machine learning is a demanding task, due to the varying shapes, sizes, colors, and vague boundaries of some lesions. The use of deep learning for the classification of skin lesion images has been shown to help diagnose the disease at its early stages. Recent studies have demonstrated that these models perform well in skin detection tasks, with high accuracy and efficiency.
  • Item
    Evolutionary computation-based self-supervised learning for image processing: a big data-driven approach to feature extraction and fusion for multispectral object detection
    (Springer, 2024-09) Chamola, Vinay
    The image object recognition and detection technology are widely used in many scenarios. In recent years, big data has become increasingly abundant, and big data-driven artificial intelligence models have attracted more and more attention. Evolutionary computation has also provided a powerful driving force for the optimization and improvement of deep learning models. In this paper, we propose an image object detection method based on self-supervised and data-driven learning. Differ from other methods, our approach stands out due to its innovative use of multispectral data fusion and evolutionary computation for model optimization. Specifically, our method uniquely combines visible light images and infrared images to detect and identify image targets. Firstly, we utilize a self-supervised learning method and the AutoEncoder model to perform high-dimensional feature extraction on the two types of images. Secondly, we fuse the extracted features from the visible light and infrared images to detect and identify objects. Thirdly, we introduce a model parameter optimization method using evolutionary learning algorithms to enhance model performance. Validation on public datasets shows that our method achieves comparable or superior performance to existing methods.
  • Item
    Integrating deep learning for visual question answering in Agricultural Disease Diagnostics: Case Study of Wheat Rust
    (Springer Nature, 2024) Chamola, Vinay; Narang, Pratik; Rallapall, Srinivas
    This paper presents a novel approach to agricultural disease diagnostics through the integration of Deep Learning (DL) techniques with Visual Question Answering (VQA) systems, specifically targeting the detection of wheat rust. Wheat rust is a pervasive and destructive disease that significantly impacts wheat production worldwide. Traditional diagnostic methods often require expert knowledge and time-consuming processes, making rapid and accurate detection challenging. We drafted a new, WheatRustDL2024 dataset (7998 images of healthy and infected leaves) specifically designed for VQA in the context of wheat rust detection and utilized it to retrieve the initial weights on the federated learning server. This dataset comprises high-resolution images of wheat plants, annotated with detailed questions and answers pertaining to the presence, type, and severity of rust infections. Our dataset also contains images collected from various sources and successfully highlights a wide range of conditions (different lighting, obstructions in the image, etc.) in which a wheat image may be taken, therefore making a generalized universally applicable model. The trained model was federated using Flower. Following extensive analysis, the chosen central model was ResNet. Our fine-tuned ResNet achieved an accuracy of 97.69% on the existing data. We also implemented the BLIP (Bootstrapping Language-Image Pre-training) methods that enable the model to understand complex visual and textual inputs, thereby improving the accuracy and relevance of the generated answers. The dual attention mechanism, combined with BLIP techniques, allows the model to simultaneously focus on relevant image regions and pertinent parts of the questions. We also created a custom dataset (WheatRustVQA) with our augmented dataset containing 1800 augmented images and their associated question-answer pairs. The model fetches an answer with an average BLEU score of 0.6235 on our testing partition of the dataset. This federated model is lightweight and can be seamlessly integrated into mobile phones, drones, etc. without any hardware requirement. Our results indicate that integrating deep learning with VQA for agricultural disease diagnostics not only accelerates the detection process but also reduces dependency on human experts, making it a valuable tool for farmers and agricultural professionals. This approach holds promise for broader applications in plant pathology and precision agriculture and can consequently address food security issues.
  • Item
    Transformer-based time series prediction of the maximum power point for solar photovoltaic cells
    (Wiley, 2022-06) Bansal, Hari Om; Gautam, Aditya R.
    This paper proposes an improved deep learning-based maximum power point tracking (MPPT) in solar photovoltaic cells considering various time series-based environmental inputs. Generally, artificial neural network-based MPPT algorithms use basic neural network architectures and inputs which do not represent the ambient conditions in a comprehensive manner. In this article, the ambient conditions of a location are represented through a comprehensive set of environmental features. Furthermore, the inclusion of time-based features in the input data is considered to model cyclic patterns temporally within the atmospheric conditions leading to robust modeling of the MPPT algorithm. A transformer-based deep learning architecture is trained as a time series prediction model using multidimensional time series input features. The model is trained on a dataset containing typical meteorological-year data points of ambient weather conditions from 50 locations. The attention mechanism in the transformer modules allows the model to learn temporal patterns in the data efficiently. The proposed model achieves a 0.47% mean average percentage error of prediction on non-zero operating voltage points in a test dataset consisting of data collected over a period of 200 consecutive hours; resulting in the average power efficiency of 99.54% and peak power efficiency of 99.98%. The proposed model is validated through real-time simulations. The proposed model performs power point tracking in a robust, dynamic, and nonlatent manner, over a wide range of atmospheric conditions.
  • Item
    Verification of Hardware Resource Utilization through High Level Synthesis for FPGA Implementation
    (IEEE, 2023) Asati, Abhijit; Shenoy, Meetha V.
    Recently, there has been a sharp rise in demand for hardware implementations because of the improved accuracy of Convolutional Neural Networks (CNN) on a wide range of classification and recognition applications. To achieve the needed performance, they include heavy processor operations and memory bandwidth. For optimized hardware deployment, which necessitates thorough optimization of system architectures and algorithms to get particularly efficient designs, a target system’s hardware resources and an estimation of its performance at a greater degree of abstraction are crucial. Since the programmable hardware fabric may be customized for each unique network, Field Programmable Gate Arrays (FPGA) can accomplish this efficiency in this situation. This paper shows the high-level synthesis (HLS) of each of the different layers of optimized CNN using the MATLAB HDL coder. Along with its HDL resource utilization report, we also investigated the computational processes and hardware resource estimation of the previously developed optimized CNN. The hardware resources required by all the convolutional and fully connected layers of the optimized CNN matches exactly will the previously calculated resources. So, the hardware resource utilization is verified through HLS. The architecture takes fixed-point math into account. All layers are synthesized in Vivado 2022.2 with the Zynq UltraScale+ MPSoC ZCU104 Evaluation Kit as the target.