Department of Computer Science and Information Systems
Permanent URI for this collectionhttp://localhost:4000/handle/123456789/1928
Browse
7 results
Search Results
Item Twitter Data Modelling and Provenance Support for Key-Value Pair Databases(Springer, 2021-02) Goyal, NavneetIn Big Data environments, reliability of data plays an important role to determine trustworthiness of the outcomes of an analysis. Big data provenance ensures the reliability of data by providing details about the origin and historical paths of data. In recent years, the preponderance of big data and its applications are increasingly using Apache Cassandra due to its high availability and linear scalability. In this paper, we present a data provenance framework for Key-Value Pair Databases using the concept of Zero-Information Loss Database (ZILD). A large volume of real-time social media data is fetched from the Twitter’s network through live streaming with the help of Twitter Streaming APIs, and then modelled in Apache Cassandra based on a Query-Driven approach. This framework provides efficient provenance capturing support for select, aggregate, update, and historical queries. We evaluate the performance of proposed framework in terms of provenance capturing and querying capabilities using appropriate query sets.Item Big Data and Artificial Intelligenc(Springer, 2023) Goyal, NavneetThis book constitutes the proceedings of the 11th International Conference on Big Data and Artificial Intelligence, BDA 2023, held in Delhi, India, during December 7–9, 2023. The17 full papers presented in this volume were carefully reviewed and selected from 67 submissions. The papers are organized in the following topical sections: Keynote Lectures, Artificial Intelligence in Healthcare, Large Language Models, Data Analytics for Low Resource Domains, Artificial Intelligence for Innovative Applications and Potpourri.Item A Survey and Experimental Review on Data Distribution Strategies for Parallel Spatial Clustering Algorithms(Springer, 2024-06) Challa, Jagat Sesh; Balasubramaniam, Sundar; Goyal, Navneet; Goyal, PoonamThe advent of Big Data has led to the rapid growth in the usage of parallel clustering algorithms that work over distributed computing frameworks such as MPI, MapReduce, and Spark. An important step for any parallel clustering algorithm is the distribution of data amongst the cluster nodes. This step governs the methodology and performance of the entire algorithm. Researchers typically use random, or a spatial/geometric distribution strategy like kd-tree based partitioning and grid-based partitioning, as per the requirements of the algorithm. However, these strategies are generic and are not tailor-made for any specific parallel clustering algorithm. In this paper, we give a very comprehensive literature survey of MPI-based parallel clustering algorithms with special reference to the specific data distribution strategies they employ. We also propose three new data distribution strategies namely Parameterized Dimensional Split for parallel density-based clustering algorithms like DBSCAN and OPTICS, Cell-Based Dimensional Split for dGridSLINK, which is a grid-based hierarchical clustering algorithm that exhibits efficiency for disjoint spatial distribution, and Projection-Based Split, which is a generic distribution strategy. All of these preserve spatial locality, achieve disjoint partitioning, and ensure good data load balancing. The experimental analysis shows the benefits of using the proposed data distribution strategies for algorithms they are designed for, based on which we give appropriate recommendations for their usage.Item A way forward towards a technology-driven development of industry 4.0 using big data analytics in 5G-enabled IIoT(Wiley, 2021-10) Dua, Amit; Gupta, ShashankThe evolution of Internet of Things (IoT) has led to the development of Industrial Internet of Things (IIoT). IIoT is one the widely applied areas to facilitate people in the manufacturing world. The adoption of IIoT automates sensing, capturing, communicating, and processing in real time. To understand how rapidly IoT and IIoT are growing, this article examines the emergence of 5G-enabled IIoT, current research trends in IIoT, key milestones achieved in IIoT, and IoT applications specific to 5G-enabled IIoT. The paper presents the state-of-the-art in networking layered framework of IIoT and comparing relationships of technologies of cloud computing as well as edge computing paradigms. We also explored the type of security attacks and their preventive measures in an IIoT-driven 5G technology. We have also highlighted the revolution of IIoT-driven 5G framework which satisfies the demands of IIoT applications.Item A Taxonomy of e-Healthcare Techniques and Solutions: Challenges and Future Directions(CRC Press, 2022) Dua, AmitTechnology has intruded all spheres of our lives, whether it be communication, travel, work, or leisure. Industries have been quick to respond to our growing needs and have explored technological interventions to aid their aid. Healthcare, on the other hand, has been slow in adapting to the evolving technology. With the rapid increase in the world population and people's life expectancy and the uncertainty of global pandemics like COVID-19, there has been a massive shortage of healthcare workers across the world. It is of utmost importance for technology to come to the aid of the healthcare domain. The purpose of e-healthcare is to improve the quality of patient care and ease access to healthcare and prepare for the high demand in the healthcare sector that we are witnessing amidst the COVID-19 outbreak in 2020. The research work done in the e-healthcare domain is majorly focused on one or other specific aspects of e-healthcare. It fails to provide an overall picture. This survey paper is aimed at providing a broader view of the techniques used in the e-healthcare domain. The survey broadly classifies the e-healthcare techniques into four categories based on the analysis done on the existing e-healthcare proposals: Machine learning techniques, cloud computing techniques, privacy techniques, and data analytics techniques. It was observed that big data analytics and 5G technology can play a prominent role in shaping the future of e-healthcare. Big data analytics can be used for drawing useful insights from healthcare data. In contrast, 5G technology can be used for scaling purposes by achieving ultra-low latency, high density, and high bandwidth requirements. Besides, suggestions for improvement and future research directions in the e-healthcare domain have been explored for a better understanding of the readers and to motivate future work.Item Role of emerging technologies in future IoT-driven Healthcare 4.0 technologies: a survey, current challenges and future directions(Springer, 2021-05) Dua, Amit; Gupta, ShashankSince its inception, Healthcare 4.0 has empowered the integration of advanced technologies to create and improve the quality of healthcare services. The delivery of healthcare services has come a long way from physical appointments with doctors to remote health monitoring and disease prediction, surgery assistive systems. This advancement has only been possible because of the integration of cutting-edge technologies like Tele-healthcare, software-defined networking and many more, with healthcare systems. In this survey, we have targeted some of the pioneering research works that could contribute significantly to the future development of Healthcare 4.0 systems. We have identified the significant research gaps and presented the modern state-of-the-art of healthcare systems, introducing the Healthcare IoT Application and Service Stacks. We have also discussed the latest paradigm of Wireless Body Area Networks, emphasizing its significance and how it can contribute to the development of next-generation healthcare applications using emerging technologies like Machine Learning, Blockchain, Cloud Computing, Internet of things, Edge/ Fog Computing, Tele-healthcare, Big Data Analytics, Software-Defined Networking and many more. We have performed a comparative study of different architectural implementations considering their advantages, shortcomings, and quality-of-service requirements. We emphasize the importance of the different emerging technologies in detail, discussing the opportunities available and their potential to create better healthcare solutions that can provide superior service quality. Finally, we highlight the fundamental need for establishing security and privacy in future healthcare systems. Overall, this survey provides a strong outlook into the development of the future of healthcare 4.0.Item A High Performance Computing Framework for Data Mining(IEEE, 2016) Goyal, Navneet; Goyal, PoonamMining large data sets is no longer the prerogative of computer scientists - specialists in a wide variety of domains are performing analytics as a day-to-day activity. Often such analyses are specific to the domain and analysts are required to devise new algorithms or techniques. For such scenarios, providing a high-level programming environment that delivers high performance on clusters is a challenge. We propose a framework that supports high-level programming using domain abstractions in data mining while delivering scalable performance on commodity clusters i.e. clusters of multi-core workstations. This framework includes a domain specific programming language, DWARF, to enable data mining specialists to rapidly prototype algorithms. DWARF is supported by a compiler that automatically parallelizes code by identifying domain specific patterns and translating them to parallel code that exploits data parallelism and task parallelism. The compiler generates code for a hybrid virtual machine supporting distributed memory model at the top level and shared memory model nested within. The code generated by the compiler can be scheduled on commodity clusters. We compare the proposed framework with other frameworks commonly used for data mining on distributed platforms.