BITS Faculty Publications

Permanent URI for this communityhttp://localhost:4000/handle/123456789/1867

Browse

Search Results

Now showing 1 - 2 of 2
  • Item
    Comparative Study of Convolutional Neural Network Object Detection Algorithms for Image Processing
    (IEEE, 2023) Singh, Navin
    This paper presents a comparative study on three Convolutional Neural Network (CNN) object detection algorithms to find the best detector based on the combination of speed and accuracy on a personal computer. The MATLAB® development environment is used to evaluate three different object detector algorithms, namely Faster Region-Based Convolutional Network (R-CNN), Single Shot Detector (SSD) and You Only Look Once (YOLO). These algorithms are trained, and their performance metrics are tested on a small sample dataset. The results show that the SSD object detector algorithm performs best when considering both performance and processing speeds. Faster R-CNN detected objects at an average speed of 4.838 seconds and achieved a mean average precision of 0.76 with an average loss of 0.429. SSD detected objects at an average speed of 0.377 seconds and achieved a mean average precision of 0.92 with an average loss of 1.754. YOLO v3 detected objects at an average speed of 1.004 seconds and achieved a mean average precision of 0.81 with an average loss of 2.739.
  • Item
    Autonomous Classification and Spatial Location of Objects from Stereoscopic Image Sequences for the Visually Impaired
    (i, 2022) Singh, Navin
    One of the main problems faced by visually impaired individuals is the inability or difficulty to identify objects. A visually impaired person usually wears glasses that help to enlarge or focus on nearby objects, and therefore heavily relies on physical touch to identify an object. There are challenges when walking on the road or navigating to a specific location since the vision is lost or reduced thereby increasing the risk of an accident. This paper proposes a simple portable machine vision system for assisting the visually impaired by providing auditory feedback of nearby objects in real-time. The proposed system consists of three main hardware components consisting of a single board computer, a wireless camera, and an earpiece module. YOLACT object detection library was used to detect objects from the captured image. The objects are converted to an audio signal using the Festival Speech Synthesis System. Experimental results show that the system is efficient and capable of providing audio feedback of detected objects to the visually impaired person in real-time.