BITS Faculty Publications

Permanent URI for this communityhttp://localhost:4000/handle/123456789/1867

Browse

Search Results

Now showing 1 - 2 of 2

Lightweight convolutional neural network architecture implementation using TensorFlow lite
(Springer, 2023-06) Asati, Abhijit
Recently, with the increase in the precision of convolutional neural networks (CNN) on a wide variety of classification and recognition tasks, the demand for their deployment has dramatically increased. Even the focus is on lightweight, faster, and low-power implementations. In this paper, we have implemented a CNN model onto an embedded platform, ‘Raspberry Pi 4-Model B edge computing system (RP4-BECS)’. This CNN model was initially trained and verified in MATLAB and then implemented on the Machine Learning (ML) framework to generate a TensorFlow lite (TF-lite) flat buffer format. This implementation offers a reduced size of models with good prediction accuracy and lesser inference time as compared with the available literature. We attempted three trials for all the digits from 0 to 9 to evaluate average prediction accuracy and average inference time. An average prediction accuracy of 99.32% and average inference time of 22.53 ms is achieved for the Sign Language Digits Database (SLDD). Further, an average prediction accuracy of 99.09% and average inference time of 13.28 ms is achieved for the Modified National Institute of Standards and Technology Database (MNIST). The model sizes implemented using TF-Lite are highly reduced to 1.53 MB for SLDD and 148 KB for the MNIST database. The obtained accuracy, inference time and model sizes are better than published results.
Area-optimal FPGA implementation of the YOLO v2 algorithm using High-Level Synthesis
(IEEE, 2020) Asati, Abhijit; Shekhar, Chandra
Field-programmable gate arrays (FPGAs) have been used as pre-silicon validation platforms in VLSI designs. In this paper, we propose a FPGA-based you-only-look-once (YOLO) v2 object detector implementation that provides better performance in terms of speed, achieves higher accuracy, and requires fewer resources compared with the alternatives. It is constructed using a convolutional deep neural network (CNN). We apply high-level synthesis (HLS) to model and optimize the implementation using multiple directives, such as pipelining, loop unrolling, in-lining, etc. The proposed YOLO v2 design is implemented on a Xilinx Zynq xc7z020clg484-1 device. We run simulations to test its functionality using an xSim simulator. The proposed implementation not only runs faster, but it utilizes an order of magnitude fewer resources than available implementations in the literature.

BITS Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results