DSpace logo

Please use this identifier to cite or link to this item: http://dspace.bits-pilani.ac.in:8080/jspui/xmlui/handle/123456789/8116
Title: Optical Character Recognition for Sanskrit Using Convolution Neural Networks
Authors: Goyal, Navneet
Keywords: Computer Science
Devanagari Script
Sanskrit
Hindi
Deep Learning
OCR
Optical character recognition
Issue Date: 2018
Publisher: IEEE
Abstract: Ancient Sanskrit manuscripts are a rich source of knowledge about Science, Mathematics, Hindu mythology, Indian civilization, and culture. It therefore becomes critical that access to these manuscripts is made easy, to share this knowledge with the world and to facilitate further research on this Ancient literature. In this paper, we propose a Convolutional Neural Network (CNN) based Optical Character Recognition system (OCR) which accurately digitizes Ancient Sanskrit manuscripts (Devanagari Script) that are not necessarily in good condition. We use an image segmentation algorithm for calculating pixel intensities to identify letters in the image. The OCR considers typical compound characters (half letter combinations) as separate classes in order to improve the segmentation accuracy. The novelty of the OCR is its robustness to image quality, image contrast, font style and font size, which makes it an ideal choice for digitizing soiled and poorly maintained Sanskrit manuscripts.
URI: https://ieeexplore.ieee.org/document/8395237/authors#authors
http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8116
Appears in Collections:Department of Computer Science and Information Systems

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.