Full Paper View Go Back
Advancements in Optical Character Recognition (OCR) for India Scripts: A Review
Jagin M. Patel1 , Bharat C. Patel2 , Manish M. Kayasth3
Section:Review Paper, Product Type: Journal-Paper
Vol.7 ,
Issue.1 , pp.23-29, Feb-2019
Online published on Feb 28, 2019
Copyright Β© Jagin M. Patel, Bharat C. Patel, Manish M. Kayasth . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: Jagin M. Patel, Bharat C. Patel, Manish M. Kayasth, βAdvancements in Optical Character Recognition (OCR) for India Scripts: A Review,β International Journal of Scientific Research in Computer Science and Engineering, Vol.7, Issue.1, pp.23-29, 2019.
MLA Style Citation: Jagin M. Patel, Bharat C. Patel, Manish M. Kayasth "Advancements in Optical Character Recognition (OCR) for India Scripts: A Review." International Journal of Scientific Research in Computer Science and Engineering 7.1 (2019): 23-29.
APA Style Citation: Jagin M. Patel, Bharat C. Patel, Manish M. Kayasth, (2019). Advancements in Optical Character Recognition (OCR) for India Scripts: A Review. International Journal of Scientific Research in Computer Science and Engineering, 7(1), 23-29.
BibTex Style Citation:
@article{Patel_2019,
author = {Jagin M. Patel, Bharat C. Patel, Manish M. Kayasth},
title = {Advancements in Optical Character Recognition (OCR) for India Scripts: A Review},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {2 2019},
volume = {7},
Issue = {1},
month = {2},
year = {2019},
issn = {2347-2693},
pages = {23-29},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=3138},
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=3138
TI - Advancements in Optical Character Recognition (OCR) for India Scripts: A Review
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Jagin M. Patel, Bharat C. Patel, Manish M. Kayasth
PY - 2019
DA - 2019/02/28
PB - IJCSE, Indore, INDIA
SP - 23-29
IS - 1
VL - 7
SN - 2347-2693
ER -
Abstract :
Optical character recognition (OCR) has been the subject of extensive study, and during the past few decades, many publications have been written about it. India is a linguistically diverse country with numerous scripts, including Devanagari, Bengali, Tamil, Gujarati and many more. There are now many commercial OCR systems on the market. However, the majority of these systems support English, Chinese, Japanese, etc. Despite India having many major scripts, there are not many studies on character identification in the Indian language. This review discusses recent research and developments in OCR techniques for Indian scripts, focusing on both printed and handwritten text recognition.
Key-Words / Index Term :
OCR survey, Indian script OCR
References :
[1] Pal, Umapada, and B. B. Chaudhuri. "Indian script character recognition: a survey." pattern Recognition 37, no. 9 (2004): 1887-1899.
[2] A.K. Dutta, "A generalized formal approach for description and analysis for major Indian scripts", J. Inst. Telecom. Eng. 30, pp. 155β161, 1984.
[3] B.B. Chaudhuri, U. Pal, "A complete printed Bangla OCR system", Pattern Recognition 31, pp. 531β549, 1998.
[4] U. Bhattacharya, T.K. Das, A. Datta, S.K. Parui, B.B. Chaudhuri, "A hybrid scheme for hand printed numeral recognition based on a self-organizing network and MLP classifiers", Int. J. Pattern Recognition Artif. Intell. 16, pp. 845β864, 2002.
[5] B.B. Chaudhuri, U. Pal, "Relational studies between phoneme and grapheme statistics in current Bangla", J. Acoust. Soc. India 23, pp. 67β77, 1995.
[6] Basu, Subhadip, Nibaran Das, Ram Sarkar, MahantapasKundu, MitaNasipuri, and Dipak Kumar Basu. "A hierarchical approach to recognition of handwritten Bangla characters." Pattern Recognition 42, no. 7, pp. 1467-1484, 2009.
[7] Pal, U., and N. Tripathy. "A contour distance-based approach for multi-oriented and multi-sized character recognition." Sadhana 34, pp. 755-765, 2009.
[8] U. Pal, P.K. Kundu, B.B. Chaudhuri, "OCR error correction of an Inflectional Indian language using morphological parsing", J. Inform. Sci. Eng. 16, pp. 903β922, 2000.
[9] U. Pal, B.B. Chaudhuri, "Automatic recognition of unconstrained on-line Bangla hand-written numerals", Advances in Multimodal Interfaces, Springer Verlag Lecture Notes on Computer Science (LNCS-1948), pp. 371β378, 2000.
[10] U. Pal, S. Datta, "Segmentation of Bangla unconstrained handwritten text", in: Proceedings of the Seventh International Conference on Document Analysis and Recognition, pp. 1128β1132, 2003.
[11] A.K. Dutta, S. Chaudhuri, "Bengali alpha-numeric character recognition using curvature features", Pattern Recognition 26, pp. 1757β1770, 1993.
[12] Bhattacharya, Ujjwal, MalayappanShridhar, and Swapan K. Parui. "On recognition of handwritten Bangla characters." In Computer Vision, Graphics and Image Processing: 5th Indian Conference, ICVGIP 2006, Madurai, India, pp. 817-828, December 13-16, 2006.
[13] K. Ray, B. Chatterjee, "Design of a nearest neighbor classifier system for Bengali character recognition", J. Inst. Electron. Telecom. Eng. 30, pp. 226β229, 1984.
[14] ShamikSural, P.K. Das, "An MLP using Hough transform based fuzzy feature extraction for Bengali script recognition", Pattern Recognition Lett. 20, pp. 771β782, 1999.
[15] Kubatur, Shruthi, Maher Sid-Ahmed, and Majid Ahmadi. "A neural network approach to online Devanagari handwritten character recognition." In 2012 International conference on high performance computing & simulation (HPCS), pp. 209-214. IEEE, 2012.
[16] K. Keeni, Shimodaira, Hiroshi, Nishino, Tetsuro, Tan, Yasuo, "Recognition of Devnagari characters using neural networks", IEICE Trans. Inform. Systems 5, pp. 523β528, 1996.
[17] R.R. Karnik, "Identifying Devnagari characters", Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 669β672, 1999.
[18] Pal, Umapada, Nabin Sharma, Tetsushi Wakabayashi, and Fumitaka Kimura. "Off-line handwritten character recognition of devnagari script." In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 1, pp. 496-500. IEEE, 2007.
[19] Singh, Pratibha, Ajay Verma, and Narendra S. Chaudhari. "Deep convolutional neural network classifier for handwritten Devanagari character recognition." In Information Systems Design and Intelligent Applications: Proceedings of Third International Conference INDIA 2016, Volume 2, pp. 551-561. Springer India, 2016.
[20] K. Sethi, B. Chatterjee, "Machine recognition of constrained hand-printed Devnagari", Pattern Recognition 9, pp. 69β76, 1977.
[21] R.M.K. Sinha, "Rule based contextual post-processing for Devnagari text recognition", Pattern Recognition 20, pp. 475β485, 1987.
[22] R.M.K. Sinha, "Role of context in Devnagari script recognition", J. Inst. Electron. Telecom Eng. 33, pp. 87β91, 1987.
[23] Singh, Raghuraj, C. S. Yadav, PrabhatVerma, and VibhashYadav. "Optical character recognition (OCR) for printed devnagari script using artificial neural network." International Journal of Computer Science & Communication 1, no. 1, pp. 91-95, 2010.
[24] R. Bajaj, L. Dey, S. Chaudhury, "Devnagari numeral recognition by combining decision of multiple connectionist classifier", Sadhana 27, pp. 59β72, 2002.
[25] Dhurandhar, Amit, KartikShankarnarayanan, and RakeshJawale. "Robust pattern recognition scheme for Devanagari script." In Computational Intelligence and Security: International Conference, CIS 2005, Xiβan, China, December 15-19, Proceedings Part I, pp. 1021-1026, 2005.
[26] V. Bansal, R.M.K. Sinha, "Partitioning and searching dictionary for correction of optically read Devnagari characters strings", Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 653β656, 1999.
[27] V. Bansal, R.M.K. Sinha, "On how to describe shapes of Devnagari characters and use them for recognition", Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 410β413, 1999.
[28] V. Bansal, R.M.K. Sinha, "Integrating knowledge sources in Devnagari text recognition system", IEEE Trans. Systems Man Cybern. Part A: Systems Humans 30, pp.500β505, 2000.
[29] V. Bansal, R.M.K. Sinha, "Segmentation of touching and fused Devnagari characters", Pattern Recognition 35, pp. 875β893, 2002.
[30] K. Sethi, B. Chatterjee, "Machine recognition of constrained hand-printed Devnagari numerals", J. Inst. Electron. Telecom. Eng. 22, pp. 532β535, 1976.
[31] S.D. Connell, R.M.K. Sinha, A.K. Jain, "Recognition of unconstrained on-line Devanagari characters", Proceedings of the International Conference on Pattern Recognition, Vol. II, pp. 368β371, 2000.
[32] B.B. Chaudhuri, U. Pal, M. Mitra, "Automatic Recognition of Printed Oriya Script", Sadhana 27, pp. 23β34, 2002.
[33] S. Mohanti, "Pattern recognition in alphabets of Oriya language using Kohonen neural network", Int. J. Pattern Recogn. Artif. Intell. 12, pp. 1007β1015, 1998.
[34] G. Siromony, R. Chandrasekaran, M. Chandrasekaran, "Computer recognition of printed Tamil characters", Pattern Recognition 10, pp. 243β247, 1978.
[35] S. Sundaresan, S.S. Keerthi, "A study of representations for pen based hand writing recognition of Tamil characters", Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 422β423, 1999.
[36] T.V. Ashwin, P.S. Sastry, "A font and size independent OCR system for printed Kannada documents using support vector machines", Sadhana 27, pp. 35β58, 2002.
[37] R.S. Rao, R.D. Sudhaker Samuel, "On-line character recognition for handwritten Kannada characters using Wavelet features and Neural classifier", IETE Journal of Research 46, no. 5, pp. 387-393, 2000.
[38] A. Negi, Chakravarthy, Bhagvati, B. Krishna, "An OCR system for Telugu", in: Proceedings of the Sixth International Conference on Document Processing, pp. 1110β1114, 2001.
[39] Jayaraman, Anitha, C. Chandra Sekhar, and V. SrinivasaChakravarthy. "Modular approach to recognition of strokes in Telugu script." In Ninth international conference on document analysis and recognition (ICDAR 2007), vol. 1, pp. 501-505. IEEE, 2007.
[40] S.N.S. Rajasekaran, B.L. Deekshatulu, "Recognition of printed Telugu characters", Comput. Graphics Image Process. 6, pp. 335β360, 1977.
[41] R. Sukhaswami, P. Seetharamulu, A.K. Pujari, "Recognition of Telugu characters using Neural networks", International journal of neural systems 6, no. 03, pp. 317-357, 1995
[42] S. Antani, L. Agnihotri, "Gujarati character recognition", Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 418-421. IEEE, 1999
[43] Sharma, Anuj, Rajesh Kumar, and R. K. Sharma. "Online handwritten Gurmukhi character recognition using elastic matching." In 2008 congress on image and signal processing, vol. 2, pp. 391-396. IEEE, 2008.
[44] G.S. Lehal, C. Singh, "Feature extraction and classification for OCR of Gurmukhi script", Vivek 12, pp. 2β12, 1999.
[45] G.S. Lehal, C. Singh, "A post processor for Gurmukhi OCR", Sadhana 27, pp. 99β111, 2002.
[46] G.S. Lehal, C. Singh, "Text segmentation of machine printed Gurmukhi script", Document Recognition and Retrieval VIII, Proceedings SPIE, USA, Vol. 4307, pp. 223β231, 2001.
[47] Lehal, G. S., and Chandan Singh. "A technique for segmentation of Gurmukhi text." In Computer Analysis of Images and Patterns: 9th International Conference, CAIP 2001 Warsaw, Poland, September 5β7, 2001 Proceedings 9, pp. 191-200. Springer Berlin Heidelberg, 2001.
[48] G.S. Lehal, C. Singh, R. Lehal, "A shape based post processor for Gurmukhi OCR", Proceedings of the Sixth International Conference on Document Analysis and Recognition, Seattle, pp. 1105β1109, 2001.
[49] G.S. Lehal, C. Singh, "A Gurmukhi script recognition system", in: Proceedings of the 15th International Conference on Pattern Recognition, Vol. 2, pp. 557β560, 2000.
[50] Bhattacharya, Ujjwal, Swapan Kumar Parui, and SrikantaMondal. "Devanagari and bangla text extraction from natural scene images." In 2009 10th International Conference on Document Analysis and Recognition, pp. 171-175. IEEE, 2009.
[51] B.B. Chaudhuri, U. Pal, "Skew angle detection of digitized Indian Script documents", IEEE Trans. Pattern Anal. Mach. Intell. 19, pp. 182β186, 1997.
[52] B.B. Chaudhuri, U. Pal, "An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi)", Proceedings of the Fourth International Conference on Document Analysis and Recognition, pp. 1011β1016, 1997.
[53] U. Garain, B.B. Chaudhuri, "Segmentation of touching characters in printed Devnagari and Bangla scripts using fuzzy multifactorial analysis", IEEE Trans. Systems Man Cybern. Part C-32, pp. 449β459, 2002.
[54] U. Garain, B.B. Chaudhuri, T.T. Pal, "Online handwritten Indian script recognition: a human motor function based framework", in: Proceedings of the 16th International Conference on Pattern Recognition, Vol. 3, pp. 164β167, 2002.
[55] U. Pal, B.B. Chaudhuri, "Automatic separation of words in Indian multi-lingual multi-script documents", in: Proceedings of the Fourth International Conference on Document Analysis and Recognition, pp. 576β579, 1997.
[56] U. Pal, B.B. Chaudhuri, "Script line separation from Indian multi-script documents", in: Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 406β409, 1999.
[57] U. Pal, M. Mitra, B.B. Chaudhuri, "Multi-skew detection of Indian script documents", in: Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp. 292β296, 2001.
[58] Sethi, Ishwar K., and B. Chatterjee. "Machine recognition of constrained hand printed Devanagari." Pattern recognition 9, no. 2, pp. 69-75, 1977.
[59] R.M.K. Sinha, H. Mahabala, "Machine recognition of Devnagari script", IEEE Trans. Systems Man Cybern. 9, pp. 435β441, 1979.
[60] M. Hanmandlu and O.V. Ramana Murthy, "Fuzzy Model Based Recognition of Handwritten Hindi Numerals", In Proc. Intl. Conf. on Cognition and Recognition, pp. 490-496, 2005.
[61] U. Bhattacharya, S. K .Parui, B. Shaw, K. Bhattacharya, "Neural combination of ANN and HMM for handwritten Devanagri Numeral Recognition", In Tenth international workshop on frontiers in handwriting recognition, pp.613-618, 2006
[62] Acharya, Shailesh, Ashok Kumar Pant, and Prashnna Kumar Gyawali. "Deep learning based large scale handwritten Devanagari character recognition." In 2015 9th International conference on software, knowledge, information management and applications (SKIMA), pp. 1-6. IEEE, 2015.
[63] N. Sharma, U. Pal, F. Kimura and S. Pal, "Recognition of Offline Handwritten Devanagri Characters using Quadratic Classifier", In Proc. Indian Conference on Computer Vision Graphics and Image Processing, pp- 805-816, 2006.
[64] Arora, Sandhya, DebotoshBhattacharjee, MitaNasipuri, Dipak Kumar Basu, and MahantapasKundu. "Combining multiple feature extraction techniques for handwritten Devnagari character recognition." In 2008 IEEE Region 10 and the Third international Conference on Industrial and Information Systems, pp. 1-6. IEEE, 2008.
[65] K. Jaynathi, A.Suzuki, H. Kanai,Y. Kawazoe, M. Kimura, K. Kido, "Devanagari Character Recognition Using Structure Analysis", IEEE Trans, pp 363-366,1989.
[66] U. Pal, S. Chanda, T. Wakabayashi and F. Kimura, "Accuracy Improvement of Devnagari Character Recognition Combining SVM and MQDF", In Proc. 11th ICFHR, pp.367-372, 2008.
[67] P. Deshpande, L. Malik, S. Arora, "Character Recognition with Histogram Band Analysis of Encoded String and Neural Network", Proceedings of the 4th WSEAS Int. Conf. on Information Security, Communications and Computers, pp354-359, December 16-18, 2005.
[68] Joshi, Niranjan, G. Sita, A. G. Ramakrishnan, V. Deepu, and SriganeshMadhvanath. "Machine recognition of online handwritten Devanagari characters." In Eighth International Conference on Document Analysis and Recognition (ICDAR`05), pp. 1156-1160. IEEE, 2005.
[69] B. Shaw, S. K. Parui and M. Shridhar, "Off-line Handwritten Devanagari Word Recognition: A Segmentation Based Approach", IEEE ,2008.
[70] Karayil, Tushar, Adnan Ul-Hasan, and Thomas M. Breuel. "A segmentation-free approach for printed Devanagari script recognition." In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 946-950. IEEE, 2015.
[71] S.K. Parui, B.B. Chaudhuri, D. Dutta Majumder, "A procedure for recognition of connected hand written numerals", Int. J. Systems Sci. 13, pp. 1019β1029, 1982.
[72] U. Garain, B.B. Chaudhuri, "Compound character recognition by run-number-based metric distances", SPIE Proc. 3305, pp. 90β97, 1996.
[73] A.F.R. Rahman, M. Kaykobad, "A complete Bengali OCR: a novel hybrid approach to handwritten Bengali character recognition", J. Comput. Inform. Technol. 6, pp.395β413, 1998.
[74] A.F.R. Rahman, R. Rahman, M.C. Fairhurst, "Recognition of handwritten Bengali characters: a novel multistage approach", Pattern Recognition 35, pp. 997β1006, 2002.
[75] P. Chinnuswamy, S.G. Krishnamoorty, "Recognition of hand-printed Tamil characters", Pattern Recognition 12, pp. 141β152, 1980.
[76] Shanthi, N., and K. Duraiswamy. "A novel SVM-based handwritten Tamil character recognition system." Pattern Analysis and Applications 13, pp.173-180, 2010.
[77] Jomy, John, K. V. Pramod, and BalakrishnanKannan. "Handwritten character recognition of south Indian scripts: a review." arXiv preprint arXiv:1106.0107 (2011).
[78] S. Sundaresan, S.S. Keerthi, "A study of representations for pen based hand writing recognition of Tamil characters", Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 422β423, 1999.
[79] Dholakia, Jignesh, AtulNegi, and S. Rama Mohan. "Progress in Gujarati document processing and character recognition." Guide to OCR for Indic Scripts: Document Recognition and Retrieval, pp. 73-95, 2010.
[80] Dholakia, Jignesh, ArchitYajnik, and AtulNegi. "Wavelet feature based confusion character sets for Gujarati script." In International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), vol. 2, pp. 366-370, 2007.
[81] Desai, Apurva A. "Gujarati handwritten numeral optical character reorganization through neural network." Pattern recognition 43, no. 7, 2582-2589, 2010.
[82] Patel Jagin and Desai Apurva. A, "A comparison of four edge detection methods for identifying Gujarati Numerals from images", VNSGU Journal of Science & Technology, Vol. 3 issue:2, 113-124, ISSN: 0975-5446, 2012.
[83] Patel Jagin and Desai Apurva. A, "Segmentation and Recognition of Gujarati Printed Numerals from Image", International Journal of Engineering Research & Technology, Vol. 3 - Issue 2, pp. 1012-1020, February β 2014.
[84] Chaudhari, Shailesh A., and Ravi M. Gulati. "An OCR for separation and identification of mixed EnglishβGujarati digits using kNN classifier." In 2013 International Conference on Intelligent Systems and Signal Processing (ISSP), pp. 190-193, 2013.
[85] Chaudhari, Shailesh, and Ravi M. Gulati. "Script identification using Gabor feature and SVM classifier." Procedia Computer Science 79 (2016): pp. 85-92, 2016.
[86] Desai, Apurva A. "Support vector machine for identification of handwritten Gujarati alphabets using hybrid feature space." CSI transactions on ICT 2, no. 4, pp. 235-241, 2015.
[87] Lakshmi, C. Vasantha, and C. Patvardhan. "An optical character recognition system for printed Telugu text." Pattern analysis and applications 7 (2004): 190-204.
[88] SanjeevKunte, R., and R. D. Sudhaker Samuel. "A simple and efficient optical character recognition system for basic symbols in printed Kannada text." Sadhana 32, no. 5 (2007): 521.
[89] Niranjan, S. K., Vijaya Kumar, and Hemantha Kumar. "FLD based unconstrained handwritten Kannada character recognition." In 2008 Second International Conference on Future Generation Communication and Networking Symposia, vol. 3, pp. 7-10. IEEE, 2008.
[90] Jindal, Manish Kumar, Rajendra Kumar Sharma, and Gurpreet Singh Lehal. "A study of different kinds of degradation in printed Gurmukhi script." In 2007 International Conference on Computing: Theory and Applications (ICCTA`07), pp. 538-544. IEEE, 2007.
[91] Jindal, Manish Kumar, Rajendra Kumar Sharma, and Gurpreet Singh Lehal. "Structural features for recognizing degraded printed Gurmukhi script." In Fifth International Conference on Information Technology: New Generations (Itng 2008), pp. 668-673. IEEE, 2008.
You do not have rights to view the full text article.
Please contact administration for subscription to Journal or individual article.
Mail us atΒ support@isroset.org or view contact page for more details.