A study of deep learning techniques for handwritten digit recognition and classification
Abstract
As computers play an increasingly vital role in human life and daily activities across various domains, humans have leveraged their intelligence and creativity to use computers in natural and effective ways. Hence, a reliable method for recognizing handwritten digits is essential. Handwritten Digit Recognition (HDR) can offer a clear benefit in this aspect. Deep Learning (DL) has been a powerful tool for solving various problems with high accuracy in recent years. This paper first surveys the different methods for HDR that have been developed by various researchers. Machine learning has enriched this analysis with different approaches that involve supervised learning, unsupervised learning and reinforcement learning. Next, the paper reviews the applications of deep learning methods to different languages in real-world scenarios. DL techniques are specially designed for handling complex data formats. Many natures inspired Convolutional Neural Network (CNN) models are discussed in this section. Lastly, the paper discusses the different classification techniques in handling the handwritten digit which could provide useful references for researchers who want to experiment more in this field.
Keywords
Full Text:
PDFReferences
1. Alaei A, Nagabhushan P, Pal U. A Benchmark Kannada Handwritten Document Dataset and Its Segmentation. 2011 International Conference on Document Analysis and Recognition. Published online September 2011. doi: 10.1109/icdar.2011.37
2. Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Springer Berlin Heidelberg; 2012. doi: 10.1007/978-3-642-24797-2
3. Alzubaidi L, Zhang J, Humaidi AJ, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data. 2021; 8(1). doi: 10.1186/s40537-021-00444-8
4. Singh A, Sukhpreet Singh E. Line segmentation of handwritten documents written in Gurmukhi script. International Journal of Application or Innovation in Engineering & Management. 2013; 2(8): 314-317.
5. Arica N, Yarman-Vural FT. An overview of character recognition focused on off-line handwriting. IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews). 2001; 31(2): 216-233. doi: 10.1109/5326.941845
6. Ghosh D, Dube T, Shivaprasad AP. Script Recognition—A Review. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2010; 32(12): 2142-2161. doi: 10.1109/tpami.2010.30
7. Jayswal D, Panchal BY, Patel B, et al. Study and develop a convolutional neural network for MNIST handwritten digit classification. In: Proceedings of the Third International Conference on Computing, Communications, and Cyber-Security: IC4S 2021.
8. Muthureka K, Srinivasulu Reddy U, Janet B. An improved customized CNN model for adaptive recognition of cerebral palsy people’s handwritten digits in assessment. International Journal of Multimedia Information Retrieval. 2023; 12(2). doi: 10.1007/s13735-023-00291-8
9. Vernes K. Gliding performance of the northern flying squirrel glaucomys sabrinus in mature mixed forest of eastern canada. Journal of Mammalogy. 2001; 82(4): 1026-1033.
10. Baheti M, Kale J. Gujarati numeral recognition: Affine invariant moments approach. International Conference on Recent Trends in Engineering & Technology.
11. Chaudhuri A, Mandaviya K, Badelia P, et al. Optical Character Recognition Systems for Different Languages with Soft Computing. Springer International Publishing; 2017. doi: 10.1007/978-3-319-50252-6
12. Chen XP, Monga R, Bengio S, Jozefowicz R. Revisiting distributed synchronous SGD. ArXiv. 2016; arXiv:1604.00981. doi: 10.48550/arXiv.1604.00981
13. Bhowmik S, Malakar S, Sarkar R, et al. Off-line Bangla handwritten word recognition: a holistic approach. Neural Computing and Applications. 2018; 31(10): 5783-5798. doi: 10.1007/s00521-018-3389-1
14. Bhowmik TK, Parui SK, Roy U. Discriminative HMM training with GA for handwritten word recognition. In: Proceedings of the 19th International Conference on Pattern Recognition
15. Roy P, Pal U, Llados J. Morphology based handwritten line segmentation using foreground and background information. International Conference on Frontiers in Handwriting Recognition.
16. Chaudhuri BB, Bera S. Handwritten Text Line Identification in Indian Scripts. In: Proceedings of the 10th International Conference on Document Analysis and Recognition.
17. Bordes A, Bottou L, Gallinari P. SGD-QN, Careful quasi-newton stochastic gradient descent. Journal of Machine Learning Research. 2010; 10: 1737-1754.
18. AlKhateeb JH, Ren J, Jiang J, et al. Offline handwritten Arabic cursive text recognition using Hidden Markov Models and re-ranking. Pattern Recognition Letters. 2011; 32(8): 1081-1088. doi: 10.1016/j.patrec.2011.02.006
19. Chen M, Lin J, Zou Y, et al. Acoustic Sensing Based on Online Handwritten Signature Verification. Sensors. 2022; 22(23): 9343. doi: 10.3390/s22239343
20. Maloo M, Kale K. Support vector machine-based Gujarati numeral recognition. International Journal on Computer Science and Engineering. 2011; 3(7): 2595-2600.
21. Choudhary A, Rishi R, Ahlawat S. A New Approach to Detect and Extract Characters from Off-Line Printed Images and Text. Procedia Computer Science. 2013; 17: 434-440. doi: 10.1016/j.procs.2013.05.056
22. Saqib N, Haque KF, Yanambaka VP, et al. Convolutional-Neural-Network-Based Handwritten Character Recognition: An Approach with Massive Multisource Data. Algorithms. 2022; 15(4): 129. doi: 10.3390/a15040129
23. Ciresan DC, Meier U, Gambardella LM, et al. Convolutional Neural Network Committees for Handwritten Character Classification. In: Proceedings of the 2011 International Conference on Document Analysis and Recognition.
24. Gomathi RS, Devi URS. Segmentation of touching, overlapping, skewed and short handwritten text lines. International Journal of Computer Applications. 2012; 4(19): 24-27.
25. Montavon G, Orr GB, Müller KR, et al. Neural Networks: Tricks of the Trade. Springer Berlin Heidelberg; 2012. doi: 10.1007/978-3-642-35289-8
26. Connell, Jain AK. Template-based online character recognition. Pattern Recognition. 2001; 34(1): 1-14.
27. Dauphin YN, Pascanu R, Gulcehre C, et al. Identifying and attacking the addle point problem in high-dimensional non-convex optimization. Springer; 2017.
28. Bottou FE, Nocedal CJ. Optimization methods for large-scale machine learning. PHI; 2016.
29. Broumandnia A, Shanbehzadeh J. Fast Zernike wavelet moments for Farsi character recognition. Image and Vision Computing. 2007; 25(5): 717-726. doi: 10.1016/j.imavis.2006.05.014
30. Wang T, Wu DJ, Coates A, Ng AY. End-to-end text recognition with convolutional neural networks. International Conference on Pattern Recognition. 2012.
31. Gupta A, Srivastava M, Mahanta C. Offline English handwritten character recognition using neural network. International Journal of Scientific Research in Computer Science. 2013; 1(2): 16-20.
32. Hinton G, Deng L, Yu D, et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Signal Processing Magazine. 2012; 29(6): 82-97. doi: 10.1109/msp.2012.2205597
33. Sachdeva R, Sharma V. A brief study of feature extraction and classification methods used for character recognition of Brahmi northern Indian scripts. International Journal of IT, Engineering and Applied Sciences Research. 2015; 4(2): 25-29.
34. Le V, Ngiam J, Qoates A, et al. On optimization methods for deep learning. In: Proceedings of the International Conference on Machine Learning (ICML 2011).
35. Sunny G, Gosu D, Subramaniam B, et al. Comparative Analysis of Handwritten Digit Recognition Investigation Using Deep Learning Model. In: Artificial Intelligence for Smart Healthcare. Springer International Publishing; 2023.
36. Quinlan JR. Induction of decision trees. Machine Learning. 1986; 1(1): 81-106. doi: 10.1007/bf00116251
37. Ramteke AS, Rane ME. Offline handwritten Devanagari script segmentation. International Journal of Scientific & Technology Research. 2012; 1(4): 142-145.
38. Pascanu R, Bengio Y. Revisiting natural gradients for deep networks. International Conference on Learning Representations; 2014.
39. Sandeep Saha N, Paul S. Kundu Optical character recognition using 40-point feature extraction and artificial neural network. International Journal of Advanced Research in Computer Science and Software Engineering. 2013; 3(4): 495-502.
40. Uddin S, Khan A, Hossain ME, et al. Comparing different supervised machine learning algorithms for disease prediction. BMC Medical Informatics and Decision Making. 2019; 19(1). doi: 10.1186/s12911-019-1004-8
41. Sainath TN, Kingsbury B, Ramabhadran B, et al. Making Deep Belief Networks effective for large vocabulary continuous speech recognition. In: Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
42. Ren S, He K, Girshick R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2017; 39(6): 1137-1149. doi: 10.1109/tpami.2016.2577031
43. Bengio Y, Boulanger-Lewandowski N, Pascanu R. Advances in optimizing recurrent networks. IEEE International Conference on Acoustics, Speech and Signal Processing; 2013.
44. Szegedy C, Wei Liu, Yangqing Jia, et al. Going deeper with convolutions. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
45. Hossain MdA, Ali MdM. Recognition of Handwritten Digit using Convolutional Neural Network (CNN). Global Journal of Computer Science and Technology. Published online May 18, 2019: 27-33. doi: 10.34257/gjcstdvol19is2pg27
46. Prasad BK, Sanyal G. A model approach to off-line English character recognition. International Journal of Scientific and Research Publications. 2012; 6(2): 1-6.
47. Wiesler S, Li J, Xue J. Investigations on hessian-free optimization for cross-entropy training of deep neural networks. Interspeech; 2013.
48. Dayan P, Niv Y. Reinforcement learning: The Good, The Bad and The Ugly. Current Opinion in Neurobiology. 2008; 18(2): 185-196. doi: 10.1016/j.conb.2008.08.003
49. Jain A, Kiran HR. Deformed character recognition using convolutional neural networks. International Journal of Engineering Technology. 2018; 7(3): 1599-1604.
50. Martens J, Bottou L, Littman M. Deep learning via hessian-free optimization. In: Proceedings of the Twenty-Seventh International Conference on Machine Learning.
51. Patel DK, Som T, Yadav SK, Singh MK. Handwritten character recognition using multi resolution technique and Euclidean distance metric. Journal of Signal and Information Processing. 2012; 3(2): 208-21.
52. Sharifani K, Amini M. Machine Learning and Deep Learning: A Review of Methods and Applications. World Information Technology and Engineering Journal. 2023.
53. Sudarchanan MS, Sujan G. Handwritten Text Recognition Using Machine Learning and Deep Learning. In: Proceedings of the Eighth International Conference on Science Technology Engineering and Mathematics (ICONSTEM 2023).
54. Jeh G, Widom J. SimRank: A measure of structural-context similarity. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
55. Shakya S. Machine learning based nonlinearity determination for optical fiber communication-review. Journal of Ubiquitous Computing and Communication Technologies. 2019; 1(2): 121-7.
56. Nocedal J, Wright SJ, Mikosch TV, et al. Numerical optimization. Springer Series in Operations Research and Financial Engineering; 2006.
57. Nocedal J, Wright S. Numerical optimization. Springer Science & Business Media; 2006.
58. Pal U, Sharma N, Wakabayashi T, Kimura F. Handwritten Numeral Recognition of Six Popular Indian Scripts. In: Proceedings of the Ninth International Conference on Document Analysis and Recognition.
59. Ding K, Liu Z, Jin L, Zhu X. Comparative study of GABOR feature and gradient feature for handwritten Chinese character recognition. In: Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition.
60. Shakya S. Machine learning based nonlinearity determination for optical fiber communication-review. Journal of Ubiquitous Computing and Communication Technologies. 2019; 1(2): 121-7.
61. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. ICLR; 2015.
62. Sridhar S. Digital Image Processing. Oxford University Press; 2016.
63. Pandey A, Singh S, Kumar R, Tiwari A. Handwritten scriptre cognition using soft computing. International Journal Advantage Research Computational Science Electronics Engineering. 2012; 1(6): 6-11.
64. Dayan P, Niv Y. Reinforcement learning: The Good, The Bad and The Ugly. Current Opinion in Neurobiology. 2008; 18(2): 185-196. doi: 10.1016/j.conb.2008.08.003
65. Dan Y, Zhu Z, Jin W, et al. PF-ViT: Parallel and Fast Vision Transformer for Offline Handwritten Chinese Character Recognition. Versaci M, ed. Computational Intelligence and Neuroscience. 2022; 2022: 1-11. doi: 10.1155/2022/8255763
66. Das D, Avancha S, Mudigere D, et al. Distributed deep learning using synchronous stochastic gradient descent. 2016; 1: 1-10.
67. Han D, Nie H, Chen J, et al. Multi-modal haptic image recognition based on deep learning. Sensor Review. 2018; 38(4): 486-493. doi: 10.1108/sr-08-2017-0160
68. Alheraki M, Al-Matham R, Al-Khalifa H. Handwritten Arabic Character Recognition for Children Writing Using Convolutional Neural Network and Stroke Identification. Human-Centric Intelligent Systems. 2023; 3(2): 147-159. doi: 10.1007/s44230-023-00024-4
69. Mamatha HR, Srikantamurthy K. Morphological operations and projection profiles-based segmentation of handwritten Kannada documents. International Journal of Applied Information Systems. 2012; 4(5): 13-19.
70. Martens J, Sutskever I. Training deep and recurrent networks with hessian-free optimization in neural networks. Springer; 2012.
71. Martens J, Sutskever H. Learning recurrent neural networks with hessian-free optimization. In: Proceedings of the 28th International Conference on Machine Learning.
72. Seide F, Fu H, Droppo J, et al. On parallelizability of stochastic gradient descent for speech DNNS. In: Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Published online May 2014. doi: 10.1109/icassp.2014.6853593
73. Amiri Z, Heidari A, Navimipour NJ, et al. Adventures in data analysis: A systematic review of Deep Learning techniques for pattern recognition in cyber-physical-social systems. Multimedia Tools and Applications; 2023.
74. Bengio Y, Boulanger-Lewandowski N, Pascanu R. Advances in optimizing recurrent networks. IEEE International Conference on Acoustics, Speech and Signal Processing; 2013.
75. Le Cun Y. Gradient-based learning applied to document recognition. IEEE. 1998; 86(11): 2278-2324.
76. Bušta M, Neumann L, Matas J. Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV 2017).
77. Lim WH, Mat Isa NA. An adaptive two-layer particle swarm optimization with elitist learning strategy. Information Sciences. 2014; 273: 49-72. doi: 10.1016/j.ins.2014.03.031
78. Bag S, Harit G. A survey on optical character recognition for Bangla and Devanagari scripts. Sadhana. 2013; 38(1): 133-168. doi: 10.1007/s12046-013-0121-9
79. Byrd J, Lipton Z. What is the effect of importance weighting in deep learning. In: Proceedings of the 36th International Conference on Machine Learning.
80. Mendapara MB, Mukesh M. Goswami Stroke identification in Gujarati text using directional features. In: Proceedings of the International Conference on Green Computing Communication and Electrical Engineering.
81. Arica N, Yarman-Vural FT. Optical character recognition for cursive handwriting. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002; 24(6): 801-813. doi: 10.1109/tpami.2002.1008386
82. Neumann L, Matas J. Real-time scene text localization and recognition. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition.
83. Jain A, Kiran HR. Deformed character recognition using convolutional neural networks. International Journal of Engineering Technology. 2018; 7(3): 1599-1604.
84. Sahare P, Dhok SB. Multilingual Character Segmentation and Recognition Schemes for Indian Document Images. IEEE Access. 2018; 6: 10603-10617. doi: 10.1109/access.2018.2795104
85. Broumandnia A, Shanbehzadeh J. Fast Zernike wavelet moments for Farsi character recognition. Image and Vision Computing. 2007; 25(5): 717-726. doi: 10.1016/j.imavis.2006.05.014
86. Elleuch M, Maalej R, Kherallah M. A New Design Based-SVM of the CNN Classifier Architecture with Dropout for Offline Arabic Handwritten Recognition. Procedia Computer Science. 2016; 80: 1712-1723. doi: 10.1016/j.procs.2016.05.512
87. Le Cun Y, Bottou L, Bengio Y, Haffner P. Gradient based learning applied to document recognition. IEEE. 1998; 86(1): 2278-2324.
88. Le Cun Y, Bottou L, Orr G, et al. Efficient back prop in neural networks. Springer; 1998.
89. Lin F, Lin YLF, Cai DC. Chinese character captcha recognition and performance estimation via deep neural network. Neurocomputing. 2018; 288: 11-19.
90. Tayyab M, Naeem F, Ul-Hasan A, Shafait F. Multi-faceted OCR framework for artificial Urdu news ticker text recognition. In: Proceedings of the 13th IAPR International Workshop on Document Analysis Systems (DAS).
91. Fu HY, Chang YY, Pao HT. Ser adaptive handwriting recognition by self-growing probabilistic decision-based neural networks. IEEE Transactions on Neural Networks. 2000; 11(6): 1373-1384.
92. Lawrence CL, Giles A, Tsoi C, Andrew D. Back: Face recognition: A convolutional neural-network approach. IEEE Transactions on Neural Networks. 2017; 8(1): 98-113.
93. Sahare P, Dhok SB. Multilingual Character Segmentation and Recognition Schemes for Indian Document Images. IEEE Access. 2018; 6: 10603-10617. doi: 10.1109/access.2018.2795104
94. Montavon G, Orr GB, Müller KR, et al. Neural Networks: Tricks of the Trade. Springer Berlin Heidelberg; 2012. doi: 10.1007/978-3-642-35289-8
95. Lawrence CL, Giles A, Tsoi C, Andrew D. Back: Face recognition: A convolutional neural-network approach. IEEE Transactions on Neural Networks. 2017; 8(1): 98-113.
96. Vernes K. Gliding performance of the northern flying squirrel glaucomys sabrinus in mature mixed forest of eastern canada. Journal of Mammalogy. 2001; 82(4): 1026-1033.
97. Tan X, Chen S, Zhou ZH, et al. Recognizing Partially Occluded, Expression Variant Faces from Single Training Image per Person with SOM and Soft$k$-NN Ensemble. IEEE Transactions on Neural Networks. 2005; 16(4): 875-886. doi: 10.1109/tnn.2005.849817
98. Malakar S, Bhowmik S. Bangla handwritten city name recognition using gradient-based feature. In: Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications, Singapore.
99. Sueiras J, Ruiz V, Sanchez A, et al. Offline continuous handwriting recognition using sequence to sequence neural networks. Neurocomputing. 2018; 289: 119-128. doi: 10.1016/j.neucom.2018.02.008
100. Jana R, Bhattacharyya S. Character recognition from handwritten image using convolutional neural networks. Springer Singapore; 2019. doi: 10.1007/978-981-13-6783-0
101. Sahoo S, Nandi SK, Barua S, et al. Handwritten Bangla word recognition using negative refraction based shape transformation. Journal of Intelligent & Fuzzy Systems. 2018; 35(2): 1765-1777. doi: 10.3233/jifs-169712
102. Keskar D, Mudigere J, Smelyanskiy M, et al. On large-batch training for deep learning: Generalization gap and sharp minima. In: Proceedings of the 2007 International Conference on Learning Representations, ICLR.
103. Kiros R. Training neural networks with stochastic hessian-free optimization. arXiv. 2013; arXiv:1301.3641. doi: 10.48550/arXiv.1301.3641
104. Kozielski M, Doetsch P, Hamdani M, et al. Multilingual Off-Line Handwriting Recognition in Real-World Images. In: Proceedings of the 11th IAPR International Workshop on Document Analysis Systems.
105. Kumar M, Jindal MK, Sharma RK. Segmentation of Isolated and Touching Characters in Offline Handwritten Gurmukhi Script Recognition. International Journal of Information Technology and Computer Science. 2014; 6(2): 58-63. doi: 10.5815/ijitcs.2014.02.08
106. Kumar RM, Shetty NN, Pragathi BP. Text line segmentation of handwritten documents using clustering method based on thres holding approach. In: Proceedings of the National Conference on Advanced Computing and Communications.
DOI: https://doi.org/10.32629/jai.v7i5.1585
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 J. Deepika, Abirami Ravi, K. Chitra, T Senthil
License URL: https://creativecommons.org/licenses/by-nc/4.0/