A New Design Based-SVM of the CNN Classifier Architecture with Dropout for Offline Arabic Handwritten Recognition内容摘要:
Procedia Computer Science Volume 80, 2016, Pages 1712–1723 ICCS 2016. The International Conference on Computational Science A New Design Based-SVM of the CNN Classifier Architecture with Dropout for Offline Arabic Handwritten Recognition Mohamed Elleuch1, Rania Maalej2 and Monji Kherallah3 1 National School of Computer Science (ENSI), University of Manouba, TUNISIA. 2 National School of Engineers (ENIS), University of Sfax, TUNISIA. 3 Faculty of Sciences, University of Sfax, TUNISIA. mohamed.elleuch.2015@ieee.org, rania.mlj@gmail.com, monji.kherallah@gmail.com Abstract In this paper we explore a new model focused on integrating two classifiers; Convolutional Neural Network (CNN) and Support Vector Machine (SVM) for offline Arabic handwriting recognition (OAHR) on which the dropout technique was applied. The suggested system altered the trainable classifier of the CNN by the SVM classifier. A convolutional network is beneficial for extracting features information and SVM functions as a recognizer. It was found that this model both automatically extracts features from the raw images and performs classification. Additionally, we protected our model against over-fitting due to the powerful performance of dropout. In this work, the recognition on the handwritten Arabic characters was evaluated; the training and test sets were taken from the HACDB and IFN/ENIT databases. Simulation results proved that the new design based-SVM of the CNN classifier architecture with dropout performs significantly more efficiently than CNN based-SVM model without dropout and the standard CNN classifier. The performance of our model is compared with character recognition accuracies gained from state-of-the-art Arabic Optical Character Recognition, producing favorable results. Keywords: CNN, dropout, Arabic handwritten recognition, over-fitting, based-SVM, features, HACDB 1 Introduction and Related Works During the two last decades, on the basis of signal processing and pattern recognition, offline and online data classification, has won big concern. As a result, it has been extensively practiced to a variety of research domains like vision recognition task [1, 2], Automatic Speech Recognition (ASR) [3] and EEG signal [4] classification. Lately, Handwriting Recognition has become a popular area of research because of the advances in technology such as the handwriting capturing devices and impressive mobile computers. Because it is a challenging topic, Arabic handwritten script recognition, in the domain of handwriting recognition 1712 Selection and peer-review under responsibility of the Scientific Programme Committee of ICCS 2016 c The Authors. Published by Elsevier B.V. doi:10.1016/j.procs.2016.05.512 A New Design Based-SVM of the CNN Classifier Architecture with Dropout ... M. Elleuch et al. has been deeply studied for a couple of decades by researchers who have utilized dissimilar algorithms, like Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), Hidden Model Markov (HMM), Deep Networks (DNN) , Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN), etc.,. The outcomes were various and satisfactory. These machines learning (ML) systems have demonstrated their reliability and performance in a large domain of applications as well as winning triumph in optical character recognition (OCR) in Latin and Asian languages [5, 6]. The major drawback of these architectures is the large number of parameters, so over-fitting can occur. Considering recognition of offline Arabic handwriting, our researches have highlighted and insisted more on the recognition aspects. Because of differences in forms, concavities, curvatures, and strokes, the handwritten characters and overlapping characters are highly varying. For this reason, a special care and importance was given from our part to the recognition of intricate Arabic handwritten text. Thanks to this work [7], architecture based on CNN and SVM classifier is investigated to the handwritten Arabic domain [8]. On the other hand, in this study to prevent our architecture from overfitting and to improve its performance, dropout is applied. This technique consists of temporarily removing a unit from the network. This removed unit is randomly selected only during the training stage [9]. This architecture mixes the advantages of the two approaches described below. Developed by LeCun et al [10], CNN which is hierarchical neural network possesses huge representational capacity