The automatic handwritten text recognition via images is considered one of the most difficult tasks in pattern recognition research areas. The challenge is to identify the shape or pattern from the handwritten text. As every word in handwritten text has different shape because of different style of writing which varies person to person. As Urdu is one of the cursive language which is spoken and written in different regions and also known as the national language of Pakistan.
All the above mentioned algorithms have been analyzed and compared to check the final results of these algorithms that which of the algorithm provides better accuracy which can be considered reliable. All the results which have been obtained after applying the above mentioned algorithms are discussed in the paper which shows the difference between the outputs and accuracy through them. The deep learning approaches have been considered more accurate while getting better accuracy as compared to other conventional techniques
The Urdu Language has quite complex structure of numbers therefore; it becomes much difficult to recognize the handwritten words. As every person has different writing style, in this regard to detect the word accurately is quite a big challenge in itself. Urdu Language script is written form right to left and it is also a cursive language
The background study reveals that very limited amount of work has been carried out on Urdu language from with the perspective of images as compared to other languages. Instead of millions of people speak Urdu but not such commendable system is made for Urdu handwritten character's recognition using images. There is a model present for roman digit recognition which provides 97% accuracy and that model was built using support vector machine by Gorgevik et al
As discussed above that handwritten Urdu character recognition using images is one of the most challenging part due to variations in handwriting and it has been observed through different studies that ANN (Artificial Neural Network) is widely used for character recognition. The ANN model creates bunch of nodes linked together and the neurons which are connected with each other passes the signal and information from one node to the other. The problem which can be faced by ANN that what of information is provided to the model because it takes an input and then trains itself to provide a labeled output. It is also not possible to develop a system which can be used for multiple languages because every language has different style and characters of writing respectively.
A methodology which was based on open mining algorithms and classification survey was presented by Kaur Harpreet et al
Urdu handwritten characters were collected from different people. Every box for writing a single character had equal breadth and width. All the people were asked to write in their own style so that the model can be best trained over versatile samples of handwritten characters. The data was collected from 100 people. All the images are separated and OpenCV filters were used to eliminate the noise from each image and some other techniques were deployed to attain high quality gray scale images which could be used for training and testing the model. The total collection of dataset is 4668 with dimensions 50*50 which is divided into two parts (training and testing). For training the model we used 3734 samples of images and 934 samples of images are used for testing the model.
A Multi-Layer Perceptron (MLP) is a feed forward and deep artificial neural network which is used to train and supervise the learning problems to minimize the errors. In MLP, the supervised learning techniques which is used, known as backpropagation for training. Multi-layer perceptron contain three layers:
It is used by the network to access the data
Used by network as computation machine to convert input into output and also known as hidden layers.
It shows the obtained results.
Support vector machines are used to analyze, classify and perform linear regression over provided data. It is a supervised learning algorithm. An SVM algorithm creates a model which breaks data into categories and assigns newly created categories to each set of data which makes SVM a non-probabilistic binary linear classifier
In SVM, all the categories of data are represented through mapping it with spaces to make the data plotting and understanding clearly. The newly created one are also mapped through the same technique by considering the gaps between them. Support Vector Machine can also be used to deal with non-linear classification using a trick called kernel trick by mapping inputs into feature spaces having high dimensions.
Support Vector Machine only works for labeled data but for unlabeled data, support vector clustering technique is used which works over unsupervised learning by making clusters of the data. This model was developed to extend SVM for tackling unlabeled data and is widely used for industrial applications.
(K-Nearest Neighbors Algorithm) is used to identify the dataset for regression and classification. It is a non-parametric method where in feature space, it contains examples of k closest training sets. K-NN output mainly rely that whether it is used for classification and regression.
If K-NN is used for regression so it provides the average of all k values and that output is considered as property value for object and if the K-NN is used for classification purpose so it produces a class membership as output
Until the classification is not done so the whole computation process in KNN is delayed because it is kind of lazy learning or instance based learning approach. Furthermore, a meaningful method which is applied on both classification and regression is to assign the weights to each neighbor. The neighbors who are nearest would put more to the average value than the far ones. Neighbor selection is made either from class which his K-NN classification or from object property value which is K-NN regression. If dealing with local structure of the data so in that case, K-NN algorithm is sensitive.
It is most commonly used to make analysis of visual images and it is a deep neural networks class. Applications of CNN can be widely found in the fields of image classification, image analysis, image recognition, image prediction, recommendation system, text prediction, video recognition, natural language processing and many more
CNN inspiration was driven from biological process which deals with the connectivity of neurons to other parts in a body. The pre-processing in CNN is at lower side as compared to other image classification algorithms which proves that CNN learns many things more effectively than other traditional algorithms which consumed more time for data processing. Major advantage of using CNN is that it is independent of human efforts required with other algorithms. CNN is also known as space invariant artificial neural networks (SIANN).
A Recurrent Neural Network (RNN) is used for handwritten recognition, speech recognition, unsegmented recognition and it is related to artificial neural networks. A directed graph alongside temporal sequence is created between nodes when connection is established. Input sequences in RNN are processed through utilizing its internal memory. Recurrent Neural Network (RNN) is classified into parts, which are finite and infinite impulse which are directed acyclic graph and directed cyclic graph respectively. It is possible to replace or unroll finite impulse by following feedforward neural network strictly but unrolling and replacing cannot be done in infinite impulse.
In addition, finite and infinite impulse has the ability to store additional states but neural networks possess the control for this storage. If the stored state has time delays or loops in the feedback then it can be replaced with some other networks. The controlled states are known as gated state or memory which are part of gated recurrent units and LSTM (Long Short Term Memory) Networks. It is also named as feedback neural networks
Random Forest algorithm works on the regression and classification of data. It constructs a decision tree at training portion. For classification and regression, it places classes and mean prediction from each individual tree respectively. Tin Kam Ho was the first person to construct an algorithm for Random Decision Forest which used stochastic discrimination approach for the classification of data. Leo Breiman and Adele Cutler together constructed and registered Random Forest Algorithm as trademark in 2019. It was extended to be able to create decision tree collections for variance control
The dataset has been created our own, the procedure of dataset collection started by writing hand written urdu characters on white papers. Data was written in such a way that each row consists of same hand written character written by different people along with one column that is considered as the label. Label was there for each hand written characters because machine learning supervised models has been applied for training and testing purpose in this research study. In order to create big corpus of hand written characters the same task of was communicated to our class students. The same approach of data collection was distributed throughout the university students. The idea was to cover maximum students so that we can have variety of hand written characters so that machine learning models can be trained well on different handwritten characters. The
The accuracy of different algorithms used for the recognition of Urdu Handwritten text from images by using machine learning and deep learning techniques as shown in
S.no | Algorithms | Accuracy |
---|---|---|
1 | Support Vector Machine (SVM) | 97 % |
2 | K-Nearest Neighbor (K-NN) | 38 % |
3 | Random Forest | 97 % |
4 | Concurrent Neural Network (CNN) | 99 % |
5 | Recurrent Neural Network (RNN) | 80 % |
6 | Multi-Layer Perceptron (MLP) | 98 % |
It can be clearly seen from the table1 that Support Vector Machine and Random Forest both provided 97% accuracy over the given dataset of 4668 sample which had 3734 samples for training and remaining 934 is used for testing. Furthermore, the digit recognition results of both techniques from the handwritten characters from the image's dataset has been shown in
The K-Nearest Neighbor proved to be the worst algorithm for recognition of handwritten Urdu text from images because it attained the accuracy of only 38%
Concurrent Neural Network is the best algorithm to attain the best accuracy for Urdu handwritten text recognition from images because we get the accuracy of 98% from this technique in table1 along with its recognition result which can be seen in
Amongst the entire algorithms of this research, Multi-Layer Perceptron and Concurrent Neural Network are the best technique which could be used for Urdu handwritten text recognition from images as these two algorithms provide best and reliable results because Deep Learning algorithms work more effectively over text recognition from images as compared to Machine Learning algorithms.
In this paper, the research was conducted on handwritten optical character recognition via images. For this purpose, we implemented different algorithms namely: Random Forest, Support Vector Machine (SVM), Concurrent Neural Network (CNN), Recurrent Neural Network (RNN), Multi-Layer Perceptron and K-Nearest Neighbor (KNN). The data set was divided into 20% and 80% for testing and training respectively.
The obtained results showed that CNN and MLP attained accuracy of 99% and 98% respectively during the handwritten characters recognition via same image dataset which is higher compared to 97%, 97%, 80%, and 38% obtained by Random forest, Support Vector Machines, Recurrent Neural Network and K-Nearest Neighbor respectively. These figures show that our study provides significant contribution in automatic optical character recognition of Urdu phonetics.
The proposed model works well for given input image; however, due to versatile style of writing, this research work has some limitation on hand written character recognition. For this purpose, in future we plan to extend existing techniques on more sophisticated Urdu language phonetics.