Fault Diagnosis of Helical Gear Box Using Vibration Signals through J-48 Graft Algorithm and Wavelet Features

Objectives: In this paper, machine learning approach, grounded on vibrations, has been used for helical gear box and holds a vital position in the industry. This approach has three steps namely feature extraction, feature selection and feature classification. Firstly, feature extraction was carried out using Matrix Laboratory (MATLAB) software. Feature selection was done using J48 classifier. The nodes with highest classification accuracy were further tested using J48 graft classifier and the results obtained were very promising. Methods/Analysis: Vibration signals were obtained from the experimental set up of the helical gear box. The recorded signals were then used for feature extraction using MATLAB through different wavelet features. The total numbers of signals extracted were 448 with each class consisting of 64 signals. The families of wavelets taken into account for fault diagnosis were Haar, Discrete Mayer, Daubechies, Biorthogonal, Reverse Biorthogonal, Coiflet and Symlets (SYM). In wavelet selection, signals were dissected into various frequencies and each was analyzed with appropriate resolution.J48 classifier was used to carry out the feature selection process and decision tree was obtained for Sym 8 wavelet. The best combination of nodes was visualized and further feature classification was done on these nodes. Findings: Feature classification was carried out by J48 graft algorithm. Using the grafting technique, the classifier achieved the highest accuracy for pruned data for 10 times cross validation. It gave maximum accuracy for pruned data (40%) and the results were satisfactory. Novelty/Improvements: The J48 graft algorithm uses grafting to infer from previous decision trees. This helps in reducing prediction errors. Fault Diagnosis of Helical Gear Box Using Vibration Signals through J-48 Graft Algorithm and Wavelet Features Nikhil Pawar1, V. Sugumaran1, Ameet Singh1 and M. Amarnath2 1School of Mechanical and building Science (SMBS), VIT University, Chennai Campus, Chennai – 600127, Tamil Nadu, India; nikhil.pawar2012@vit.ac.in, v_sugu@yahoo.com, ameet.singh2015@vit.ac.in 2Department of Mechanical Engineering, Indian Institute of Information Technology Design and Manufacturing, Airport Rd, Jabalpur Campus, Khamaria, Jabalpur – 482005, Madhya Pradesh, India; amarnath.cmy@gmail.com


Introduction
In recent years helical gear box condition monitoring has gained popularity. A significant number of failures occur due to localized defects. Fatigue cracking occurs due to constant cyclic contact stressing, when a major portion of the surface is displaced during operation, leading to localized defects at an initial stage. Helical gear box can operate under a variety of speeds and load conditions' , making it is difficult to measure and delineate local defects. Nowadays, fault detection and diagnosis are being carried out by physical parameters such as vibration, acoustic emission and wear debris. Fault classification techniques 1 have been utilized in a vast array of pattern recognition applications including vibration monitoring, previously used by the author has recorded a model study using numerous physical defects in various environments of velocity and load 2 . Vibration signals 3 collected hold crucial data regarding the working devices. The concept of using vibration 4 was initially used where vibration signals were used to detect tree, reduced error pruning tree, the J48 graft tree boasted the highest classification accuracy 19,20 for the fault diagnosis process. The application of the J48 graft classifier can be extended to the fields of medicine and biology. The J48 graft algorithm can be used diagnosis of diabetes disease 21 . The J48 algorithm has been used to generate credit scores and show that the J48graft has the capacity to facilitate migration from existing data systems toward new concise analytic systems and big data 22 .Comparisons between the J48 graft algorithm [23][24][25] with other tree classifiers has been studied. " In the present study, feature extraction 26 was done using discrete wavelet features. Feature selection 27 was carried out by J48 algorithm, the decision tree enables the visualization of the contribution of features for fault diagnosis 28 . Finally feature classification is done with J48 graft classifier, the results obtained were highly accurate among various tree algorithms used for carrying out fault diagnosis using vibration signals.

Materials and Methods
" Figure 1 shows the experimental setup. The setup consists of a 5 HP two stage helical gearbox. The gear box is driven by a 5.5 HP, 3-phase induction motor with a rated speed of 1440 rpm. The inverter drive is responsible for the control of speed of the motor, which is currently operated at 80 rpm. The speed of the first stage of the gearbox is 80 rpm. With a step-up ratio of 1:15, the speed of the pinion shaft in the second stage of the gear box is 1200 rpm. Figure 1(a) summarizes the specifications of the test rig. The pinion is connected to a D.C motor (which is used as generator) to generate 2 kW power, which is dissipated in a resistor bank. Therefore, the actual load on the gearbox is only 2.6 HP which is 52% of its rated power 5 HP. In industrial environment utilization of load varies from 50% to 100%. In the case of traditional dynamometer, additional torsional vibrations can occur due to torque fluctuations. This is avoided in this case by using D.C motor and resistor bank. Tyre couplings are fitted between the electrical machines and gear box so that backlash in the system can be restricted to the gears. The motor, gear box and generator are mounted on I-beams, which are anchored to a massive foundation. Vibration signals are measured using a Brüel&Kjaer accelerometer which is installed close to the test bearing. Signals are sampled at a sampling frequency of 8.2 kHz. The signals were collected in 8 different classes, each class containing 54 distinct samples. faults and measure the severity. Many important like biology, ultrasound and thermal imaging 5 all rely on wavelet analysis 6,7 . "The paper show cases the mother wavelet selection techniques with weight age on the quantitative approaches 8 . Various mother wavelets have been studied in this paper. The various mother wavelets are Haar, Daubechies, coiflet, SYM wavelets for fault detection 9 . Thorough research has shown that the optimal mother wavelet proposed was coiflet wavelet, due to lower sum of coefficients for all values of fault resistances. However, in this study, SYM wavelets were clearly the better option due to higher accuracy. The accuracy of characteristics derived from wavelets for detecting faults of a gear box using Artificial Neural Network (ANN) and Proximal Support Vector Machine (PSVM) 10 . The J48 algorithm was used for classification of the statistical features of Morlet wavelet coefficients. The predominant features were fed as input for training. The relative accuracy in classifying the faults in the bevel gear box was compared using ANN and PSVM. Both ANN and PVSM had a high average classification accuracy of 97.5% and 97% respectively. However, PVSM had an edge over ANN due to lesser time required for training. A similar concept is used in this paper, the analysis here was done on a helical gearbox. The paper signifies data classification using Naïve Bayes and J48 classification algorithm 11 . Naive Bayes algorithm is based on probability and J48 algorithm is based on decision tree 12 . A comparative evaluation of the classifiers Naïve Bayes 13 and J48 were done in the paper. The best fit tree is selected from the J48 classifier to carry out further analysis. The effectiveness of the J48 graft classifier is evident from the work presented. Through this paper a comprehensive analysis of various classifiers (both pruned and unpruned) using Weka 14,15 software was implemented on a spam base dataset. The results were compared based on an evaluation criteria. Naive Bayes algorithm 16 was previously used for carrying out fault diagnosis. The J48 graft classifier, was more complex due to its large tree size and high number of leaves. However, the complexity did not affect its performance as it had a higher classification accuracy. The only drawback presented was the higher time taken by the classifier with respect to J48 and simple cart classifier. This paper uses vibration signals to conduct the fault diagnosis 17 . The performance of the J48 grafts algorithm classification of bank directing marketing 18 . Although its accuracy was not as high as SVM algorithm, the J48 graft classifier still presented a decent output. Among the many tree classifiers namely, least absolute deviation tree, Naïve Bayes tree, random forest tree, best fit tree and simple cart information. The process of wavelet decomposition was performed on vibration signals using DWT. The trends and details were the consequence of decomposition. For next level trend and detail, the previous trends obtained from decomposition are decomposed again. Additional levels of details are obtained by the decomposition of the preceding trend levels. The length of the signal is 8192 (2 13 ) and possible decomposition levels are 13. At each level, the detail coefficient was used to compute the energy content using the following formulae: Then the features were defined as the energy content at each level. The feature vector is defined as V= (v1, v2,  Seeded fault trials are extremely necessary to study the fault detection procedures. Faults can be simulated in the helical gear box using surface grinding, adding iron particles in gearbox lubricant, and using Electric Discharge Machining (EDM) or simply overloading the gear box, which would test it in accelerated conditions. The simplest approach is chip off one edge of the gear. This simulates the partial tooth break 29 . Feature selection was then carried out on the vibration signals using Discrete Wavelet Transform (DWT) algorithm. Out of the numerous wavelets, symlet wavelet was selected due its high classification accuracy. Feature extraction was done by the J48 classifier to evaluate the 8 symlet wavelets. Finally, feature classification carried out on the specific combination of nodes to determine the optimum number of objects for the best classification accuracy 30 . The methodology of the study is given in Figure 2. "

Wavelet Selection
Fifty four distinct wavelets were selected by using mother wavelet selection technique, from sym 2,sym 3,sym 4, sym 5,sym 6, sym 7 and sym 8 were studied and a comparative analysis was carried out using J48 algorithm.Sym8 was selected based on the highest accuracy obtained as shown is Figure 8. "

Feature Selection
A Decision Tree is used to understand the relationship between a dependent attribute (variable) given the values of the independent (input) attributes (variables). This concept aids modeling and knowledge extraction from the bulk data available 31 . J48 algorithm was used to carry out the feature selection process. It creates a binary tree which is useful in the classification problem. Once is a post process technique that adds nodes to inferred decision trees with the purpose of reducing prediction errors. This process only allows branching that avoids the introduction of classification errors into data that have previously been correctly classified. As a result, rather than introducing errors, the grafting technique eliminates. For evaluation purposes the following terms have been used: True positive (TP) for correctly identified, True Negative (TN) for correctly rejected, and False Positive (FP) for incorrectly identified, Precision, Recall, F-Measure, and Accuracy. Recall is referred to as the True Positive Rate or Sensitivity, and precision is Positive Predictive Value (PPV); True Negative Rate is Specificity. F-measure is a measure of a test's accuracy. F-measure reaches its best value at 1 and worst score at 0. The v3+v10+v1+v6+v5+v4+v2 nodes were tested using the J48 Graft classifier, by varying the minimum number of objects to pinpoint the optimum number of objects with the highest accuracy. The graph depicts a uniform trend of the accuracy on increasing the number of objects. The highest accuracy of 90.17 % is obtained initially with the number of objects equal to 1. This is succeeded by a steady decline for the next the tree is built, it is applied to each ordered list in the database and results in classification for that list. SYM 8 wavelet data was used and classification was carried out by using J48 tree algorithm. Hence, decision tree was obtained through visualization of decision tree.
Feature selection was done for all 13 features. Further tests took place using the J48 algorithm to identify the sequence of nodes and the node with the best accuracy. The nodes,v3+v10+v1+v6+v5+v4+v2 had an accuracy of 89.2857 i.e. the maximum, among the 13 combinations. The Figure 10 shown compares the accuracies obtained by the different combination of nodes and helps identify the node with the highest accuracy. "

Feature Classification
A grafted decision tree is generated by the J48 graft algorithm from a J48 tree algorithm. The grafting technique   The summary of stratified cross validation obtained from the confusion matrix is given below-Total number of instances 448 Correctly classified instances 40490.17 % Incorrectly classified instances 44 9.82% The detailed class-wise accuracy of the J48 graft algorithm is presented in Figure 3. The TP rate stands for true positive and its value should be close to 1 and FP rate stands for false positive and its value should be close to 0 for better classification accuracy, which is confirmed in the paper 34 . " ten objects were accuracy values equal or as low as 86.60%. There is a minor increase in accuracy after the 12 th object with highest accuracy of 87.50% during this period. Finally, there is a steep decline after the 20 th object where the accuracy decreases non uniformly and rapidly. "

Results and Discussion
Vibration signals from a helical gearbox were recorded. 54 discrete wavelets were obtained and were used to carry out feature extraction. The signals were divided into 8 distinct groups (symlets), each symlet containing 54 different wavelet samples. These were tested in the Weka software using the J48 classifier algorithm. The group with the highest accuracy was visualized using the decision tree. The nodes of the tree were separately evaluated to learn about their individual accuracies. Upon, deriving the nodes with the maximum accuracy, the J48 Graft algorithm helped to identify the optimum number of objects for the given nodes.

Feature Classification
The vibration signals were recorded for normal and abnormal conditions of helical gear box. Totally 448 samples were collected; out of which 64 samples were from healthy condition. For faulty gears with 10%, 20%, 40%, 60%, 80% and 100% fault, 64 samples from each condition were collected. The wavelet features were treated as features (attributes) were inputs to the algorithm. Numerous conditions (10% fault, 20 % fault, 40% fault, 60% fault, 80% fault, 100% fault and healthy) of the classified data were the outputs. Although the nodes closer to the root are more significant, all nodes in the tree are given equal importance for feature subset selection in order to maintain simplicity of the code 32,33 .
The interpretation of the confusion matrix is as follows ( Figure 12). The diagonal elements in the confusion matrix above, show the number of correctly classified instances.
1. In the first row, the first element shows number of data points that belong to 'Good' class and classified by the classifier as 'Good' . 2. In the first row, the fourth element shows the number of data points belonging to 'Good' class but misclassified as '30% fault' . 3. In the first row, the seventh element shows the number of 'Good' data points misclassified as '100.0% fault' .

Conclusion
Gears are important machine elements which undergoes constant wear. This paper presents an algorithm based interpretation of vibration signals for fault diagnosis of helical gear box. DWT was used with different wavelets. Sym 8 wavelet was selected among these wavelets due to its high accuracy. J48 decision tree classifier was used to carry out the feature selection process. The decision tree was studied and the sequences of nodes were visualized. Feature classification was carried out by J48 graft algorithm. Using the grafting technique, the classifier achieved the highest accuracy for pruned data for 10 times cross validation. It gave maximum accuracy for pruned data (40%) and the results were satisfactory. "