Comparative Analysis of DTW based Outlier Segregation Algorithms for Wrist Pulse Analysis

Background/Objectives: Quantification of Wrist Pulse Signals is helpful to take benefit of ancient approach i.e. Pulse Diagnosis. The objective of this paper is to effectively segregate outliers present within wrist pulses. Methods/Statistical Analysis: This work presents modification in Dynamic Time Warping (DTW) algorithm. The existing DTW algorithm searches for an optimal path using squared Euclidean distance to measure the local distance between segments. Here, we are discussing and integrating different local distance measures such as Correlation Distance, Manhattan Distance, Kendall’s τ Distance and Canberra Distance with DTW. All the discussed local distance measures were compared with existing Euclidean based DTW algorithm on the basis of Similarity Index parameter. Findings: Results shown that Manhattan Distance and Canberra Distance based DTW algorithm was efficient in optimal path selection and segregation of segments which lose their shape characteristics. In euclidean based DTW, outlier segregation was difficult as all values lied between 0 to 1.Correlation distance and Kendall’s tau distance algorithm were inappropriate in detecting outliers as results were not matched with visual observations. It was noticed that combination of Manhattan Distance and Canberra Distance based DTW algorithm were giving better outlier finding. Comparative Analysis of DTW based Outlier Segregation Algorithms for Wrist Pulse Analysis Amandeep Bisht1*, Nidhi Garg1, Hardeep S. Ryait2 and Amod Kumar3 1UIET, Panjab University, Chandigarh – 160014, Punjab, India; amandeepbisht@gmail.com, nidhi_garg@pu.ac.in 2BBSBEC Fatehgarh Sahib, Chandigarh – 140407, Punjab, India; hardeepsryait@gmail.com 3Central Scientific Instruments Organisation (CSIRCSIO), Chandigarh – 160030, Punjab, India; Csioamod@yahoocom


Introduction
In Traditional Indian Medicine, the organ under study is zeroed down by sensing the palpation from the three fingers (index, middle and ring) placed on the radial artery (vata, pitta, kapha) to analyze body variation 1,2 . To add consistency and objectivity in this concept, a computerized wrist pulse signal is required which can be utilized as a substitute for practitioners job and provide reliable results. The pulses basically captures cardiac muscle activity which is influenced by the characteristics of blood and vessel that make it effective for studying both cardiac and non-cardiac diseases. A segment of wrist pulse comprise of 3 main waves: Percussion, Tidal and Dicrotic waves 3 . Variation in the amplitude and interval of these waves describes imbalance in doshas (vata, pitta, kapha) which is nothing but combination of panchtattva 4 .
The generated Pulse sequence is a semi-periodic physiological signal. One of the frequent problems faced in biomedical signal processing is noise that comes with signal during acquisition process 5 . It is necessary to eliminate disturbances in wrist pulse series so as to maintain the quality and time-frequency information of signal. Although even after noise removal, there exist some irregular pulses which distort key characteristics of pulses. Before feature extraction, a practical outlook for pulse segmentation was followed as done by Xia et al. 6 followed by ensemble averaging of segments so that indistinctness in segments could be clearly pointed 7 . For removal of outliers, Euclidean Distance based Dynamic Time Warping algorithm 8 is used for dissimilar pulse identifi-cation and subtraction as performed by Bhasker et al 9 . To improve this existing DTW, various other similarity measure algorithms such as Correlation Distance, Manhattan Distance, Kendall's τ Distance and Canberra Distance 10,11 were used as local distance measure to outline better outlier removal algorithm using similarity index. These algorithms are applied indifferent applications such as image, speech processing, character recognition, pattern recognition and feature selection etc 12, 13 . In this paper, we have bring about the comparative analysis of existing different outlier algorithms to find the best for wrist pulse analysis and validating the results by studying ensemble averaging of segments.
This paper is divided into five different sections as mentioned. Section II describes database collection and pre-processing. In section III, outliers removal algorithms are discussed, Section IV shows results with parameters described in a tabular form and Section V concludes the paper.

Preprocessing
A pressure sensor is used to collect wrist pulses from subject and tools used for acquisition is ®LabVIEW 14,22 . Notch filter has been used for removing 50 Hz powerline interference 15 , linear detrending for baseline removal 16 and Daubechies (db) wavelet transform 17 for high frequency noise removal followed by band pass filter. Whole pre-processing has been done in MATLAB Software. Figure 1a and 1brepresents original and denoised pulse signal for Set A that comprises of 11 segments. Ideally, data acquisition setup needs to be perfect that no variation in segments occur even with movement of subject. But practically it's unavoidable. Even slight movement by subject causes these variations which are difficult to avoid. We can observe in figure 1 that all segments are not absolutely similar to each other which is because of body movements. By using appropriate algorithms, segregation of these segments is achievable.

Outlier Removal Algorithm
Outliers are irregular segments present in pulse series which mainly occurs due to motion artifacts (such as hand movements) 9 as can be visually identified from figure 1b.We can determine heart rate from pulse series which is nothing just beat to beat variations. To study different features such as pulse percussion wave amplitude, tidal wave amplitude, dicrotic notch amplitude, dicrotic wave amplitude, their onsets, timing interval and beat frequency 18,19 , it is compulsory to compute these parameter for each pulse segment or after taking average of all the segments. It becomes a necessity to eliminate segments which loses their character as it is not possible to extract features from these irrelevant pulses. Further, we are discussing different outlier segregation algorithms which were implemented to find the best method.

Dynamic Time Warping (DTW)
One of the advantages of DTW is that it can be applied to segments of uneven length that means no need for zero padding DTW calculates optimal distance for segments of uneven length. It utilizes euclidean distance to match samples in one segment to other in time series by nonlinear alignment 20 . Its main advantage is that it reduces distortion effect by allowing stretching (elastic) transformations of time series for detecting shape similarity for different phases. A warping path is computed in a non-linear fashion by adding local distance with global distance (minimum distance of adjacent element) as shown:  Then, x -x′<1 and y -y′<1. This restricts the allowable steps in the warping path to adjacent cells (including diagonally adjacent cells).
Here path cost is considered as a similarity parameter. Warping path cost for similar segments will be zero and its value will go up as dissimilarity increases. Different outlier removal algorithms has been discussed and implemented on the Set A to find a better solution for outlier segregation.

Euclidean distance
Euclidean distance is most extensively used similarity distance measure of all presented approach. Distance is calculated as the square root of sum of squares of the difference between corresponding samples of two segments of same length. Zero padding is done for equivalent length 11,21 . Lower the value of distance more similar are pulse segments while rise in value increases dissimilarity between segments. For segment p and q of length, k = 1, 2, 3……..N; where; d e (p, q) = Euclidean distance In DTW, Squared Euclidean distance is used as local distance.

Manhattan Distance
Manhattan distance, sometimes acknowledged as City Block distance, is similar to Euclidean except that it cannot move with points diagonally, rather moves horizontally and vertically in grid based system 11,21 . It is given by:

Correlation Distance
It is defined as one minus the correlation coefficient of segment 11,21 . If p and q are two segments with mean as p and q resp., then correlation distance is: d corr (p, q) =

Kendall's τ Distance
The Kendall tau rank distance is also known as bubble sort distance and is defined as a metric that counts the number of pair wise disagreements between two data points 11 . Larger the distance, the more dissimilar are data points.

Canberra Distance
Canberra distance is defined as weighted version of Manhattan Distance and utilized as a metric for comparing ranked lists. It is very sensitive for values close to zero that's why often used for data points scattered around origin 10,11,21 . segmentation is done so that similar pulses and outliers are noticeable as shown in Figure 2.
After thresholding, we have removed S3, S5, S6 and S11 from Set A. All the segments with outliers of Set A are superimposed on each other to easily witness outliers in ensemble average as shown in figure 2a while figure 2b represents ensemble average of pulses without outliers for Set A.

Conclusion
DTW is a reliable algorithm for similarity measure as it considers elastic transformation of time series and can use for dissimilar length segments. DTW while searching for optimal path considers surrounding points of time series resulting in non-linear alignment which is an advantage over others metric based algorithm. From the collective analysis of results, Euclidean, Manhattan and Canberra based DTW computes S3 as most dissimilar segment and S1 as most similar one. It is difficult to distinguish all prominent outliers with existing DTW as its similarity index lies between 0 to 1 making analysis unreliable for our database. Better Similarity Index has been found for

Results
To check similarity between segments, Similarity Index parameter is calculated which tells about resemblance and regularity of pulse segments. Suppose we have n segments in pulse series, a distance matrix D of n × n dimensions is formed for all above algorithm which represents similarity distance between segments.
Similarity Index is defined as:    A2 and A5algorithm and it was validated from ensemble averaging where S3, S5, S6 and S11 were dissimilar segments. Comparing with visual analysis, segments S3, S5 and S6 were easily differentiable but S11 was difficult to analyze with naked eyes. Correlation distance based AlgorithmshowedS1 and S10 as outlier, S3 as most similar segment. From tabular data, it is observed that Correlation distance and Kendall's tau distance based DTW algorithm appears to be less effective for detecting shape irregularities in which Manhattan and Canberra based DTW algorithm proved to be better one followed by existing Euclidean based DTW. Overall, it can be concluded that combination of Manhattan and Canberra distance measure with DTW algorithm is robust in terms of detecting small contraction or expansions whereas Correlation distance and Kendall's tau distance algorithm appears to be inefficient in our database for wrist pulse. For future work, efficiency of outlier removal could be improved by using DDTW with above local distance measure to improve shape resemblance and spatial variations.