Optimal Feature for Text Similarity based Hybrid Clustering Technique with Aid of MGWO

Surya Narayana Goddumarri  and Vasumathi Devara

doi:10.17485/ijst/2017/v10i6/104768

Article

Optimal Feature for Text Similarity based Hybrid Clustering Technique with Aid of MGWO

VIEWS 763
PDF 236

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2017/v10i6/104768

Year: 2017, Volume: 10, Issue: 6, Pages: 1-10

Original Article

Optimal Feature for Text Similarity based Hybrid Clustering Technique with Aid of MGWO

Surya Narayana Goddumarri^* and Vasumathi Devara

JNTUH, Hyderabad - 500085, Telangana, India; [email protected], [email protected]

*Author for the correspondence:
Surya Narayana Goddumarri
JNTUH, Hyderabad - 500085, Telangana, India;
E-mail: [email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Objectives: The way toward gathering high dimensional information into groups is not exact and maybe not up to the level of desire when the dimensions of the dataset is high. It is presently centering gigantic consideration towards innovative work. Methods/Analysis: Initially the input high dimensional data is fed to similarity measure for text processing for feature selection, in which similarity between the categorical data is evaluated. Then we have planned to utilize optimal feature selection method. Feature determination is a vital subject in data mining, particularly for high dimensional datasets. In our proposed technique, Modified Grey Wolf Optimization technique is used for optimal feature selection. Next the selected features are grouped with the help of clustering technique. Here we are hybrid two clustering techniques for grouping the optimal features. Findings: The performance of the proposed technique is evaluated by means of clustering accuracy, Jaccard coefficient and Dice’s coefficient. The proposed technique is compared with existing clustering algorithms. Novelty/Improvements: The primary intension of this research is to achieving promising results in text similarity based clustering technique. Here we are hybridizing k means and fuzzy c means clustering algorithm for grouping the optimal features.

Keywords: Fuzzy C Means Clustering, Grey Wolf Optimization, Jaccard Coefficient and Dice’s Coefficient, K Means, Similarity Measure for Text Processing