Indian Journal of Science and Technology
Year: 2016, Volume: 9, Issue: 40, Pages: 1-5
Ramanpreet Kaur* and Amandeep Kaur
Department of CSE, Chandigarh University, Gharuan, Mohali - 140413, Punjab, India; [email protected]
*Author for correspondence
Department of CSE
Email: [email protected]
This paper demonstrated the outcomes of the research of a number of general document clustering and classification methods. Objectives: This research improves the clustering. Its objective is to create a system which reduces the retrieval time of text documents from clusters. Method: In this paper, we propose a new method supporting clustering and classification, using k-means with feed forward neural networks using MATLAB. We use k-mean for the clustering of text documents and neural networks for classification of text documents. Findings: Earlier various techniques have come up like semi supervised models for labelled text, namely Partially Labeled Dirichlet Allocation and the Partially Labeled Dirichlet Process, genetic algorithm, Guassian distribution, hybrid genetic algorithm, fast k means global, k-means clustering. But all these techniques have their merits as well as demerits and the common thing is that these techniques are very time consuming. That is why the main aim of the work is to develop the model based on supervised as well as unsupervised techniques to achieve the similarity between documents. Improvements: To remove that time consuming problem we used neural networks for classification and k-means for clustering. We developed a model based on supervised as well as unsupervised technique to achieve the similarity between documents.
Keywords: Artificial Neural Network, Cosine Similarity and Data Mining, K-mean Algorithm, Similarity Measure Function, Text Document Clustering
Subscribe now for latest articles and news.