Indian Journal of Science and Technology
Year: 2015, Volume: 8, Issue: 12, Pages:
V. Chitraa1* and Antony Selvadoss Thanamani 2
1 CMS College of Science & Commerce (Autonomous), Coimbatore, Tamilnadu, India; [email protected]
2 Computer Science, NGM College (Autonomous), Pollachi, Coimbatore, Tamilnadu, India; [email protected]
Objectives: The primary objective of this research paper is to design a new and efficient clustering technique to group user navigation patterns which are useful for classification system to classify a new user with the previous users group.
Methodology: Three real time web log data sets are collected from e-commerce web server, academic institution web server and a research journal web server. All three sets were collected from IIS web servers. After navigation patterns are derived from preprocessing step it is clustered into groups by using traditional Fuzzy C-Means technique. The clusters are validated and re-clustered using Bolzano_Weierstrass Theorem.
Findings: Web log data is preprocessed and ICA is applied in the user session matrix to select relevant and important features. To measure the clustering accuracy of proposed and the existing methods, the parameters such as Rand Index, F measure are calculated and compared. It shows proposed BWFCM have higher rand index rate than FCM and lesser error rate. To understand the impact of the feature selection method, the data sets were implemented with the existing and proposed methods of feature selection. The parameters taken for comparison were Rand Index, Sum of Squared Errors, F-measure. The method was implemented in all the three data sets after data cleaning, session construction step. Clustering was carried out twice with the proposed clustering algorithm in all the three data sets, without selecting features and after selecting features. It was observed that the clustering results are poor when applied in full data set with irrelevant features, and the performance was increased after relevant features were selected.
Conclusion: The result of the optimized clustering proves its significance and there is an increase in similarity of intra clustering and dissimilarity in inter clustering than the existing methods.
Keywords: Bolzano_Weierstrass Theorem, Clustering, Feature Selection, Navigation Patterns, Web Usage Mining
Subscribe now for latest articles and news.