Indian Journal of Science and Technology
Year: 2021, Volume: 14, Issue: 20, Pages: 1635-1641
Original Article
V K Raj Anishaa✉ 1 , P Sathvika 2 , Sandeep Rawat 3
1 Assistant System Engineer, Tata Consultancy Services, Madhapur, Telangana, 500081, India
2 Device, Digital and Alexa Support Associate, Amazon- Hyd20, Nanakaramguda, Telangana, 500032, India
3 Professor, Computer Science and Engineering, Anurag University, Venkatapur, Telangana, 500088, India
Received Date:19 February 2021, Accepted Date:17 May 2021, Published Date:04 June 2021
Background/Objectives: Every day millions of people visit search engines like Quora, reedit, stack overflow, etc., the demand for new intelligent techniques is growing, to help individuals find better solutions. Methods: In our proposed system, the Quora datasets were filtered using SQLite which takes one-quarter of the time taken to pre-process the same dataset using existing approaches like python functions. We used machine learning techniques namely the Random Forest, Logistic Regression, Linear SVM (Support Vector Machine) and XGBoost to analyze and identify the most suitable model. Findings: The error log loss functions (0.887, 0.521, 0.654 and 0.357) of the above machine learning techniques were analyzed and compared respectively. The performance of XGBoost is the best among the other models, hence XGBoost is the most efficient model. Conclusion/Future Scope: It is concluded that XGBoost has outperformed other machine learning techniques discussed in the study. It is also found that pre-processing using SQLite has improved the response time. In the future, we would like to apply a similar approach for various other search engines that are available like reedit, stack overflow, etc. and one could ensemble the best models of each type (linear, tree-based, and neural network).
Keywords
Machine Learning, Question Pair Similarity, XGBoost, Linear SVM, Logistic Regression, Random Forest
© 2021 Anishaa et al.This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.