Total views : 720

A Comparative Study on Various Data Mining Algorithms with Special Reference to Crop Yield Prediction


  • Faculty of Computer Science and Applications, Charotar University of Science and Technology (CHARUSAT), Changa - 388421, Gujarat, India


Objectives: To compare different data mining algorithms with the same parameters on the 10fold cross validation test to predict the crop yield. Methods/Analysis: Different data mining classification algorithms like K-nearest Neighbor, K-means, Neural Network, Support Vector Machine, Case-based Reasoning, Decision Tree algorithm, etc. are applied for various application of agriculture domain. A comparative study is done by using J48, Naïve Bayes and Simple Cart algorithms to determine which classification algorithm is best fitted for crop prediction. Findings: In this study, this work reveals the superior performance of J48 classification algorithm with accuracy 89.33% for crop prediction than the other two classification algorithms Simple Cart and Naïve Bayes. Novelty /Improvement: This study first time demonstrates the application of different data mining classification techniques (as discussed above) in the domain of agriculture for yield prediction.


Classification Algorithm, Crop Prediction, Data Mining, Decision Tree, J48

Full Text:

 |  (PDF views: 628)


  • Chen MS, Han J, Yu PS. Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering. 1996 Dec; 8(6):866–83.
  • Fayyad UM, Piatetsky-Shapiro G, Smyth P. Knowledge discovery and data mining: towards a unifying framework. American Association for Artificial Intelligence. 1996 Aug 2; 96:82–88.
  • Fayyad U, Piatetsky-Shapiro G, Smyth P. From data mining to knowledge discovery in databases. AI magazine. 1996 Mar 15; 17(3):37–54.
  • Verheyen K, Adriaens D, Hermy M, Deckers S. High-resolution continuous soil classification using morphological soil profile descriptions. Geoderma. 2001 Apr 30; 101(3):31–48.
  • Bhargavi P, Jyothi S. Applying naive bayes data mining technique for classification of agricultural land soils. International Journal of Computer Science and Network Security. 2009 Aug; 9(8):117–22.
  • Meyer GE, Neto JC, Jones DD, Hindman TW. Intensified fuzzy clusters for classifying plant, soil, and residue regions of interest from color images. Computers and Electronics in Agriculture. 2004 Mar 31; 42(3):161–80.
  • Leemans V, Destain MF. A real-time grading method of apples based on features extracted from defects. Journal of Food Engineering. 2004 Jan 31; 61(1):83–9.
  • Klise KA, McKenna SA. Water quality change detection: multivariate algorithms. Optics and Photonics in Global Homeland Security II; 2006 May 5.
  • Tellaeche A, BurgosArtizzu XP, Pajares G, Ribeiro A. A vision-based hybrid classifier for weeds detection in precision agriculture through the Bayesian and Fuzzy k-Means paradigms. Innovations in Hybrid Intelligent Systems; 2007. p. 72–79.
  • Urtubia A, Pérez-Correa JR, Soto A, Pszczólkowski P. Using data mining techniques to predict industrial wine problem fermentations. Food Control. 2007 Dec 31; 18(12):1512–7.
  • Rajagopalan B, Lall U. A k-nearest-neighbor simulator for daily precipitation and other weather variables. Water resources research. 1999 Oct 1; 35(10):3089–101.
  • Mucherino A, Papajorgji PJ, Pardalos PM. Data mining in agriculture. Springer Science & Business Media; 2009 Sep 22.
  • Elizondo DA, McClendon RW, Hoogenboom G. Neural network models for predicting flowering and physiological maturity of soybean. Transactions of the ASAE. 1994; 37(3):981–8.
  • Maier HR, Dandy GC. Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environmental modelling & software. 2000 Jan 31; 15(1):101–24.
  • Camps-Valls G, Gómez-Chova L, Calpe-Maravilla J, Soria-Olivas E, Martín-Guerrero JD, Moreno J. Support vector machines for crop classification using hyperspectral data. Pattern Recognition and Image Analysis. 2003 Jun 4; 2652:134–41.
  • Tripathi S, Srinivas VV, Nanjundiah RS. Downscaling of precipitation for climate change scenarios: a support vector machine approach. Journal of Hydrology. 2006 Nov 15; 330(3):621–40.
  • Gholap J. Performance tuning of J48 Algorithm for prediction of soil fertility. Asian Journal of Computer Science and Information Technology. 2012; 2(8).
  • Megala S, Hemalatha M. A novel datamining approach to determine the vanished agricultural land in Tamilnadu. International Journal of Computer Applications. 2011; 23(3):1–6.
  • Ramesh D, Vardhan BV. Data mining techniques and applications to agricultural yield data. International Journal of Advanced Research in Computer and Communication Engineering. 2013 Sep; 2(9):3477–80.
  • Ramesh V, Ramar K. Classification of agricultural land soils: a data mining approach. Agricultural Journal. 2011; 6(3):82–6.
  • Patel H, Patel D. A Brief survey of data mining techniques applied to agricultural data. International Journal of Computer Applications. 2014 Jan 1; 95(9):1–3.
  • Sharma AK, Sahni S. A comparative study of classification algorithms for spam email data analysis. International Journal on Computer Science and Engineering. 2011 May; 3(5):1890–5.
  • Kohavi R. Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. Association for the Advancement of Artificial Intelligence. 1996 Aug 2; 96:202–07.
  • Bahramirad S, Mustapha A, Eshraghi M. Classification of liver disease diagnosis: a comparative study. 2013 Second International Conference on Informatics and Applications (ICIA); 2013 Sep 23. p. 42–6.
  • Venkatesan E, Velmurugan T. Performance analysis of decision tree algorithms for breast cancer classification. Indian Journal of Science and Technology. 2015 Nov 7; 8(29):1–8. DOI: 10.17485/ijst/2015/v8i1/84646.
  • Suma VR, Renjith S, Ashok S, Judy MV. Analytical study of selected classification algorithms for clinical dataset. Indian Journal of Science and Technology. 2016 Mar 22; 9(11):1–9. DOI: 10.17485/ijst/2016/v9i11/67151.
  • Patil TR, Sherekar SS. Performance analysis of Naive Bayes and J48 classification algorithm for data classification. International Journal of Computer Science and Applications. 2013 Apr; 6(2):2561–61.
  • Saadati M, Bagheri A. Mining children ever born data; classification tree approach. Indian Journal of Science and Technology. 2015 Nov 14; 8(30):1–7. DOI: 10.17485/ ijst/2015/v8i30/90251.
  • Jenicka S, Suruliandi A. Comparative study of classification algorithms with modified multivariate local binary pattern texture model on remotely sensed images. 2011 International Conference on Recent Trends in Information Technology (ICRTIT), Chennai: Tamil Nadu; 2011 Jun 3. p. 848–52.
  • Cios KJ, Liu N. A machine learning method for generation of a neural network architecture: A continuous ID3 algorithm. IEEE Transactions on Neural Networks. 1992 Mar; 3(2):280–91.
  • Venkatesan N, Arasan KA, Muthukumaran S. An ID3 algorithm for performance of decision tree in predicting student’s absenteeism in an academic year using categorical datasets. Indian Journal of Science and Technology. 2015 Jul 1; 8(14):1–5. DOI: 10.17485/ijst/2015/v8i14/72730.
  • Joachims T. Making large scale SVM learning practical. Universität Dortmund; 1999 Oct 29.
  • Suganya P, Sumathi CP. A novel metaheuristic data mining algorithm for the detection and classification of parkinson disease. Indian Journal of Science and Technology. 2015 Jul 1; 8(14):1–9. DOI: 10.17485/ijst/2015/v8i14/72685.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.