• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Year: 2023, Volume: 16, Issue: 36, Pages: 2988-3001

Original Article

Modified Associative Classification Model for Microarray Gene Expression Data using Maximal Frequent Itemsets and Probability Distribution

Received Date:04 March 2023, Accepted Date:14 August 2023, Published Date:27 September 2023


Objective: To make study on generating a less number of class association rules and predicting the class. Methods: Modified associative classification model (MACM) is proposed here for diagnosing cancer from microarray gene expression data using maximal frequent itemsets and probability distribution. The proposed system performs supervised discretization, maximal frequent itemset generation from 80% of the data and prediction processes on the 20% of the dataset. The frequent items set are generated using the minimum support as 20%, 40% and 80% and the minimum confidence as 80%. Binary class data sets and multi class data sets are used to evaluate the constructed model and compared with the classical associative classification algorithms. The model performance is evaluated with type of frequent itemset, number of class association rules generated, accuracy and time taken during training the model. The experiment uses the two colorectal cancer datasets, one lung cancer dataset and one multi label cancer datasets. Findings: The maximal frequent itemset generates the class association rules quickly with lesser number and leads to consume lesser memory space. The performance of the proposed method provides 100%, classification accuracy for the colon cancer datasets GSE15781 and GSE25070 and 99.17% for the colon cancer data set GSE87211. 94% classification accuracy is obtained for the lung cancer dataset GSE43580 when used maximal frequent itemset types. Novelty: Proposed Modified associative classification model has achieved very high performance in classifying gene expression data. The associative classification model helps to diagnose cancer diseases, pathway analysis and treat the cancer disease.

Keywords: Microarray; Discretization; Maximal frequent itemsets; Association rules; Probability distribution


