• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2023, Volume: 16, Issue: 3, Pages: 204-213

Original Article

A Scalable High Utilization Itemset Mining Technique for Large Datasets Using A Bit-Based Model

Received Date:06 October 2022, Accepted Date:01 December 2022, Published Date:23 January 2023

Abstract

Objectives: Utility-list based algorithms have gained a lot of traction due to their efficiency and the ease with which they may be modified. While there have been some enhancements, the problem of inefficiency persists. This research presents a solution to this issue by enhancing the utility-list building process, a crucial function that has received little attention in previous studies. Also, the research aims at reducing memory complexity and better performance than existing approaches. Methods: To expedite building, a new set of bitwise operations termed Bit combine construction (BCC) is proposed. In addition, BCC is supported by a unique data format called EBP (Efficiency Bit Partition). An innovative EBP-Miner algorithm is developed with this framework in mind, and it uses many techniques to narrow the search field. Findings: On widely used baseline methods, experimental findings reveal that EBP-Miner outperforms numerous state-of-the-art techniques, including FEACP as well as CLH-Miner approaches. The experiments were conducted with utilization value ranging from 20% to 100% of the nodes. The proposed system achieves an average of 390s runtime and utilization value of 90.25% which are outperformed the existing methods. Also, the approach has proven 20% lesser memory complexity than of the existing algorithms. Novelty: In the field of data mining, high utility itemset mining (HUIM) is an important challenge. The idea is to discover groups of data in a database that are particularly significant or profitable in order to unearth information that can aid in making decisions. The novelty of this study is on developing a better method for building algorithms for HUIM that make use of a bitwise data structure, and on suggesting a more time- and effort-effective strategy for building utility-lists.

Keywords: High Utility Itemsets; Data Mining; Optimization Model; Bitwise Operations; Pattern Mining

References

  1. Reddy A, Murali MH, Prasad K. High utility item-set mining from retail market data stream with various discount strategies using EGUI-tree. Journal of Ambient Intelligence and Humanized Computing. 2021;p. 1–12. Available from: https://doi.org/10.1007/s12652-021-03341-3
  2. Baek Y, Yun U, Kim H, Kim J, Vo B, Truong T, et al. Approximate high utility itemset mining in noisy environments. Knowledge-Based Systems. 2021;212:106596. Available from: https://doi.org/10.1016/j.knosys.2020.106596
  3. Han X, Liu X, Li J, Gao H. Efficient top-k high utility itemset mining on massive data. Information Sciences. 2021;557:382–406. Available from: https://doi.org/10.1016/j.ins.2020.08.028
  4. Qu JF, Fournier-Viger P, Liu M, Hang B, Wang F. Mining high utility itemsets using extended chain structure and utility machine. Knowledge-Based Systems. 2020;208:106457. Available from: https://doi.org/10.1016/j.knosys.2020.106457
  5. Sathyavani D, Sharmila D. Retraction Note to: An improved memory adaptive up-growth to mine high utility itemsets from large transaction databases. Journal of Ambient Intelligence and Humanized Computing. 2021;12(3):3841–3850. Available from: https://doi.org/10.1007/s12652-022-04039-w
  6. Sethi KK, Ramesh DC, Trivedi MC. A Spark-based high utility itemset mining with multiple external utilities. Cluster Computing. 2022;25(2):889–909. Available from: https://doi.org/10.1007/s10586-021-03442-w
  7. Song W, Li J. Discovering High Utility Itemsets Using Set-Based Particle Swarm Optimization. Advanced Data Mining and Applications. 2020;p. 38–53. Available from: https://doi.org/10.1007/978-3-030-65390-3_4
  8. Shen W, Zhang C, Fang W, Zhang X, Zhan ZHCW, Lin JCW. Efficient High-utility Itemset Mining Based on a Novel Data Structure. 2021 IEEE International Smart Cities Conference (ISC2). 2021;2:1–6. Available from: https://doi.org/10.1109/ISC253183.2021.9562788
  9. Lin JCWCW, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P. A predictive GA-based model for closed high-utility itemset mining. Applied Soft Computing. 2021;108:107422. Available from: https://doi.org/10.1016/j.asoc.2021.107422

Copyright

© 2023 Nellutla & Srinivasan. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.