Multi-dimensional CNN Based Feature Extraction with Feature Fusion and SVM for Human Activity Recognition in Surveillance Videos

Hetal Shah; Mehfuza S Holia

doi:10.17485/IJST/v17i21.3203

Article

Multi-dimensional CNN Based Feature Extraction with Feature Fusion and SVM for Human Activity Recognition in Surveillance Videos

VIEWS 130
PDF 30

Indian Journal of Science and Technology

DOI: 10.17485/IJST/v17i21.3203

Year: 2024, Volume: 17, Issue: 21, Pages: 2177-2198

Original Article

Multi-dimensional CNN Based Feature Extraction with Feature Fusion and SVM for Human Activity Recognition in Surveillance Videos

Hetal Shah^1*, Mehfuza S Holia²

¹Department of Computer Engineering, Gujarat Technological University, Ahmedabad, 382424, Gujarat, India
²Assistant Professor, Department of Electronics, BVM Engineering College, Vallabh Vidyanagar, Gujarat, India

*Corresponding Author
Email: [email protected]

Received Date:21 December 2023, Accepted Date:24 April 2024, Published Date:21 May 2024

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Background/Objectives: The accurate recognition of human activities from video sequences is very challenging due to low resolution, cluttered background, partial occlusion, and different viewpoints. Machine learning (ML) based automated HAR from surveillance videos is required with the fusion of various feature extraction techniques. Methods: In this paper, SVM with feature fusion is utilized for automatic recognition from surveillance videos. A Histogram of Oriented Gradient (HOG) is used to segment the frame to differentiate humans from other objects or background noise in the input video frames. The multi-feature extraction can be accomplished in terms of Gabor Wavelet Transform (GWT), Autocorrelogram, Gray-Level Co-Occurrence Matrix (GLCM), HSV histogram, and Multi-dimensional CNN. The proposed approach is implemented in MATLAB software and compared with existing approaches like Space-Time Interest Point (STIP) and Histogram of Optical Flow (HOF). Findings: The proposed approach outperforms the existing approaches in terms of reduced time consumption and high accuracy, 99.886% when using the UCF101 dataset and 99.538% when using the UTKinect dataset. Novelty: The most discriminative feature information is obtained with the feature-level fusion technique. From the feature information, various human actions are recognized with the classification algorithm.

Keywords: Human activity recognition, Machine Learning, Surveillance Videos, Human detection algorithm, Feature extraction, SVM classifier

References

Singh T, Vishwakarma DK. A deeply coupled ConvNet for human activity recognition using dynamic and RGB images. Neural Computing and Applications. 2021;33(1):469–485. Available from: https://dx.doi.org/10.1007/s00521-020-05018-y
Snoun A, Jlidi N, Bouchrika T, Jemai O, Zaied M. Towards a deep human activity recognition approach based on video to image transformation with skeleton data. Multimedia Tools and Applications. 2021;80(19):29675–29698. Available from: https://dx.doi.org/10.1007/s11042-021-11188-1
Anuradha SG, Teja KD. Deep Learning based Human Activity Recognition System with Open Datasets. Turkish Journal of Computer and Mathematics Education (TURCOMAT). 2021;12(13):3143–3147. Available from: https://doi.org/10.17762/turcomat.v12i13.9093
Ullah A, Muhammad K, Ding W, Palade V, Haq IU, Baik SW. Efficient activity recognition using lightweight CNN and DS-GRU network for surveillance applications. Applied Soft Computing. 2021;103:1–13. Available from: https://dx.doi.org/10.1016/j.asoc.2021.107102
HHAS. Recognition of normal and abnormal human actions. Весенние дни науки. In: International Conference of Students and Young Scientists “Spring Days of Science”. (pp. 333-335) 2021.
Elharrouss O, Almaadeed N, Al-Maadeed S, Bouridane A, Beghdadi A. A combined multiple action recognition and summarization for surveillance video sequences. Applied Intelligence. 2021;51(2):690–712. Available from: https://dx.doi.org/10.1007/s10489-020-01823-z
D DA, Minu RI. Edge computing based surveillance framework for real time activity recognition. ICT Express. 2021;7(2):182–186. Available from: https://dx.doi.org/10.1016/j.icte.2021.04.010
Franco A, Magnani A, Maio D. A multimodal approach for human activity recognition based on skeleton and RGB data. Pattern Recognition Letters. 2020;131:293–299. Available from: https://dx.doi.org/10.1016/j.patrec.2020.01.010
Zhang Y, Po LM, Liu M, Rehman YAU, Ou W, Zhao Y. Data-level information enhancement: Motion-patch-based Siamese Convolutional Neural Networks for human activity recognition in videos. Expert Systems with Applications. 2020;147. Available from: https://dx.doi.org/10.1016/j.eswa.2020.113203
Wan S, Qi L, Xu X, Tong C, Gu Z. Deep Learning Models for Real-time Human Activity Recognition with Smartphones. Mobile Networks and Applications. 2020;25(2):743–755. Available from: https://dx.doi.org/10.1007/s11036-019-01445-x
Singh R, Sonawane A, Srivastava R. Recent evolution of modern datasets for human activity recognition: a deep survey. Multimedia Systems. 2020;26(2):83–106. Available from: https://dx.doi.org/10.1007/s00530-019-00635-7
Shreyas DG, Raksha S, Prasad BG. Implementation of an Anomalous Human Activity Recognition System. SN Computer Science. 2020;1(3). Available from: https://dx.doi.org/10.1007/s42979-020-00169-0
Dwivedi N, Singh DK, Kushwaha DS. Orientation Invariant Skeleton Feature (OISF): a new feature for Human Activity Recognition. Multimedia Tools and Applications. 2020;79(29-30):21037–21072. Available from: https://dx.doi.org/10.1007/s11042-020-08902-w
Mukherjee S, Anvitha L, Lahari TM. Human activity recognition in RGB-D videos by dynamic images. Multimedia Tools and Applications. 2020;79(27-28):19787–19801. Available from: https://dx.doi.org/10.1007/s11042-020-08747-3
Kwon H, Tong C, Haresamudram H, Gao Y, Abowd GD, Lane ND, et al. IMUTube: Automatic Extraction of Virtual on-body Accelerometry from Video for Human Activity Recognition. In: Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies. (Vol. 4, Issue 3, pp. 1-29) Association for Computing Machinery (ACM). 2020. 10.1145/3411841
Gul MA, Yousaf MH, Nawaz S, Rehman ZU, Kim H. Patient Monitoring by Abnormal Human Activity Recognition Based on CNN Architecture. Electronics. 2020;9(12):1–14. Available from: https://doi.org/10.3390/electronics9121993
Ghazal S, Khan US, Saleem MM, Rashid N, Iqbal J. Human activity recognition using 2D skeleton data and supervised machine learning. IET Image Processing. 2019;13(13):2572–2578. Available from: https://dx.doi.org/10.1049/iet-ipr.2019.0030
Naveed H, Khan G, Khan AU, Siddiqi A, Khan MUG. Human activity recognition using mixture of heterogeneous features and sequential minimal optimization. International Journal of Machine Learning and Cybernetics. 2019;10(9):2329–2340. Available from: https://dx.doi.org/10.1007/s13042-018-0870-1
Zhou X, Liang W, Wang KIK, Wang H, Yang LT, Jin Q. Deep-Learning-Enhanced Human Activity Recognition for Internet of Healthcare Things. IEEE Internet of Things Journal. 2020;7(7):6429–6438. Available from: https://dx.doi.org/10.1109/jiot.2020.2985082
Ehatisham-Ul-Haq M, Javed A, Azam MA, Malik HMA, Irtaza A, Lee IH, et al. Robust Human Activity Recognition Using Multimodal Feature-Level Fusion. IEEE Access. 2019;7:60736–60751. Available from: https://dx.doi.org/10.1109/access.2019.2913393
Kushwaha A, Khare A, Srivastava P. On integration of multiple features for human activity recognition in video sequences. Multimedia Tools and Applications. 2021;80(21-23):32511–32538. Available from: https://dx.doi.org/10.1007/s11042-021-11207-1
Deotale D, Verma M, Suresh P. Human Activity Recognition in Untrimmed Video using Deep Learning for Sports Domain. SSRN Electronic Journal. ;p. 1–12. Available from: https://dx.doi.org/10.2139/ssrn.3769815
Girdhar P, Johri P, Virmani D. Incept_LSTM : Accession for human activity concession in automatic surveillance. Journal of Discrete Mathematical Sciences and Cryptography. 2022;25(8):2259–2273. Available from: https://doi.org/10.1080/09720529.2020.1804132
Kushwaha A, Khare A, Khare M. Human Activity Recognition Algorithm in Video Sequences Based on Integration of Magnitude and Orientation Information of Optical Flow. International Journal of Image and Graphics. 2022;22(01). Available from: https://dx.doi.org/10.1142/s0219467822500097
Alawneh L, Alsarhan T, Al-Zinati M, Al-Ayyoub M, Jararweh Y, Lu H. Enhancing human activity recognition using deep learning and time series augmented data. Journal of Ambient Intelligence and Humanized Computing. 2021;12(12):10565–10580. Available from: https://dx.doi.org/10.1007/s12652-020-02865-4
Santos F, Durães D, Marcondes F, Gomes M, Gonçalves F, Fonseca J, et al. Modelling a Deep Learning Framework for Recognition of Human Actions on Video. In: World Conference on Information Systems and Technologies, Advances in Intelligent Systems and Computing. (Vol. 1365, pp. 104-112) Springer, Cham. 2021.
Marcondes FS, Durães D, Gonçalves F, Fonseca J, Machado J, Novais P. In-vehicle violence detection in carpooling: a brief survey towards a general surveillance system. InDistributed Computing and Artificial Intelligence. In: 17th International Symposium on Distributed Computing and Artificial Intelligence, Advances in Intelligent Systems and Computing . (Vol. 1237, pp. 211-220) Springer, Cham. 2020.
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. (pp. 248-255) IEEE. 2009.
Hosotani D, Yoda I, Sakaue K. Wheelchair recognition by using stereo vision and histogram of oriented gradients (HOG) in real environments. In: 2009 Workshop on Applications of Computer Vision (WACV). (pp. 1-6) IEEE. 2010.
Vishwakarma DK, Rawat P, Kapoor R. Human Activity Recognition Using Gabor Wavelet Transform and Ridgelet Transform. Procedia Computer Science. 2015;57:630–636. Available from: https://dx.doi.org/10.1016/j.procs.2015.07.425
Hazra D. Retrieval of color image using color correlogram and wavelet filters. In: International Conference on advances in computer engineering. (pp. 151-154) 2011.
Haralick RM, Shanmugam K, Dinstein I. Textural Features for Image Classification. IEEE Transactions on Systems, Man, and Cybernetics. 1973;SMC-3(6):610–621. Available from: https://dx.doi.org/10.1109/tsmc.1973.4309314
Kumari B, Kumar R, Singh VK, Pawar L, Pandey P, Sharma M. An Efficient System for Color Image Retrieval Representing Semantic Information to Enhance Performance by Optimizing Feature Extraction. Procedia Computer Science. 2019;152:102–110. Available from: https://dx.doi.org/10.1016/j.procs.2019.05.032
Zhang H, Li Y, Jiang Y, Wang P, Shen Q, Shen C. Hyperspectral Classification Based on Lightweight 3-D-CNN With Transfer Learning. IEEE Transactions on Geoscience and Remote Sensing. 2019;57(8):5813–5828. Available from: https://dx.doi.org/10.1109/tgrs.2019.2902568
Chui KT, Gupta BB, Chi HR, Arya V, Alhalabi W, Ruiz MT, et al. Transfer Learning-Based Multi-Scale Denoising Convolutional Neural Network for Prostate Cancer Detection. Cancers. 2022;14(15):1–13. Available from: https://dx.doi.org/10.3390/cancers14153687
Hakim M, Omran AAB, Inayat-Hussain JI, Ahmed AN, Abdellatef H, Abdellatif A, et al. Bearing Fault Diagnosis Using Lightweight and Robust One-Dimensional Convolution Neural Network in the Frequency Domain. Sensors. 2022;22(15):1–24. Available from: https://dx.doi.org/10.3390/s22155793
Chathuramali KGM, Rodrigo R. Faster human activity recognition with SVM. In: International conference on advances in ICT for emerging regions. (pp. 197-203) IEEE. 2013.
Soomro K, Zamir AR, Shah M. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild. 2012. Available from: https://doi.org/10.48550/arXiv.1212.0402
Xia L, Chen CCC, Aggarwal JK. View invariant human action recognition using histograms of 3D joints. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. (pp. 20-27) IEEE. 2012.
Xie Y. Deep Learning Approaches for Human Action Recognition in Video Data. Available from: https://doi.org/10.48550/arXiv.2403.06810
Jaouedi N, Boujnah N, Bouhlel MS. A new hybrid deep learning model for human action recognition. Journal of King Saud University - Computer and Information Sciences. 2020;32(4):447–453. Available from: https://dx.doi.org/10.1016/j.jksuci.2019.09.004
Dong M, Fang Z, Li Y, Bi S, Chen J. AR3D: Attention Residual 3D Network for Human Action Recognition. Sensors. 2021;21(5):1–14. Available from: https://dx.doi.org/10.3390/s21051656
Li W, Xu N, Liu G, Zhao L, Fang X. Segments-Based 3D ConvNet for Action Recognition. In: 2020 International Conference on Computer Science and Communication Technology (ICCSCT) 2020 , Journal of Physics: Conference Series. (Vol. 1621, pp. 1-7) IOP Publishing. 2020.
Richard A, Gall J. A bag-of-words equivalent recurrent neural network for action recognition. Computer Vision and Image Understanding. 2017;156:79–91. Available from: https://dx.doi.org/10.1016/j.cviu.2016.10.014
Ullah A, Ahmad J, Muhammad K, Sajjad M, Baik SW. Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features. IEEE Access. 2017;6:1155–1166. Available from: https://dx.doi.org/10.1109/access.2017.2778011
Ahmad T, Wu J, Alwageed HS, Khan F, Khan J, Lee Y. Human Activity Recognition Based on Deep-Temporal Learning Using Convolution Neural Networks Features and Bidirectional Gated Recurrent Unit With Features Selection. IEEE Access. 2023;11:33148–33159. Available from: https://dx.doi.org/10.1109/access.2023.3263155
Yu S, Xie L, Liu L, Xia D. Learning Long-Term Temporal Features With Deep Neural Networks for Human Action Recognition. IEEE Access. 2019;8:1840–1850. Available from: https://dx.doi.org/10.1109/access.2019.2962284
Ning Z, Suk-Hwan L, Eung-Joo L. Human Activity Recognition Based on Loss-Net Fusion Domain Convolutional Neural Networks. In: 2019 IEEE International Conference on Computation, Communication and Engineering (ICCCE). (pp. 146-149) IEEE. 2020.
Liu G, Zhang Q, Cao Y, Tian G, Ji Z. Online human action recognition with spatial and temporal skeleton features using a distributed camera network. International Journal of Intelligent Systems. 2021;36(12):7389–7411. Available from: https://dx.doi.org/10.1002/int.22591
Kao JY, Ortega A, Tian D, Mansour H, Vetro A. Graph Based Skeleton Modeling for Human Activity Analysis. In: 2019 IEEE International Conference on Image Processing (ICIP). (pp. 2025-2029) IEEE. 2019.
Zhu K, Wang R, Zhao Q, Cheng J, Tao D. A Cuboid CNN Model With an Attention Mechanism for Skeleton-Based Action Recognition. IEEE Transactions on Multimedia. 2020;22(11):2977–2989. Available from: https://dx.doi.org/10.1109/tmm.2019.2962304
Anirudh R, Turaga P, Su J, Srivastava A. Elastic functional coding of human actions: From vector-fields to latent variables. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (pp. 3147-3155) IEEE. 2015.
Rhif M, Wannous H, Farah IR. Action Recognition from 3D Skeleton Sequences using Deep Networks on Lie Group Features. In: 2018 24th International Conference on Pattern Recognition (ICPR). (pp. 3427-3432) IEEE. 2018.
Mohammadzade H, Hosseini S, Rezaei-Dastjerdehei MR, Tabejamaat M. Dynamic Time Warping-Based Features With Class-Specific Joint Importance Maps for Action Recognition Using Kinect Depth Sensor. IEEE Sensors Journal. 2021;21(7):9300–9313. Available from: https://dx.doi.org/10.1109/jsen.2021.3051497
Ghodsi S, Mohammadzade H, Korki E. Simultaneous joint and object trajectory templates for human activity recognition from 3-D data. Journal of Visual Communication and Image Representation. 2018;55:729–741. Available from: https://dx.doi.org/10.1016/j.jvcir.2018.08.001
Gao X, Hu W, Tang J, Liu J, Guo Z. Optimized Skeleton-based Action Recognition via Sparsified Graph Regression. In: Proceedings of the 27th ACM International Conference on Multimedia. (pp. 601-610) ACM. 2019.
Zhang S, Liu X, Xiao J. On Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). (pp. 148-157) IEEE. 2017.

Copyright

© 2024 Shah & Holia. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)