• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2021, Volume: 14, Issue: 42, Pages: 3144-3156

Original Article

System for Fusion of Face and Speech Modalities Using DTCWT+QFT and MFCC+RASTA Techniques

Received Date:16 July 2021, Accepted Date:19 November 2021, Published Date:10 December 2021


Objectives: The main objective is to propose a multimodal biometric system by forming a fusion of Face and Speech modalities using DTCWT+QFT techniques for face and MFCC+RASTA Techniques for Speech recognitions. The experimental results are compared with existing works and analysed the performance with counterparts. Methods: The proposed model, make use of DTCWT and QFT techniques to extract the features of face images and perform fusion of both. The MFCC and RASTA techniques are implemented to extract features of speech data and then fusion is applied. Various databases discussed and utilized for both face and speech recognition system proposed. Findings: The results of experimentation are compared with existing systems and analysis proved than the proposed system is placed in better position. The fusion of DTCWT and QFT techniques for face recognition system is implemented and the results using performance parameters such as False Acceptation Ratio (FAR), False Rejection Ratio (FRR), Total Success Rate (TSR), Partial Error Rate (PER), Equal Error Rate (EER) are tabulated for six different types of face data sets. The average performance of the results is compared with four existing fusion techniques and showed that the proposed system performs better. The fusion of MFCC and RASTA techniques for speech recognition system is implemented and the performance is measured by calculating accuracy, precision, recall and F1-score. These results are compared with five different schemes and proved that proposed system of fusion of face and speech traits works better for human recognitions. Novelty: Fusion of two algorithms for face recognition is implemented and the results analysed. Then the fusion of two algorithms for speech recognition is implemented and the results are analysed. The novel approach is presented to combine both face and speech recognition system in to single system to improve the security using multimodal biometrics.

Keywords: DTCWT; QFT; RASTA; MFCC; Feature Extraction; Fusion


  1. Subramanian G, Cholendiran N, Prathyusha K, Balasubramanain N, Aravinth J. Multimodal Emotion Recognition Using Different Fusion Techniques. 2021 Seventh International conference on Bio Signals, Images, and Instrumentation (ICBSII). 2021;p. 1–6. doi: 10.1109/ICBSII51839.2021.9445146
  2. Saste ST, Jagdale SM. Emotion recognition from speech using MFCC and DWT for security system. 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA). 2017;1:701–704. doi: 10.1109/ICECA.2017.8203631
  3. Attawibulkul S, Kaewkamnerdpong B, Miyanaga Y. Noisy speech training in MFCC-based speech recognition with noise suppression toward robot assisted autism therapy. 2017 10th Biomedical Engineering International Conference (BMEiCON). 2017;p. 1–5. doi: 10.1109/BMEiCON.2017.8229135
  4. Assuncao G, Goncalves N, Menezes P. Bio-Inspired Modality Fusion for Active Speaker Detection. Applied Sciences. 2021;11:3397. Available from: https://doi. org/10.3390/app11083397
  5. Liu D, Wang Z, Wang L, Chen L. Multi-Modal Fusion Emotion Recognition Method of Speech Expression Based on Deep Learning. Frontiers in Neurorobotics. 2009;15:8300695. doi: 10.3389/fnbot.2021.697634
  6. Zheng C, Wang C, Jia N. Emotion Recognition Model Based on Multimodal Decision Fusion. Journal of Physics: Conference Series. 2021;1873(1):012092. Available from: https://dx.doi.org/10.1088/1742-6596/1873/1/012092
  7. Nassif AB, Shahin I, Attili I, Azzeh M, Shaalan K. Speech Recognition Using Deep Neural Networks: A Systematic Review. IEEE Access. 2019;7:19143–19165. Available from: https://dx.doi.org/10.1109/access.2019.2896880
  8. Anggraeni D, Sanjaya WSM, Nurasyidiek MYS, Munawwaroh M. The Implementation of Speech Recognition using Mel-Frequency Cepstrum Coefficients (MFCC) and Support Vector Machine (SVM) method based on Python to Control Robot Arm. IOP Conference Series: Materials Science and Engineering. 2018;288:012042. Available from: https://dx.doi.org/10.1088/1757-899x/288/1/012042
  9. Khusainov AF. Language Models Creation for the Tatar Speech Recognition System. Indian Journal of Science and Technology. 2017;10(1). Available from: https://dx.doi.org/10.17485/ijst/2017/v10i1/109954
  10. Adjabi I, Ouahabi A, Benzaoui A, Taleb-Ahmed A. Past, Present, and Future of Face Recognition: A Review. Electronics. 2020;10(1):2020. doi: 10.3390/electronics9081188
  11. Shanthakumar HC, Nagaraja GS, Basthikodi M. Performance Evolution of Face and Speech Recognition system using DTCWT and MFCC Features. Turkish Journal of Computer and Mathematics Education (TURCOMAT). 2021;12(3):3395–3404. Available from: https://dx.doi.org/10.17762/turcomat.v12i3.1603
  12. Maruf MR, Faruque MO, Mahmood S, Nelima NN, Muhtasim MG, Pervez MJA. Effects of Noise on RASTA-PLP and MFCC based Bangla ASR Using CNN. 2020 IEEE Region 10 Symposium (TENSYMP). 2020;p. 1564–1567. doi: 1109/TENSYMP50017.2020.9231034
  13. Helali W, Hajaiej Ζ, Cherif A. Real Time Speech Recognition based on PWP Thresholding and MFCC using SVM. Engineering, Technology & Applied Science Research. 2020;10(5):6204–6208. Available from: https://dx.doi.org/10.48084/etasr.3759
  14. Hidayat R, Bejo A, Sumaryono S, Winursito A. Denoising Speech for MFCC Feature Extraction Using Wavelet Transformation in Speech Recognition System. 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE). 2018;p. 280–284. doi: 10.1109/ICITEED.2018.8534807
  15. Raju K, Krishna A, Murali M. Automatic Speech Recognition System Using Mfcc-Based Lpc Approach with Back Propagated Artificial Neural Networks. ICTACT Journal on Soft Computing. 2020;10(4). doi: 10.9790/4200-0606024864
  16. Basthikodi M, Ahmed W. Parallel Algorithm Performance Analysis using OpenMP for Multicore Machines. International Journal of Advanced Computer Technology (IJACT). 2015;4(5):28–32. Available from: https://www.ijact.org/ijactold/volume4issue5/IJ0450005.pdf
  17. Bousnina N, Ghouzali S, Mikram M, Abdul W. DTCWT-DCT watermarking method for multimodal biometric authentication. Proceedings of the 2nd International Conference on Networking, Information Systems & Security - NISS19. 2019;19. Available from: https://www.techscience.com/iasc/v27n1/41145/pdf
  18. Shruthi M, Mustafa, Ananth Prabhu. Parellel Implementation of Modified Apriori Algorithm on Multicore Systems. ORALNDO, USA. 2016.
  19. Ma Y, Huang Z, Wang X, Huang K. An Overview of Multimodal Biometrics Using the Face and Ear. Mathematical Problems in Engineering. 2020;2020:1–17. Available from: https://dx.doi.org/10.1155/2020/6802905
  20. Sarangi PP, Nayak DR, Panda M, Majhi B. A feature-level fusion based improved multimodal biometric recognition system using ear and profile face. Journal of Ambient Intelligence and Humanized Computing. 2021. Available from: https://dx.doi.org/10.1007/s12652-021-02952-0
  21. Tomar P, Singh RC. Cascade‐based Multimodal Biometric Recognition System with Fingerprint and Face. Macromolecular Symposia. 2021;397(1):2000271. Available from: https://dx.doi.org/10.1002/masy.202000271
  22. Siddiqui MF, Siddique WA, MA, Jumani AK. Face Detection and Recognition System for Enhancing Security Measures Using Artificial Intelligence System. Indian Journal of Science and Technology. 2020. doi: 10.17485/ijst/2020/v13i09/149298
  23. Happy SL, Dasgupta A, George A, Routray A. A video database of human faces under near Infra-Red illumination for human computer interaction applications. 2012 4th International Conference on Intelligent Human Computer Interaction (IHCI). 2012;p. 1–4. doi: 10.1109/IHCI.2012.6481868
  24. Panayotov V, Chen G, Povey D, Khudanpur S. Librispeech: An ASR corpus based on public domain audio books. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2015;p. 5206–5210. doi: 10.1109/ICASSP.2015.7178964
  25. Halvi S, Ramapur N, Raja KB, Prasad S. Fusion Based Face Recognition System using 1D Transform Domains. Procedia Computer Science. 2017;115:383–390. doi: 10.1016/j.procs.2017.09.095
  26. Sujatha BM. SOM based Face Recognition using Steganography and DWT Compression Techniques. International Journal of Computer Science and Information Security. 2016;14(9):806–826. doi: 10.5121/sipij.2016.7304
  27. Sujatha BM, Madiwalar CT, Babu KS, Raja KB, Venugopal KR. Compression Based Face Recognition Using DWT and SVM. An International Journal (SIPIJ). 2016;7(3):45–62. doi: 10.5121/sipij.2016.7304
  28. Sujatha BM, Lagali S, Ramapur N, Babu KS, Raja KB, Venugopal KR. Reversible Logic-MUX-Multiplier Based Face Recognition using Hybrid Features. IOSR Journal of VLSI and Signal Processing. 2016;6(6):48–64. Available from: http://www.iosrjournals.org/iosr-jvlsi/papers/vol6-issue6/Version-2/F0606024864.pdf
  29. Belahcene M, Laid M, Chouchane A, Ouamane A, Bourennane S. Local descriptors and tensor local preserving projection in face recognition. 2016 6th European Workshop on Visual Information Processing (EUVIP). 2016. doi: 10.1109/EUVIP.2016.7764608
  30. Maza S, Touahria M. Feature Selection Algorithms in Intrusion Detection System: A Survey. KSII Transactions on Internet and Information Systems. 2018;12(10):1–14. doi: 10.3837/tiis.2018.10.024
  31. Chen K, Zhou FY, Yuan XF. Hybrid particle swarm optimization with spiral-shaped mechanism for feature selection. Expert Systems with Applications. 2019;128:140–156. Available from: https://dx.doi.org/10.1016/j.eswa.2019.03.039
  32. Khalvati L, Keshtgary M, Rikhtegar N. Intrusion Detection Based on a Novel Hybrid Learning Approach”. Journal of AI and Data Mining. 2018;6(1):157–162. doi: 10.22044/JADM.2017.979
  33. Acharya N, Singh S. An IWD-based feature selection method for intrusion detection system. Soft Computing. 2018;22(13):4407–4416. Available from: https://dx.doi.org/10.1007/s00500-017-2635-2


© 2021 Shanthakumar et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.