• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2023, Volume: 16, Issue: 26, Pages: 1967-1974

Original Article

Automated Resume Parsing and Job Domain Prediction using Machine Learning

Received Date:14 April 2023, Accepted Date:17 June 2023, Published Date:04 July 2023


Objectives: This study aims to develop an efficient approach for parsing resumes and predicting job domains using natural language processing (NLP) techniques and named entity recognition to enhance the resume screening process for recruiters. Methods: The proposed approach involves preprocessing steps, such as cleaning, tokenization, stop-word removal, stemming, and lemmatization, implemented with the PyMuPDF and doc2text Python modules. Regular expressions and the spaCy library are utilized for entity recognition and name extraction. The model achieved a prediction accuracy of 92.08% and an F1-score of 0.92 on a dataset of 1000 resumes. An ablation experiment assessed the contributions of different factors. Findings: The approach demonstrated a high prediction accuracy of 92.08% and F1-score of 0.92 for job domain prediction, effectively identifying relevant job domains from resumes. Evaluations on individual job domains showed excellent precision and recall scores, validating its applicability. Preprocessing techniques significantly improved accuracy, while the integration of regular expressions and spaCy enhanced the model’s performance. This approach automates resume screening, reducing recruiters’ workload, saving time and effort, and improving candidate selection and the hiring process. Novelty: This study introduces a novel approach combining NLP techniques, regular expressions, and entity recognition for resume parsing and job domain prediction. This integration enhances accuracy and efficiency, offering a unique solution for resume screening.

Keywords: Resume parsing; Job domain prediction; Entity recognition; Machine learning; Natural Language Processing


  1. Chaudhari Y, Jadhav P, Gupta Y. An End to End Solution For Automated Hiring. 2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT). 2022;p. 1–6. Available from: https://doi.org/10.1109/ICERECT56837.2022.10060436
  2. Sharma N, Bhutia R, Sardar V, George AP, Ahmed F. Novel Hiring Process using Machine Learning and Natural Language Processing. 2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT). 2021;p. 1–6. Available from: https://doi.org/10.1109/CONECCT52877.2021.9622692
  3. Roy PK, Chowdhary SS, Bhatia R. A Machine Learning approach for automation of Resume Recommendation system. Procedia Computer Science. 2020;167:2318–2327. Available from: https://doi.org/10.1016/j.procs.2020.03.284
  4. Kulkarni A, Shankarwar T, Thorat S. Personality Prediction Via CV Analysis using Machine Learning. International Journal of Engineering Research & Technology. 2021;10(9). Available from: https://www.ijert.org/research/personality-prediction-via-cv-analysis-using-machine-learning-IJERTV10IS090197.pdf
  5. Roy PK, Singh SK, Das TK, Tripathy AK. Automated Resume Classification Using Machine Learning. In: Lecture Notes in Networks and Systems. (Vol. 427, pp. 307-316) Springer Nature Singapore. 2022.
  6. Kinge B, Mandhare S, Chavan P, Chaware SM. Resume Screening using Machine Learning and NLP: A proposed system. International Journal of Scientific Research in Computer Science, Engineering and Information Technology. 2022;p. 253–258. Available from: https://doi.org/10.32628/CSEIT228240
  7. Mittal V, Mehta P, Relan D, Gabrani G. Methodology for resume parsing and job domain prediction. Journal of Statistics and Management Systems. 2020;23(7):1265–1274. Available from: https://doi.org/10.1080/09720510.2020.1799583
  8. Noble SM, Foster LL, Craig SB. The procedural and interpersonal justice of automated application and resume screening. International Journal of Selection and Assessment. 2021;29(2):139–153. Available from: https://doi.org/10.1111/ijsa.12320
  9. Sauter M, Draschkow D, Mack W. Building, hosting, recruiting: A brief introduction to running behavioral experiments online. Brain Sciences. 2020;10(4):251. Available from: https://doi.org/10.3390/brainsci10040251
  10. Rojas-Galeano S, Posada J, Ordoñez E. A Bibliometric Perspective on AI Research for Job-Résumé Matching. The Scientific World Journal. 2022;2022:1–15. Available from: https://doi.org/10.1155/2022/8002363
  11. Jayakumar N, Maheshwaran AK, Arvind PS, Vijayaragavan G. On-Demand Job-Based Recruitment For Organisations Using Artificial Intelligence. 2023 International Conference on Networking and Communications (ICNWC). 2023;p. 1–6. Available from: https://doi.org/10.1109/ICNWC57852.2023.10127551
  12. Roy PK, Chahar S. N-Gram Feature Based Resume Classification Using Machine Learning. In: Communications in Computer and Information Science. (pp. 239-251) Springer International Publishing. 2022.
  13. Anandarajan M, Hill C, Nolan T. Text Preprocessing. In: AM, HC, NT., eds. Practical Text Analytics. (pp. 45-59) Springer International Publishing. 2019.
  14. Liu L, Martín-Barragán B, Prieto FJ. A projection multi-objective SVM method for multi-class classification. Computers & Industrial Engineering. 2021;158:107425. Available from: https://doi.org/10.1016/j.cie.2021.107425
  15. Vajjala S, Majumder B, Gupta A, Surana H. Practical Natural Language Processing: A Comprehensive Guide to Building Real-World NLP Systems. O’Reilly Media. .
  16. Amini MR, Canu S, Fischer A, Guns T, Novak PK, Tsoumakas G. Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022. 2022.
  17. Alderham AH, Jaha ES. Comparative Semantic Resume Analysis for Improving Candidate-Career Matching. 2022 14th International Conference on Computational Intelligence and Communication Networks (CICN). 2022;p. 313–334. Available from: https://doi.org/10.1109/CICN56167.2022.10008255


© 2023 Sinha et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.