• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2024, Volume: 17, Issue: 5, Pages: 478-486

Original Article

A Textual Data Analysis of the Union Budget of India

Received Date:15 May 2023, Accepted Date:23 December 2023, Published Date:31 January 2024


Objectives: To present a textual data analysis of the Union Budgets of India for financial years 2019-20 to 2023-24). Examining the policy narratives, key announcements, and thematic emphasis in Budget speeches to explore about the government's priorities. Methods : The analysis is centered on the budget presented by Nirmala Sitharaman, the Finance Minister of India. The study emphases on the budget speeches conveyed by her and serves a combination of quantitative and qualitative techniques to classify the main themes and primacy of the budget. To categorize the discourse into diverse subjects, Natural Language Processing (NLP), Topic Modeling and Sentiment Analysis methods are used. Findings: The outcomes recommend that the budget emphases on encouraging commercial growth, improving living standard, and providing liberation to many sectors affected by the COVID-19 pandemic. “India”, “Government”, “Infrastructure”, “Sector” etc. are some of the words which are used repeatedly in each budget presented. Further high value of Term Frequency-Inverse Document Frequency (TF-IDF) suggests that India (0.323), Government (0.315), Tax (0.268), Crores (0.380) are some of the words which are the most important and relevant words during Budget presentations. Correlation matrix suggests that topic 1 is highly negatively correlated with topic 2 (coefficient value – (-0.832)). The paper concludes by deliberating the repercussions of the budget on the Indian economy and the challenges that are to be addressed to attain the budget's intents. Novelty : Largely, the research paper delivers an all-inclusive understanding of the Indian Union Budget and its possible influence on the country's economic and social development.

Keywords: Textual Data Analysis, Union Budget, Nirmala Sitharaman, Bag of Words, Sentiment Analysis, TF-IDF


  1. Makwana K, Ganatra AP. Textual Data Analysis of ‘Mann Ki Baat’ Show. Indian Journal Of Science And Technology. 2022;15(37):1859–1867. Available from: https://doi.org/10.17485/IJST/v15i37.848
  2. Dredze M, Wallach HM, Puller D, Pereira F. Generating summary keywords for emails using topics. In: Proceedings of the 13th international conference on Intelligent user interfaces. (pp. 199-206) ACM. 2008.
  3. Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. Journal of Machine Learning Research. 2003;3:993–1022. Available from: https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf
  4. Rosen-Zvi M, Chemudugunta C, Griffiths T, Smyth P, Steyvers M. Learning author-topic models from text corpora. ACM Transactions on Information Systems. 2010;28(1):1–38. Available from: https://cocosci.princeton.edu/tom/papers/AT_tois.pdf
  5. Griffiths TL, Steyvers M, Blei DM, Tenenbaum JB. Integrating topics and syntax. In: NIPS'04: Proceedings of the 17th International Conference on Neural Information Processing Systems. (pp. 537-544) 2004.
  6. Demsar J, Curk T, Erjavec A, Gorup C, Hocevar T, Milutinovic M, et al. Orange: Data Mining Toolbox in Python. Journal of Machine Learning Research. 2013;14: 2349–2353. Available from: https://www.jmlr.org/papers/volume14/demsar13a/demsar13a.pdf
  7. Kaushal N, Ghalawat S, Saroha A. Communicating Five-Year Budgets for the Indian Economy: Comparative Text and Sentiment Analysis. Journal of Content, Community & Communication. 2021;14(7):133–144. Available from: https://doi.org/10.31620/JCCC.12.21/11


© 2024 Makwana. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.