• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2023, Volume: 16, Issue: 34, Pages: 2703-2708

Original Article

Generation of Medical Reports from Chest X-Ray Images using Multi Modal Learning Approach

Received Date:06 June 2023, Accepted Date:04 August 2023, Published Date:12 September 2023


Objectives: The purpose of this research is to use a multimodal learning approach to perform suggestive diagnosis and generate reports based on chest X-rays and associated data. This research falls under the Vision Language Generation, or VLG, which in this case produces reports given a chest X-ray. Methods: We use a Transformer model with CNN and RNN as part of a multimodal architecture in addition to greedy beam search to generate report impressions in order to construct a proper transformer model capable of producing precise report impressions. We will also collect reports and chest X-rays from the dataset in order to evaluate the adaptability of the model: Indiana University’s Open-I CXR(1). This will be done so that the results can be evaluated and the model’s ability to produce accurate and grammatically correct impressions of reports can be improved. Findings: We achieved better BLEU-1 and BLEU-2 scores compared to the research selected for this research. We have been able to achieve following BLEU scores through our proposed model: BLEU-1 = 0.592, BLEU-2 = 0.422, BLEU-3 = 0.298, BLEU-4 = 0.205. Novelty: We propose a Transformer model to generate report impressions. This transformer model has CNN as an encoder and RNN as a decoder with attention mechanism on top of it. Additionally, greedy beam search has been used to get grammatically correct sentences.

Keywords: Chest XRay; Transformers; OpenI CXR; CNN; RNN


  1. Demner-Fushman D, Antani S, Simpson M, Thoma GR. Design and Development of a Multimodal Biomedical Information Retrieval System. Journal of Computing Science and Engineering. 2012. Available from: https://lhncbc.nlm.nih.gov/LHC-publications/PDF/pub2012019.pdf
  2. Li J, Li S, Hu Y, Tao H. A Self-guided Framework for Radiology Report Generation. Lecture Notes in Computer Science. 2022;p. 588–598. Available from: https://arxiv.org/abs/2206.09378
  3. Yan A, He Z, Lu X, Du J, Chang E, Gentili A, et al. Weakly Supervised Contrastive Learning for Chest X-Ray Report Generation. Findings of the Association for Computational Linguistics: EMNLP 2021. 2021. Available from: https://arxiv.org/abs/2109.12242
  4. Sirshar M, Paracha MFK, Akram MU, Alghamdi NS, Zaidi SZY, Fatima T. Attention based automated radiology report generation using CNN and LSTM. PLOS ONE. 2021;17(1):e0262209. Available from: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0262209
  5. Jing B, Xie P, Xing E. On the Automatic Generation of Medical Imaging Reports. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2021. Available from: https://doi.org/10.18653/v1/P18-1240
  6. Li Y, Wang H, Luo Y. A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports. 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2020. Available from: https://arxiv.org/abs/2009.01523
  7. Liu G, Hsu TMH, Mcdermott M, Boag W, Weng WH, Szolovits P, et al. Clinically Accurate Chest X-Ray Report Generation. 2019. Available from: https://doi.org/10.48550/arXiv.1904.02633
  8. Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T, et al. CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning. Computer Vision and Pattern Recognition. 2017. Available from: https://doi.org/10.48550/arXiv.1711.05225
  9. Vaswani A, Shazeer N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, et al. Attention Is All You Need. 2017. Available from: https://doi.org/10.48550/arXiv.1706.03762
  10. Papineni K, Roukos S, Ward T, Zhu WJ. BLEU: a Method for Automatic Evaluation of Machine Translation. 2002. Available from: https://aclanthology.org/P02-1040.pdf


2023 Navuduri & Kulkarni. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.