Deep Generative Models: A Review

Rayeesa Mehmood; Rumaan Bashir; Kaiser J Giri

doi:10.17485/IJST/v16i7.2296

Article

VIEWS 1373
PDF 245

Indian Journal of Science and Technology

DOI: 10.17485/IJST/v16i7.2296

Year: 2023, Volume: 16, Issue: 7, Pages: 460-467

Original Article

Deep Generative Models: A Review

Rayeesa Mehmood¹, Rumaan Bashir^1*, Kaiser J Giri¹

¹Department of Computer Science, Islamic University of Science &Technology, Kashmir, Jammu and Kashmir, India

*Corresponding Author
Email: [email protected]

Received Date:29 November 2022, Accepted Date:17 January 2023, Published Date:17 February 2023

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Objectives: To provide insight into deep generative models and review the most prominent and efficient deep generative models, including Variational Auto-encoder (VAE) and Generative Adversarial Networks (GANs). Methods: We provide a comprehensive overview of VAEs and GANs along with their advantages and disadvantages. This paper also surveys the recently introduced Attention-based GANs and the most recently introduced Transformer based GANs. Findings: GANs have been intensively researched because of their significant advantages over VAE. Furthermore, GANs are powerful generative models that have been widely employed in a variety of fields. Though GANs have a number of advantages over VAEs, but, despite their immense popularity and success, training GANs is still difficult and has experienced a lot of setbacks. These failures include mode collapse, where the generator produces the same set of outputs for various inputs, ultimately resulting in the loss of diversity; non-convergence due to oscillatory and diverging behaviors of the generator and discriminator during the training phase; and vanishing or exploding gradients, where learning either ceases to occur or occurs very slowly. Recently, some attention-based GANs and Transformer-based GANs have also been proposed for high-fidelity image generation. Novelty: Unlike previous survey articles, which often focus on all DGMs and dive into their complicated aspects, this work focuses on the most prominent DGMs, VAEs, and GANs and provides a theoretical understanding of them. Furthermore, because GAN is now the most extensively used DGM being studied by the academic community, the literature on it needs to be explored more. Moreover, while numerous articles on GANs are available, none have analyzed the most recent attention-based GANs and Transformer-based GANs. So, in this study, we review the recently introduced attention-based GANs and Transformer-based GANs, the literature related to which has not been reviewed by any survey paper.

Keywords: Variational Autoencoder; Generative Adversarial Networks; Autoencoder; Transformer; Self-Attention

References

Qi C, Chen J, Xu G, Xu Z, Lukasiewicz T, Liu Y. Sag-gan: Semi-supervised attention-guided gans for data augmentation on medical images. 2020. Available from: https://doi.org/10.48550/arXiv.2011.07534
Park J, Styleformer KY. Transformer based Generative Adversarial Networks with Style Vector. 2021. Available from: https://doi.org/10.48550/arXiv.2106.07023
Hudson DA, Zitnick L. Generative adversarial transformers. International Conference on Machine Learning. 2021;p. 4487–4499. Available from: https://doi.org/10.48550/arXiv.2103.01209
Zhang B, Gu S, Zhang B, Bao J, Chen D, Wen F, et al. Styleswin: Transformer-based gan for high-resolution image generation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2022;p. 11304–11314. Available from: https://doi.org/10.48550/arXiv.2112.10762
Zhao L, Zhang Z, Chen T, Metaxas D, Zhang H. Improved transformer for high-resolution gans. Advances in Neural Information Processing Systems. 2021;34:18367–18380. Available from: https://proceedings.neurips.cc/paper/2021/hash/98dce83da57b0395e163467c9dae521b-Abstract.html
Xu R, Xu X, Chen K, Zhou B, Loy CC. Stransgan: An empirical study on transformer in gans. 2021. Available from: https://doi.org/10.48550/arXiv.2110.13107
Lee K, Chang H, Jiang L, Zhang H, Tu Z, Liu C, et al. Vitgan: Training gans with vision transformers. 2009. Available from: https://doi.org/10.48550/arXiv.2107.04589
Jiang Y, Chang S, Wang Z. Transgan: Two pure transformers can make one strong gan, and that can scale up. Advances in Neural Information Processing Systems. 2021;34:14745–14758. Available from: https://proceedings.neurips.cc/paper/2021/hash/7c220a2091c26a7f5e9f1cfb099511e3-Abstract.html
Schulze H, Yaman D, Waibel A, Cagan. Text-To-Image Generation with Combined Attention Generative Adversarial Networks. In: DAGM German Conference on Pattern Recognition. (pp. 392-404) Springer. 2021.
Jeha P, Bohlke-Schneider M, Mercado P, Kapoor S, Nirwan RS, Flunkert V, et al. Progressive Self Attention GANs for Synthetic Time Series. International Conference on Learning Representations. 2021. Available from: https://doi.org/10.48550/arXiv.2108.00981
Tang H, Bai S, Sebe N. Dual Attention GANs for Semantic Image Synthesis. Proceedings of the 28th ACM International Conference on Multimedia. 2020;p. 1994–2002. Available from: https://doi.org/10.1145/3394171.3416270
Torrado RR, Khalifa A, Green MC, Justesen N, Risi S, Togelius J. Bootstrapping Conditional GANs for Video Game Level Generation. 2020 IEEE Conference on Games (CoG). 2020;p. 41–48. Available from: https://doi.org/10.1109/CoG47356.2020.9231576
Xiang X, Yu Z, Lv N, Kong X, Saddik AE. Attention-Based Generative Adversarial Network for Semi-supervised Image Classification. Neural Processing Letters. 2020;51(2):1527–1540. Available from: https://doi.org/10.1007/s11063-019-10158-x
Tang H, Xu D, Yan Y, Corso JJ, Torr PH, Sebe N. Multi-channel attention selection gans for guided image-to-image translation. 2020. Available from: https://doi.org/10.48550/arXiv.2002.01048
Yu Y, Li X, Liu F. Attention GANs: Unsupervised Deep Feature Learning for Aerial Scene Classification. IEEE Transactions on Geoscience and Remote Sensing. 2020;58(1):519–531. Available from: https://doi.org/10.1109/TGRS.2019.2937830
Jiang Y, CS, WZ. TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up. 2021. Available from: https://doi.org/10.48550/arXiv.2102.07074
Mejjati A, Richardt Y, Tompkin C, Cosker J, Kim D, KI. Unsupervised attention-guided image-to-image translation. 2018. Available from: https://doi.org/10.48550/arXiv.1806.02311
Alotaibi A. Deep Generative Adversarial Networks for Image-to-Image Translation: A Review. Symmetry. 2020;12(10):1705. Available from: https://doi.org/10.3390/sym12101705
Ruthotto L, Haber E. An introduction to deep generative modeling. GAMM-Mitteilungen. 2021;44(2):202100008.
Kingma DP, Welling M. 2013. Available from: https://doi.org/10.48550/arXiv.1312.6114
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial networks. Communications of the ACM. 2020;63(11):139–144. Available from: https://doi.org/10.1145/3422622
Aggarwal A, Mittal M, Battineni G. Generative adversarial network: An overview of theory and applications. International Journal of Information Management Data Insights. 2021;1(1):100004. Available from: https://doi.org/10.1016/j.jjimei.2020.100004
Jabbar A, Li X, Omar B. A Survey on Generative Adversarial Networks: Variants, Applications, and Training. ACM Computing Surveys. 2022;54(8):1–49. Available from: https://doi.org/10.1145/3463475

Copyright

© 2023 Mehmood et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)