Indian Journal of Science and Technology
Year: 2020, Volume: 13, Issue: 39, Pages: 4189-4201
Muhammad Yaseen Khan 1* , Muhammad Adil Rao2 , Shaukat Wasi1 , Twaha Ahmed Minai 2 , Syed Muhammad Khaliq-ur-Rahman Raazi 1
1 Center for Language Computing, FoC, Mohammad Ali Jinnah University, Karachi, Pakistan
2 Department of Computer Science, DHA Suffa University, Karachi, Pakistan
Email: [email protected]
Received Date:25 August 2020, Accepted Date:24 September 2020, Published Date:09 November 2020
Background: Prosody (rhyming words) is a connatural element of poetry, throughout its reach, across thousands of languages in the world. Since medieval era, the Indic poetry (principally the Hindi/Urdu poetry) has created an impactful flamboyance w.r.t the subjects, styles, and other creative aspects in poetry. Besides the message of heartfelt poetry, we see the Qafiya (i.e., rhyming words) is the core element, without which we may not consider anything Hindi/Urdu poetry but merely a piece of writing; alongside it, Radif (i.e., a phrasal suffix to qafiya) is also considered next to the intrinsic part in Ghazals. In this regard, the contributions of this paper are one–the development of an optimal technique for the prosodic (qafiya) suggestions/retrieval in Hindi/Urdu poetry; and two–the qafiya suggestions based on the attached subsequent radif. Methods: The work in this paper involves usage of a 13.46 M tokens tri-script corpus of poetry. Instead of phonetic value matching, the proposed methodology employs four different Edit Distances (i.e., Levenshtein, Damerau–Levenshtein, Jaro–Winkler, and Hamming distance) as the comparison measures for prosodic suggestions. Findings: The proposed work shows better results in comparison to ‘Qaafiya Dictionary’ powered by rekhta.org. Moreover, w.r.t the inter-metric similarity and running time Jaro–Winkler appears to be the most optimal algorithm for the rhyme suggestion, whereas the Levenshtein distance is the laziest technique. Novelty/Applications: This work benefits researchers of Indic natural language processing for lexical look-ups and analysis of creative literature, especially poetry.
Keywords: Natural language processing, information retrieval, poetry, prosody, Hindi, Urdu
© 2020 Khan et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee).
Subscribe now for latest articles and news.