• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2023, Volume: 16, Issue: 35, Pages: 2894-2901

Original Article

Morphology-Assisted Sindhi Text Analysis for Natural Language Processing Applications

Received Date:10 July 2023, Accepted Date:02 August 2023, Published Date:25 September 2023

Abstract

Objectives: Understanding word construction and internal structure, especially in the Sindhi language, requires knowledge of the linguistic field known as morphology. In this study, Sindhi morphology is examined with particular attention paid to its structure, function, nature, word categories, and writing system. Natural Language Processing (NLP) relies on morphological analysis to identify words and their grammatical features, enabling applications like spell checkers and machine translation. A comparative analysis is done to comprehend how Sindhi Morphology developed. Because research on morphology analysis lack proper classification and cover both modern and conventional methodologies, Sindhi morphology variances present difficulties for computerization. Methods: Morphological analysis is crucial in Natural Language Processing (NLP) domains like spell checkers and gadget translation, studying word production and phrase shape using morphemes, the smallest grammatical elements in a language. Morphemes are the building blocks of words and are divided into free and fixed morphemes. Findings: Sindhi’s rich morphology and complexity enable borrowing and lending of words, but ambiguity is high due to similar patterns and vowel deletions. Morphological analysis influences semantic and syntactic analysis. Computerization is challenging due to prefixes, suffixes, and stem positions. Primary and secondary words can be subdivided into compound and complicated terms. The language uses initial, middle, and end writing styles. Novelty: This research aims to develop an automatic Sindhi morphological analyzer for future NLP applications, ensuring compatibility with existing Information Technology world applications. It will help understand Sindhi word structure and benefit software developers in developing Sindhi natural language and speech processing applications.

Keywords: Sindhi Morphology; Morphological Analysis; NLP; Communication and information sharing; machine Learning

References

  1. Khattak A, Asghar MZ, Saeed A, Hameed IA, Hassan SA, Ahmad S. A survey on sentiment analysis in Urdu: A resource-poor language. Egyptian Informatics Journal. 2021;22(1):53–74. Available from: https://doi.org/10.1016/j.eij.2020.04.003
  2. Alothman A, Alsalman A. Arabic Morphological Analysis Techniques. International Journal of Advanced Computer Science and Applications. 2020;11(2). Available from: https://doi.org/10.14569/IJACSA.2020.0110229
  3. Nathani B, Joshi N. Part of Speech Tagging for a Resource Poor Language: Sindhi in Devanagari Script using HMM and CRF. Proceedings of the 18th International Conference on Natural Language Processing (ICON). 2021;p. 611–618. Available from: https://aclanthology.org/2021.icon-main.75/
  4. Faizullah S, Ayub MS, Hussain S, Khan MA. A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges. Applied Sciences. 2023;13(7):4584. Available from: https://doi.org/10.3390/app13074584
  5. Ali Z, Khan SU, Khanda G. A Morphosemantic Analysis of An Agent Theta Role in English and Sindhi. Southern Journal of Arts & Humanities. 2023;1(01):14–25. Available from: https://sjah.isp.edu.pk/index.php/sjah123/article/view/2
  6. Amin M, Ali Z. Phonological and Morphological Variations between Lasi and Standard Sindhi. Journal of Humanities and Social Sciences Research. 2021;3(2):181–194. Available from: https://doi.org/10.37534/bp.jhssr.2021.v3.n2.id1109.p181
  7. Sodhar IN, Hussain A, Ibrahim M, Nawaz D. Identification of Issues and Challenges in Romanized Sindhi Text. International Journal of Advanced Computer Science and Applications. 2019;10(9):10. Available from: https://doi.org/10.14569/IJACSA.2019.0100929
  8. Sodhar IN, Jalbani AH, Channa MI, Hakro DN. Romanized Sindhi Rules for Text Communication. Mehran University Research Journal of Engineering & Technology. 2021;40(2):298–304. Available from: https://doi.org/10.22581/muet1982.2102.04
  9. Sodhar IN, Jalbani AH, Channa MI, Hakro DN. Parts of speech tagging of Romanized Sindhi text by applying rule based model. International Journal of Computer Science and Network Security. 2019;19(11):91. Available from: http://paper.ijcsns.org/07_book/201911/20191113.pdf
  10. Amur ZH, Hooi Y, Sodhar IN, Bhanbhro H, Dahri K. State-of-the Art: Short Text Semantic Similarity (STSS) Techniques in Question Answering Systems (QAS) In: International Conference on Artificial Intelligence for Smart Community. (pp. 1033-1044) Springer Nature Singapore. 2022.
  11. Sodhar IN, Sulaiman S, Buller AH, Sodhar AN. Hybrid Approach Used to Analyze the Sentiments of Romanized Text (Sindhi) International Journal of Advanced Computer Science and Applications. 2023;14(3). Available from: https://doi.org/10.14569/IJACSA.2023.0140362
  12. Sodhar IN, Jalbani AH, Buller AH, Channa MI, Hakro DN. Sentiment analysis of Romanized Sindhi text. Journal of Intelligent & Fuzzy Systems. 2020;38(5):5877–5883. Available from: https://doi.org/10.3233/JIFS-179675
  13. Sodhar IN, Sulaiman S, Buller AH. Exploration of Sindhi Corpus Through Statistical Analysis on the Basis of Reality. Indian Journal Of Science And Technology. 2023;16(12):924–931. Available from: https:// doi.org/10.17485/IJST/v16i12.236
  14. Talpur N, Talpur MJ, Samar T. Researching on Analysis and creating Corpus from Primary level Sindhi language Book for Sindhi. Repertus: Journal of Linguistics, Language Planning and Policy. 2023;p. 37–48. Available from: https://rjllp.muet.edu.pk/index.php/repertus/article/view/24

Copyright

© 2023 Sodhar et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.