Indian Journal of Science and Technology
Year: 2020, Volume: 13, Issue: 13, Pages: 1390-1400
Asadullah Kehar1*, Rafaqat Hussain Arain1, Riaz Ahmed Shaikh1
1Department of Computer Science, Shah Abdul Latif University, Khairpur, Pakistan
*Author for correspondence
Department of Computer Science, Shah Abdul Latif University, Khairpur, Pakistan
Email: [email protected]
Received Date:03 April 2020, Accepted Date:23 April 2020, Published Date:16 May 2020
Background: CAPTCHA is a mechanism to distinguish humans from bots. It has become standard means of protection from the misuse of resources on World Wide Web. Different types of CAPTCHAs are implemented but text-based schemes are the most widely used due to its easiness and robustness. A user is asked to type in the text from an image. The image is intentionally distorted to dodge the bots. Recognizing the text is easy for humans but very hard for computers. Method/Findings: In this work, a text-based CAPTCHA scheme with background clutter and partially connected characters is decoded. The main steps consist on preprocessing, segmentation and recognition. Several digital image processing techniques were applied during preprocessing, segmentation steps and convolutional neural network (CNN) was used for recognition process. Since massive data is required for CNN therefore data was generated synthetically. A complex text-based CAPTCHA scheme with varying number of letters: 3, 4 and 5 letters is decoded with the overall precision of 77.5%, 64.2% and 51.9% respectively.
Keywords: CAPTCHAs; HIPs; image processing; machine learning; CNN
Copyright: © 2020 Kehar, Arain, Shaikh. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.