Dissertations

CNN MODEL FOR RECOGNITION OF TEXT-BASED CAPTCHAS AND ANALYSIS OF LEARNING BASED ALGORITHMS’ VULNERABILITIES TO VISUAL DISTORTION

Noorbakhsh Amiri Golilarz, Southern Illinois University CarbondaleFollow

Date of Award

5-1-2023

Degree Name

Doctor of Philosophy

Department

Electrical and Computer Engineering

First Advisor

Sayeh, Mohammad

Abstract

Due to the rapid progress and advancements in deep learning and neural networks, manyapproaches and state-of-the-art researches have been conducted in these fields which cause developing various learning-based attacks leading to vulnerability of websites and portals. This kind of attacks decrease the security of the websites which results in releasing the sensitive and important personal information. These days, preserving the security of the websites is one of the most challenging tasks. CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) is kind of test which are developed by designers and are available in various websites to distinguish and differentiate humans from robots in order to protect the websites from possible attacks. In this dissertation, we proposed a CNN based approach to attack and break text-based CAPTCHAs. The proposed method has been compared with several state-of-the-art approaches in terms of recognition accuracy (RA). Based on the results, the developed method can break and recognize CAPTCHAs at high accuracy. Additionally, we wanted to check how to make these CAPTCHAs hard to be broken, so we employed five types of distortions in these CAPTCHAs. The recognition accuracy in presence of these noises has been calculated. The results indicate that adversarial noise can make CAPTCHAs much difficult to be broken. The results have been compared with some state-of-the-art approaches. This analysis can be helpful for CAPTCHA developers to consider these noises in their developed CAPTCHAs. This dissertation also presents a hybrid model based on CNN-SVM to solve text-based CAPTCHAs. The developed method contains four main steps, namely: segmentation, feature extraction, feature selection, and recognition. For segmentation, we suggested using histogram and k-means clustering. For feature extraction, we developed a new CNN structure. The extracted features are passed through the mRMR algorithm to select the most efficient features. These selected features are fed into SVM for further classification and recognition. The results have been compared with several state-of-the-art methods to show the superiority of the developed approach. In general, this dissertation presented deep learning-based methods to solve text-based CAPTCHAs. The efficiency and effectiveness of the developed methods have been compared with various state-of-the-art methods. The developed techniques can break CAPTCHAs at high accuracy and also in a short time. We utilized Peak Signal to Noise Ratio (PSNR), ROC, accuracy, sensitivity, specificity, and precision to evaluate and measure the performance analysis of different methods. The results indicate the superiority of the developed methods.

Download

Available for download on Tuesday, July 11, 2028

COinS

Access

This dissertation is only available for download to the SIUC community. Current SIUC affiliates may also access this paper off campus by searching Dissertations & Theses @ Southern Illinois University Carbondale from ProQuest. Others should contact the interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.

OpenSIUC

Dissertations

CNN MODEL FOR RECOGNITION OF TEXT-BASED CAPTCHAS AND ANALYSIS OF LEARNING BASED ALGORITHMS’ VULNERABILITIES TO VISUAL DISTORTION

Date of Award

Degree Name

Department

First Advisor

Abstract

Access

Links

Browse

Author Corner

OpenSIUC

Dissertations

CNN MODEL FOR RECOGNITION OF TEXT-BASED CAPTCHAS AND ANALYSIS OF LEARNING BASED ALGORITHMS’ VULNERABILITIES TO VISUAL DISTORTION

Author

Date of Award

Degree Name

Department

First Advisor

Abstract

Share

Access

Links

Browse

Author Corner