Date of Award
5-1-2025
Degree Name
Doctor of Philosophy
Department
Electrical and Computer Engineering
First Advisor
Lu, Chao
Abstract
In recent years, the development of the Back-Propagation Algorithm has led to the emergence of numerous object detectors, significantly advancing their visibility. State-of-the-art architectures now enable the learning and prediction of instance locations with remarkable accuracy. Most of these detectors rely on segmentation technologies to directly connect raw image pixels and object coordinates. However, approach tend to be computationally expensive. This study introduces "Selection", a novel two-stage online training algorithm based on self-supervised learning and Convolutional Neural Networks (CNNs), which demonstrate substantial improvements in the efficiency of object detection compared to the conventional end-to-end regression structure using ground truth box coordinates. Selection establishes a bridge between Instance and Semantic Segmentation for the first time, overcoming mask training limitations imposed by contour polygons in instance box ground truth annotations. Without relying on an additional mask task branch, Selection leverages edge detection technology to extract object contour features directly from image representations. This approach effectively avoids distractions from unrelated tasks, such as confidence rate prediction, ensuring robustness during training while conserving computational resources. In addition, Selection employs a clustering concept to collaborate with edge feature locations of objects within proximity from neural network outputs. Furthermore, a highly parallel clustering algorithm, Pyramid, has been developed and embedded within the Selection framework. It accumulates contour features into a converged point pattern and generates valid instance centers based on corresponding density rates, without requiring additional online training. Compared to six existing clustering algorithms, Pyramid outperforms in both Average Precision (AP) and time efficiency. Unlike traditional methods, this approach not only offers greater flexibility but also makes the learning process more interpretable from a human perspective. The Selection framework along with the Pyramid clustering algorithm have been validated using the COCO dataset, achieving a 72.8% mean Average Precision on a random batch of 5,000 images while utilizing 44.3 million parameters. Compared to Faster R-CNN, Selection demonstrates a 1.3% improvement in AP and a 20% reduction in parameter count. These results strongly demonstrate the superior regression performance of Selection.
Access
This dissertation is only available for download to the SIUC community. Current SIUC affiliates may also access this paper off campus by searching Dissertations & Theses @ Southern Illinois University Carbondale from ProQuest. Others should contact the interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.