Date of Award

8-1-2025

Degree Name

Doctor of Philosophy

Department

Electrical and Computer Engineering

First Advisor

Anagnostopoulos, Iraklis

Abstract

With the widespread adoption of Deep Neural Networks (DNNs) in modern embedded applications, there has been a substantial increase in computationally intensive and power-hungry workloads, necessitating more sophisticated approaches to efficiently deploy these models on resource-constrained devices. The inherent limitations of edge computing platforms-including restricted memory capacity, limited processing capabilities, and stringent power constraints-pose substantial challenges for the deployment of complex DNN workloads. While recent advances in model compression and hardware acceleration have partially addressed these challenges, there remains a critical need for comprehensive solutions that systematically optimize both the pre-deployment design and runtime management of DNNs on heterogeneous edge systems. This dissertation aims to address these challenges by presenting a comprehensive approach that systematically optimizes DNN deployment at the edge, focusing on both offline model design and online resource management. Specifically, this dissertation focuses on four key areas: (i) dimensionality reduction through latent imagination for efficient AI-powered computer vision, where our methodology achieves over 45% improvement in prediction accuracy; (ii) a composite reinforcement learning controller for joint DNN pruning and quantization, which achieves 39% average energy reduction with only 1.7% average accuracy loss; (iii) efficient multi-DNN management via DNN partitioning for heterogeneous embedded systems, with our approach resulting in x4.6 average throughput improvement for multi-DNN workloads; and (iv) multi-objective optimization methodologies for balancing competing performance metrics in multi-DNN deployment scenarios through three complementary frameworks that address throughput-fairness balancing via reinforcement learning, throughput-power efficiency co-optimization through heterogeneity-aware techniques, and priority-aware resource allocation, collectively yielding significant improvements in system performance and responsiveness. In summary, this dissertation provides a comprehensive set of methodologies that enable the efficient deployment of complex DNN workloads on edge devices, systematically addressing the challenges of pre-deployment optimization and runtime management. The demonstrated results highlight the effectiveness of our proposed solutions and their potential for practical applications across various edge computing environments.

Share

COinS
 

Access

This dissertation is only available for download to the SIUC community. Current SIUC affiliates may also access this paper off campus by searching Dissertations & Theses @ Southern Illinois University Carbondale from ProQuest. Others should contact the interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.