Date of Award
Doctor of Philosophy
Electrical and Computer Engineering
The Internet of Things (IoT) computing paradigm has connected smart objects “things” and has brought new services at the proximity of the user. Edge Computing, a natural evolution of the traditional IoT, has been proposed to deal with the ever-increasing (i) number of IoT devices and (ii) the amount of data traffic that is produced by the IoT endpoints. EC promises to significantly reduce the unwanted latency that is imposed by the multi-hop communication delays and suggests that instead of uploading all the data to the remote cloud for further processing, it is beneficial to perform computation at the “edge” of the network, close to where the data is produced. However, bringing computation at the edge level has created numerous challenges as edge devices struggle to keep up with the growing application requirements (e.g. Neural Networks, or video-based analytics). In this thesis, we adopt the EC paradigm and we aim at addressing the open challenges. Our goal is to bridge the performance gap that is caused by the increased requirements of the IoT applications with respect to the IoT platform capabilities and provide latency- and energy-efficient computation at the edge level. Our first step is to study the performance of IoT applications that are based on Deep Neural Networks (DNNs). The exploding need to deploy DNN-based applications on resource-constrained edge devices has created several challenges, mainly due to the complex nature of DNNs. DNNs are becoming deeper and wider in order to fulfill users expectations for high accuracy, while they also become power hungry. For instance, executing a DNN on an edge device can drain the battery within minutes. Our solution to make DNNs more energy and inference friendly is to propose hardware-aware method that re-designs a given DNN architecture. Instead of proxy metrics, we measure the DNN performance on real edge devices and we capture their energy and inference time. Our method manages to find alternative DNN architectures that consume up to 78.82% less energy and are up to35.71% faster than the reference networks. In order to achieve end-to-end optimal performance, we also need to manage theedge device resources that will execute a DNN-based application. Due to their unique characteristics, we distinguish the edge devices into two categories: (i) a neuromorphic platform that is designed to execute Spiking Neural Networks (SNNs), and (ii) a general-purpose edge device that is suitable to host a DNN. For the first category, we train a traditional DNN and then we convert it to a spiking representation. We target the SpiNNaker neuromorphic platform and we develop a novel technique that efficiently configures the platform-dependent parameters, in order to achieve the highest possible SNN accuracy.Experimental results show that our technique is 2.5× faster than an exhaustive approach and can reach up to 0.8% higher accuracy compared to a CPU-based simulation method. Regarding the general-purpose edge devices, we show that a DNN-unaware platform can result in sub-optimal DNN performance in terms of power and inference time. Our approachconfigures the frequency of the device components (GPU, CPU, Memory) and manages to achieve average of 33.4% and up to 66.3% inference time improvements and an average of 42.8% and up to 61.5% power savings compared to the predefined configuration of an edge device. The last part of this thesis is the offloading optimization between the edge devicesand the gateway. The offloaded tasks create contention effects on gateway, which can lead to application slowdown. Our proposed solution configures (i) the number of application stages that are executed on each edge device, and (ii) the achieved utility in terms of Quality of Service (QoS) on each edge device. Our technique manages to (i) maximize theoverall QoS, and (ii) simultaneously satisfy network constraints (bandwidth) and user expectations (execution time). In case of multi-gateway deployments, we tackled the problem of unequal workload distribution. In particular, we propose a workload-aware management scheme that performs intra- and inter-gateway optimizations. The intra-gateway mechanism provides a balanced execution environment for the applications, and it achieves up to 95% performance deviation improvement, compared to un-optimized systems. The presented inter-gateway method manages to balance the workload among multiple gateways and is able to achieve a global performance threshold.
This dissertation is only available for download to the SIUC community. Current SIUC affiliates may also access this paper off campus by searching Dissertations & Theses @ Southern Illinois University Carbondale from ProQuest. Others should contact the interlibrary loan department of your local library or contact ProQuest's Dissertation Express service.