Edge intelligence pushes intelligent data processing using deep neural networks (DNNs) to the edge of the network, closer to data sources. It enables applications across various fields and has garnered significant attention from both industry and academia. However, the limited resources on edge platforms, such as edge servers and Internet of Things devices, hinder the ability to deliver fast and accurate responses to queries from deep learning prediction tasks. As a result, only some deep learning tasks and smaller DNN models suitable for edge deployment are feasible.
To overcome this limitation, this project explores a new adaptive approach to building deep learning systems. The systems will make real-time adjustments to the DNNs executed for prediction tasks based on the varying resource demands arising from three critical dimensions -- variable task complexity, fluctuating inference workloads, and resource contention in multi-tenant edge environments. The goal is to optimize both system efficiency and accuracy. Realizing the envisioned adaptiveness will facilitate the effective deployment of deep learning techniques across diverse applications and environments.