Tango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms
Taufique, Zain; Vyas, Aman; Miele; Antonio; Liljeberg; Pasi; Kanduri, Anil
Tätä artikkelia/julkaisua ei ole tallennettu UTUPubiin. Julkaisun tiedoissa voi kuitenkin olla linkki toisaalle tallennettuun artikkeliin / julkaisuun.
https://urn.fi/URN:NBN:fi-fe2025082791616
Tiivistelmä
There is an increasing demand to run DNN applications on edge platforms for low-latency inference. Executing multi-DNN workloads with diverse compute and latency requirements on resource-constrained heterogeneous edge platforms poses a significant scheduling challenge. In this work, we present Tango framework for orchestrating multi-DNN inference on heterogeneous edge platforms. Our approach uses a Proximal Policy-based Reinforcement Learning agent to jointly optimize cluster selection, accuracy configuration, and frequency scaling to minimize inference latency with a tolerable accuracy loss. We implemented the proposed Tango framework as a portable middleware and deployed it on real hardware of the Jetson TX edge platform. Our evaluation against relevant multi-DNN scheduling strategies demonstrates 61 % lower latency and 48.4 % lower energy consumption at a maximum accuracy loss of 1.59 %.
Kokoelmat
- Rinnakkaistallenteet [27094]