Tango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms
| dc.contributor.author | Taufique, Zain | |
| dc.contributor.author | Vyas, Aman | |
| dc.contributor.author | Miele | |
| dc.contributor.author | Antonio | |
| dc.contributor.author | Liljeberg | |
| dc.contributor.author | Pasi | |
| dc.contributor.author | Kanduri, Anil | |
| dc.contributor.organization | fi=terveysteknologia|en=Health Technology| | |
| dc.contributor.organization-code | 1.2.246.10.2458963.20.28696315432 | |
| dc.converis.publication-id | 477606436 | |
| dc.converis.url | https://research.utu.fi/converis/portal/Publication/477606436 | |
| dc.date.accessioned | 2025-08-28T01:19:36Z | |
| dc.date.available | 2025-08-28T01:19:36Z | |
| dc.description.abstract | <p>There is an increasing demand to run DNN applications on edge platforms for low-latency inference. Executing multi-DNN workloads with diverse compute and latency requirements on resource-constrained heterogeneous edge platforms poses a significant scheduling challenge. In this work, we present Tango framework for orchestrating multi-DNN inference on heterogeneous edge platforms. Our approach uses a Proximal Policy-based Reinforcement Learning agent to jointly optimize cluster selection, accuracy configuration, and frequency scaling to minimize inference latency with a tolerable accuracy loss. We implemented the proposed Tango framework as a portable middleware and deployed it on real hardware of the Jetson TX edge platform. Our evaluation against relevant multi-DNN scheduling strategies demonstrates 61 % lower latency and 48.4 % lower energy consumption at a maximum accuracy loss of 1.59 %.</p> | |
| dc.embargo.lift | 2027-01-02 | |
| dc.format.pagerange | 300 | |
| dc.format.pagerange | 307 | |
| dc.identifier.eisbn | 979-8-3503-8040-8 | |
| dc.identifier.isbn | 979-8-3503-8041-5 | |
| dc.identifier.issn | 1063-6404 | |
| dc.identifier.jour-issn | 1063-6404 | |
| dc.identifier.olddbid | 207390 | |
| dc.identifier.oldhandle | 10024/190417 | |
| dc.identifier.uri | https://www.utupub.fi/handle/11111/51251 | |
| dc.identifier.url | https://ieeexplore.ieee.org/document/10817997 | |
| dc.identifier.urn | URN:NBN:fi-fe2025082791616 | |
| dc.language.iso | en | |
| dc.okm.affiliatedauthor | Taufique, Zain | |
| dc.okm.affiliatedauthor | Vyas, Aman Manishbhai | |
| dc.okm.affiliatedauthor | Liljeberg, Pasi | |
| dc.okm.affiliatedauthor | Kanduru, Srinivasa | |
| dc.okm.discipline | 213 Electronic, automation and communications engineering, electronics | en_GB |
| dc.okm.discipline | 213 Sähkö-, automaatio- ja tietoliikennetekniikka, elektroniikka | fi_FI |
| dc.okm.internationalcopublication | international co-publication | |
| dc.okm.internationality | International publication | |
| dc.okm.type | A4 Conference Article | |
| dc.publisher.country | United States | en_GB |
| dc.publisher.country | Yhdysvallat (USA) | fi_FI |
| dc.publisher.country-code | US | |
| dc.relation.conference | IEEE International Conference on Computer Design | |
| dc.relation.doi | 10.1109/ICCD63220.2024.00053 | |
| dc.relation.ispartofjournal | Proceedings : IEEE International Conference on Computer Design | |
| dc.relation.volume | 42 | |
| dc.source.identifier | https://www.utupub.fi/handle/10024/190417 | |
| dc.title | Tango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms | |
| dc.title.book | 2024 IEEE 42nd International Conference on Computer Design (ICCD) | |
| dc.year.issued | 2024 |