Tango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms

dc.contributor.authorTaufique, Zain
dc.contributor.authorVyas, Aman
dc.contributor.authorMiele
dc.contributor.authorAntonio
dc.contributor.authorLiljeberg
dc.contributor.authorPasi
dc.contributor.authorKanduri, Anil
dc.contributor.organizationfi=terveysteknologia|en=Health Technology|
dc.contributor.organization-code1.2.246.10.2458963.20.28696315432
dc.converis.publication-id477606436
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/477606436
dc.date.accessioned2025-08-28T01:19:36Z
dc.date.available2025-08-28T01:19:36Z
dc.description.abstract<p>There is an increasing demand to run DNN applications on edge platforms for low-latency inference. Executing multi-DNN workloads with diverse compute and latency requirements on resource-constrained heterogeneous edge platforms poses a significant scheduling challenge. In this work, we present Tango framework for orchestrating multi-DNN inference on heterogeneous edge platforms. Our approach uses a Proximal Policy-based Reinforcement Learning agent to jointly optimize cluster selection, accuracy configuration, and frequency scaling to minimize inference latency with a tolerable accuracy loss. We implemented the proposed Tango framework as a portable middleware and deployed it on real hardware of the Jetson TX edge platform. Our evaluation against relevant multi-DNN scheduling strategies demonstrates 61 % lower latency and 48.4 % lower energy consumption at a maximum accuracy loss of 1.59 %.</p>
dc.embargo.lift2027-01-02
dc.format.pagerange300
dc.format.pagerange307
dc.identifier.eisbn979-8-3503-8040-8
dc.identifier.isbn979-8-3503-8041-5
dc.identifier.issn1063-6404
dc.identifier.jour-issn1063-6404
dc.identifier.olddbid207390
dc.identifier.oldhandle10024/190417
dc.identifier.urihttps://www.utupub.fi/handle/11111/51251
dc.identifier.urlhttps://ieeexplore.ieee.org/document/10817997
dc.identifier.urnURN:NBN:fi-fe2025082791616
dc.language.isoen
dc.okm.affiliatedauthorTaufique, Zain
dc.okm.affiliatedauthorVyas, Aman Manishbhai
dc.okm.affiliatedauthorLiljeberg, Pasi
dc.okm.affiliatedauthorKanduru, Srinivasa
dc.okm.discipline213 Electronic, automation and communications engineering, electronicsen_GB
dc.okm.discipline213 Sähkö-, automaatio- ja tietoliikennetekniikka, elektroniikkafi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA4 Conference Article
dc.publisher.countryUnited Statesen_GB
dc.publisher.countryYhdysvallat (USA)fi_FI
dc.publisher.country-codeUS
dc.relation.conferenceIEEE International Conference on Computer Design
dc.relation.doi10.1109/ICCD63220.2024.00053
dc.relation.ispartofjournalProceedings : IEEE International Conference on Computer Design
dc.relation.volume42
dc.source.identifierhttps://www.utupub.fi/handle/10024/190417
dc.titleTango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms
dc.title.book2024 IEEE 42nd International Conference on Computer Design (ICCD)
dc.year.issued2024

Tiedostot