Switches for HIRE: Resource Scheduling for Data Center In-Network Computing

Venue

ACM ASPLOS 2021

Authors

Marcel Blöcher Lin Wang Patrick Eugster Max Schmidt

Abstract

The recent trend towards more programmable switching hardware in data centers opens up new possibilities for distributed applications to leverage in-network computing (INC). Literature so far has largely focused on individual application scenarios of INC, leaving aside the problem of coordinating usage of potentially scarce and heterogeneous switch resources among multiple INC scenarios, applications, and users. The traditional model of resource pools of isolated compute containers does not fit an INC-enabled data center.
This paper describes HIRE, a Holistic INC-aware Resource managEr which allows for server-local and INC resources to be coordinated in a unified manner. HIRE introduces a novel flexible resource (meta-)model to address heterogeneity, resource interchangeability, and non-linear resource requirements, and integrates dependencies between resources and locations in a unified cost model, cast as a min-cost max-flow problem. In absence of prior work, we compare HIRE against variants of state-of-the-art schedulers retrofitted to handle INC requests. Experiments with a workload trace of a 4000 machine cluster show that HIRE makes better use of INC resources by serving 8-30% more INC requests, while at the same time reducing network detours by 20%, and reducing tail placement latency by 50%.

Bibtex

@inproceedings{2021-asplos-hire,
 Author = {Marcel Blöcher and Lin Wang and Patrick Eugster and Max Schmidt},
 Title = {Switches for {HIRE}: Resource Scheduling for Data Center In-Network Computing},
 Booktitle = {Architectural Support for Programming Languages and Operating Systems ({ASPLOS})},
 Publisher = {{ACM}},
 ISBN = {9781450383172},
 DOI = {10.1145/3445814.3446760},
 Pages = {268–285},
 Numpages = {18},
 Year = {2021}
}