A Highly Scalable, Hybrid, Cross-Platform Timing Analysis Framework Providing Accurate Differential Throughput Estimation via Instruction-Level Tracing
Differential throughput estimation, i.e., predicting the performance impact of software changes, is critical when developing applications that rely on accurate timing bounds, such as automotive, avionic, or industrial control systems. However, developers often lack access to the target hardware to perform on-device measurements, and hence rely on instruction throughput estimation tools to evaluate performance impacts.
State-of-the-art techniques broadly fall into two categories: dynamic and static. Dynamic approaches emulate program execution using cycle-accurate microarchitectural simulators resulting in high precision at the cost of long turnaround times and convoluted setups. Static approaches reduce overhead by predicting cycle counts outside of a concrete runtime environment. However, they are limited by the lack of dynamic runtime information and mostly focus on predictions over single basic blocks which requires developers to manually construct critical instruction sequences.
We present ABCD (The actual name has been redacted for anonymous review), a hybrid timing analysis framework that combines the advantages of dynamic and static approaches. Instead of relying on heavyweight cycle-accurate emulation, ABCD collects instruction traces along with dynamic runtime information from QEMU and streams them to a static throughput estimator. This allows developers to accurately estimate the performance impact of software changes for complete programs within minutes, reducing turnaround times by orders of magnitude compared to existing approaches with similar accuracy. Our evaluation shows that ABCD scales to real-world applications such as FFmpeg and Clang with millions of instructions, achieving < 3% geo. mean error compared to ground truth timings from hardware-performance counters on x86 and ARM machines.
Wed 6 DecDisplayed time zone: Pacific Time (US & Canada) change
14:00 - 15:30 | PerformanceResearch Papers / Industry Papers / Journal First at Golden Gate C1 Chair(s): Aitor Arrieta Mondragon University | ||
14:00 15mTalk | A Highly Scalable, Hybrid, Cross-Platform Timing Analysis Framework Providing Accurate Differential Throughput Estimation via Instruction-Level Tracing Research Papers Min-Yih Hsu University of California, Irvine, Felicitas Hetzelt University of California, Irvine, David Gens University of California, Irvine, Michael Maitland SiFive, Michael Franz University of California, Irvine Media Attached | ||
14:15 15mTalk | Towards effective assessment of steady state performance in Java software: Are we there yet? Journal First Luca Traini University of L'Aquila, Vittorio Cortellessa Università dell'Aquila, Italy, Daniele Di Pompeo University of L'Aquila, Michele Tucci University of L'Aquila Link to publication DOI Media Attached | ||
14:30 15mTalk | Adapting Performance Analytic Techniques in a Real-World Database-Centric System: An Industrial Experience Report Industry Papers Lizhi Liao University of Waterloo, Heng Li Polytechnique Montréal, Weiyi Shang University of Waterloo, Catalin Sporea ERA Environmental Management Solutions, Andrei Toma ERA Environmental Management Solutions, Sarah Sajedi ERA Environmental Management Solutions DOI Media Attached | ||
14:45 15mTalk | IoPV: On Inconsistent Option Performance Variations Research Papers Jinfu Chen Jiangsu University, Zishuo Ding University of Waterloo, Yiming Tang Rochester Institute of Technology, Mohammed Sayagh ETS Montreal, University of Quebec, Heng Li Polytechnique Montréal, Bram Adams Queen's University, Kingston, Ontario, Weiyi Shang University of Waterloo Pre-print Media Attached | ||
15:00 15mTalk | Predicting Software Performance with Divide-and-Learn Research Papers Pre-print Media Attached | ||
15:15 15mTalk | [Remote] Discovering Parallelisms in Python Programs Research Papers Siwei Wei State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, and University of Chinese Academy of Sciences Beijing, China, Guyang Song AntGroup, Senlin Zhu AntGroup, Ruoyi Ruan AntGroup, Shihao Zhu State Key Laboratory of Computer Science,Institute of Software,Chinese Academy of Sciences,China, Yan Cai State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, and University of Chinese Academy of Sciences Beijing, China Media Attached |