Towards Greener Yet Powerful Code Generation via Quantization: An Empirical Study (ESEC/FSE 2023 - Research Papers)

Who

Xiaokai Wei, Sujan Kumar Gonugondla, Shiqi Wang, Wasi Ahmad, Baishakhi Ray, Haifeng Qian, Xiaopeng LI, Varun Kumar, Zijian Wang, Yuchen Tian, Qing Sun, Ben Athiwaratkun, Mingyue Shang, Murali Krishna Ramanathan, Parminder Bhatia, Bing Xiang

Track

ESEC/FSE 2023 Research Papers

Time Zone

The program is currently displayed in (GMT-08:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-08:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 5 Dec 2023 15:00 - 15:15 at Golden Gate A - Empirical Studies I Chair(s): Cristian Cadar

Abstract

ML-powered code generation aims to assist developers to write code in a more productive manner by intelligently generating code blocks based on natural language prompts. Recently, large pretrained deep learning models have pushed the boundary of code generation and achieved impressive performance. However, the huge number of model parameters poses a significant challenge to their adoption in a typical software development environment, where a developer might use a standard laptop or mid-size server to develop code. Such large models cost significant resources in terms of memory, latency, dollars, as well as carbon footprint. Model compression is a promising approach to address these challenges. Out of many compression techniques, we have identified that quantization is the most applicable for code generation task as it does not require significant retrain- ing cost. As quantization represents model parameters with lower-bit integer (e.g., int8), the model size and runtime la- tency would both benefit. We empirically evaluate quantized models on code generation tasks across different dimensions: (i) resource usage and carbon footprint, (ii) accuracy, and (iii) robustness. Through systematic experiments we find a code- aware quantization recipe that could run even a 6-billion- parameter model in a regular laptop without significant ac- curacy or robustness degradation. We find that the recipe is readily applicable to code summarization task as well.

Xiaokai Wei

AWS AI Labs

Sujan Kumar Gonugondla

AWS AI Labs

Shiqi Wang

AWS AI Labs

United States

Wasi Ahmad

AWS AI Labs

Baishakhi Ray

Columbia University

United States

Haifeng Qian

AWS AI Labs

United States

Xiaopeng LI

AWS AI Labs

Varun Kumar

AWS AI Labs

Zijian Wang

AWS AI Labs

Yuchen Tian

AWS

United States

Qing Sun

AWS AI Labs

Ben Athiwaratkun

AWS AI Labs

Mingyue Shang

AWS AI Labs

Murali Krishna Ramanathan

AWS AI Labs

United States

Parminder Bhatia

AWS AI Labs

Bing Xiang

AWS AI Labs

Media

Time Zone

The program is currently displayed in (GMT-08:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-08:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 5 Dec
Displayed time zone: Pacific Time (US & Canada) change

14:00 - 15:30	Empirical Studies IIdeas, Visions and Reflections / Research Papers / Industry Papers / Journal First at Golden Gate A Chair(s): Cristian Cadar Imperial College London

14:00 15m Talk		[Remote] Assess and Summarize: Improve Outage Understanding with Large Language Models Industry Papers Pengxiang Jin Nankai University, Shenglin Zhang Nankai University, Minghua Ma Microsoft Research, Haozhe Li Peking University, Yu Kang Microsoft Research, Liqun Li Microsoft Research, Yudong Liu Microsoft Research, Bo Qiao Microsoft Research, Chaoyun Zhang Microsoft, Pu Zhao Microsoft Research, Shilin He Microsoft Research, Federica Sarro University College London, Yingnong Dang Microsoft Azure, Saravan Rajmohan Microsoft 365, Qingwei Lin Microsoft, Dongmei Zhang Microsoft Research DOI Media Attached
14:15 15m Talk		Open Source License Inconsistencies on GitHub Journal First Thomas Wolter Friedrich-Alexander University Erlangen-Nuernberg, Ann Barcomb Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Dirk Riehle U of Erlangen, Nikolay Harutyunyan Friedrich-Alexander University Erlangen-Nuremberg, Germany Media Attached
14:30 15m Talk		On the Relationship Between Code Verifiability and Understandability Research Papers Kobi Feldman College of William & Mary, Martin Kellogg New Jersey Institute of Technology, Oscar Chaparro William & Mary Media Attached
14:45 15m Talk		Lessons from the Long Tail: Analysing Unsafe Dependency Updates across Software Ecosystems Ideas, Visions and Reflections Supatsara Wattanakriengkrai Nara Institute of Science and Technology, Raula Gaikovina Kula Nara Institute of Science and Technology, Christoph Treude University of Melbourne, Kenichi Matsumoto Nara Institute of Science and Technology Media Attached
15:00 15m Talk		Towards Greener Yet Powerful Code Generation via Quantization: An Empirical Study Research Papers Xiaokai Wei AWS AI Labs, Sujan Kumar Gonugondla AWS AI Labs, Shiqi Wang AWS AI Labs, Wasi Ahmad AWS AI Labs, Baishakhi Ray Columbia University, Haifeng Qian AWS AI Labs, Xiaopeng LI AWS AI Labs, Varun Kumar AWS AI Labs, Zijian Wang AWS AI Labs, Yuchen Tian AWS, Qing Sun AWS AI Labs, Ben Athiwaratkun AWS AI Labs, Mingyue Shang AWS AI Labs, Murali Krishna Ramanathan AWS AI Labs, Parminder Bhatia AWS AI Labs, Bing Xiang AWS AI Labs Media Attached
15:15 15m Talk		Understanding Hackers’ Work: An Empirical Study of Offensive Security Practitioners Industry Papers Andreas Happe TU Wien, Jürgen Cito TU Wien DOI Media Attached