Tue 5 Dec 2023 17:30 - 17:45 at Golden Gate C2 - Machine Learning II Chair(s): Iftekhar Ahmed

Software engineering techniques are increasingly relying on deep learning approaches to support many software engineering tasks, from bug triaging to code generation. To assess the efficacy of such techniques researchers typically perform controlled experiments. Conducting these experiments, however, is particularly challenging given the complexity of the space of variables involved, from specialized and intricate architectures and algorithms to a large number of training hyper-parameters and choices of evolving datasets, all compounded by how rapidly the machine learning technology is advancing, and the inherent sources of randomness in the training process. In this work we conduct a mapping study, examining 194 experiments with techniques that rely on deep neural networks appearing in 55 papers published in premier software engineering venues to provide a characterization of the state-of-the-practice, pinpointing experiments common trends and pitfalls. Our study reveals that most of the experiments, including those that have received ACM artifact badges, have fundamental limitations that raise doubts about the reliability of their findings. More specifically, we find: 1) weak analyses to determine that there is a true relationship between independent and dependent variables (87% of the experiments), 2) limited control over the space of DNN relevant variables, which can render a relationship between dependent variables and treatments that may not be causal but rather correlational (100% of the experiments), and 3) lack of specificity in terms of what are the DNN variables and their values utilized in the experiments (86% of the experiments) to define the treatments being applied, which makes it unclear whether the techniques designed are the ones being assessed, or how the sources of extraneous variation are controlled. We provide some practical recommendations to address these limitations.

Tue 5 Dec

Displayed time zone: Pacific Time (US & Canada) change

16:00 - 18:00
Machine Learning IIResearch Papers / Ideas, Visions and Reflections at Golden Gate C2
Chair(s): Iftekhar Ahmed University of California at Irvine
16:00
15m
Talk
[Remote] Compatibility Issues in Deep Learning Systems: Problems and Opportunities
Research Papers
Jun Wang Nanjing University of Aeronautics and Astronautics, Nanjing, China, Guanping Xiao Nanjing University of Aeronautics and Astronautics, China, Shuai Zhang Nanjing University of Aeronautics and Astronautics, China, Huashan Lei Nanjing University of Aeronautics and Astronautics, China, Yepang Liu Southern University of Science and Technology, Yulei Sui University of New South Wales, Australia
DOI Pre-print Media Attached
16:15
15m
Talk
[Remote] An Extensive Study on Adversarial Attack against Pre-trained Models of Code
Research Papers
Xiaohu Du Huazhong University of Science and Technology, Ming Wen Huazhong University of Science and Technology, Zichao Wei Huazhong University of Science and Technology, Shangwen Wang National University of Defense Technology, Hai Jin Huazhong University of Science and Technology
Media Attached
16:30
15m
Talk
Can Machine Learning Pipelines Be Better Configured?
Research Papers
Yibo Wang Northeastern University, Ying Wang Northeastern University, Tingwei Zhang Northeastern University, Yue Yu National University of Defense Technology, Shing-Chi Cheung Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hai Yu Software College, Northeastern University, Zhiliang Zhu Software College, Northeastern University
Media Attached
16:45
15m
Talk
Towards Feature-Based Analysis of the Machine Learning Development Lifecycle
Ideas, Visions and Reflections
Boyue Caroline Hu University of Toronto, Marsha Chechik University of Toronto
Media Attached
17:00
15m
Talk
Fix Fairness, Don’t Ruin Accuracy: Performance Aware Fairness Repair using AutoML
Research Papers
Giang Nguyen Dept. of Computer Science, Iowa State University, Sumon Biswas Carnegie Mellon University, Hridesh Rajan Dept. of Computer Science, Iowa State University
Pre-print Media Attached
17:15
15m
Talk
BiasAsker: Measuring the Bias in Conversational AI System
Research Papers
Yuxuan Wan The Chinese University of Hong Kong, Wenxuan Wang Chinese University of Hong Kong, Pinjia He The Chinese University of Hong Kong, Shenzhen, Jiazhen Gu Chinese University of Hong Kong, Haonan Bai The Chinese University of Hong Kong, Michael Lyu The Chinese University of Hong Kong
Media Attached
17:30
15m
Talk
Pitfalls in Experiments with DNN4SE: An Analysis of the State of the Practice
Research Papers
Sira Vegas Universidad Politecnica de Madrid, Sebastian Elbaum University of Virginia
Media Attached
17:45
15m
Talk
DecompoVision: Reliability Analysis of Machine Vision Components Through Decomposition and Reuse
Research Papers
Boyue Caroline Hu University of Toronto, Lina Marsso University of Toronto, Nikita Dvornik Waabi, Huakun Shen University of Toronto, Marsha Chechik University of Toronto
Media Attached