Thu 7 Dec 2023 14:30 - 14:45 at Golden Gate A - Empirical Studies II Chair(s): Peter Rigby

Researchers and practitioners have been studying correlations between software metrics and defects for decades. The typical approach is to postulate a hypothesis that a certain metric correlates with the number of defects. A statistical test then utilizes historical data to accept or reject the hypothesis. Although this methodology has been widely adopted, our own experience is that such correlations are often limited in their practical relevance, particularly for large industrial projects: Interpreting and arguing about them is challenging and cumbersome; the difference between correlation and causation might not be clear; and the practical impact of a correlation is often questioned due to misconceptions between a statistical conclusion and the impact on singular events.

Instead of discussing correlations, we found that the analysis for binary testedness results in more fruitful discussions. Binary testedness, as proposed by prior work, utilizes a metric to divide the source code into two parts and verifies whether more (or less) defects appear in each part than expected. In our work, we leverage the binary testedness approach and analyze several software metrics for a large industrial project to illustrate the concept. We furthermore introduce dynamic thresholds as a novel and more practical approach for source code classification compared to the static binary classification of previous works. Our results show that some studied metrics have a significant correlation with bug distribution, but effect sizes differ by several magnitudes across metrics. Overall, our approach moves away from “metric X correlates with defects” to a more fruitful “source code with attribute X has more (or less) bugs than expected”, reframing the discussion from questioning statistics and methods towards an evidence-based root cause analysis.

Thu 7 Dec

Displayed time zone: Pacific Time (US & Canada) change

14:00 - 15:30
Empirical Studies IIResearch Papers / Ideas, Visions and Reflections / Industry Papers / Journal First at Golden Gate A
Chair(s): Peter Rigby Concordia University; Meta
14:00
15m
Talk
Reflecting on the use of the Policy-Process-Product Theory in Empirical Software Engineering
Ideas, Visions and Reflections
Kelechi G. Kalu Purdue University, Taylor R. Schorlemmer Purdue University, Sophie Chen University of Michigan, Kyle A. Robinson Purdue University, Erik Kocinare Purdue University, James C. Davis Purdue University
DOI Pre-print Media Attached
14:15
15m
Talk
Understanding the Role of External Pull Requests in the NPM Ecosystem
Journal First
Vittunyuta Maeprasart Nara Institute of Science and Technology, Supatsara Wattanakriengkrai Nara Institute of Science and Technology, Raula Gaikovina Kula Nara Institute of Science and Technology, Christoph Treude University of Melbourne, Kenichi Matsumoto Nara Institute of Science and Technology
Media Attached
14:30
15m
Talk
A Multidimensional Analysis of Bug Density in SAP HANA
Industry Papers
Julian Reck SAP, Thomas Bach SAP, Jan Stoess University of Applied Sciences Karlsruhe
DOI Media Attached
14:45
15m
Talk
[Remote] Understanding the topics and challenges of GPU programming by classifying and analyzing Stack Overflow posts
Research Papers
Wenhua Yang Nanjing University of Aeronautics and Astronautics, Chong Zhang Nanjing University, Minxue Pan Nanjing University
Media Attached
15:00
15m
Talk
[Remote] Ownership in the Hands of Accountability at Brightsquid: A Case Study and a Developer Survey
Industry Papers
Umme Ayman Koana York University, Francis Chew Brightsquid, Chris Carlson Brightsquid, Maleknaz Nayebi York University
DOI
15:15
15m
Talk
[Remote] Software Architecture in Practice: Challenges and Opportunities
Research Papers
Zhiyuan Wan Zhejiang University, Yun Zhang Zhejiang University City College, Xin Xia Huawei Technologies, Yi Jiang Huawei, David Lo School of Computing and Information Systems, Singapore Management University
Media Attached