Knowledge-based Version Incompatibility Detection for Deep Learning
Version incompatibility issues are rampant when reusing or reproducing deep learning models and applications. Existing techniques are limited to library dependency specifications declared in PyPI or open-source projects on GitHub. Therefore, they cannot account for latent errors due to undocumented version constraints or dependency to non-Python libraries such as CUDA and cuDNN. Meanwhile, Stack Overflow (SO) offers abundant and up-to-date discussions of version issues spanning across the deep learning stack, e.g., libraries, runtime, OS, and GPU.
We propose to extract the rich knowledge from SO and build a knowledge graph to support the detection of version compatibility issues. Specifically, we develop a novel indirect supervision method that uses a pre-trained Question-Answering (QA) model to extract logic facts from online discussions. Extracted logic facts are further consolidated into a probabilistic knowledge graph to resolve duplicates and conflicts in online discussions. Our evaluation results show that (1) our system can accurately extract version-related facts with 81% precision and 88% recall, and (2) our system can accurately identify 65% while two state-of-the-art approaches can only detect 18% and 6% on a benchmark of 10 popular DL projects.