Process Data Quality

Technical Report

Data Retrieval, Processing and Linking for Software Process Data Analysis

Adrian Bachmann and Abraham Bernstein

Many projects in the mining software repositories communities rely on software process data gathered from bug tracking databases and commit log files of version control systems. These data are then used to predict defects, gather insight into a project’s life-cycle, and other tasks. In this technical report we introduce the software systems which hold such data. Furthermore, we present our approach for retrieving, processing and linking this data. Specifically, we first introduce the bug fixing process and the software products used which support this process. We then present a step by step guidance of our approach to retrieve, parse, convert and link the data sources. Additionally, we introduce an improved approach for linking the change log file with the bug tracking database. Doing that, we achieve a higher linking rate than with other approaches.



  Adrian Bachmann, Abraham Bernstein, Data Retrieval, Processing and Linking for Software Process Data Analysis, University of Zurich, Department of Informatics, 12 2009.

