Jonas Tappolet, Abraham Bernstein, Applied Temporal RDF: Efficient Temporal Querying of RDF Data with SPARQL, Proceedings of the 6th European Semantic Web Conference (ESWC), June 2009, Springer. (inproceedings)
Many applications operate on time-sensitive data. Some of
these data are only valid for certain intervals (e.g., job-assignments, versions
of software code), others describe temporal events that happened
at certain points in time (e.g., a persons birthday). Until recently, the
only way to incorporate time into Semantic Web models was as a data
type property. Temporal RDF, however, considers time as an additional
dimension in data preserving the semantics of time.
In this paper we present a syntax and storage format based on named
graphs to express temporal RDF. Given the restriction to preexisting
RDF-syntax, our approach can perform any temporal query using standard
SPARQL syntax only. For convenience, we introduce a shorthand
format called -SPARQL for temporal queries and show how -SPARQL
queries can be translated to standard SPARQL. Additionally, we show
that, depending on the underlying data?s nature, the temporal RDF approach
vastly reduces the number of triples by eliminating redundancies
resulting in an increased performance for processing and querying. Last
but not least, we introduce a new indexing approach method that can
significantly reduce the time needed to execute time point queries (e.g.,
what happened on January 1st).
Jayalath Ekanayake, Jonas Tappolet, Harald C. Gall, Abraham Bernstein, Tracking Concept Drift of Software Projects Using Defect Prediction Quality, Proceedings of the 6th IEEE Working Conference on Mining Software Repositories , May 2009, IEEE Computer Society. (inproceedings)
Defect prediction is an important task in the mining of
software repositories, but the quality of predictions varies
strongly within and across software projects. In this paper
we investigate the reasons why the prediction quality is so
fluctuating due to the altering nature of the bug (or defect)
fixing process. Therefore, we adopt the notion of a concept
drift, which denotes that the defect prediction model has
become unsuitable as set of influencing features has changed
? usually due to a change in the underlying bug generation
process (i.e., the concept). We explore four open source
projects (Eclipse, OpenOffice, Netbeans and Mozilla) and
construct file-level and project-level features for each of
them from their respective CVS and Bugzilla repositories.
We then use this data to build defect prediction models and
visualize the prediction quality along the time axis. These
visualizations allow us to identify concept drifts and ? as a
consequence ? phases of stability and instability expressed
in the level of defect prediction quality. Further, we identify
those project features, which are influencing the defect
prediction quality using both a tree induction-algorithm and
a linear regression model. Our experiments uncover that
software systems are subject to considerable concept drifts
in their evolution history. Specifically, we observe that the
change in number of authors editing a file and the number
of defects fixed by them contribute to a project?s concept
drift and therefore influence the defect prediction quality.
Our findings suggest that project managers using defect
prediction models for decision making should be aware of
the actual phase of stability or instability due to a potential
concept drift.