|Title||History-Based Recommendations to Guide Software Evolution|
|Project(s)||evolveIT: Evidence-Based Recommendations to Guide the Evolution of Component-Based Product Families, The Certus Centre (SFI)|
|Publication Type||Talks, invited|
|Year of Publication||2017|
|Location of Talk||Nara Institute of Science and Technology, Nara, Japan|
|Keywords||change impact analysis, evolutionary coupling, Regression testing, software recommendation systems, targeted association rule mining|
Any software system that is actually used needs to be constantly evolved. Changes are needed to fulfill shifting user requirements, keep a competitive advantage, adapt to changes in other software, and to fix the ever-present bugs. Knowledge about dependencies between system artifacts such as modules, methods and variables is essential for these software maintenance and software evolution tasks. Unfortunately, existing approaches to uncover such dependencies by means of static or dynamic program analysis are typically language-specific. Their application is thus largely restricted to homogeneous systems, which is a major drawback given the increasing heterogeneity in modern software systems.
In this talk, we will look at the alternative of using association rule mining, an unsupervised machine learning techniques that can be used to infer knowledge about the relationships between items in a data set. We use association rule mining to analyze the change history of a software system and uncover so called evolutionary coupling between its artifacts. One of the advantages of this approach is that it is language-agnostic, i.e. uncovering dependencies across artifacts written in different programming languages essentially comes for free.
We will explore how association rule mining can be used to derive evidence-based recommendations to guide software maintenance and evolution tasks. Examples include software change impact analysis, recommending related change during development, and conducting targeted regression testing. We survey the state-of-the-art, analyze why and where the applicability of existing techniques falls short, and discuss several avenues for improvement, including novel mining algorithms, methods for aggregating the evidence captured by individual rules, guidelines for selecting appropriate values for parameters of the mining algorithms. Finally, we discuss an approach aimed at reducing undesired developer interruptions using an automated classifier that predicts whether a change recommendation will be relevant to a developer.