|Title||Leveraging Machine Learning to Guide Software Evolution|
|Project(s)||evolveIT: Evidence-Based Recommendations to Guide the Evolution of Component-Based Product Families, The Certus Centre (SFI)|
|Publication Type||Talk, keynote|
|Year of Publication||2017|
|Location of Talk||8th IEEE International Workshop on Empirical Software Engineering in Practice (IWESEP), Tokyo, Japan|
|Place Published||Tokyo, Japan|
|Type of Talk||keynote|
|Keywords||change impact analysis, evolutionary coupling, Regression testing, software recommendation systems, targeted association rule mining|
Knowledge about dependencies between system artifacts such as modules, methods and variables is essential for a variety of software maintenance and software evolution tasks. Unfortunately, existing approaches to uncover such dependencies by means of static or dynamic program analysis are typically language-specific. Their application is thus largely restricted to homogeneous systems, which is a major drawback given the increasingly heterogeneity in modern software systems.
In this talk, we will look at the alternative of using unsupervised machine learning techniques such as association rule mining, which can be used to infer knowledge about the relationships between items in a data set. Association rule mining has been successfully used to analyze the change history of a software system and uncover so called evolutionary coupling between its artifacts. One of the advantages of this approach is that it is language-agnostic, and uncovering dependencies across artifacts written in different programming languages essentially comes for free.
We will explore how association rule mining can be used to derive evidence-based recommendations to guide software maintenance and evolution tasks. Examples include software change impact analysis, recommending related change during development, and conducting targeted regression testing. We survey the state-of-the-art, analyze why and where the applicability of existing techniques falls short, and discuss several avenues for improvement, including novel mining algorithms, methods for aggregating the evidence captured by individual rules, and guidelines for selecting appropriate values for parameters of the mining algorithms.