|Authors||T. Rolfsnes, S. Di Alesio, R. Behjati, L. Moonen and D. Binkley|
|Title||Generalizing the Analysis of Evolutionary Coupling for Software Change Impact Analysis|
|Afilliation||Software Engineering, The Certus Centre (SFI), Software Engineering|
|Publication Type||Proceedings, refereed|
|Year of Publication||2016|
|Conference Name||23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER)|
|Keywords||evolutionary coupling, Machine learning, software repository mining, targeted association rule mining|
Software change impact analysis aims to find artifacts potentially affected by a change. Typical approaches apply language-specific static or dynamic dependence analysis, and are thus restricted to homogeneous systems. This restriction is a major drawback given today’s increasingly heterogeneous software. Evolutionary coupling has been proposed as a language-agnostic alternative that mines relations between source-code entities from the system’s change history. Unfortunately, existing evolutionary coupling based techniques fall short. For example, using Singular Value Decomposition (SVD) quickly becomes computationally expensive. An efficient alternative applies targeted association rule mining, but the most widely known approach (ROSE) has restricted applicability: experiments on two large industrial systems, and four large open source systems, show that ROSE can only identify dependencies about 25% of the time. To overcome this limitation, we introduce TARMAQ, a new algorithm for mining evolutionary coupling. Empirically evaluated on the same six systems, TARMAQ performs consistently better than ROSE and SVD, is applicable 100% of the time, and runs orders of magnitude faster than SVD. We conclude that the proposed algorithm is a significant step forward towards achieving robust change impact analysis for heterogeneous systems.