AuthorsC. Ieva, A. Gotlieb, S. Kaci and N. Lazaar
TitleDiscovering Program Topoi via Hierarchical Agglomerative Clustering
AfilliationSoftware Engineering
Project(s)The Certus Centre (SFI)
StatusAccepted
Publication TypeJournal Article
Year of Publication2018
JournalIEEE Transaction of Reliability
PublisherIEEE Reliability Society
Abstract

In long lifespan software-systems, specification documents can be outdated or even missing. Developing new software releases or checking whether some user requirements are still valid becomes challenging in this context. 
This challenge can be addressed by extracting high-level observable capabilities of a system by mining its source code and the available source-level documentation.
This paper presents FEAT (Feature Extraction and Traceability), an approach that automatically extracts topoi, which are summaries of the main capabilities of a program, given under the form of collections of code functions along with an index. 
FEAT acts in two steps: 
(1) Clustering. By mining the available source code, possibly augmented with code-level comments, hierarchical agglomerative clustering groups similar code functions. In addition, this process gathers an index for each function;
(2) Entry-Point Selection. Functions within a cluster are then ranked and presented to validation engineers as topoi candidates.

We implemented FEAT on top of a general-purpose test management and optimization platform and performed an experimental study over 15 open-source software projects amounting to more than 1 MLOC proving that automatically discovering topoi is feasible and meaningful on realistic projects. 

Citation Key25875

Contact person