Towards More Reliable Automated Program Repair by Integrating Formal Analysis Techniques
Keywords: Automated Program Repair, Formal Methods, Static Analysis, Software Analysis
Description: Faults (aka bugs) in software systems can affect large groups of people and lead to massive financial damages. Correcting such bugs accounts for a significant portion of overall software development costs. Automated program repair (APR) techniques aim to reduce these costs by automatically generating program patches (edits in code) to remove bugs from software systems.
In this project, you will investigate APR approaches that use search-based, and constraint-based approaches to generate candidate patches. To determine whether a generated patch corrects the bug, APR needs a specification of correct program behavior. In the absence of having a complete specification of correct behavior for the program that is repaired, existing APR approaches often resolve to using test cases as oracles for determining if a patched program exhibits the correct behavior. However, the test suites in practice are not perfect ways to guarantee the correctness of patches and can lead to a large number of incorrect patches. The result is a long-standing open challenge for APR, known as patch overfitting. Patch overfitting is a condition where the patched program may pass the tests in the given test suite, while it is failing for tests outside the test suite.
Several solutions have been developed to alleviate the overfitting problem, such as symbolic specification inference [1,2], machine learning-based prioritization of patches [3], fuzzing-based test-suite augmentation [4], and concolic path exploration [5]. These solutions rely mainly on test cases and do not guarantee the general correctness of the patches. They have in common that they do not solve the overfitting problem, but aim to limit its impact.
Goal
We aim to develop new approaches to address the overfitting problem of APR by using knowledge contained in reliable static bug detection tools (i.e., bug detection tools that are built upon some solid theoretical background) to enhance existing specifications. The approach aims mainly to synthesize precise specifications for recurring classes of bugs from the static analysis patterns and rules that are used to detect those bugs. We argue that many of these bug detection patterns can be reformulated as specifications of correct behavior. Exploiting this knowledge in APR can further alleviate, or even resolve, the overfitting problem, improve the quality of repairs, and decrease the time spent searching for patches.
Our initial investigation [6] shows that static bug detection can be a promising source for synthesizing accurate specifications that APR can use, where the feasibility of integration has been successfully demonstrated using two types of bugs: arithmetic bugs, such as integer overflow, and logical bugs, such as termination bugs.
To complete the line of research initiated in [6], we aim to consider additional case studies to analyze and demonstrate the feasibility and limitations of the proposed approach on a wider range of bugs. We aim also to develop a (semi-)automated mining technique that can derive bug detection patterns for various classes of bugs from a selection of reliable static analysis tools. Our initial focus will be on mining patterns from static analysis tools with configurable and customizable bug detection rules.
Learning outcome
- Gain an appreciation for the state of the art in APR and formal bug detection techniques.
- Collecting, analyzing, and formalizing known bug detection patterns for recurring classes of bugs from reliable static analysis tools.
- Experience with working in an exciting and active research environment.
- Excellent opportunities to publish your research results in the form of a scientific publication.
Qualifications
We are seeking qualified applicants to work in the area of automated program repair (APR). Applicants must satisfy the following criteria.
Necessary criteria:
- A bachelor's or master's degree in computer science or related areas.
- Good programming skills.
- Background in software engineering.
- Good project management skills.
Additional desired criteria:
- Background in formal bug detection techniques including static analysis techniques.
- Experience in contemporary APR tools.
Supervisors
- Omar Al-Bataineh
- Leon Moonen
References
- H. D. T. Nguyen, D. Qi, A. Roychoudhury, and S. Chandra. SemFix: Program repair via semantic analysis. In: Int’l Conf. Softw. Eng. 2013, pp. 772–781.
- S. Mechtaev, J. Yi, and A. Roychoudhury. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis. In: Int’l Conf. Softw. Eng. 2016, pp. 691–701.
- J. Bader, A. Scott, M. Pradel, and S. Chandra. Getafix: learning to fix bugs automatically. In: Proc. ACM Program. Lang. (2019), 159:1–159:27.
- X. Gao, S. Mechtaev, and A. Roychoudhury. Crash-avoiding program repair. In: Int’l Symp. Softw. Testing & Analysis. 2019, pp. 8–18
- R. Shariffdeen, Y. Noller, L. Grunske, and A. Roychoudhury. “Concolic Program Repair.” In: Conf. Prog. Lang. Design & Impl. ACM, 2021.
- O. Al-Bataineh, A. Grishina, L. Moonen. Towards More Reliable Automated Program Repair by Integrating Static Analysis Techniques. In 21st IEEE International Conference on Software Quality, Reliability and Security (QRS), 2021.