|Authors||S. Sen, C. Ieva, A. Sarkar, A. Grime and A. Sander|
|Title||Experience Report: Verifying Data Interaction Coverage to Improve Testing of Data-Intensive Systems|
|Afilliation||, Software Engineering|
|Publication Type||Proceedings, refereed|
|Year of Publication||2014|
|Conference Name||International Symposium on Software Reliability Engineering|
Testing data-intensive systems is paramount to increase our reliance on information processed in e-governance, scientific/ medical research, and social networks. A common practice in the industrial testing process is to use test databases copied from live production streams to test functionality of complex database applications that manage well-formedness of data and its adherence to business rules in these systems. This practice is often based on the assumption that the test database adequately covers realistic scenarios to test, hopefully, all functionality in these applications. There is a need to systematically evaluate this assumption. We present a tool-supported method to model realistic scenarios and verify whether copied test databases actually cover them and consequently facilitate adequate testing. We conceptualize realistic scenarios as data interactions between fields cross-cutting a complex database schema and model them as test cases in a classification tree model. We present a human-inthe- loop tool, DEPICT, that uses the classification tree model as input to (a) facilitate interactive selection of a connected subgraph from often many possible paths of interactions between tables specified in the model (b) automatically generate SQL queries to create an inner join between tables in the connected subgraph (c) extract records from the join and generate a visual report of satisfied and unsatisfied interactions hence quantifying test adequacy of the test database. We report our experience as a qualitative evaluation of approach and with a large industrial database from the Norwegian Customs and Excise information system TVINN featuring large and complex databases with millions of records.