AuthorsS. Sen, D. Marijan, C. Ieva, A. Grime and A. Sander
TitleModelling and Verifying Combinatorial Interactions to Test Data Intensive Systems: Experience with Optimal Archiving at the Norwegian Customs and Excise Directorate
AfilliationSoftware Engineering
Project(s)The Certus Centre (SFI)
StatusPublished
Publication TypeJournal Article
Year of Publication2016
JournalIEEE Transaction on Reliability
Issue99
Pagination1-14
PublisherIEEE
Abstract

Testing data-intensive systems is paramount to increase

our reliance on information processed in e-governance,

scientific/medical research, and social networks. Data accrued in

these systems often go through several manual and computational

steps involving human inputs in interactive media and complex

batch appications that aim to ensure high quality of data in

terms of validity, correctness, and adherence to business rules. A

common industrial practice in testing data-intensive systems is

to extract test databases from live production streams and verify

the data in them through a checklist of requirements either

by tedious manual observation or by executing complex SQL

queries composed and understood by very few domain experts.

We elevate the specification of such requirements on data by

modelling data interactions between fields cross-cutting the test

database’s schema. These interactions are modelled as test cases

in a classification tree model. The model documents intuitive

expert knowledge about what to expect in the test database

and is given executable semantics using our human-in-the-loop

tool DEPICT. DEPICT verifies if interactions occurred or not

in systematically extracted test databases. Non-occurrence of

expected interactions or occurrence of unexpected interactions

indicate faults in the data. We present experiences on how our

model-driven approach has been successfully applied to verify

test databases in the Norwegian Public Sector. In particular, we

present case studies at (1) the Norwegian Customs and Excise

Directorate for verifying the adherence to customs regulations

and (2) the Cancer Registry of Norway to verify its data quality

management process involving both human coders and complex

legacy batches.

Citation Key23816

Contact person