Data integration involves combining data residing in different sources and providing users with a unified view of them.[1] This process becomes significant in a variety of situations, which include both commercial (such as when two similar companies need to merge their databases) and scientific (combining research results from different bioinformatics repositories, for example) domains. Data integration appears with increasing frequency as the volume (that is, big data[2]) and the need to share existing data explodes.[3] It has become the focus of extensive theoretical work, and numerous open problems remain unsolved.

