Supporting Data Interlinking in Semantic Libraries with Microtask Crowdsourcing

December 03, 2014

Presenter: Cristina Sarasua (Institute for Web Science and Technologies (WeST). University of Koblenz-Landau, Germany)

Semantic Web technologies enable the integration of distributed data sets curated by different organisations and with different purposes. Descriptions of particular resources (e.g. events, persons or images) are connected through links that explicitly state the relationship between them. Connecting data of similar or disparate domains, libraries can offer a more extensive and detailed information to their visitors, while librarians have better documentation in their cataloguing activities. Despite the advances in data interlinking technology, human intervention is still a core aspect of the process. Humans, in particular librarians, are crucial both as knowledge providers and reviewers of the automatically computed links. One of the problems that arises in this scenario is that libraries might have limited human resources dedicated to authority control; so, running the time-consuming interlinking process over external data sets becomes troublesome. Microtask crowdsourcing provides an economic and scalable way to involve humans systematically in data processing. The goal of this talk is to introduce the process of crowdsourced data interlinking in semantic libraries, which is a paid crowd-powered approach that can support librarians in the interlinking task. Several use cases are described to illustrate how our software, which implements the crowdsourced data interlinking process, could be useful to reduce the amount of information that librarians would need to process when enriching their data with other sources, or to obtain a different perspective from potential users. In addition, challenges that become relevant when adopting this approach are listed.



