Project collaborators
Granted by
German Research Foundation (DFG)
Duration of the Project
  • 2011 - 2013 (first phase)
  • 2014 - 2016 (second phase)

In the first project phase we developed heuristical algorithms to extract references to research datasets from publications in the field of Social Science and to integrate those links into the Primo Search Portal of Mannheim University Library and the Datenbestandskatalog of GESIS. Figure 1 shows the basic idea behind the project. A user can search either in Primo or in the Datenbestandskatalog and gets both, relevant publications and research data. Further, they are linked to each other to know which research data is mentioned in which publication and vice versa.

Figure 1: Seaching a publication can lead to a dataset and vice versa
Figure 1: Seaching a publication can lead to a dataset and vice versa

In the current phase of the project, the goal is to extend the results of the previous project, in terms of quality, quantity and supported languages. Furthermore, the technical infrastructure will be developed further to provide a solid and vendor-independent foundation for the algorithms. Last but not least, the nature of publication-dataset links will be thoroughly examined and formalized to allow more fine-grained description of these links.

See also our current goals, add our blog to your RSS reader, follow us on Twitter and check out our code on Github.

Figure 2 shows why linking research data can be difficult, i.e. due to its granularity.

Figure 2: What does 'ALLBUS' refer to here?
Figure 2: What does 'ALLBUS' refer to here?