Services for the European Open Science Cloud

Ensuring long-term access to EOSC resources: PIDs as a service

The European Open Science Cloud is on its way to becoming the reference point for researchers and innovators to discover, access, use and reuse a broad spectrum of research-related services. This will mean broader access to resources supporting scientific discovery and collaboration across disciplinary and geographical boundaries. To ensure the EOSC resources are easily accessible and their usage continuously tracked, appropriate mechanisms must be put in place. Persistent Identifiers (PIDs) represent a good solution to this problem. Currently there are several PID providers such as DataCite, EUDAT, CrossRef and ORCID, which are among the most popular. With their various persistent identifier systems, they address different use cases.

ORCID provides PIDs to identify researchers in an unambiguous way. DataCite and CrossRef PIDs are more for publications and at the level of datasets, and they require a minimum level of metadata before receiving a Digital Object Identifier (DOI). DataCite also provides connections via metadata to other resources, including other datasets, software, publications, people, funding, etc. These connected resources ideally use persistent identifiers as well, including DOIs from DataCite, other DOI registration agencies and/or handles, ORCID IDs, etc. These connections can be navigated in advanced ways using the PID Graph, a service developed in the EC-funded FREYA project.

EUDAT provides a flexible PID service infrastructure which is based on the Handle System (the so-called EUDAT B2HANDLE service). PIDs provided by EUDAT provide persistent references for all kinds of scientific artefacts and during all stages of the scientific process and across the data life cycle. Communities use flexible EUDAT PIDs within their workflow when digital objects (DOs) are firstly stored within the Research Infrastructure (RI). As RIs work in a distributed environment where data is located and maintained by different institutions, PIDs enable persistent access to DOs independent from the location where the data is stored.

Handles provided by EUDAT are mostly used for large, early stage data outputs, whereas DataCite DOIs – which use handles for persistent identification and resolution – are assigned to more permanent datasets. Nowadays, there are several examples of organisations using DataCite and EUDAT systems, but to date, the workflows to make the most optimal use of both have not been explicitly identified. This is why DataCite and EUDAT are working together to provide PID services to support the whole researcher data lifecycle within the future EOSC landscape.

Through this collaboration, it would be possible to deliver PIDs for datasets and digital objects in a more optimal way and identify a workflow that would enable European organizations to use handles for raw data and have a seamless transition to DOIs for more permanent datasets. At the moment, in the context of the EOSC-hub and FREYA projects, the following activities are under development:

  1. The creation of a PID Service Provider Catalogue. Merging the information about PID services and their providers into a single resource, helping researchers, data managers, and other EOSC users to discover and use PID services for their data management. A first version of the catalogue will be available by the end of 2020.
  2. The onboarding of the PID services in the EOSC Catalogue.
  3. Joint training and communication activities. Best practices for using PIDs, PID decision trees, and how to connect PIDs for data with other resources such as publications or funding to support data managers and researchers. See also the FREYA PID Forum Knowledge Hub.

The result of this collaboration feeds the work of the PID Task force of the EOSC FAIR and Architecture Working Groups to which the two projects are actively contributing. As an early result, the Second draft Persistent Identifier (PID) policy for the European Open Science Cloud (EOSC) has recently been released. The final achievements of the collaboration will be presented at the joint EOSC-hub/FREYA/SSHOC event taking place in Amsterdam in November 2020.

News type:

05/06/2020