With GraphQL and Jupyter notebooks we can now start to explore the PID Graph in a variety of ways. For this topic I picked the graph of FREYA-funded work, and the Jupyter notebook can be found here.
The resulting graph looks like this (FREYA grant: yellow, publications: blue, researchers: green, a particular output highlighted in white):
Some initial observations:
- This graph is automatically generated by a Jupyter notebook. The notebook is available to everyone via a GitHub repository, and it is a minimal effort to for example generate a similar graph for a different grant-funded project.
- I am using a DataCite-specific query to find all grant-funded outputs. Going forward we want to use persistent identifiers for grants, and work is ongoing on standard approach for globally unique grant identifiers. We have also started to collaborate with OpenAIRE and their API for projects.
- Not every significant output of the FREYA project has a persistent identifier, and not all outputs include funding information in their metadata. These kinds of visualizations provide a good incentive to fill these gaps, and we will add the missing information where possible, e.g. to this dataset published last year.
- Zenodo generates two persistent identifiers and associated metadata for every output that is registered with them (one for the specific version and one for the “concept”, see their blog post about versioning). You can see some examples in the graph, and you can also see that in this case this does not provide additional information (as the connections to both PIDs are exactly the same). Depending on the use case and particular dataset, we will often not show all the connected PIDs. Because we use a public Jupyter notebook, the processing of the data is transparent and can easily be tweaked other users.
- This graph shows only the immediate connections to grant funding and the references of the grant-funded outputs. Obviously this graph can be greatly expanded by adding more connections. With a focus on the FREYA grant this could for example mean the organizations involved in the FREYA projects and how they are connected to the researchers and outputs. Or more broadly, include other projects that are also part of the European Open Science Cloud (EOSC), and show the connections between these projects.