GBIF.org has 1,3B individual records coming from 10,000s of datasets. Each dataset has a DOI and we currently track citations at that level which works well. It would be useful to offer a consistent record-level handle-based identifier so that people can link to records consistently - e.g. cite records individually in a paper publication or provide annotation services.
There are several ways I foresee we could achieve this with some initial pro/con commentary:
- Templated DOIs. Pros: cheap, easy, scalable for DataCite. Cons: can only track citations at dataset level.
- DataCite DOI for each specimen. Pros: easy for everyone. Cons: lots of unnecessary DOIs created, scalability challenge and bottleneck in DataCite.
- GBIF become a DOI authority (or other handle authority). Pros: raise visibility of GBIF, infrastructure already built with mature metadata standards. Cons: Misalignment with DataCite, not portable to other domains, additional governance for GBIF (currently a DataCite member, but not a DOI Foundation member) and costs involved.
- Mint DOIs on demand. Pros: Only the records needed are created - less wasteful. Cons: burden for users, no consistent resolution for records achieved, anyone looking to link programmatically then needs to mint first rather than just link.
I am interested in wider discussion around this topic for both GBIF (biodiversity observation / specimen records) and but also with wider communities who may have sophisticated systems in place dealing with high quantities of data looking to connect with DataCite.