Creating resolvable PIDs without registering them in a registry, through CIDs and IPFS

Dear all,

For a while now I’ve been pondering the possibility of creating persistent, unique and resolvable identifiers, without having to register these identifiers anywhere in a specific registry. This whole idea came up when trying to find a way to register PIDs for European healthcare databases in the IMI EHDEN project in which I’m involved, and I realized how hard it is currently to do that in a future-proof way (hopefully EOSC can address this going forward).

But… is this even possible? Yes, it is theoretically possible, if we agree on a standardized and reproducible way of encoding data and using its hash as the identifier. :nerd_face: But more than that, the technology already exists today. Using the IPLD standard and CIDs as identifiers, with the backing of the growing IPFS (Interplanetary Filesystem) which is an ambitious project to decentralize the internet. If you have no idea what I’m talking about (that happens, often to myself :upside_down_face:), perhaps the clearest explanation is this small article with code examples: https://docs.ipld.io/tutorial.html#addressing.

I just don’t know a good place to bring this topic up and experiment with it. The open science / FAIR digital objects community where PIDs live and the blockchain world of CIDs have almost zero intersection, it’s just that I happen to be interested in both. So I’m trying this forum which is about PIDs, has anyone thought about this or an interest to experiment with it?

Greetings,

Kees

1 Like

Hi Kees,

Interesting topic. Can you elaborate on the resolvability of CIDs? I see no mention of URIs and protocols, so I’m a bit confused. If there’s no implied protocol and host, I don’t understand how a CID is more than a hash. (For example, https://proto.school/anatomy-of-a-cid/01 mentions that QmcRD4wkPPi6dig81r5sLj9Zm1gDCL4zgpEj9CfuRrGbzF is a CID, while actually linking to https://ipfs.io/ipfs/QmcRD4wkPPi6dig81r5sLj9Zm1gDCL4zgpEj9CfuRrGbzF.)

https://docs.ipld.io/tutorial.html#addressing mentions that “We can now ask any random device on the internet do you have this CID?” I feel like you can already do that with any HTTP HEAD/GET request and ask random devices if they have a given path, then inspect the response code. I understand the benefit of knowing, with CIDs, whether the server lies, but I feel like I’m missing the bigger picture.

Also, how do you handle metadata without a registry? You cannot based the CID on data that is subject to change (hosting organizations, addresses, etc.). So how do you exchange metadata about a given resource, and how do you link different versions (hence different CIDs) of a given resource?

I also wanted to point out that CIDs’ dependency on cryptographic hashes reminded me of the SWHIDs used by Software Heritage, which use a URI scheme registered with IANA, in case you don’t already know that project.

Hi Luc,

Thanks for your reply! Your questions help a lot in brainstorming this. I guess the other part of the puzzle is IPFS itself. The ipfs.io/ipfs bridge is just a way to bridge today’s centralized internet stack to IPFS. Think about it as a permanent peer-to-peer network where as long as you are connected with some node that has the data (which could be a researcher over in the next university that also downloaded the same dataset) you can always retrieve it.
The original video from Juan Benet probably says it best (https://www.youtube.com/watch?v=HUVmypx9HGI) - the newer ones are very flashy, but you can also look at hands-on overview like this: https://www.freecodecamp.org/news/ipfs-101-understand-by-doing-it-9f5622c4d4ed/.

Of course, the big question is, will IPFS actually get enough traction, because its user community is growing but still small compared to the current web. On the other hand, you really only need a handful of data producers and users, the network auto-scales… and as a bonus it really would work nicely on Mars too.

Your other point about metadata handling is a good one. I think it’s not difficult to add metadata because IPLD supports links between blocks. You can see that in how qri.io uses IPFS (e.g. https://ipfs.io/ipfs/QmTKhugTGYXe9ozosHSdeSDCfjr3fi5DDCzmeqM9czfExE). Like data standards, metadata standards will evolve over time, but if you would use something like JSON-LD, then you would have a decentralized linked web of data. If someone already defined an entity that you want to reuse, you can just copy the JSON-LD and as a bonus you would be storing an additional copy of that block.

However, you wouldn’t be able to version the metadata, because this whole idea hinges on the immutability of the content. As soon as you introduce versioning, you will have to have some sort of registry. This could still be a decentralized one, such as IPNS (https://docs.ipfs.io/concepts/ipns/, see https://dweb-primer.ipfs.io/publishing-changes) or the ENS (https://ens.domains/), but that’s not the idea I have in the title.

Thanks for the link to SWHIDs, I see a number of parallels there with IPFS and IPLD indeed!

Greetings,

Kees

Hi @keesvanbochove, exciting to read that you looked into resolvable PIDs with IPFS and Co.! :slight_smile:

I wondered about that issue, too, but found the following solutions, depending on what kind of level you need:

  1. w3id.org – is managed via GitHub Pull Requests, and has thus delay between asking to register something and availability of the identifier. But: works out of the box.
  2. trying to sneak into public infrastructure which mints handles (handle.net): here it took me roughly 3 months to get answers via email, and I needed to try with several public handle services, but ultimately the most responsive (and still for free!) was https://epic.grnet.gr/ – the pidconsortium.net is still in the making (so will take around 4 months).

In your post you also referred to “community uptake / traction” – although I love the idea of IPFS, I believe handles (and thus http) are the way to go right now (and even handles are already quite advanced, I often feel). But again, thanks for sharing, I would say this is a good place indeed. :slight_smile:

See you around!

1 Like