About Identities

After yesterdays talking about this dreaded topic, it occurred to me that there might actually be a relatively straightforward solution.

Massi showed in radicle-dev/radicle-link#39 that, by embedding the signatures in the metadata file, the commit identities of the history of this file become irrelevant — any git history which yields a valid sequence of updates will do, regardless of who created the commits and when (i.e. what the SHA1 is).

If this is true, we could do the same thing did-git does: simply place the identity documents of maintainers right next to the project metadata in the same tree and branch. The files would have any human readable name, and be referenced from the project metadata by that name. Example:

project.json
kim.json
massi.json
finto.json

project.json would simply contain:

{
    ...
    "maintainers": ["kim", "massi", "finto"]
}

Inside those identity documents, we place any public keys that person might identify with, possibly even adopting the DID format to provide hints for how to verify ownership. Applying TUF rules for validating (device) key additions or revocations becomes a DID extension the radicle-link protocol mandates, but applications may support other methods such as the radicle registry. For example, kim.json could look like the following:

{
    "rad-version": 1,
    "revision": 1,
    "profile": { "nickname": "kim" },
    "@context": "https://radicle.xyz/did/v1",
    "@id": "did:rad:<projectid>",
    "publicKey": [
        {
            "id": "did:rad:<je ne sais pas>",
            "type": "ED25519SignatureVerification",
            "publicKeyBase58": "deadbeef...",
        },
        {
            // something something with PGP key in PEM
        },
        {
            // whatever describes a registry account / user
        }
    ]
}

One complication remains: what to do with contributors (i.e. non-maintainers)? What we need is to be able to go from a commit to a radicle identity (which may not actually exist, because the patch came in through different means). Several possibilities come to mind:

  • Mandate that people publish their radicle device keys as subkeys of their GPG key to a keyserver.

    Code to do this without storing the private key in the GPG keyring already exists in librad.

  • When merging via our own collaboration tooling, add another parent to the commit which points to the latest version of the contributor’s identity doc in their tree.

    That can look confusing in plain git, and also risks to pull in other unwanted data which happens to sit next to that document. Also, it implies that fast forward merges are verboten, which may upset some folks.

  • When merging via our own tooling, simply place a copy of the doc in the metadata branch, perhaps under a contributors/ directory.

    While an application can trivially build an inverted index linking public key identifiers to the identity document, we still have the problem that a git gpg-sig won’t contain any hints as to which key was used to create the signature. That is, one has to try every known key until one verifies. This doesn’t seem to be a huge problem in practice, as GPG essentially works this way – but perhaps I’m missing something which allows GPG to do better than O(n).

Let me know your thoughts.

2 Likes

Oh actually trivial: PGP keys usually have one or more email addresses associated with them, and the Committer header should use one of them.

I had a few thoughts about this yesterday.

  • Does updating your maintainer file (eg. kim.json) require a quorum? Perhaps we can avoid that in some way, and only require the maintainer’s signature.
  • Are the maintainer files safe? I think so, because the TUF validation will examine every commit, which includes the modifications of the maintainer files.
  • Maintainers will now have both a maintainer file and a contributor file in their own tree, which may diverge. We should think about what happens in that case. I guess we can use the file with the highest revision.

Can you explain this? I don’t understand what we’re trying to do here.

This is simply about dealing with the situation that a code commit could be signed with a key which doesn’t have any mapping to any of the keys which exist in the Radicle world. For example, as a project migrates to Radicle, contributions which happened before that might be signed with any key (or none at all).

We just had an RTC about this, and @mmassi will update with some thoughts.

Without preempting too much, named references are too naive: the project will still have to list exactly the eligible keys (which is also what did-git mandates). This is so because we have decoupled everything from git signatures, so there is nothing which proves which kim.json you meant. But also because of computational complexity: to verify the project meta, you’d have read the identity document (potentially following a hyperlink to another project), verify that, and on top of that decide which of the keys listed there was used to create the signature of the project meta.

Regardless, a user’s identity document can evolve independently – you can add or revoke keys listed there via the TUF rules, where you prove ownership of a quorum of the keys (nb. this is our own “DID method”, and may apply only to specific types of keys in that doc – if you want to list other identities you have on the internet, those might require other methods of proving ownership).

However, if you lose a maintainer key, the project has to be updated. I still have to think about this a bit more, because we’d like to avoid having to update it whenever someone adds or removes a device.

I would say that it doesn’t need a quorum from the maintainers, but I think it’s necessary that it needs a quorum from the devices. Since if someone accessed one of your devices they could lock out the rest of your devices by systematically removing your other device keys from each project. Possibly even adding their own and completely transitioning ownership over (but not ownership of the project thanks to the maintainers quorum).

I like the idea of having a “user description file” containing the set of public keys that a user will use.

However I don’t think that in the context of Radicle that file should be stored inside a project, mostly because each user will likely be connected to tens of projects but the full identity information should not be replicated in all of them: this would quickly generate many outdated copies.

Moreover, as @kim just wrote, storing an array like ["kim", "massi", "finto"] in the project.json file does not work because there’s nothing we can certify about those names (signing the file would not sign the keys).

This is directly related to the concept of “unique user ID”.

Radicle is decentralized, therefore identities are always related to keys.

As an added complexity Radicle has both off chain and on chain data, and it would be perfectly natural to have users that are exist either on or off chain, but not necessarily on both, therefore each aspect of their identity is optional.

Identity management on chain is a solved problem so here we’ll just talk about off chain identity.

With this concept of user identity we have two goals:

  • One id letting each user have a unique user ID that identifies them in the system and that cannot be claimed by other users.

  • The other, which builds upon the first, is the concept of “proving the identity of a user” in the sense of “proving that a given piece of data has been authored by a given user”.

This “identity proof” is best implemented as a signature and is easy to do using public/private key pairs.

Which brings us to the conclusion that the actual goal that we have is to associate public keys to a user ID.

At this point it seems natural to describe a user with a file that contains all the public keys he uses in Radicle, and that is also signed using all of them.

This gives to any receiver of this file the confidence that the user possesses all of the private keys associated to those public keys.

The contents of the file would then be the following:

  • The user ID (which is itself a public key)

  • A revision (monotonically increasing number), and a way to reach previous revisions.

  • A set of public keys that the user can use to sign artifacts inside Radicle, like code commits, issue comments, or anything that in the collaboration requires the concept of “authorship”.

  • The set of the user devices, identified by their device ids which are, again, public keys

  • A set of user medatada (name, kicknake, email) that is just informative and cannot be proved or claimed in any way.

  • Possibly other metadata that can be claimed referring to external authorities (like blockchain IDs, or maybe email addresses if we find a way to interface with external identiry providers).

  • All of the signatures that prove all those claims.

This file should evolve according to TUF rules, which means that the user identity could be “taken over” only if an attacker gains control of a majority of the user keys (be them device keys, signing keys, or anything else).

One way of representing the evolution of this file would be to store it in a well known branch of git repository that represents the user.

This repository would be the underlying “database” of every user device, and this branch would be replicated on all the devices (this repo could contain other relevant Radicle metadata, like the list of the projects that the user owns or tracks, but this is beyond the analisys of the user identity which is what we are focusing on).

This evolution of the user identity gives the possibility of evolving the set of keys that represent a user over time.

I still see as an open point the idea of letting users change their main ID.

Technically it should be possible that a user loses control of the private key related to that ID.

When this happens they should just remove that ID from the list of trusted keys inside the file by using TUF quorum voting: in the new revision that key would not be trusted anymore.

However, for identifying the user, it would be best not to change the id itself (changing it would rise a lot of complications).

The idea is that nobody would be able to create another user identity starting with the same public key.

The problem is that this is true only if the key has been lost.

If the private key were stolen another user could do the following:

  • create another user metadata file
  • use as user ID the public key of the stolen key
  • sign it with that key and an entirely different set of other keys

The problem is that the resulting file would be valid, and would also claim that user ID.
The root of the issue is that the TUF evolution is designed to phase out compromised keys, and in TUF the “root” of the revisions is not accessed using a key but using another kind of identity (an URI).
Here we are trying to still use al “root” a key that is not trusted anymore and this is probably impossible.

At this point I am stuck debating this problem: this scenario of the stolen key (of the user ID) is something that is hard to handle, so for now I’ll just leave it open.
I fear that this is hard to solve without a blockchain.

Anyway, once we decide what sequence of bytes is a user ID, signatures in Radicle should contain:

  • the user ID
  • a reference to which of their keys (in the user file) has been used to sign
  • the signature itself

These kind of signatures should be the standard way to claim authorship of artifacts when collaborating (which, IMHO, is the actual goal we are trying to solve).

Still digesting this stuff on my side, but I think using DID would be a great way to go. The whole concept pertains to our use case and decentralization in general. It has a [growing/adapting] specification that we can follow (and possibly contribute back to).

I think it allows us to focus in on how we would use this file within radicle, for user information discovery, key verification, etc.

Would we (@kim & @mmassi) happy on committing to using the DID spec at least while I think about the other details more?

It might be interesting to think about advertising repos that one maintains via the services field[0]

[0] Decentralized Identifiers (DIDs) v1.0

Thanks @mmassi, many good points!

Hm, is it? As long as the security of the system rests on a single, eternally
valid keypair, I am having difficulties to see how anything is solved.
Specifically key theft is not addressed at all.

There are a few different considerations, which led me to the conclusion that
both inlining as well as indirection should be supported:

  • Pseudonymity

    It is a desirable property of the system that the user can choose in which
    context they want to reveal which information about themselves. Using an
    identity for only one, a subset, or all projects the user is participating
    in should be the user’s choice.

  • Interoperability and Adoption

    We have multiple avenues of interoperability with other systems, including
    plain old git storage hosting. It is desirable to be able to try out the
    system without forcing a concerted migration effort. Inlining removes the
    requirement to be able to resolve a URI-based indirection.

  • Consistency and Availability

    There are good reasons to optimise for availability of all information within
    the context of one project, and also to optimise for cross-project consistency
    at the expense of availability. I think the tradeoff should be made at a
    higher level.

  • Uniformity

    At its core, radicle-link is simply a fancy way to distribute and discover
    version control repositories on a network. That is, the only way data can
    exist is in the context of a repository. The concept of a “project” provides a
    way to address and correlate repositories. I am not at this point convinced
    that introducting another primitive reduces the overall complexity or
    indirection.

I agree, however, that there is no reason an identity statement needs to bind
itself to any repository (or project) in particular, we only require the other
direction: a project binds one or more identities. Since we can index into a
repository via branches, the identity statement may in fact exist under multiple
project namespaces.

My first idea was also to use a public key. I discarded this because of the
following reasoning: what is this keypair used for?

  1. Nothing but establishing the user ID.

    This would mean we can throw away the secret key after generation. We could
    just as well use a random sequence of bytes, or a UUID in this case.

  2. Signing updates to the user profile (including other keys).

    This is basically how GPG works, if used properly. The key now has to be kept
    in secure offline storage, which is terrible because people don’t do that for
    convenience reasons.

  3. Signing code.

    That’s the worst option, because a) it encourages key transfer, b) the key is
    used all the time, increasing vulnerability, c) revocation changes the
    identity, and d) key rotation requires a web-of-trust, or otherwise external
    PKI we’d be tightly coupled to, or have to replicate within our system.

In summary, I think the only property we need for a “user ID” is that it is some
kind of stable identifier, paired with some rules for how to resolve it. What it
resolves to should indeed be some kind of document which contains claims of
properties of the described subject. Inlining proofs of those properties
should not generally be necessary, but possible for certain application-level
needs, e.g. two-way attestation of the on-chain identity or a GPG key also
present on external PKI. Note that this is essentially convenience: we could
also require that an online challenge against those external systems has to be
performed in order to obtain ownership proofs.

What we do need, however, is ownership proofs of those claims themselves (and by
extension the user ID, if we think of it as a content address of the document).
This is where the device keys come into play.

We can go into more detail as to why they exist, but for now I would only like
to point out the following: TLS ensures that a remote machine is operated by a
certain legal entity, and that the data it serves is not altered while in
transit between the remote machine and ours. It does not prove anything about
the data itself: if we put a piece of data on the server, and then request it
back, we should end up with the same sequence of bytes. For this, we either have
to trust the remote machine and its operator, or put our own integrity
protection in place. Which is exactly what we need in a peer-to-peer system,
because we’re talking to intermediaries most of the time. Thus, we repurpose the
origin certificate of TLS to also prove authorship, couple it with an integrity
proof, and require all intermediaries to present this along with the data. This
is why we require a radicle peer to provide a signature over refs/heads.

Now, bear with me, we’re conflating the concept of a “server” and that of a
“user” because, simplified, git commits don’t commute (i.e. order matters). We
thus need to defer conflict resolution to either a central timestamping service
(which we don’t want), or to humans on the read end (which is actually a normal
thing to do when working with version control). Hence we keep all data, and
decide what to do with it at the very end.

Alright. So if a user and a server is basically the same thing, and a user also
wants to claim certain properties of themselves, it seems plausible to use those
device keys to sign them. We get for “free” that the security of these claims
increases with the number of devices a user owns (in the sense that it becomes
more difficult to take over the identity, which is an effective countermeasure
against key theft), a revocation mechanism, and in some sense a web-of-trust via
third-party attestation in the context of projects.

Note that this does not protect any other keys you might be using against
compromise. I’d argue that this is perfectly fine – if you rely on other
cryptosystems, you also need to use their validation methods. We don’t have to
support them within radicle-link. What we could do, however, is to teach git
to use radicle device keys for code signing, which is not only easy (cf.
gpg.program in git-config(1)), but incidentally also how one would properly
use GPG (I mean, using device-local keys only), yet without the terrible UX. The
drawback would be that the project would again have to approve of all devices
of a user, which we just got rid of.

Are you still with me? Cool :slight_smile: So, I think, yes @fintohaps, DID is the right
approach, if not in letter, then in spirit. We would simply extend it with a
“method” which applies to only our keys.

The question remains, how do we construct the “user ID”?

The obvious choice would be a hash over the intial revision of the identity
document, although this creates the nuisance that it can not refer to itself.
I’m not sure how much of a problem this is in practice, though.

Now, we need to be able to refer to this ID, and resolve it on the network, so
let’s work backwards from this. We could:

  1. Use this ID as the repository name, and invent a naming convention to
    distinguish it from projects, e.g.: <subject>.rid.git. subject in this
    case would simply be the commit hash of the initial revision.
  2. Bind it to a specific project ID, e.g. <project>.git/<subject>.rid. Here,
    using a content (blob) hash seems wiser, as copying the file to a different
    project would otherwise alter its identity. To preserve content-
    addressability, we would need to encode this hash in the branch name for git,
    e.g. refs/heads/.rad/<subject>.rid
  3. Don’t mention the repository at all, and just say <subject>.rid. This would
    also suggest a content hash, or something else entirely (see below).

The first option is straighforward to resolve on the network, but creates two
problems: it precludes that this repository can be used for anything else than
storing the identity document (or else, the semantics of projects vs. identities
become unclear), and we need to maintain and inspect remote-tracking branches
for every device which signs the document.

The second option is also straightforward, but since the identity document does
not say anything about the project, the subject may appear under more than one
project. In addition, every device must be tracked.

The third option abstracts away the repository backend, but then we need a
custom wire format to transmit the history, and in addition provide a mapping to
materialise that into a repository format for the owner to edit. Or invent a
custom editable, transmittable, and history-preserving format altogether.

Of these, I’d still favor the second option, because it does not change anything
about the discovery and replication protocol we already employ. Also,
project-relative and absolute references are both possible (well, if we allow
$self in a project metadata document). Since the identifier is stable, it is
trivial to maintain a persistent index locally in order to determine the most
recent seen revision across all locally-tracked projects.

While I’m still going through the rest of this reply making sure I understand the individual arguments. I’m a little bit fuzzy on what you mean by the above. What primitive are you referring to?

I think @mmassi brought up that distributing (and discovering) the identity doc could be decoupled from repositories (projects). So, either QUERY(userid) -> [network address], or QUERY(userid) -> [identity doc] (a list because different peers may respond with different revisions, and we have to pick one). I tried working through this towards the end of the post, and found that it would circle us back to solving the same problems as with projects, but with a different API.

Maybe there are good arguments for this API, but currently it seems to me that it would be implemented in terms of the same primitives we have for projects, so I count roughtly the same number of indirections :slight_smile:

1 Like

Just making a note on this. It may be awkward to teach git to sign with device keys, but I think we could also consider using radicle-link to sign commits via libgit2/git2’s commit_signed function.

This makes sense, thanks :slight_smile:

Are you saying that a project must allow all of a user’s devices as maintainers of the project? If so why is that? I can’t follow the argument here :slight_smile:

So many good points!

There are just a few things I either disagree with, or don’t fully understand.

About user IDs and device IDs

I agree that a device IDs can represent a user.

But, while a device represents a user, the device is not the user, and the device identity is distinct from the user identity.

More specifically:

  • In radicle if we receive data from a device we assume that the data has been authored by the device owner (a radicle user): it has been signed with the device private key, therefore and we assume that the device owner did put that data in the device git repo.

  • However we should not use the device key as the user ID: devices come and go but the use ID is supposed to be long lived and independent from them.

About using git signatures

I am not sure we should mandate git commit signatures in radicle.

In a past conversation with @kim it turned out that a device that “owns” a branch should be able to serve a signature of the branch tip at any point in time, and a device that “mirrors” a branch should be able to relay those signatures, but we should neither insert nor require those signatures at the git commit level.

About actual user IDs

About what bit sequence to use as a user ID, thanks to @kim I am convinced that this is exactly the same problem as identifying a project.

Actually the parallel is remarkably similar.

With projects we have different parts that work together:

  • a project ID (the hash of the initial commit that introduced the project metadata) into the repo

  • one or more git repositories that contain or track the project

  • the project metadata file (project.json)

  • the git branch that actually contains the project source tree

In radicle given a project ID we can find git repositories that contain it, and this is the “resolution” phase.

Given a repo we find the project.json inside it, and given the project.json we see which is the master branch and, which is even more relevant here, who owns the project.

Anybody can track the project and publish copies of it, but only owners can change the metadata (a new project.json revision without proper signatures would be invalid).

A user.json metadata file, in this sense, is similar to a project.json file: it uniquely describes an “entity” (in this case a user and not a project), and provides information on who owns this entity, in the form of attestations (this is like DID in spirit).

The basic attestation we can support is a public key, where the ownership of the corresponding private key is done with a signature.

Following this reasoning the most general form of a radicl user ID should be the composition of three sections:

  • a radicle repository ID that can be resolved to a git repo using the P2P protocol (the git repo itself could be stored anywhere, including outside the p2p network)

  • a branch inside that repo

  • the user metadata file inside that branch

The second and third sections could be optional, or decided by convention, but logically they are always there and in general could be arbitrary. By convention they could be respectively rad/id and user.json but they could be any valid branch and filename. Inlining user info inside a project would then be a special case that still follows the general rules.

I did not fully understand @kim’s .2 and .3 options for ID encoding but I feel that this is a variation on those and I think at least the spirit should be similar.

About using user IDs

So far I see three ways in which user IDs can appear in radicle:

  1. inside project.json files to identify maintainers

  2. inside collaboration artifacts to identify the artifact author (like the author of an issue comment)

  3. inside collaboration artifacts to refer to a user (like mentioning another user inside an issue comment)

More uses could pop up but for now I am focusing on these three cases.

The first two have in common the fact that they both need some kind of certification.

In this case the user ID should be a unique reference to a user.json metadata file, and the certification would be composed of a reference to a key inside it and a signature done with that key.

This is probably the only way we have to claim authorship of something inside radicle.

In the third case, however, the need is totally different.

In this case no certification is required which means that a unique reference to a key is not needed. What would be needed is some kind of ergonomic, human readable (and writable!) nickname or username that can be used inside a piece of markdown text.

Unfortunately in our decentralized context having unique and ergonomic nicknames is likely to be impossible. The best we could do is allow the use of email addresses which are practically unique enough even if not very ergonomic while writing. Se could allow nicknames with the notion that mentions will not necessarily be univoquely resolvable.

I am bringing this up here because at some point the application layer will have to deal with this issue.

Advancing the analysis…

How to use a user ID

A user ID achieves its goals in the following ways:

  • It univoquely identifies a user by means of a “resolution” process that results into a user metadata file that describes the user.

  • It makes it possible to prove authorship (or “approval”) of artifacts by a user by including public keys into the user description (the actual proof is a signature done using the corresponding private key).

  • It is unique in such a way that it must not be possible to “repurpose” it making so that it resolves to a different (malicious) user metadata file containing a different set of trusted keys.

Composition of a user ID

Since a user ID must resolve into a user metadata file it is logically composed of the following parts:

  • repo: a “git repository ID” that the radicle P2P protocol can convert into a git repository URL (this is in practice a radicle project ID, and the git repo could be hosted anywhere, either on radicle devices or on traditional centralized git servers).

  • root: Optionally (if the git repo is not a “user repo” that contains only the user metadata) provides the hash of the commit introducing the user metadata file: if omitted it is equal to root.

  • Optionally (if it is not the canonical one decided by convention) the branch containing the history if the user metadata file.

  • Optionally (if it is not the canonical one decided by convention) the name of the user metadata file.

The reason why repo and root are distinct is because the user ID needs to provide a hash that is used to verify that the initial revision of the metadata file is the intended one.

It is thanks to this hash that the initial revision of the file is, for practical purposes, fixed and immutable: no other file would have exactly that hash, therefore a given ID can refer only to a revision history that starts with exactly that file.

In this scheme this hash is the root part of the user ID.

The reason why repo and root are both needed is because we want to allow embedding user metadata files into arbitrary repositories.

repo is the pointer to the repository in the radicle P2P network, and root ensures that the referred user metadata initial revision starts with specific contents, and if the user has been added to the repository after the repository creation root is necessarily different than repo.

As mentioned above, if a git repository is actually a “user definition repo” (containing only user metadata) repo and root will be the same.

Contents of a user metadata file

Mostly what we have written previously in the discussion, where the most important parts are public keys and their associated signatures.

It is however important that the file also contains purely informative but ergonomic user identifiers, like a nickname, an email address, and a full name.

These are not guaranteed to be unique in the P2P network and not even among the contributors of a given project but they should be used at the application level for ergonomic reasons.

Similarities with DID (Distributed IDentity) and git-did

This system is in spirit very similar to git-did.

The similarities are:

  • The user ID can be resolved into a user description file.
  • The user description file contains keys that can be used to sign artifacts (commits).
  • It is possible to store user definition files (and therefore keys) and project code in the same repository, simplifying key retrieval.

The reasons why we are not formally switching to git-did are:

  • The spec is still preliminary.
  • Most importantly, the git-did resolution step ultimately relies on URLs which in turn rely on DNS and HTTPS, and DNS is in practice a centralized system that we don’t want to depend on to use radicle.

Key revocation and possibility of identity theft

Key revocation happens publishing new revisions of the user metadata file that do not contain the revoked key.

The validity of new revisions is attested by the TUF rules (a new revision must be signed by a quorum of the keys signing the previous revision).

This means that an “identity theft” is possible if someone takes control of a quorum of the user keys. For now we don’t have an answer to this problem (more analysis is needed).

There is another open aspect of key revocation: it should happen at a given moment in time, but it should not invalidate signatures that happened before the revocation.

More generally, the verification of a signature that happened at a given time should be done using the keys provided by the user metadata file revision that was valid at the same time.

This is exactly the same problem we’d have when verifying the history of a project against the set of maintainers: each commit needs to be signed by one of the maintainer at the point in time of the commit.

If the metadata file is in the same repo as the artifacts that must be verified the revocation could reference a commit to specify the point in time, otherwise we would need to use UTC timestamps in both the key revocation event and the artifact signature and compare them.

A further consideration: the kind of ID described above works the same way for projects and users, and this is not by chance.
It is that the “certification” requirements we have are the same in both cases.

In both cases we need to univoquely identify a file and its history, and verify that the history of the file conforms to the TUF rules (every new revision is signed by a quorum of the previous signing keys) so that we are sure that the current “owners” are still the original ones, or are “following their intention”.

In both cases the hash of the initial revision must be part of the ID so that the ID in hardly tied with that: there is no way to make it point to something else.

And in both cases the resolution has four steps:

  • identify a git repository
  • identify a branch on it
  • identify a file in this branch
  • check that the initial revision of that file actually matches the “root” hash embedded in the ID

In any ID the root, branch and file parts could be omitted and fixed by convention but logically they are always there.

The difference between the user and project cases is only in the meaning of the associated keys: for a users, they are the user’s keys, and for a project they are the maintainer’s keys.
But in both cases the meaning is that the keys are controlled by the “owner” (or owners) of the certified entity.

Another difference is that in the user case the keys should be directly embedded in the user.json file, while for a project it will be more natural to refer to the keys by referring to an external user.json file, but again, in both cases we have a set of keys and signatures which evolve by TUF rules.

So I would propose that in radicle we use this scheme every time we need to “certify” that a set of entities (users) “control” or “own” another entity (in the current case a project).

This ID “schema” (repo:root/branch/filename) could point to any file for which we would like to certify the evolution.
And the code implementing the schema should be the same in any case.
Even just now we could implement the “user” and “project” cases sharing a lot of code. Building the right traits probably all of it except for the concrete file contents (which are obviously different).

Thoughts?

1 Like

This is shaping up very nicely!

Yeah, what they’re trying to do is to bind DIDs to a specific, canonical repository. This means that the method is vulnerable to attacks which simply rewrite the history.

Hm, I’m not so much concerned about his – the security rests on the assumption that these keys are kept on different hardware devices, and never leave them. I think this is pretty good, and quite “teachable” – imagine a little mobile app which does little more than signing a blob it receives with a secret key stored in the secure enclave of the device, which you “pair” with your upstream installation.

More of a concern is whether it is possible to compromise the root of trust of the TUF update – i.e. create a root as per above which is equal to an existing one, but signed by a different key (or set thereof). PoW-meister @cloudhead pointed out, only half-jokingly, that if you alter the other user attributes stored in the file just a little bit, it might be possible to find a hash collision which would render the attacker’s key valid. In other words, as long as it is cheap to generate root, we’re vulnerable to sybil attacks.

I believe it has been shown in the literature that these attacks cannot be prevented, only impeded. I’m also reminded of the S/Kademlia paper, which proposes exactly a crypto puzzle to make id generation expensive. Perhaps something to explore – or else, require to purchase a name on the registry.

Hm, if we unify resolution of any type of metadata (i.e. prefix with a repo/project identifier), it should be possible to construct absolute references, cross-project. Or maybe I’m not understanding correctly what you’re saying.