Git Tags

kim · May 13, 2020, 5:16pm

Cross-post of Handle git tags · Issue #145 · radicle-dev/radicle-link · GitHub, where it should re-surface once the time has come

We have so far (deliberately) avoided to talk about how we should handle git tags.

Git has two types of tags: “annotated” and “lightweight”. Annotated tags are a type of object in the odb with the same shape as commit objects. Lightweight tags are merely pointers from refs/tags/<tag> to objects. (Aside: note that both types may point to any object type, not just commits)

Lightweight tags are meant to be used locally (ie. not published), while annotated tags may denote releases or similar, and are supposed to be published alongside the code. So the former can be used mutably, but the latter shall not (although it’s possible, like anything in git).

The question now is, how do we handle them in a fully distributed setting?

Not allowing lightweight tags to be replicated seems to be the philosophically right thing to do, and is doable, albeit a bit annoying as we will have to inspect what a tag ref points to in order to be able to tell whether it’s a lightweight or annotated tag.

As the refs/tags category is global (ie. not namespaced by remote), annotated tags are trickier. There are four possibilities afaics:

Rewrite refs/tags/<tag> to refs/tags/<peer>/<tag>

This will leave git tooling intact, but violate workflow assumptions. External tools can no longer rely on tag names being repo-global, and maintainers would need to “merge” tags into their own refs/tags hierarchy.
Keep them global, and use project ACL to decide who can tag

This means that if two maintainers create conflicting tags (ie. having the same name), one of them will be rejected (non-deterministically, first-seen-wins). A major complication is that we’d need to consider the maintainers-set as valid at tag creation time.
Invent a tag-agreement scheme

That is, a special type of “feed” (you know, like issues) which has some semantics attached to it, and would carry references to actual git tag objects (but not their refs) in the commit parents (hell yeah, that’s a hack). If consensus is reached, the tag refs would be created in the working copy .
Don’t allow tags at all

Since tags require coordination, and radicle-link doesn’t provide coordination, you need to use some other system to reach consensus on your tags. This business is all in your working copy.

cloudhead · May 25, 2020, 10:22am

Interesting question… Without having thought about it too much, I would explore 1. or 4.

With the assumption that there is a canonical upstream, eg. in the case of a single maintainer, doesn’t 1. work quite well? And even without that assumption, you can still agree out-of-band on tags.

I think in terms of tooling then, we could smoothen the experience and only show the annotated tags from maintainers, for eg., stripping the <peer> prefix.

kim · May 26, 2020, 12:47pm

I guess it’s a question of what use-cases to serve (first).

By far the most common one is to tag releases – but this only works in a centralised model. It does not really make sense to think of a release tag as “version X.Y.Z according to cloudhead”. Think of CI/CD workflows: a global tag is expected to represent the canonical, permanently immutable state of a project at a given point in time, regardless of what that project’s governance model is. To me, this strongly suggests an on-chain model, as the properties of a distributed ledger are exactly what we want in order to replace a trusted upstream (the tag object itself can be created on-the-fly by the client).

More niche use-cases arise when people are actually embracing a distributed workflow (as hinted at in git-tag(1), section “On Automatic following”). Here, subsystem maintainers may want to exchange tags, which “the” project itself is not interested in. I can imagine a bunch of different ways to enable this, including naming conventions (option 1.), “lightweight consensus” (option 3.), “forking” (ie. creating a different project identity), or configuration (“follow tags from these users only, and let me resolve conflicts myself”, or: “pull tags from github.com”).

Perhaps we should just ask some people used to a non-GH workflow how they think about tags?

monsieuricon · May 26, 2020, 1:17pm

In the Linux world, tags are pretty important, as they are the canonical
source of “what is a released version.” More importantly, they are also
points of attestation when Linux is concerned, because every kernel
release tag is signed by Linus Torvalds or Greg Kroah-Hartman. The only
canonical way of getting an attested version of the mainline kernel is
to clone the repository and verify the tag. This applies to all clones
of linux.git, not just what is on git.kernel.org or on github.com – you
can clone it from anywhere and use the tag signature to verify that it’s
genuine.

With Radicle, this distinction fades, since every change to the
repository is already fully attestable. I think it is acceptable to keep
tags local-only in Radicle and have a designated “release manager” who
is responsible for tagging releases in exported git repositories (GH/GL,
etc).

kim · May 30, 2020, 6:42pm

That’s actually true, some uses of tagging might become less significant.

@fred (I believe) brought up this game-of-stakes scenario, where multiple parties need to agree on what constitutes a release (potentially pseudonymously). This could work by producing a Schnorr signature over a proposed, unsigned tag object, and distributing the Schnorr-signed tag with all copies of the repository. Verification would require some tooling, but nothing crazy. Or is there anything more complicated to it?

cloudhead · June 2, 2020, 10:21am

Good points. Maybe a good way to start then is per-user tags in radicle-link, which are just like branches, and then anchored (on-chain) tags if people want more global tags?

kim · June 6, 2020, 2:02pm

Ya I’m not sure. It’s teaching people a new and obscure way to work with tags (and providing the tooling for it), only to teach them a newer and more obscure way later…

Due to the subtly non-standard layout of our remote tracking branches (we keep heads/ so we can fit in rad/), we can actually replicate tags as well. I’d suggest to leave it to the user whose tags to follow (if any). The only difference to standard git usage would be that it can’t be selected ad-hoc via git fetch —tags, but needs to be written to the config as a fetchspec.