Software Releases on Radicle

Great stuff!

In that case, I think the interesting question is indeed the one you phrased earlier:

Any thoughts here ? I feel am lacking enough context to make any kind of meaningful suggestion yet.

I think maybe this is why Iā€™m finding this conversation confusing - I think my understanding of what a release is, is very different to yours. To me a release is primarily a commitment from the maintainer to supporting something. The reason you would use a release vs just YOLOing off of master is that the maintainer will try not to break things for you as they change things and there is some idea that the whole thing works as expected. There is also generally an expectation that maintainers will notify in some manner if a release is found to be insecure, etc.

There are also interesting questions of supply chain security here as well. Things like cargo crev.

Maybe Iā€™m unusual here, but it seems to me that a lot of people are going to think about these things when you say ā€œreleaseā€, and thinking about how these things fit into the Link model is what Iā€™m interested in.

I actually donā€™t think our views are that different here. :slight_smile:

The reason you would use a release vs just YOLOing off of master is that the maintainer will try not to break things for you as they change things and there is some idea that the whole thing works as expected. There is also generally an expectation that maintainers will notify in some manner if a release is found to be insecure, etc.

These are all things that I had in mind when I wrote ā€œinvitation to users to use the softwareā€. (When writing my previous reply I actually started making a list of things like these but then decided not to include it in my reply and use the ā€œdouble-distilledā€ version of what a release is, so we donā€™t get caught up in the weeds of individual list items - trying to reach a higher-level agreement first.).

The only thing I am only a little reluctant to include into the release definition however, is any ā€œcommitmentā€ by project maintainers, considering that it is open source software we are talking about here and most licenses explicitly prescribe ā€œAS ISā€ use/delivery of the software and explicitly exclude commitments by maintainers. In a commercial setting that commitment is absolutely part of the game and so is support and back-porting fixes to older releases and security, etc. etc.

Does that help?

From reviewing the conversation, it seems that the concept of release needs to be refined into its more core components for this to be more productive. Each component can be discussed and thought about separately, but then perhaps composing them back again into the more ā€œholisticā€ view of a ā€œreleaseā€.

One component is the idea of maintainers tagging a specific snapshot of a project and saying this is a ā€œreleaseā€ point. I believe this is mostly solved by using tags, but I think how these things are verified can vary from project to project. For example, one project might want all delegates to create a tag of the same name and the release is verified by checking a quorum of signed tags point to the same commit. Or another project might just use a mechanical maintainerā€™s tag as the canonical release ā€“ after some CI/CD.

Another component is the release of a published artifact ā€“ this could be a binary or a library package. This distribution depends on where and how these artifacts get published. For example, Nix users might want to teach derivations how to fetch from Radicle hosted mirrors.

The point being that I think this discussion is too vague at the moment and as @alexgood said:

So perhaps we can break these into separate discussions of what you want to solve?

The only thing I am only a little reluctant to include into the release definition however, is any ā€œcommitmentā€ by project maintainers, considering that it is open source software we are talking about here and most licenses explicitly prescribe ā€œAS ISā€ use/delivery of the software and explicitly exclude commitments by maintainers.

I agree with this 100%, in theory. Like, if we are to define it, it should be defined as an ā€œAS ISā€ agreement.

But if/when a developer starts utilizing Radicle components like Communities and Drips to take in communal funds for their work, I think ā€“ in practical terms ā€“ there will be some implicit commitment to deliver value to that community of users.

If the maintainer of a code base stops maintaining or updating, their community may start withdrawing Drips, funds will dry up, and there will be a financial incentive to continue maintaining/improving the code base.

1 Like

So perhaps we can break these into separate discussions of what you want to solve?

I really like this idea.

What would you say they are?

  1. Tagging
  2. Binary / hosting of binaries
  3. What else?
1 Like

Iā€™m not speaking of legal commitments. Releases are a social commitment. One of the thing I think about when evaluating a dependency is what the maintainers attitude to their release schedule is. Whether they care about backwards compatibility, how they respond to issues raised on older releases etc. I typically check whether I already know who a maintainer is, go and have a look at the issue history and so on.

Please do include this stuff! Itā€™s probably relevant here that I have never used Github Releases, I donā€™t have first hand knowledge of what people care about when they are using it. I think Fintan is correct that this discussion is very vague at the moment and it would be great if we could crystalise it around actual usecases where we know what people are trying to achieve.

1 Like

For the record: Jit-Hub releases are actually created off of git tags. That is, a git tag is prerequisite to a Jit-Hub release, which in turn is nothing more than an unverified database entry providing some additional data.

They keyword here is ā€œunverifiedā€: the reason release objects are not stored in git itself is, presumably, that they can be modified and arbitrary files attached to them (well, maybe they are, but not in the same repo). By trusting a ā€œReleasesā€ entry, you trust Jit-Hub.

The only feature of ā€œReleasesā€ which cannot be modeled with git tags alone is the attachment of file artifacts. Oddly enough, it is possible to store large (binary) objects in git using Git LFS, and instruct Jit-Hub to include those in the generated tarballs. Naturally, Git LFS is not backed by decentralized storage, but there is no technical reason why it couldnā€™t be.

4 Likes

I follow the philosophy of Releases being a communication layer between maintainer and consumer. Tags are technically releases, but may not necessarily be apart of an official release.
Packages would contain artifacts, but also associated with a tagged release.

Looking at the GitHub screenshot. We can see that tags live under releases.

Maybe it would help if we had a framed example. My initial tweet was based on figuring out how to publish packages from the Design System Iā€™m building.

A third thing to add should be Documentation

I originally wrote on discord:

when I think of a ā€œreleaseā€, I think of a combination of 3 things:

  • the code (the - signed? - commit sha)

On this, I think we still need to resolve @fintohapsā€™ questions around maintainer consensus on what actually constitutes a valid ā€œrelease tagā€. A straightforward way to do this would be that each maintainer creates one signed annotated tag pointing to the same commit as all the other maintainers. And then we are just left with tackling the ā€œwhatā€™s the mechanism for saying a tag ā€“ or multiple tags ā€“ are the canonical release(s)?ā€ question that @fintohaps posed earlier.

  • the binaries (produced from that code with some well-defined, auditable process)

As it is possible to include hyperlinks to the binaries in the annotated git tag message, I think we are good here(at least for the Minimum Viable Product (MVP) version).

I would suggest descoping both ā€œhow do we produce these binariesā€ and ā€œwhere do we actually host these binariesā€ from this particular discussion. These can both be discussed elsewhere.

  • the list of issues included in the release

Again, since writing the original message, @fintohaps has confirmed it is possible to use annotated git tags to include such a list of issues and other release notes / docs around migration etc. in the git tag message. I think that also covers my need.

With that said, would others here agree that the discussion here can be focused around just the mechanisms that project maintainers can use to communicate that a specific git tag (they have (all?) signed?) constitutes the ā€œcanonical releaseā€?

Please note I am not asking for us to come up with all the possible options (different maintainers will go about this in different ways), I am just looking to document some possible options that this could work in a p2p / decentralized setting.

Git treats tags as being globally unique, ie. refs/tags is a common namespace regardless of where the tag came from (unlike branches, which are namespaced as refs/remotes/[the origin]/*). This has to do with the traditional way of verifying them (which btw includes the tagā€™s name, which is part of the object headers).

link permits to replicate tag refs in a namespaced manner, that is they end up under refs/remotes/xyz/tags. That, however, means that clients need to be careful to not confuse git when creating project checkouts: if the tags stay namespaced, git will not consider them as being tags (eg. for git-tag -l, or when decorating git-log output). If they do not stay namespaced, the workflow described above is unlikely to work, because the maintainer tags will conflict.

Also consider that the project may be published to some traditional mirror (eg. Jit-Hub), so at the end of the day the must be a single release tag (under refs/tags).

So thatā€™s a multisig problem, which is fairly awkward to solve when you donā€™t have a timestamping service. Essentially, a proposal would need to be published, which then is signed by the eligible keys, and ultimately ā€œfinalizedā€ into the published release tag (which in turn contains a proof of those signatures).

This kind of thing can be implemented using collaborative objects, from which a tag can be synthesized (although the tag itself will still be signed by only a single key).

1 Like

Fantastic, thanks so much for this @kim !

One clarification if you wouldnā€™t mind, please:

Do I understand correctly that this ā€œproposalā€ (regardless if that is what it is eventually called) would be the collaborative object?

How could different projects then select different policies around e.g. ā€œhow many signatures are needed / which signatures are optional / etc.ā€ and when the proposal can be ā€œfinalizedā€ ? Would all this be captured in the collaborative object schema?

Those are good questions!

The current model of anchoring the cob history is refs/cobs/<type>/<id>, which would suggest that each ā€œreleaseā€ is its own object. That could make sense, but has one drawback: the ordering of the releases would depend on some convention (e.g. a version number) found inside the object (the id above does not yield an ordering).

Thus, I think it might be preferable to only have a single history of ā€œreleasesā€ per project.

I wonder what kind of processes youā€™re seeking to capture.

Traditionally, maintainership of up to the largest-scale free software projects very much equates to the authority of declaring what a release is. That is, in a DVCS like git anyone can tag a particular tree as a release ā€“ and sometimes this is very useful, think of ā€œdistributionsā€ maintaining their own patchsets, or ā€œlong term stableā€ releases maintained by different people. Yet the momentum comes from accepting a single person, or a small group of persons as authoritative.

Unlike traditional git, link expresses this relationship explicitly: by carrying the identity document of the ā€œupstreamā€ project, it is implied that some random tree considers itself within the lineage of that project.

Surely some projects may prefer to mirror corporate structures: some release team is responsible for conducting all the preparatory work, and finally gets to sign a release tag (probably with a shared ā€œrelease keyā€). Iā€™m not personally interested in this model, as itā€™s as broken as the corporate structures themselves: the ā€œrelease managersā€™ā€ performance indicator is to ship non-broken releases, but itā€™s not their responsibility (nor even competence) to make it so the code actually works. Thatā€™s the same problem as with dedicated QA teams; a waste of time and resources.

That being said: linkā€™s identities are modelled after TUF, with the express intention of enabling delegation (by the root keys to some other set of keys which have narrower privileges). That has some security implications, so the process would need to be modelled carefully.

As a middle ground between these two, I could imagine a workflow where releases require signoff by both the maintainer keys as well as keys held by other entities. For example, CI systems may indicate that the build passed from the proposed tip, or packagers may confirm that their pipelines work.

For this kind of thing, I would recommend that the policy is expressed in the release object itself. Ie. when I propose a new release, I also specify which additional keys or identity chains are expected to sign off. This has the advantage of not requiring additional revocation mechanisms ā€“ the statement is only valid for this specific release, which can easily be amended by the next one.

+1

I agree on both parts: I also think it is broken and that some teams will want to follow it regardless. Hopefully - because radicle is not so much solving corporate problems - this will be a limited use case.

I think this here is generic enough to cover a range of different workflows / policies, without us necessarily caring about the meaning of what each signature means in the context of the release. That can probably be captured elsewhere.

I think this is fine. I do see a tradeoff in that it is not so easy to say ā€œthis is what our releases look likeā€ (considering that two consecutive objects may be radically different). However, I also see ways around that: for example, teams could document ā€œthis is what our releases look likeā€ in some README, etc.

On this point, I do understand that IDs do not help with an ordering, but I donā€™t yet understand how the history of ā€œreleasesā€ will be made possibleā€¦ Would this fall on the ā€œreaderā€ to read all objects, find some version number within and make sense of the ordering that way ?

Iā€™m not sure what you mean by ā€œlook likeā€ ā€“ the schema describes what a collaborative object looks like. The schema itself can change, but it wouldnā€™t for what I proposed: there would simply be a list-shaped element which describes what keys are expected to sign, and the signatures.

If ā€œreleasesā€ is only one collaborative object, then the CRDT properties yield an ordering.

For example, when a new maintainer is added/removed, the list of signatures would change, right? Or if some team decides at some point that a new ā€œrelease managerā€ needs to sign releases (instead/as well). Those arenā€™t necessarily schema changes (if I understand correctly), but the data changes - because the policy around ā€œwhat a release looks likeā€ changed.

ah, ok, I understand now. I thought each release would be a new collaborative object (conforming to a ā€œreleaseā€ schema), but only one instance of the collaborative object that incorporates all information around the releases does make sense.

With that, I think Iā€™m good with all clarifications for now (and thank you for those!) and I guess weā€™d need to start making this proposal more concrete in order to invite broader feedback, etc. ? What do you think would be good next steps for this discussion ?

Well I think you might want to start designing the schema, which will help answering the remaining questions.

I like to do this in some kind of typed pseudo-code, so as to more concisely capture the desired semantics. Hereā€™s the simplest thing I could come up with:

struct Releases {
    /// The project we're talking about
    urn: Urn,
    /// Releases are an ordered list
    releases: Vec<Release>,
}

/// A point in the git history which shall
/// be tagged as a release after it was 
/// approved by some number of collaborators.
struct Release {
    /// The commit hash to be released
    commit: Oid,
    /// Name of this release, eg. a version number
    name: String,
    /// Some arbitrary blurb, eg. release notes
    description: String,
    /// The signing obligations to render this 
    /// release valid
    valid_when: Set<Either<PublicKey, Urn>>,
    /// The actual signoffs, initially empty
    signed_off_by: Set<Sob>,
}

struct Sob {
    /// URN of the person signing, optional
    urn: Option<Urn>,
    /// The actual key used
    key: PublicKey,
    /// Signature over the `commit` hash of 
    /// the `Release`
    sig: Signature
}

Iā€™m not sure if it is clear, so Iā€™m just going to reiterate: a cob / CRDT is just a datastructure. Its properties give us an ordering (of edits), and we can guarantee that it conforms to the schema. The rest is up to an application written to interpret this data.

From the above, I think itā€™s easy to see that we can synthesize a git tag which includes the description as well as the set of sobs. Since git does not have a native way to express ā€œmultisigsā€, this tag would be signed by whoever creates it, and some custom tooling would be necessary to allow verifying the sobs knowing only the git history. I would suggest to just encode the sobs as git trailers, and verify the signatures element-wise.

A few things of the above could be refined, or require further consideration, eg.

  • Whenever a URN is mentioned, should it also refer to its revision at the time of creation? This can serve as an optimization, but also protect against rewrite attacks.
  • The valid_when set can obviously be modified after creation. This is either a case for @alexgood 's ACL language, or the application needs to commit on a semantics (eg. first-writer-wins).
  • How to express that the release has been ā€œfinalizedā€? Does the tag refer to the cob, or vice versa, or both?
  • When there is more than one signature, there are various ways in which such a release object could be in some kind of partially-valid state. This is an opportunity to come up with an improved verification UI (which shouldnā€™t be too hard if git / GPG is the benchmark).
1 Like

thanks for putting the draft together @kim !

I wonder if this should be an ordered list. :thinking: (e.g. what if 1.2.0 has been released, then 1.3.0 and then we need to ship 1.2.1 with a hotfix). Could this be an unordered list and a date be added to the Release struct perhaps? (ordering could also be an A-Z / Z-A ordering of the release name as well)

I am not sure I fully understand the rewrite attack here? Perhaps an example would help?

I think first-writer wins makes sense here and the application should enforce it (i.e. ignore changes made to valid_when).

It seems to me that the multi-sig part is how a release is ā€œfinalizedā€ (the humans say so). Unless I am misunderstanding the question, it seems to me the cob referring to the ā€œtagā€ makes sense. (well, s/tag/commitid/ because git tags can be moved to a different commit hash and Iā€™m not sure we want that, right?)

Sounds like youā€™re referring to the application that displays / lists the releases, right? If so, I would also expect some kind of status for each Release object it displays that explains whether it satisfies the valid_when constraints.

Maybe Map<String, Release>.

If we refer to a signer by URN, there is no causal relationship to their ā€œsigchainā€. So, we could be presented with a signing key which is only in a forked history of that identity. Or, the key was revoked at some point, but we donā€™t know if that was before or after the release has been singing with it. We can greatly simplify these validation obligations by referring to (Urn, Revision) instead.

For the project itself there is a connection made every time a release is finalized, which is why it would be useful to include the hash of the release object in the tag.

(Obviously, we cannot have the tag hash in the release object then without modifying it. So it would not be possible to know if a release was finalized by looking only at the release object. I guess thatā€™s fine)