Great stuff!
In that case, I think the interesting question is indeed the one you phrased earlier:
Any thoughts here ? I feel am lacking enough context to make any kind of meaningful suggestion yet.
Great stuff!
In that case, I think the interesting question is indeed the one you phrased earlier:
Any thoughts here ? I feel am lacking enough context to make any kind of meaningful suggestion yet.
I think maybe this is why Iām finding this conversation confusing - I think my understanding of what a release is, is very different to yours. To me a release is primarily a commitment from the maintainer to supporting something. The reason you would use a release vs just YOLOing off of master
is that the maintainer will try not to break things for you as they change things and there is some idea that the whole thing works as expected. There is also generally an expectation that maintainers will notify in some manner if a release is found to be insecure, etc.
There are also interesting questions of supply chain security here as well. Things like cargo crev.
Maybe Iām unusual here, but it seems to me that a lot of people are going to think about these things when you say āreleaseā, and thinking about how these things fit into the Link model is what Iām interested in.
I actually donāt think our views are that different here.
The reason you would use a release vs just YOLOing off of
master
is that the maintainer will try not to break things for you as they change things and there is some idea that the whole thing works as expected. There is also generally an expectation that maintainers will notify in some manner if a release is found to be insecure, etc.
These are all things that I had in mind when I wrote āinvitation to users to use the softwareā. (When writing my previous reply I actually started making a list of things like these but then decided not to include it in my reply and use the ādouble-distilledā version of what a release is, so we donāt get caught up in the weeds of individual list items - trying to reach a higher-level agreement first.).
The only thing I am only a little reluctant to include into the release definition however, is any ācommitmentā by project maintainers, considering that it is open source software we are talking about here and most licenses explicitly prescribe āAS ISā use/delivery of the software and explicitly exclude commitments by maintainers. In a commercial setting that commitment is absolutely part of the game and so is support and back-porting fixes to older releases and security, etc. etc.
Does that help?
From reviewing the conversation, it seems that the concept of release needs to be refined into its more core components for this to be more productive. Each component can be discussed and thought about separately, but then perhaps composing them back again into the more āholisticā view of a āreleaseā.
One component is the idea of maintainers tagging a specific snapshot of a project and saying this is a āreleaseā point. I believe this is mostly solved by using tags, but I think how these things are verified can vary from project to project. For example, one project might want all delegates to create a tag of the same name and the release is verified by checking a quorum of signed tags point to the same commit. Or another project might just use a mechanical maintainerās tag as the canonical release ā after some CI/CD.
Another component is the release of a published artifact ā this could be a binary or a library package. This distribution depends on where and how these artifacts get published. For example, Nix users might want to teach derivations how to fetch from Radicle hosted mirrors.
The point being that I think this discussion is too vague at the moment and as @alexgood said:
So perhaps we can break these into separate discussions of what you want to solve?
The only thing I am only a little reluctant to include into the release definition however, is any ācommitmentā by project maintainers, considering that it is open source software we are talking about here and most licenses explicitly prescribe āAS ISā use/delivery of the software and explicitly exclude commitments by maintainers.
I agree with this 100%, in theory. Like, if we are to define it, it should be defined as an āAS ISā agreement.
But if/when a developer starts utilizing Radicle components like Communities and Drips to take in communal funds for their work, I think ā in practical terms ā there will be some implicit commitment to deliver value to that community of users.
If the maintainer of a code base stops maintaining or updating, their community may start withdrawing Drips, funds will dry up, and there will be a financial incentive to continue maintaining/improving the code base.
So perhaps we can break these into separate discussions of what you want to solve?
I really like this idea.
What would you say they are?
Iām not speaking of legal commitments. Releases are a social commitment. One of the thing I think about when evaluating a dependency is what the maintainers attitude to their release schedule is. Whether they care about backwards compatibility, how they respond to issues raised on older releases etc. I typically check whether I already know who a maintainer is, go and have a look at the issue history and so on.
Please do include this stuff! Itās probably relevant here that I have never used Github Releases, I donāt have first hand knowledge of what people care about when they are using it. I think Fintan is correct that this discussion is very vague at the moment and it would be great if we could crystalise it around actual usecases where we know what people are trying to achieve.
For the record: Jit-Hub releases are actually created off of git tags. That is, a git tag is prerequisite to a Jit-Hub release, which in turn is nothing more than an unverified database entry providing some additional data.
They keyword here is āunverifiedā: the reason release objects are not stored in git itself is, presumably, that they can be modified and arbitrary files attached to them (well, maybe they are, but not in the same repo). By trusting a āReleasesā entry, you trust Jit-Hub.
The only feature of āReleasesā which cannot be modeled with git tags alone is the attachment of file artifacts. Oddly enough, it is possible to store large (binary) objects in git using Git LFS, and instruct Jit-Hub to include those in the generated tarballs. Naturally, Git LFS is not backed by decentralized storage, but there is no technical reason why it couldnāt be.
I follow the philosophy of Releases being a communication layer between maintainer and consumer. Tags are technically releases, but may not necessarily be apart of an official release.
Packages would contain artifacts, but also associated with a tagged release.
Looking at the GitHub screenshot. We can see that tags live under releases.
Maybe it would help if we had a framed example. My initial tweet was based on figuring out how to publish packages from the Design System Iām building.
A third thing to add should be Documentation
I originally wrote on discord:
when I think of a āreleaseā, I think of a combination of 3 things:
- the code (the - signed? - commit sha)
On this, I think we still need to resolve @fintohapsā questions around maintainer consensus on what actually constitutes a valid ārelease tagā. A straightforward way to do this would be that each maintainer creates one signed annotated tag pointing to the same commit as all the other maintainers. And then we are just left with tackling the āwhatās the mechanism for saying a tag ā or multiple tags ā are the canonical release(s)?ā question that @fintohaps posed earlier.
- the binaries (produced from that code with some well-defined, auditable process)
As it is possible to include hyperlinks to the binaries in the annotated git tag message, I think we are good here(at least for the Minimum Viable Product (MVP) version).
I would suggest descoping both āhow do we produce these binariesā and āwhere do we actually host these binariesā from this particular discussion. These can both be discussed elsewhere.
- the list of issues included in the release
Again, since writing the original message, @fintohaps has confirmed it is possible to use annotated git tags to include such a list of issues and other release notes / docs around migration etc. in the git tag message. I think that also covers my need.
With that said, would others here agree that the discussion here can be focused around just the mechanisms that project maintainers can use to communicate that a specific git tag (they have (all?) signed?) constitutes the ācanonical releaseā?
Please note I am not asking for us to come up with all the possible options (different maintainers will go about this in different ways), I am just looking to document some possible options that this could work in a p2p / decentralized setting.
Git treats tags as being globally unique, ie. refs/tags
is a common namespace regardless of where the tag came from (unlike branches, which are namespaced as refs/remotes/[the origin]/*
). This has to do with the traditional way of verifying them (which btw includes the tagās name, which is part of the object headers).
link
permits to replicate tag refs in a namespaced manner, that is they end up under refs/remotes/xyz/tags
. That, however, means that clients need to be careful to not confuse git when creating project checkouts: if the tags stay namespaced, git will not consider them as being tags (eg. for git-tag -l
, or when decorating git-log
output). If they do not stay namespaced, the workflow described above is unlikely to work, because the maintainer tags will conflict.
Also consider that the project may be published to some traditional mirror (eg. Jit-Hub), so at the end of the day the must be a single release tag (under refs/tags
).
So thatās a multisig problem, which is fairly awkward to solve when you donāt have a timestamping service. Essentially, a proposal would need to be published, which then is signed by the eligible keys, and ultimately āfinalizedā into the published release tag (which in turn contains a proof of those signatures).
This kind of thing can be implemented using collaborative objects, from which a tag can be synthesized (although the tag itself will still be signed by only a single key).
Fantastic, thanks so much for this @kim !
One clarification if you wouldnāt mind, please:
Do I understand correctly that this āproposalā (regardless if that is what it is eventually called) would be the collaborative object?
How could different projects then select different policies around e.g. āhow many signatures are needed / which signatures are optional / etc.ā and when the proposal can be āfinalizedā ? Would all this be captured in the collaborative object schema?
Those are good questions!
The current model of anchoring the cob history is refs/cobs/<type>/<id>
, which would suggest that each āreleaseā is its own object. That could make sense, but has one drawback: the ordering of the releases would depend on some convention (e.g. a version number) found inside the object (the id
above does not yield an ordering).
Thus, I think it might be preferable to only have a single history of āreleasesā per project.
I wonder what kind of processes youāre seeking to capture.
Traditionally, maintainership of up to the largest-scale free software projects very much equates to the authority of declaring what a release is. That is, in a DVCS like git anyone can tag a particular tree as a release ā and sometimes this is very useful, think of ādistributionsā maintaining their own patchsets, or ālong term stableā releases maintained by different people. Yet the momentum comes from accepting a single person, or a small group of persons as authoritative.
Unlike traditional git, link
expresses this relationship explicitly: by carrying the identity document of the āupstreamā project, it is implied that some random tree considers itself within the lineage of that project.
Surely some projects may prefer to mirror corporate structures: some release team is responsible for conducting all the preparatory work, and finally gets to sign a release tag (probably with a shared ārelease keyā). Iām not personally interested in this model, as itās as broken as the corporate structures themselves: the ārelease managersāā performance indicator is to ship non-broken releases, but itās not their responsibility (nor even competence) to make it so the code actually works. Thatās the same problem as with dedicated QA teams; a waste of time and resources.
That being said: link
ās identities are modelled after TUF, with the express intention of enabling delegation (by the root keys to some other set of keys which have narrower privileges). That has some security implications, so the process would need to be modelled carefully.
As a middle ground between these two, I could imagine a workflow where releases require signoff by both the maintainer keys as well as keys held by other entities. For example, CI systems may indicate that the build passed from the proposed tip, or packagers may confirm that their pipelines work.
For this kind of thing, I would recommend that the policy is expressed in the release object itself. Ie. when I propose a new release, I also specify which additional keys or identity chains are expected to sign off. This has the advantage of not requiring additional revocation mechanisms ā the statement is only valid for this specific release, which can easily be amended by the next one.
+1
I agree on both parts: I also think it is broken and that some teams will want to follow it regardless. Hopefully - because radicle is not so much solving corporate problems - this will be a limited use case.
I think this here is generic enough to cover a range of different workflows / policies, without us necessarily caring about the meaning of what each signature means in the context of the release. That can probably be captured elsewhere.
I think this is fine. I do see a tradeoff in that it is not so easy to say āthis is what our releases look likeā (considering that two consecutive objects may be radically different). However, I also see ways around that: for example, teams could document āthis is what our releases look likeā in some README, etc.
On this point, I do understand that IDs do not help with an ordering, but I donāt yet understand how the history of āreleasesā will be made possibleā¦ Would this fall on the āreaderā to read all objects, find some version number within and make sense of the ordering that way ?
Iām not sure what you mean by ālook likeā ā the schema describes what a collaborative object looks like. The schema itself can change, but it wouldnāt for what I proposed: there would simply be a list-shaped element which describes what keys are expected to sign, and the signatures.
If āreleasesā is only one collaborative object, then the CRDT properties yield an ordering.
For example, when a new maintainer is added/removed, the list of signatures would change, right? Or if some team decides at some point that a new ārelease managerā needs to sign releases (instead/as well). Those arenāt necessarily schema changes (if I understand correctly), but the data changes - because the policy around āwhat a release looks likeā changed.
ah, ok, I understand now. I thought each release would be a new collaborative object (conforming to a āreleaseā schema), but only one instance of the collaborative object that incorporates all information around the releases does make sense.
With that, I think Iām good with all clarifications for now (and thank you for those!) and I guess weād need to start making this proposal more concrete in order to invite broader feedback, etc. ? What do you think would be good next steps for this discussion ?
Well I think you might want to start designing the schema, which will help answering the remaining questions.
I like to do this in some kind of typed pseudo-code, so as to more concisely capture the desired semantics. Hereās the simplest thing I could come up with:
struct Releases {
/// The project we're talking about
urn: Urn,
/// Releases are an ordered list
releases: Vec<Release>,
}
/// A point in the git history which shall
/// be tagged as a release after it was
/// approved by some number of collaborators.
struct Release {
/// The commit hash to be released
commit: Oid,
/// Name of this release, eg. a version number
name: String,
/// Some arbitrary blurb, eg. release notes
description: String,
/// The signing obligations to render this
/// release valid
valid_when: Set<Either<PublicKey, Urn>>,
/// The actual signoffs, initially empty
signed_off_by: Set<Sob>,
}
struct Sob {
/// URN of the person signing, optional
urn: Option<Urn>,
/// The actual key used
key: PublicKey,
/// Signature over the `commit` hash of
/// the `Release`
sig: Signature
}
Iām not sure if it is clear, so Iām just going to reiterate: a cob / CRDT is just a datastructure. Its properties give us an ordering (of edits), and we can guarantee that it conforms to the schema. The rest is up to an application written to interpret this data.
From the above, I think itās easy to see that we can synthesize a git tag which includes the description
as well as the set of sobs. Since git does not have a native way to express āmultisigsā, this tag would be signed by whoever creates it, and some custom tooling would be necessary to allow verifying the sobs knowing only the git history. I would suggest to just encode the sobs as git trailers, and verify the signatures element-wise.
A few things of the above could be refined, or require further consideration, eg.
valid_when
set can obviously be modified after creation. This is either a case for @alexgood 's ACL language, or the application needs to commit on a semantics (eg. first-writer-wins).thanks for putting the draft together @kim !
I wonder if this should be an ordered list. (e.g. what if 1.2.0 has been released, then 1.3.0 and then we need to ship 1.2.1 with a hotfix). Could this be an unordered list and a
date
be added to the Release
struct perhaps? (ordering could also be an A-Z / Z-A ordering of the release name
as well)
I am not sure I fully understand the rewrite attack here? Perhaps an example would help?
I think first-writer wins makes sense here and the application should enforce it (i.e. ignore changes made to valid_when
).
It seems to me that the multi-sig part is how a release is āfinalizedā (the humans say so). Unless I am misunderstanding the question, it seems to me the cob referring to the ātagā makes sense. (well, s/tag/commitid/ because git tags can be moved to a different commit hash and Iām not sure we want that, right?)
Sounds like youāre referring to the application that displays / lists the releases, right? If so, I would also expect some kind of status for each Release object it displays that explains whether it satisfies the valid_when
constraints.
Maybe Map<String, Release>
.
If we refer to a signer by URN, there is no causal relationship to their āsigchainā. So, we could be presented with a signing key which is only in a forked history of that identity. Or, the key was revoked at some point, but we donāt know if that was before or after the release has been singing with it. We can greatly simplify these validation obligations by referring to (Urn, Revision)
instead.
For the project itself there is a connection made every time a release is finalized, which is why it would be useful to include the hash of the release object in the tag.
(Obviously, we cannot have the tag hash in the release object then without modifying it. So it would not be possible to know if a release was finalized by looking only at the release object. I guess thatās fine)