Registry runtime state compatibility

When the runtime is updated we need to ensure that it can handle state data that was written by a previous version of the runtime. Otherwise the old data cannot be accessed. To allow for this I propose the following scheme.

Let’s consider as an example user with just a handle field that we store on chain. At first we define it as

#[derive(Decode, Encode)]
enum User {
    UserV1(UserV1)
}

#[derive(Decode, Encode)]
struct UserV1 {
    handle: String
}

Now if we want to add a karma field to the user we introduce a new version:

#[derive(Decode, Encode)]
enum User {
    UserV1(UserV1)
    UserV2(UserV2)
}

#[derive(Decode, Encode)]
struct UserV1 {
    handle: String
}

#[derive(Decode, Encode)]
struct UserV2 {
    handle: String,
    karma: u32
}

A User value encoded with the first version of the code can safely by decoded with the second version of the code and produces the expected result. This is guaranteed by the fact that the SCALE codec uses a byte to indicate which variant is used and that number stays stable if we just add variants.

If the user needs to change again we can another UserV3 variant and struct. Existing variants and structs may never be changed.

I suggest we adopt this as a policy for adding new storage and add it to the developer docs. For existing storage we’ll need to replace it with a new storage declaration if we change it.

This is arguably more pleasant to work with than a struct with all-optional fields — it’s not forward-compatible though (ie. an older runtime will not be able to parse the new version). Sometimes this is desirable, but many times it is not: the example illustrates that a user without karma could be perfectly valid, perhaps with a default value.

You’re right that this approach is not forward compatible. Fortunately we don’t need that for the runtime. Once a runtime update is submitted to the chain the old runtime will not run anymore.

The “all fields optional” approach (taken by protobufs v3 for example) was also under consideration. However, in addition to being more cumbersome to handle—as you pointed out— it also requires the serialization mechanism to support this. With substrate we are locked in their SCALE coding which does not support this. This is another reason to prefer the enum approach.