With a new development cycle upon us, radicle-surf
needs to revisit the task of implementing diff’ing.
We created a couple of issues to track this work initially:
- Diff Semantics · Issue #27 · radicle-dev/radicle-surf · GitHub
- Implement diff · Issue #22 · radicle-dev/radicle-surf · GitHub
The reason for getting more serious about this topic is that we would like to complete some code browsing features. To name just a few:
- Viewing commit changes
- Comparing two commits
- View branch comparisons (more specifically for viewing proposed changes to be merged)
Work in Progress
Thanks to the work of Andrei Tomashpolskiy we have a start for diff’ing. This initial implementation surfaced a lot of questions that we hadn’t initially considered when thinking about generating diffs. We initially thought that generating a diff from the Directory
structure would be quite powerful since it would be implemented in, so-called, “pure” functions. It would not rely on any store or state. In reality though, it would be useful to boot-strap off of the backing VCS’s implementation of diff’ing since it would track a lot of useful information already, such as lines changed, files moved, etc.
Next Steps
Moving forward we would like to try work from the concrete and notice patterns that we could generalise for future development. This process will be different in comparison to Surfs Up - The Denotational Wave. If we were to approach this problem with another denotational design perspective, I would fear that we’d be on the road to re-inventing something like Pijul (but nowhere near as complete and nice).
Instead we will start from the model of libgit2 (more specifically the Rust wrapper git2), mapping out how it models diffs, and try to create a model that would make more sense to us and other implementations, again, such as Pijul. Over the next couple of weeks, starting today, we will dig into diff’ing with one use case at a time.
Commit Change-Set
The first use-case is looking at a single commit change set. In GitHub this would be the same as looking at a single commit, e.g. Add Deref for Label · radicle-dev/radicle-surf@07fce73 · GitHub. What this is doing is taking a single commit and comparing it to its parent. What we get is a series of files that were changed, whether they were added, deleted, moved, or modified, and within modifications what lines were added or removed. Sounds simple right? Well in libgit2
it turns out to be a web of data structures that we’re currently navigating through. Here’s a taste of what we have so far:
- To get a diff of two
Commit
s we can get theirTree
s and call diff_tree_to_tree - A
Patch
can be retrieved by usingPatch::from_diff
passing theDiff
we just retrieved along with with the file you want to look at (which is given by an index, by the way…) - Within the
Patch
we have aDiffDelta
which tells us what files are being touched - We also have a series of
DiffHunk
s DiffHunk
’s in turn have lines which are captured in the structure:DiffLine
. It’s here where we can tell how the content has changed within a file.
Finding the Pattern
While these are just our initial findings with a quick dive into the libgit2
API (via git2
), we will have to keep in the back of our mind what patterns we can see here. While these data structures may be useful for the git model, is this type of nested structure helpful for code browsing? Is there a more user friendly structure that we can come up with that allows for the plug’n’play of other VCS libraries? What information makes the most sense for radicle-upstream
so they can display diffs in our beautiful application?
If you, dear reader, have any ideas please let us know! In the meantime, we will play around with these git primitives and keep you updated on our progress
Thank you for reading,
Team CoCo (Code Collaboration)
Edit 1: “tells us” instead of “tells use”
Edit 2: Add links to libgit2 and git2-rs