With a new development cycle upon us,
radicle-surf needs to revisit the task of implementing diff’ing.
We created a couple of issues to track this work initially:
- Diff Semantics · Issue #27 · radicle-dev/radicle-surf · GitHub
- Implement diff · Issue #22 · radicle-dev/radicle-surf · GitHub
The reason for getting more serious about this topic is that we would like to complete some code browsing features. To name just a few:
- Viewing commit changes
- Comparing two commits
- View branch comparisons (more specifically for viewing proposed changes to be merged)
Thanks to the work of Andrei Tomashpolskiy we have a start for diff’ing. This initial implementation surfaced a lot of questions that we hadn’t initially considered when thinking about generating diffs. We initially thought that generating a diff from the
Directory structure would be quite powerful since it would be implemented in, so-called, “pure” functions. It would not rely on any store or state. In reality though, it would be useful to boot-strap off of the backing VCS’s implementation of diff’ing since it would track a lot of useful information already, such as lines changed, files moved, etc.
Moving forward we would like to try work from the concrete and notice patterns that we could generalise for future development. This process will be different in comparison to Surfs Up - The Denotational Wave. If we were to approach this problem with another denotational design perspective, I would fear that we’d be on the road to re-inventing something like Pijul (but nowhere near as complete and nice).
Instead we will start from the model of libgit2 (more specifically the Rust wrapper git2), mapping out how it models diffs, and try to create a model that would make more sense to us and other implementations, again, such as Pijul. Over the next couple of weeks, starting today, we will dig into diff’ing with one use case at a time.
The first use-case is looking at a single commit change set. In GitHub this would be the same as looking at a single commit, e.g. Add Deref for Label · radicle-dev/radicle-surf@07fce73 · GitHub. What this is doing is taking a single commit and comparing it to its parent. What we get is a series of files that were changed, whether they were added, deleted, moved, or modified, and within modifications what lines were added or removed. Sounds simple right? Well in
libgit2 it turns out to be a web of data structures that we’re currently navigating through. Here’s a taste of what we have so far:
- To get a diff of two
Commits we can get their
Trees and call diff_tree_to_tree
Patchcan be retrieved by using
Diffwe just retrieved along with with the file you want to look at (which is given by an index, by the way…)
- Within the
Patchwe have a
DiffDeltawhich tells us what files are being touched
- We also have a series of
DiffHunk’s in turn have lines which are captured in the structure:
DiffLine. It’s here where we can tell how the content has changed within a file.
While these are just our initial findings with a quick dive into the
libgit2 API (via
git2), we will have to keep in the back of our mind what patterns we can see here. While these data structures may be useful for the git model, is this type of nested structure helpful for code browsing? Is there a more user friendly structure that we can come up with that allows for the plug’n’play of other VCS libraries? What information makes the most sense for
radicle-upstream so they can display diffs in our beautiful application?
If you, dear reader, have any ideas please let us know! In the meantime, we will play around with these git primitives and keep you updated on our progress
Thank you for reading,
Team CoCo (Code Collaboration)
Edit 1: “tells us” instead of “tells use”
Edit 2: Add links to libgit2 and git2-rs