In this post I wanted to give people the low-down on radicle-surf
and the journey for its first version.
In part it’s also a personal journey for myself and the team, since we are a partly-remote team and I wanted to give an
idea of how we work.
radicle-surf
As a part of the greater Radicle ecosystem, we will focus on the radicle-surf
component here. The aim of this component is to provide an API for browsing code in a Version Control System (VCS). Think of it as similar to going to GitHub and looking at a specific repository. Some of the common things we want to do are:
- Navigate directories
- View a file’s contents
- See what commit touched a file last
- View the commits in some history
Another wish for the project was to have it independent from the VCS it used. We would like to explore having support for Pijul repositories, for example. With a few of these details in mind we moved forward into thinking about the design of radicle-surf
.
Denotational Design
Denotational design is a technique for designing your API in such a way that it minimises abstraction leaks and lends itself to getting to the “true meaning” of what you are designing. Conal Elliott is the main (pretty much only) proponent of this technique. I first became interested in it after watching his talk and remarked on how elegant his abstractions turned out to be. On top of that, the process of thinking about what you want to represent and eliminating edge cases in the system is dear to my heart as a functional programmer who likes types and static guarantees.
The reason I mention this is because as I began to think about code browsing my friend Sandy Maguire had mentioned denotational design in his upcoming book, “Design and Interpretation of Haskell Programs”. This re-lit the first in my mind and I decided to explore the technique as a way to provide some upfront work on radicle-surf
before diving in and writing code.
First Attempt
It’s rare that we get things right the first time around. It was no different with working denotational design and our initial stab at the problem. We can find the initial design here.
The tactic for designing in this way is to lay out your domain as much as you can up front. So, to start off we list all the objects we think we will require in the system. In this case, we had listed objects such as: Repo
, CommitHistory
, Commit
, and Change
. We can note here that these might not have initially existed from the outset, but instead brought into existence after exploring the design further.
Once we had the objects listed, we listed the actions we would like to have working on them. These build up the functionality of our API. For example, in code browsing we want to list a directory, get a particular set of commits, view a file’s contents, etc. With the types of objects and functions in hand we begin to reason about the “meaning” of these objects and functions. The aim of this step is a bit more nuanced and we spend the most of our time figuring this out and refining it. The best guide for this step is to find the type that is used the least in the functions. For example, Repo
was only mentioned in two places:
addCommitHistory :: CommitHistory -> Repo -> Repo
getCommitHistories :: Repo -> [CommitHistory]
We have a function that adds a CommitHistory
to a Repo
, getting back the updated Repo
. The other is to retrieve all the CommitHistory
objects in a Repo
. When we want to assign a meaning to an object, we want to pick the simplest meaning as possible. This means that we do not want to bring any unnecessary meaning with us during the design. We will not want to talk about what time the Repo
was created at, so we will not be adding a time type to the meaning. The simplest meaning in our case here is type instance Repo = [CommitHistory]
. A Repo
is simply the collection of CommitHistory
objects. From there we can derive the meaning of our two functions above using the overloaded operator ÎĽ
(read “as meaning of”).
-- Prepend a 'CommitHistory' to a 'Repo'
addCommitHistory :: CommitHistory -> Repo -> Repo
ÎĽ addCommitHistory history = history : ÎĽ repo
-- Retrieve a list of the 'CommitHistory's
getCommitHistories :: Repo -> [CommitHistory]
ÎĽ getCommitHistories repo = ÎĽ repo
The issue was that when starting out I did not describe the domain correctly to myself. When fleshing out the design I got to the point where a Change
was the atomic unit of the system. A Commit
was made up of a set of Change
s and they would build a file. Maybe it was my buzz on just learning about patch theory, Darcs, and Pijul.
In the design we defined Change
as:
type Change
-- Constructors of Change - think GADT
AddLineToFile :: FileName -> Location -> FileContents -> Change
RemoveLineFromFile :: FileName -> Location -> Change
MoveFile :: FileName -> FileName -> Change
CreateFile :: FileName -> Change
DeleteFile :: FileName -> Change
Then to build up a file’s contents we added:
applyChange :: Change -> FileContents -> FileContents
ÎĽ applyChange change fileContents = case change of
AddLineToFile _ location fileContents' -> addLine location fileContents' fileContents
RemoveLineFromFile _ location -> removeLine location fileContents
-- These operations don't modify contents per se but rather just modify the file
MoveFile _ _ -> fileContents
CreateFile _ -> fileContents
-- Deleting will set FileContents to nothing
DeleteFile _ -> emptyFileContents
While this is elegant from the point of view of constructing a file via its history, it became unnecessarily complicated to implement when all we wanted to do is view a file’s state. We would have to implement a patch-based approach for Git Commits. This would be a distraction from what we really wanted to do: view files and directories.
Second Attempt
The good news is that I have a team at Monadic to lean on when the going gets tough! After expressing my frustration with the initial design, Kim and I setup a call where we could collaborate on a refinement of the design. Kim is based in Berlin and I am based in Dublin so we hopped on Google Hangouts and used tmate to start a remote terminal that we could both work in. This is a really cool tool for doing some pair programming remotely, or even in the same room.
Sitting down together we thought about the problem again. We initially tried to think about the initial design and what was the issue, but we switched how we were thinking about things a bit. We had a thought experiment as to what a code browser might look like if we were to build it for a REPL. This lead us to an initial jumping off point where the user would get a repository, pick a history to view, build the directory from that history, and finally browse that directory. This matched up pretty well when thinking about how a user would utilise GitHub. We first go to a URL, pick a branch (default to the master
branch), and we have our directories and files to view.
Then things got interesting. We knew we had this concept of building a directory. Using Pijul and Git as examples we thought about how they worked to render a directory from their units of change.
In Git we have the notion of Commit (see Commit Objects), and this will give us the ability to render a Tree. So if we had a list of commits (a Branch can be seen as a list of Commits), to get the directory state we grab the head of the list and render the Tree. On the flip-side we have Pijul which contains a list of Patches. If we are given a collection of Patches we can fold over them, applying each using Pijul’s logic, to get the redered directory state.
So it was important to note that we have two similar functions that go from some VCS artifact to a rendered directory. This led us to the idea of a Snapshot
. A Snapshot
was simply a function
from [a] -> Directory
where a
could be a Commit
or a Patch
. This is when we felt like we were really onto something. The boundary was clearly there separating any Repo
logic from the
Directory
logic.
We refined this further. Since we must have at least one a
to bring a Directory
into existence, a Snapshot
became NonEmpty a -> Directory
. We note that we could have also said that it would be [a] -> Maybe Directory
. Instead we wanted to work on more natural semantics, since we can also note that there is a transformation from [a] -> Maybe (NonEmpty a)
. For a further explanation on this method, we highly recommend reading Parse, don’t validate, by Alexis King.
One issue with this initial meaning for Snapshot
though was that head
of an empty list is not a total function and so we could not get the head
of list of Commit
s. We would like to avoid these kinds of faulty semantics as much as possible, and thankfully this is simply fixed by using a NonEmpty
list, but it clearly illustrates the push and pull of giving meanings to types and functions.
We explored further the meaning of Snapshot
and how it would play in the browsing semantics. We noted that in the initial user flow of the REPL, that there was no way of changing the history we wanted to work with. This smelled like we needed some way of viewing and updating state. In Haskell-land, this is achieved simply by using the State
monad, where we can call functions such as get
, set
, and modify
. At that moment we realised that our way of browsing code was simply combining a Reader
monad (a read only state) over a State
monad. The read-only state was the Snapshot
function, and the modifiable state was History
(i.e. NonEmpty a
). From there we could easily define a function that rendered a Directory
from its current state and modify state, such as finding a particular commit, or viewing a particular branch.
Once we have a Directory
we can do things like finding a particular file or directory, or getting the diff between to two Directory
objects. These are outlined in the second design.
Conclusion
The first minimal version of radicle-surf
is coming along nicely and we would say this is due to having put some good up-front work before writing any code.
On top of having a reference document for writing the implementation, it also gave other Monadlings the chance to give their input on the design and allowed us to have a common place to discuss the design. It allowed us to not get distracted by nit-picky semantics, but rather focus on what we truly wanted to design and use.
We would love to hear your feedback and if you have had any experience in writing denotational design before
Lots of love,
@fintohaps