Can We Please Move Past Git? (22 Feb 2021)

Git is fundamentally a content-addressable filesystem with a VCS user interface written on top of it.

— Pro Git §10.1

Most software development is not like the Linux kernel's development; as such, Git is not designed for most software development. Like Samuel Hayden tapping the forces of Hell itself to generate electricity, the foundations on which Git is built are overkill on the largest scale, and when the interface concealing all that complexity cracks, the nightmares which emerge can only be dealt with by the Doom Slayer that is Oh Shit Git. Of course, the far more common error handling method is start over from scratch.

Git is bad. But are version control systems like operating systems in that they're all various kinds of bad, or is an actually good VCS possible? I don't know, but I can test some things and see what comes up.

Mercurial

Mercurial is a distributed VCS that's around the same age as Git, and I've seen it called the Betamax to Git's VHS, which my boomer friends tell me is an apt analogy, but I'm too young for that to carry meaning. So let me see what all the fuss is about.

Well, I have some bad news. From the download page, under "Requirements":

Mercurial uses Python (version 2.7). Most ready-to-run Mercurial distributions include Python or use the Python that comes with your operating system.

Emphasis theirs, but I'd have added it myself otherwise. Python 2 has been dead for a very long time now, and saying you require Python 2 makes me stop caring faster than referring to "GNU/Linux". If you've updated it to Python 3, cool, don't say it uses Python 2. Saying it uses Python 2 makes me think you don't have your shit together, and in fairness, that makes two of us, but I'm not asking people to use my version control system (so far, at least).

You can't be better than Git if you're that outdated. (Although you can totally be better than Git by developing a reputation for having a better UI than Git; word of mouth helps a lot.)

Subversion

I am a fan of subverting things, and I have to respect wordplay. So let's take a look at Subversion (sorry, "Apache® Subversion®").

There are no official binaries at all, and the most-plausible-looking blessed unofficial binary for Windows is TortoiseSVN. I'm looking through the manual, and I must say, the fact that branches and tags aren't actually part of the VCS, but instead conventions on top of it, isn't good. When I want to make a new branch, it's usually "I want to try an experiment, and I want to make it easy to give up on this experiment." Also, I'm not married to the idea of distributed VCSes, but I do tend to start a project well before I've set up server-side infrastructure for it, and Subversion is not designed for that sort of thing at all. So I think I'll pass.

You can't be better than Git if the server setup precedes the client setup when you're starting a new project. (Although you can totally be better than Git by having monotonically-ish increasing revision numbers.)

Fossil

Fossil is kinda nifty: it handles not just code but also issue tracking, documentation authoring, and a bunch of the other things that services like GitHub staple on after the fact. Where Git was designed for the Linux kernel, which has a fuckton of contributors and needs to scale absurdly widely, Fossil was designed for SQLite, which has a very small number of contributors and does not solicit patches. My projects tend to only have one contributor, so this should in principle work fine for me.

However, a few things about Fossil fail to spark joy. The fact that repository metadata is stored as an independent file separate from the working directory, for example, is a design decision that doesn't merge well with my existing setup. If I were to move my website into Fossil, I would need somewhere to put boringcactus.com.fossil outside of D:\Melody\Projects\boringcactus.com where the working directory currently resides. The documentation suggests ~/Fossils as a folder in which repository metadata can be stored, but that makes my directory structure more ugly. The rationale for doing it this way instead of having .fossil in the working directory like .git etc. is that multiple checkouts of the same repository are simpler when repository metadata is outside each of them. Presumably the SQLite developers do that sort of thing a lot, but I don't, and I don't know anyone who does, and I've only ever done it once (back in the days when the only way to use GitHub Pages was to make a separate gh-pages branch). Cluttering up my filesystem just so you can support a weird edge case that I don't need isn't a great pitch.

But sure, let's check this out. The docs have instructions for importing a Git repo to Fossil, so let's follow them:

PS D:\Melody\Projects\boringcactus.com> git fast-export --all | fossil import --git D:\Melody\Projects\misc\boringcactus.com.fossil
]ad fast-import line: [S IN THE

Well, then. You can't be better than Git if your instructions for importing from Git don't actually work. (Although you can totally be better than Git if you can keep track of issues etc. alongside the code.)

Darcs

Darcs is a distributed VCS that's a little different to Git etc. Git etc. have the commit as the fundamental unit on which all else is built, whereas Darcs has the patch as its fundamental unit. This means that a branch in Darcs refers to a set of patches, not a commit. As such, Darcs can be more flexible with its history than Git can: a Git commit depends on its temporal ancestor ("parent"), whereas a Darcs patch depends only on its logical ancestor (e.g. creating a file before adding text to it). This approach also improves the way that some types of merge are handled; I'm not sure how often this sort of thing actually comes up, but the fact that it could is definitely suboptimal.

So that's pretty cool; let's take a look for ourselves. Oh. Well, then. The download page is only served over plain HTTP - there's just nothing listening on that server over HTTPS - and the downloaded binaries are also served over plain HTTP. That's not a good idea. I'll pass, thanks.

You can't be better than Git while serving binaries over plain HTTP. (Although you can totally be better than Git by having nonlinear history and doing interesting things with patches.)

Pijul

Pijul is (per the manual)

the first distributed version control system to be based on a sound mathematical theory of changes. It is inspired by Darcs, but aims at solving the soundness and performance issues of Darcs.

Inspired by Darcs but better, you say? You have my attention. Also of note is that the developers are also building their own GitHub clone, which they use to host pijul itself, which gives a really nice view of how a GitHub clone built on top of pijul would work, and also offers free hosting.

The manual gives installation instructions for a couple Linuces and OS X, but not Windows, and not Alpine Linux, which is the only WSL distro I have installed. However, someone involved in the project showed up in my mentions to say that it works on Windows, so we'll just follow the generic instructions and see what happens:

PS D:\Melody\Projects> cargo install pijul --version "~1.0.0-alpha"
    Updating crates.io index
  Installing pijul v1.0.0-alpha.38
  Downloaded <a bunch of stuff>
   Compiling <a bunch of stuff>
error: linking with `link.exe` failed: exit code: 1181
  |
  = note: "C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.27.29110\\bin\\HostX64\\x64\\link.exe" <lots of bullshit>
  = note: LINK : fatal error LNK1181: cannot open input file 'zstd.lib'


error: aborting due to previous error

So it doesn't work for me on Windows. (There's a chance that instructions would help, but in the absence of those, I will simply give up.) Let's try it over on Linux:

UberPC-V3:~$ cargo install pijul --version "~1.0.0-alpha"
<lots of output>
error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" <a mountain of arguments>
  = note: /usr/lib/gcc/x86_64-alpine-linux-musl/9.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lzstd
          /usr/lib/gcc/x86_64-alpine-linux-musl/9.3.0/../../../../x86_64-alpine-linux-musl/bin/ld: cannot find -lxxhash
          collect2: error: ld returned 1 exit status


error: aborting due to previous error
UberPC-V3:~$ sudo apk add zstd-dev xxhash-dev
UberPC-V3:~$ cargo install pijul --version "~1.0.0-alpha"
<lots of output again because cargo install forgets dependencies immediately smdh>
   Installed package `pijul v1.0.0-alpha.38` (executable `pijul`)

Oh hey, would you look at that, it actually worked, and all I had to do was wait six months for each compile to finish (and make an educated guess about what packages to install). So for the sake of giving back, let's add those instructions to the manual, so nobody else has to bang their head against the wall like I'd done the past few times I tried to get Pijul working for myself.

First, clone the repository for the manual:

UberPC-V3:~$ pijul clone https://nest.pijul.com/pijul/manual
Segmentation fault

Oh my god. That's extremely funny. Oh fuck that's hilarious - I sent that to a friend and her reaction reminded me that Pijul is written in Rust. This VCS so profoundly doesn't work on my machine that it manages to segfault in a language that's supposed to make segfaults impossible. Presumably the segfault came from C code FFId with unsafe preconditions that weren't met, but still, that's just amazing.

Update 2021-02-24: One of the Pijul authors reached out to me to help debug things. Apparently mmap on WSL is just broken, which explains the segfault. They also pointed me towards the state of the art in getting Pijul to work on Windows, which I confirmed worked locally and then set up automated Windows builds using GitHub Actions. So if we have a working Pijul install, let's see if we can add that CI setup to the manual:

PS D:\Melody\Projects\misc> pijul clone https://nest.pijul.com/pijul/manual pijul-manual
✓ Updating remote changelist
✓ Applying changes       47/47
✓ Downloading changes    47/47
✓ Outputting repository

Hey, that actually works! We can throw in some text to the installation page (and more text to the getting started page) and then use pijul record to commit our changes. That pulls up Notepad as the default text editor, which fails to spark joy, but that's a papercut that's entirely understandable for alpha software not primarily developed on this OS. Instead of having "issues" and "pull requests" as two disjoint things, the Pijul Nest lets you add changes to any discussion, which I very much like. Once we've recorded our change and made a discussion on the repository, we can pijul push boringcactus@nest.pijul.com:pijul/manual --to-channel :34 and it'll attach the change we just made to discussion #34. (It appears to be having trouble finding my SSH keys or persisting known SSH hosts, which means I have to re-accept the fingerprint and re-enter my Nest password every time, but that's not the end of the world.)

So yeah, Pijul definitely still isn't production-ready, but it shows some real promise. That said, you can't be better than Git if you aren't production-ready. (Although you can totally be better than Git by having your own officially-blessed GitHub clone sorted out already.) (And maybe, with time, you can be eventually better than Git.)

what next?

None of the existing VCSes that I looked at were unreservedly better than Git, but they all had aspects that would help beat Git.

A tool which is actually better than Git should start by being no worse than Git:

allow importing existing Git repositories
don't require Git users to relearn every single thing - we already had to learn Git, we've been through enough

Then, to pick and choose the best parts of other VCSes, it should

have a UI that's better, or at least perceived as better, than Git's - ideally minimalism and intuitiveness will get you there, but user testing is gonna be the main thing
avoid opaque hashes as the primary identifier for things - r62 carries more meaning than 7c7bb33 - but not at the expense of features that are actually important
go beyond just source code, and cover issues, documentation wikis, and similar items, so that (for at least the easy cases) the entire state of the project is contained within version control
approach history as not just a linear sequence of facts but a story
offer hosting to other developers who want to use your VCS, so they don't have to figure that out themselves to get started in a robust way

And just for kicks, a couple of extra features that nobody has but everybody should:

the CLI takes a back seat to the GUI (or TUI, I guess) - seeing the state gets easier that way, discovering features gets easier that way, teaching to people who aren't CLI-literate gets easier that way
contributor names & emails aren't immutable - trans people exist, and git filter-graph makes it about as difficult to change my name as the state of Colorado did
if you build in issue/wiki/whatever tracking, also build in CI in some way
avoid internal jargon - either say things in plain $LANG or develop a consistent and intuitive metaphor and use it literally everywhere

I probably don't have the skills, and I certainly don't have the free time, to build an Actually Good VCS myself. But if you want to, here's what you're aiming for. Good luck. If you can pull it off, you'll be a hero. And if you can't, you'll be in good company.