Skip to content

The Intolerable Hypocrisy of Cyberlibertarianism

I like the Internet. I am old enough to remember the pre-Internet era and despite the younger generations pining for those simpler days, I was there. Paper maps were absolutely horrible, just you and a compass in your car on the side of the road in the middle of the night trying to figure out where you are and where you are going. Once when driving from Michigan to Florida I got so lost in the middle of the night in Kentucky that I had to pull over to sleep and wait for the sun so I could figure out where I was. I awoke to an old man staring unblinkingly into my car, shirtless, breathing heavy enough to fog the windows. To say I floored that 1991 Honda Civic is an understatement.

You would leave your house and then just disappear. This is presented as kind of romantic now, as if we were just free spirits on the wind and could stop and really watch a sunset. In practice it was mostly an annoying game of attempting to guess where people were. You'd call their job, they had left. You'd call their house, they weren't home yet. Presumably they were in transit but you actually had no idea. As a child my response to people asking me where my parents were was often a shrug as I resumed attempting to eat my weight in shoplifted candy or make homemade napalm with gasoline and styrofoam. Sometimes I shudder as a parent remembering how young I was putting pennies on train tracks and hiding dangerously close so that we could get the cool squished penny afterwards.

Cassettes are the worst way to listen to music ever invented. Tapes squealed. Tapes slowed down for no reason, like they were depressed. Multiple times in my life I would set off on a long road trip, pop in a tape, and within fifteen minutes watch as it shot from the deck unspooled like the guts from the tauntaun in Star Wars. You'd then spend forty-five minutes at a Sunoco trying to wind it back in with a Bic pen knowing in your heart you were performing CPR on a corpse. Then you'd put it back in the player out of pure stubbornness, and it would chew itself again immediately, and you'd drive the next six hours in silence with your own thoughts, which were not as good as Pearl Jam.

So I am, mostly, grateful for the bounty the internet has provided. But there is something wrong, deeply wrong, with what we built. The wrongness was there at the start. It was baked into the foundation by people who told themselves a story about freedom, and that story was a lie, and we are all, every one of us, paying their tab.

To understand what happened we need to go back to the 90s.

A Declaration of the Independence of Cyberspace

One of the first and most classic examples of the ideology that powered and continues to power tech is the classic "A Declaration of the Independence of Cyberspace" by John Perry Barlow written in 1996. You can find the full text here. I remember thinking it was genius when I first read it. I was young enough that I also thought "Snow Crash" was a serious political document. Today the Declaration reads like one of those sovereign citizen TikToks where someone in traffic court is claiming diplomatic immunity under maritime law.

It helps to know who Barlow was. Barlow was a Grateful Dead lyricist. He was also a Wyoming cattle rancher. He was also, briefly, the campaign manager for Dick Cheney's first run for Congress. (You did not misread that.) He spent his later years as a fixture at Davos, the World Economic Forum, where the very wealthy gather each January to remind each other that they are interesting. It was at Davos, in February 1996, fueled by champagne and grievance over the Telecommunications Act, that Barlow banged out the Declaration on a laptop and emailed it to a few hundred friends. From there it became, somehow, one of the founding documents of the modern internet.

These increasingly hostile and colonial measures place us in the same position as those previous lovers of freedom and self-determination who had to reject the authorities of distant, uninformed powers. We must declare our virtual selves immune to your sovereignty, even as we continue to consent to your rule over our bodies. We will spread ourselves across the Planet so that no one can arrest our thoughts.

Many of the pillars of "modern Internet" are here. Identity isn't a fixed concept based on government ID but is a more fluid concept. We don't need centralized control or really any form of control because those things are unnecessary. It was this and the famous earlier "Cyberspace and the American Dream: A Magna
Carta for the Knowledge Age" that laid a familiar foundation for a lot of the culture we now have. [link]

The Magna Carta is also our introduction to the (now familiar) creed of "catch up or get left behind". The adoption of new technology must be done at the absolute fastest speed possible with no regulations or checks. You don't need to worry about the consequences of technology because these problems correct themselves. If you told me the following was written two weeks ago by OpenAI I would have believed you.

If this analysis is correct, copyright and patent protection of knowledge (or at least many forms of it) may no longer be unnecessary. In fact, the marketplace may already be creating vehicles to compensate creators of customized knowledge outside the cumbersome copyright/patent process

The cumbersome copyright/patent process. Cumbersome to whom, exactly? This is always the move. The thing your industry would prefer not to deal with is reframed as an obsolete burden. Your refusal to do it is rebranded as innovation. Your inability to imagine a world where you don't get exactly what you want becomes a manifesto.

Winner Saw It Coming

So there are dozens of these pieces and they all read the same. If you don't regulate these technologies humanity will only benefit. Education, healthcare, industry, etc. We don't need regulations because the transformation from the medium of paper to digital has transformed the human spirit. But one was extremely surprising to me. Langdon Winner wrote something almost prophetic back in 1997. You can read it here.

He coins the term cyberlibertarianism (or at least is the first mention of it I could find) and then goes on to describe an almost eerily accurate set of events.

In this perspective, the dynamism of digital technology is our true destiny. There is no time to pause, reflect or ask for more influence in shaping these developments. Enormous feats of quick adaptation are required of all of us just to respond to the
requirements the new technology casts upon us each day. In the writings of cyberlibertarians those able to rise to the challenge are the champions of the coming millennium. The rest are fated to languish in the dust.
Characteristic of this way of thinking is a tendency to conflate
the activities of freedom seeking individuals with the operations
of enormous, profit seeking business firms. In the Magna Carta
for the Knowledge Age, concepts of rights, freedoms, access, and
ownership justified as appropriate to individuals are marshaled
to support the machinations of enormous transnational firms.
We must recognize, the manifesto argues, that "Government does
not own cyberspace, the people do." One might read this as a
suggestion that cyberspace is a commons in which people have
shared rights and responsibilities. But that is definitely not where
the writers carry their reasoning.

What "ownership by the people" means, the Magna Carta
insists, is simply "private ownership." And it eventually becomes
clear that the private entities they have in mind are actually large,
transnational business firms, especially those in communications.
Thus, after praising the market competition as the pathway to a
better society, the authors announce that some forms of compe-
tition are distinctly unwelcome. In fact, the writers fear that the
government will regulate in a way that requires cable companies
and phone companies to compete. Needed instead, they argue,
is the reduction of barriers to collaboration of already large firms,
a step that will encourage the creation of a huge, commercial,
interactive multimedia network as the formerly separate kinds of
communication merge.

In all he lays out 4 pillars of this ideology.

Technological determinism. The new technology is going to transform everything, it cannot be stopped, and your only job is to keep up. Stewart Brand's actual quote, which Winner pulls out and lets sit there like a body on display, is "Technology is rapidly accelerating and you have to keep up." There's no room to ask whether we want any of this. The wave is coming. Surf or drown.

It does not occur to anyone in this discourse that 'drown' is a choice the wave is making, not a natural law. Waves do not have intentions. Destroying your livelihood and leaving you to rot isn't a requirement of the natural order as much as that would convenient.

Radical individualism. The point of all this technology is personal liberation. Anything that gets in the way of the individual maximizing themselves be it government, regulation, social obligation, your annoying neighbors, is an obstacle to be removed. Winner notes, with what I imagine was a very dry expression, that the writers of the "Magna Carta for the Knowledge Age" cited Ayn Rand approvingly. In 1994. As intellectual grounding. For a document about computers.

There is something deeply funny about a movement claiming to invent the future and grounding its case in a Russian émigré's airport novels about steel barons in love with their own reflections.

Free-market absolutism. Specifically the Milton Friedman, Chicago School, supply-side flavor. The market will sort it out. Regulation is theft. Wealth is virtue. George Gilder, who co-wrote the Magna Carta, had previously written a book called Wealth and Poverty that helped sell Reaganomics to the masses. He then wrote Microcosm, which argued that microprocessors plus deregulated capitalism would liberate humanity. He was very serious about this.

Don't worry, Gilder is still out there. He loves the blockchain and crypto now. He now writes about how Bitcoin will save the soul of capitalism, which it is somehow doing while also destroying the planet. Both can be true in his cosmology. The ideology is flexible like that.

A fantasy of communitarian outcomes. This is the part that should make you laugh out loud. After establishing that government is bad, regulation is theft, and the individual is sovereign, the cyberlibertarians then promise that the result of all this will be... rich, decentralized, harmonious community life. Negroponte: "It can flatten organizations, globalize society, decentralize control, and help harmonize people." Democracy will flourish. The gap between rich and poor will close. The lion will lie down with the lamb, and the lamb will have a Pentium II.

We also have the advantage of hindsight and know, without question, that all of these predicted outcomes were wrong. Not 'directionally wrong' or 'wrong in the details.' Wrong the way it would be wrong to predict that if you set your kitchen on fire, the result will be a renovation.

You have to hold these four ideas in your head at the same time to see the trick. The cyberlibertarians wanted you to believe that radical individualism plus deregulated capitalism plus inevitable technology would produce communitarian utopia. This is, on its face, insane. It is the economic equivalent of claiming that if everyone punches each other really hard, eventually we'll all be hugging.

But Winner's sharpest observation, the one I keep coming back to, isn't about any of the four pillars individually. It's about the move underneath them. He writes:

"Characteristic of this way of thinking is a tendency to conflate the activities of freedom seeking individuals with the operations of enormous, profit seeking business firms."

This is the entire game. This is how "don't tread on me" becomes "Meta should be allowed to do whatever it wants." This is how the rights of the lone hacker working in their garage become indistinguishable from the rights of a multinational with a market cap larger than most countries' GDP. The Magna Carta literally argues that the government should reduce barriers to collaboration between cable companies and phone companies in the name of individual freedom and social equality. Winner caught this in 1997.

That is why obstructing such collaboration – in the cause of forcing a competition
between the cable and phone industries – is socially elitist. To the extent it prevents collaboration between the cable industry and the phone companies, present federal policy actually thwarts the Administration's own goals of access and empowerment.

What makes the essay uncomfortable to read now is that Winner wasn't even predicting the future. He was just describing what was already happening and noting where it would obviously lead. He saw the media mergers and asked the question nobody in the industry wanted to answer: what happened to the predicted collapse of large centralized structures in the age of electronic media? Where, exactly, did the decentralization go? He saw that the cyberlibertarians were going to deliver the opposite of everything they promised, and that they were going to keep getting paid to promise it anyway.

He was writing before Google. Before Facebook. Before the iPhone. Before YouTube. Before Twitter, Bitcoin, Uber, AirBnB, OpenAI, and the entire app economy. Before any of the actual examples that would eventually prove him right existed. He just looked at the people doing the talking, listened to what they were saying, and wrote down where it ended. It is not a long essay. He didn't need a long essay. The future was right there on the page, in their own words. He just had to read it back to them.

The essay closes with a question that has, to my knowledge, never been seriously answered by the industry it was aimed at:

"Are the practices, relationships and institutions affected by people's involvement with networked computing ones we wish to foster? Or are they ones we must try to modify or even oppose?"

Twenty-eight years later, the industry still treats this question as somewhere between naive and seditious. It's the question Barlow's declaration was specifically designed to make unaskable. And it remains, to this day, the only question that actually matters.

Caveat emptor

When you look at these early formative writings, so much of what we see now becomes clear. The cyberlibertarian deal was always the same: you're on your own. The industry would build the infrastructure, take the profits, and shove every consequence, every harm, every cost, every responsibility, onto somebody else.

There is no greater example to me than the moderator. Anyone who has ever moderated a forum or a subreddit knows that adding the word "cyber" to a space doesn't suddenly turn people into better humans. People are still people. They flame each other, they post slurs, they doxx, they harass, they spam, they post CSAM, they radicalize each other, they grief, they coordinate, they lie. A space with humans in it requires governance.

They produce, with frightening regularity, the exact behavior any kindergarten teacher could have predicted. Then they act surprised.

But the cyberlibertarian model required pretending it was unforeseeable. The platforms couldn't acknowledge that they needed governance because acknowledging it would mean acknowledging responsibility, and acknowledging responsibility would mean acknowledging liability, and acknowledging liability would mean the entire economic model collapses. So instead the industry invented a beautiful fiction: governance happens, but it happens by magic, performed by volunteers, for free, who we will simultaneously rely on and mock.

Reddit is run by unpaid moderators. Wikipedia is run by unpaid editors. Stack Overflow was run by unpaid experts and is now a ghost town. On TikTok and Twitter it is the unknowable "algorithm" that is the cause of and solution to every problem backed by capricious moderators who delight in stopping free speech. Unless you don't like it, then it's negligence moderation in defense of your enemies.

Open source is run by unpaid maintainers having nervous breakdowns. The platforms collect the rent. The people doing the actual work of making the platforms livable get nothing, and when they ask for anything like recognition, tools, basic protection from harassment, they're told they're power-tripping nerds who should touch grass.

This is also the crypto story, just with the masks off. What if we made worse money on purpose, money that bypassed every protection consumers had won over the previous century, money that couldn't be reversed when stolen, money that funded ransomware attacks on hospitals and pump-and-dumps targeting people's retirement accounts? The cyberlibertarian answer was: that's freedom. The losses were real. People killed themselves. Hospitals had to turn away patients. The architects became billionaires and bought yachts and now sit on the boards of AI companies, where they are reinventing the same con with a new vocabulary.

Now Winner got one thing wrong, and it's worth pausing on, because it's the most interesting wrinkle in all of this. What actually happened was weirder and worse. The cyberlibertarians became the corporations. They didn't sell out. They didn't betray their principles for the first offer of money. They simply scaled until their principles became inconvenient, and then they stopped mentioning them.

Once the platforms got large enough to be unstoppable, once they captured enough of the regulatory apparatus to write their own rules, the libertarian rhetoric got quietly shelved like a college poster you took down before your in-laws came over. Meta no longer pretends it stands for free speech and seemingly takes delight in putting its thumb on the scale. TikTok users have invented an entire euphemistic shadow language to evade automated censorship like "unalive," "le dollar bean," "graped" that would have made 1996 Barlow weep into his bolo tie.

Copyright and patents matter when they're Apple's copyright and patents. Or Googles. Or OpenAIs. Go try to make a Facebook+ website and see how quickly Meta is capable of responding to content it finds objectionable.

Cyberlibertarianism was the ladder. Once they were on the roof, they kicked it away and started charging admission to look at the view.

So the Internet is Doomed?

Remember I like the Internet. I said it in the beginning and it is still true. I love the Fediverse, I love weird Discords about small tabletop RPGs I'm in. I spend hours in the Mister FPGA forums. There are corners that are good. But they're mostly good because they're not big enough to be worth breaking up.

It feels increasingly like I'm hanging out in the old neighborhood dive bar after most of the regulars have moved away. The lighting is the same. The bartender remembers your order. But you can hear yourself think now, and that's mostly because the room is half empty and the jukebox finally died. The new clientele is from out of town. They are taking pictures of the menu.

If we want to have a serious conversation about why we are in the situation we're in, it is no longer possible to pretend that the broken ideology that put us on this trajectory is still somehow compatible with the harsh realities that surround us. It is not clear to me if democracy can survive a deregulated Internet. A deregulated Internet filled with LLMs that can perfectly impersonate human beings powered by unregulated corporations with zero ethical guidelines seems like a somewhat obvious problem. Like an episode of Star Trek where you the viewer are like "well clearly the Zorkians can't keep the Killbots as pets." It doesn't take some giant intellect to see the pretty fucking obvious problem.

If we want to save the parts of the internet worth saving, we have to evolve. We have to find some sort of ethical code that says: just because I can do something and it makes money, that is not sufficient justification to unleash it on the world. Or, more simply: just because I want to do something and you cannot actively stop me, that does not make doing it a good idea. We have waited thirty years for the cyberlibertarian future to arrive and produce the promised harmonious community. It's time to face the facts. It's never coming. The bus left in 1996. The bus was never real.

People did not get better because they went online. Giving everyone access to a raw, unfiltered pipeline of every fact and lie ever produced did not turn them into better-educated people. It broke them. It allowed them to choose the reality they now inhabit, like ordering off a menu. If I want to believe the world is flat, TikTok will gladly serve me that content all day. Meta will recommend supportive groups. There will be hashtags. There will be Discords. There will be a guy named Trent who runs a podcast. I will never have to face the deeply uncomfortable possibility that I might be wrong about anything, ever, until the day I die, surrounded by people who agree with me about everything, including which of the other mourners are secretly lizards.

That is the internet we built. It was not an accident. It was the product of a specific ideology, written down by specific people, at a specific cocktail party in Davos, in 1996. Winner watched it happen and told us where it was going. We did not listen. There is still time, maybe, to start.


If I Could Make My Own GitHub

My friend and I have a game where we talk about what we'd do if we were rich. Not rich like 'paid off the mortgage' rich. Rich like a man who owns a submarine he's never been inside. Rich like a man whose third wife has a skincare line. Tech-titan rich — the kind of money that buys you a compound in Wyoming and the confidence to wear the same gray t-shirt to congressional testimony."

One of mine, for a long time, has been the dream of making a new forge. I was prompted to write this after reading the good post about Ghostty leaving GitHub but it's something I've written and talked about for a few years. Given how bad GitHub has become at its core job, it seemed like a fun opportunity to try and write up what my billionaire folly of a forge would look like. This folly would have less penile rockets filled with aging celebrities.

What are the problems with modern forges?

GitHub, GitLab and Gitea (those being the 3 I've used the most) are all modeled on effectively the same design. There are differences, but you can tell that GitHub sets the pattern for the industry and then those features are ported over to the other two with varying levels of success. Almost nothing in the forge has any relationship to git so the client has become completely disconnected from the practical usage.

Git is great at what it is designed to do, but what it is designed to do isn't the way most people are using it. Git is a perfect tool for kernel development. It is a decentralized distributed version control system that relies on the idea of patches being sent to maintainers over email. You trust those maintainers to maintain their sections and merge in the stuff that makes sense and not merge in the other stuff. It's a pretty high trust environment that places very few restrictions on how online a specific contributor is or what system they are using. If you have a laptop from 2010 that connects to the Internet once a week you can still be a meaningful contributor to a project with these workflows. .

However, in most jobs, git is effectively just the way I pull and push from a centralized repository stored in a forge. All the important stuff happens inside the forge, and very little of it happens on my client. Pull Requests are how I enforce the four-eyes principle, GitHub Actions are how I run my tests and linting on those Pull Requests to ensure they are functional and meet my organizational requirements, the user's identity in relation to that forge is how I verify who they are. I track issues with my code through Issues and cut releases for users to download through Releases. There's not a lot of git in this workflow, this is mostly placed on top of git.

So if we sort of don't care about what the client is really doing then there's no reason to be bound to git, since we're effectively free from that constraint. If we're gonna make a new forge I'd want it to do the following, ideally designed to work with the client to expose this stuff better.

  1. Stuff happens in the wrong order. You know the PR. Commit 1: 'Feature.' Commit 2: 'fix.' Commit 3: 'fix.' Commit 4: 'actually fix.' Commit 5: 'please.' Commit 6, made at 11:47 PM on a Thursday: 'asdfasdf'. This person has a family. This person has hobbies. This person is, at this moment, crying. You don't want the feedback loop after the commit you want it before. Let me do an enforced pre-commit hook to run the jobs remotely on the forge and provide the feedback to the user before they push.
  2. PR approval is too boolean. The PR is approved or it's not approved. Real code review, like real life, lives in the middle. 'Sure, fine, we'll deal with it later' is a legitimate human response and should be a legitimate button. Gerrit has a better model for this. If I weakly approve something as a maintainer, let me flag it for later.
  3. PRs are too inflexible. I don't need 4 eyes on every change, especially in a universe where LLMs exist. The global GDP lost annually to senior engineers staring at a four-line PR waiting for someone — anyone — to type 'LGTM' could fund a moon mission. A nice one. With legroom. Let me customize and more easily control this. If the person is a maintainer and the LLM says its low risk/no risk just let them go.
  4. Stacked PRs are just better. They're easier to review and understand. They have to be a first-class citizen not an add-on through a tool other than your VCS.
  5. A forge shouldn't do everything. Issue tracking yes. Kanban board, probably not. Wiki? I doubt it. Everything tools always turn into crap. You add features when its easy to add features and then pay the maintenance price for those features forever regardless of their rate of adoption because now someone, somewhere uses them and you are locked in.
  6. The standard unit of hosting is too large. Running Github Enterprise is a big task. Running GitLab is also a relatively big ask. These are complicated products with a lot of moving pieces. I want smaller individual units of hosting that I can link together to make an organization. It's fine if they're not globally federated and I need to make an account for each Organization, but an Organization should be flexible enough to let me say "these 12 Raspberry Pis are my org". I don't know how they communicate securely, I hire people for that problem.
  7. My local copy of the repo should be a representation of the entire repo, not just the code. I should be able to approve a PR from the same VCS I use to check in the code. I should be able to go through my issues by looking through local files.
  8. On the flip side, since I need to be online all the time to really work with a team, don't make me pay the storage price all the time. I want my VCS and my forge to work together. If I clone a repo, I want a pretty limited history for that repo when I clone. If I start to go back in time, spin up a worker to go fetch that stuff from the VCS when I need it. I don't need to hammer the forge with giant clone requests on the assumption that I might need to rebuild the forge at any moment with the entire history of the entire project.
  9. Actions need to be signed, SHA'd and usable offline. If I want, I should be able to get tarballs of all my actions, stick them in the repo and tell my system "don't go anywhere for checkout action, that's right here". If I say latest, have that work like Dependabot does now where it opens up a PR to put the latest tarballs in my repo. Actions are critical and they should be runnable on my local machine through the same VCS if I want to.

Well Y Does Some Of That

Absolutely. There are a lot of tools that do parts of this. I want someone to take them, put them all together and fit them up. I want JJ as the VCS, I want this as the forge and I want the expectation that I as a user could live happily with a raspberry pi as a forge for a long time. I want those forges designed around modern concepts like object storage and shallow clones and getting constantly hammered by LLM bots.

Now in a universe where GitHub was doing a good job, I wouldn't even bother writing this up. GitHub is the default and talking to people about overcoming the default is usually a waste of time. Heinz is the default ketchup, when I order a Coke I don't want a Pepsi and if I'm going to use a forge up until 2026 there would have to be an amazing reason for me to not choose the one that everyone uses. Up until recently other forges have been like sweet potato french fries, which is to say never the thing you actually want.

But we live in a world where the monolithic forge is breaking down and nobody has built the replacement. The people with the money are busy with the rockets. The people with the taste are busy with their day jobs. And the rest of us are opening PRs titled 'asdfasdf' at midnight, waiting for a robot to check them, wondering when the tool we spend our whole working lives inside stopped being built for us.

If I ever get the submarine money, I'll let you know.


Why Don't I Like Git More?

Why Don't I Like Git More?

I've been working with git now full-time for around a decade now. I use it every day, relying on the command-line version primarily. I've read a book, watched talks, practiced with it and in general use it effectively to get my job done. I even have a custom collection of hooks I install in new repos to help me stay on the happy path. I should like it, based on mere exposure effect alone. I don't.

I don't feel like I can always "control" what git is going to do, with commands sometimes resulting in unexpected behavior that is consistent with the way git works but doesn't track with how I think it should work. Instead, I need to keep a lot in my mind to get it to do what I want. "Alright, I want to move unstaged edits to a new branch. If the branch doesn't exist, I want to use checkout, but if it does exist I need to stash, checkout and then stash pop." "Now if the problem is that I made changes on the wrong branch, I want stash apply and not stash pop." "I need to bring in some cross-repo dependencies. Do I want submodules or subtree?"

I need to always deeply understand the difference between reset, revert, checkout, clone, pull, fetch, cherrypick when I'm working even though some of those words mean the same thing in English. You need to remember that push and pull aren't opposites despite the name. When it comes to merging, you need to think through the logic of when you want rebase vs merge vs merge --squash. What is the direction of the merge? Shit, I accidentally deleted a file awhile ago. I need to remember git rev-list -n 1 HEAD – filename. Maybe I deleted a file and immediately realized it but accidentally committed it. git reset --hard HEAD~1 will fix my mistake but I need to remember what specifically --hard does when you use it and make sure it's the right flag to pass.

Nobody is saying this is impossible and clearly git works for millions of people around the world, but can we be honest for a second and acknowledge that this is massive overkill for the workflow I use at almost every job which looks as follows:

  • Make a branch
  • Push branch to remote
  • Do work on branch and then make a Pull Request
  • Merge PR, typically with a squash and merge cause it is easier to read
  • Let CI/CD do its thing.

I've never emailed a patch or restored a repo from my local copy. I don't spend weeks working offline only to attempt to merge a giant branch. We don't let repos get larger than 1-2 GB because then they become difficult to work with when I just need to change like three files and make a PR. None of the typical workflow benefits from the complexity of git.

More specifically, it doesn't work offline. It relies on merge controls that aren't even a part of git with Pull Requests. Most of that distributed history gets thrown away when I do a squash. I don't gain anything with my local disk being cluttered up with out-of-date repos I need to update before I start working anyway.

Now someone saying "I don't like how git works" is sort of like complaining about PHP in terms of being a new and novel perspective. Let me lay out what I think would be the perfect VCS and explore if I can get anywhere close to there with anything on the market.

Gitlite

What do I think a VCS needs (and doesn't need) to replace git for 95% of usecases.

  • Dump the decentralized model. I work with tons of repos, everyone works with tons of repos, I need to hit the server all the time to do my work anyway. The complexity of decentralized doesn't pay off and I'd rather be able to do the next section and lose it. If GitHub is down today I can't deploy anyway so I might as well embrace the server requirement as a perk.
  • Move a lot of the work server-side and on-demand. I wanna search for something in a repo. Instead of copying everything from the repo, running the search locally and then just accepting that it might be out of date, run it on the server and tell me what files I want. Then let me ask for just those files on-demand instead of copying everything.
  • I want big repos and I don't want to copy the entire thing to my disk. Just give me the stuff I want when I request it and then leave the rest of it up there. Why am I constantly pulling down hundreds of files when I work with like 3 of them.
  • Pull Request as a first-class citizen. We have the concept of branches and we've all adopted the idea of checks that a branch must pass before it can be merged. Let's make that a part of the CLI flow. How great would be it to be able to, inside the same tool, ask the server to "dry-run" a PR check and see if my branch passes? Imagine taking the functionality of the gh CLI and not making it platform specific ala kubectl with different hosted Kubernetes providers.
  • Endorsing and simplifying the idea of cross-repo dependencies. submodules don't work the way anybody wants them to. subtree does but taking work and pushing it back to the upstream dependency is confusing and painful to explain to people. Instead I want something like: https://gitmodules.com/
    • My server keeps it in sync with the remote server if I'm pulling from a remote server but I can pin the version in my repo.
    • My changes in my repo go to the remote dependency if I have permission
    • If there are conflicts they are resolved through a PR.
  • Build in better visualization tools. Let me kick out to a browser or whatever to more graphically explore what I'm looking at here. A lot of people use the CLI + a GUI tool to do this with git and it seems like something we could roll into one step.
  • Easier centralized commit message and other etiquette enforcement. Yes I can distribute a bunch of git hooks but it would be nice if when you cloned the repo you got all the checks to make sure that you are doing things the right way before you wasted a bunch of time only to get caught by the CI linter or commit message format checker. I'd also love some prompts like "hey this branch is getting pretty big" or "every commit must be of a type fix/feat/docs/style/test/ci whatever".
  • Read replica concept. I'd love to be able to point my CI/CD systems at a read replica box and preserve my primary VCS box for actual users. Primary server fires a webhook that triggers a build with a tag, hits the read replica which knows to pull from the primary if it doesn't have that tag. Be even more amazing if we could do some sort of primary/secondary model where I can set both in the config and if primary (cloud provider) is down I can keep pushing stuff up to somewhere that is backed up.

So I tried out a few competitors to see "is there any system moving more towards this direction".

SVN in 2024

My first introduction to version control was SVN (Subversion), which was pitched to me as "don't try to make a branch until you've worked here a year". However getting SVN to work as a newbie was extremely easy because it doesn't do much. Add, delete, copy, move, mkdir, status, diff, update, commit, log, revert, update -r, co -r were pretty much all the commands you needed to get rolling. Subversion has a very simple mental model of how it works which also assists with getting you started. It's effectively "we copied stuff to a file server and back to your laptop when you ask us to".

I have to say though, svn is a much nicer experience than I remember. A lot of the rough edges seem to have been sanded down and I didn't hit any of the old issues I used to. Huge props to the Subversion team for delivering great work.

Subversion Basics

Effectively your Subversion client commits all your files as a single atomic transaction to the central server as the basic function. Whenever that happens, it creates a new version of the whole project, called a revision. This isn't a hash, it's just a number starting at zero, so there's no confusion as a new user what is the "newer" or "older" thing. These are global numbers, not tied to a file, so it's the state of the world. For each individual file there are 4 states it can be in:

  • Unchanged locally + current remote: leave it alone
  • Locally changed + current remote: to publish the change you need to commit it, an update will do nothing
  • Unchanged locally + out of date remotely: svn update will merge the latest copy into your working copy
  • Locally changed + out of date remotely: svn commit won't work, svn update will try to resolve the problem but if it can't then the user will need to figure out what to do.

It's nearly impossible to "break" SVN because pushing up doesn't mean you are pulling down. This means different files and directories can be set to different revisions, but only when you run svn update does the whole world true itself up to the latest revision.

Working with SVN looks as follows:

  • Ensure you are on the network
  • Run svn update to get your working copy up to latest
  • Make the changes you need, remembering not to use OS tooling to move or delete files and instead use svn copy and svn move so it knows about the changes.
  • Run svn diff to make sure you want to do what you are talking about doing
  • Run svn update again, resolve conflicts with svn resolve
  • Feeling good? Hit svn commit and you are done.

Why did SVN get dumped then? One word: branches.

SVN Branches

In SVN a branch is really just a directory that you stick into where you are working. Typically you do it as a remote copy and then start working with it, so it looks more like you are copying the URL path to a new URL path. But to users they just look like normal directories in the repository that you've made. Before SVN 1.4 merging a branch required a masters degree and a steady hand, but they added an svn merge which made it a bit easier.

Practically you are using svn merge against the main to keep your branch in sync and then when you are ready to go, you run svn merge --reintegrate to push the branch to master. Then you can delete the branch, but if you need to read the log the URL of the branch will always work to read the log of. This was particularly nice with ticket systems where the URL was just the ticket number. But you don't need to clutter things up forever with random directories.

In short a lot of the things that used to be wrong with svn branches aren't anymore.

What's wrong with it?

So SVN breaks down IME when it comes to automation. You need to make it all yourself. While you do you have nuanced access control over different parts of a repo, in practice this wasn't often valuable. What you don't have is the ability to block someone from merging in a branch without some sort of additional controls or check. It also can place a lot of burden on the SVN server since nobody seems to ever update them even when you add a lot more employees.

Also the UIs are dated and the entire tooling ecosystem has started to rot from users leaving. I don't know if I could realistically recommend someone jump from git to svn right now, but I do think it has a lot of good ideas that moves us closer to what I want. It would just need a tremendous amount of UI/UX investment in terms of web to get it to where I would like using it over git. But I think if someone was interested in that work, the fundamental "bones" of Subversion are good.

Sapling

One thing I've heard from every former Meta engineer I've worked with is how much they miss their VCS. Sapling is that team letting us play around with a lot of those pieces, adopted for a more GitHub-centric world. I've been using it for my own personal stuff for a few months and have really come to fall in love with it. It feels like Sapling is specifically designed to be easy to understand, which is a delightful change.

A lot of the stuff is the same. You clone with sl clone, you check the status with sl status and you commit with sl commit. The differences that immediately stick out are the concept of stacks and the concept of the smartlog. So stacks are "collections of commits" and the idea is that from the command line I can issue PRs for those changes with sl pr submit with each GitHub PR being one of the commits. This view (obviously) is cluttered and annoying, so there's another tool that helps you see the changes correctly which is ReviewStack.

None of this makes a lot of sense unless I show you what I'm talking about. I made a new repo and I'm adding files to it. First I check the status:

❯ sl st
? Dockerfile
? all_functions.py
? get-roles.sh
? gunicorn.sh
? main.py
? requirements.in
? requirements.txt

Then I add the files:

sl add .
adding Dockerfile
adding all_functions.py
adding get-roles.sh
adding gunicorn.sh
adding main.py
adding requirements.in
adding requirements.txt

If I want a nicer web UI running locally, I run sl web and get this:

So I added all those files in as one Initial Commit. Great, let's add some more.

❯ sl
@  5a23c603a  4 seconds ago  mathew.duggan
│  feat: adding the exceptions handler
│
o  2652cf416  17 seconds ago  mathew.duggan
│  feat: adding auth
│
o  2f5b8ee0c  9 minutes ago  mathew.duggan
   Initial Commit

Now if I want to navigate this stack, I can just use sl prev which moves me up and down the stack:

sl prev 1
0 files updated, 0 files merged, 1 files removed, 0 files unresolved
[2f5b8e] Initial Commit

And that is also represented in my sl output

❯ sl
o  5a23c603a  108 seconds ago  mathew.duggan
│  feat: adding the exceptions handler
│
o  2652cf416  2 minutes ago  mathew.duggan
│  feat: adding auth
│
@  2f5b8ee0c  11 minutes ago  mathew.duggan
   Initial Commit

This also shows up in my local web UI

Finally the flow ends with sl pr to create Pull Requests. They are GitHub Pull Requests but they don't look like normal GitHub pull requests and you don't want to review them the same way. The tool you want to use for this is ReviewStack.

I stole their GIF because it does a good job

Why I like it

Sapling lines up with what I expect a VCS to do. It's easier to see what is going on, it's designed to work with a large team and it surfaces the information I want in a way that makes more sense. The commands make more sense to me and I've never found myself unable to do something I needed to do.

More specifically I like throwing away the idea of branches. What I have is a collection of commits that fork off from the main line of development, but I don't have a distinct thing I want named that I'm asking you to add. I want to take the main line of work and add a stack of commits to it and then I want someone to look at that collection of commits and make sure it makes sense and then run automated checks against it. The "branch" concept doesn't do anything for me and ends up being something I delete anyway.

I also like that it's much easier to undo work. This is something where I feel like git makes it really difficult to handle and uncommit, unamend, unhide, and undo in Sapling just work better for me and always seem to result in the behavior that I expected. Losing the staging area and focusing more on easy to use commands is a more logical design.

Why you shouldn't switch

If I love Sapling so much, what's the problem? So to get Sapling to the place I actually want it to be, I need more of the Meta special sauce running. Sapling works pretty well on top of GitHub, but what I'd love is to get:

These seem to be the pieces to get all the goodness of this system as outlined below

  • On-demand historical file fetching (remotefilelog, 2013)
  • File system monitor for faster working copy status (watchman, 2014)
  • In-repo sparse profile to shrink working copy (2015)
  • Limit references to exchange (selective pull, 2016)
  • On-demand historical tree fetching (2017)
  • Incremental updates to working copy state (treestate, 2017)
  • New server infrastructure for push throughput and faster indexes (Mononoke, 2017)
  • Virtualized working copy for on-demand currently checked out file or tree fetching (EdenFS, 2018)
  • Faster commit graph algorithms (segmented changelog, 2020)
  • On-demand commit fetching (2021)

I'd love to try all of this together (and since there is source code for a lot of it, I am working on trying to get it started) but so far I don't think I've been able to see the full Sapling experience. All these pieces together would provide a really interesting argument for transitioning to Sapling but without them I'm really tacking a lot of custom workflow on top of GitHub. I think I could pitch migrating wholesale from GitHub to something else, but Meta would need to release more of these pieces in an easier to consume fashion.

Scalar

Alright so until Facebook decided to release the entire package end to end, Sapling exists as a great stack on top of GitHub but not something I could (realistically) see migrating a team to. Can I make git work more the way I want to? Or at least can I make it less of a pain to manage all the individual files?

Microsoft has a tool that does this, VFS for Git, but it's Windows only so that does nothing for me. However they also offer a cross-platform tool called Scalar that is designed to "enable working with large repos at scale". It was originally a Microsoft technology and was eventually moved to git proper, so maybe it'll do what I want.

What scalar does is effectively set all the most modern git options for working with a large repo. So this is the built-in file-system monitor, multi-pack index, commit graphs, scheduled background maintenance, partial cloning, and clone mode sparse-checkout.

So what are these things?

  • The file system monitor is FSMonitor, a daemon that tracks changes to files and directories from the OS and adds them to a queue. That means git status doesn't need to query every file in the repo to find changes.
  • Take the git pack directory with a pack file and break it into multiples.
  • Commit graphs which from the docs:
    • " The commit-graph file stores the commit graph structure along with some extra metadata to speed up graph walks. By listing commit OIDs in lexicographic order, we can identify an integer position for each commit and refer to the parents of a commit using those integer positions. We use binary search to find initial commits and then use the integer positions for fast lookups during the walk."
  • Finally clone mode sparse-checkout. This allows people to limit their working directory to specific files

The purpose of this tool is to create an easy-mode for dealing with large monorepos, with an eye towards monorepos that are actually a collection of microservices. Ok but does it do what I want?

Why I like it

Well it's already built into git which is great and it is incredibly easy to use and get started with. Also it does some of what I want. Taking a bunch of existing repos and creating one giant monorepo, the performance was surprisingly good. The sparse-checkout means I get to designate what I care about and what I don't and also solves the problem of "what if I have a giant directory of binaries that I don't want people to worry about" since it follows the same pattern matching as .gitignore

Now what it doesn't do is radically change what git is. You could grow a repo to much much larger with these defaults set, but it's still handling a lot of things locally and requiring me to do the work. However I will say it makes a lot of my complaints go away. Combined with the gh CLI tool for PRs and I can cobble together a reasonably good workflow that I really like.

So while this is definitely the pattern I'm going to be adopting from now on (monorepo full of microservices where I manage scale with scalar), I think it represents how far you can modify git as an existing platform. This is the best possible option today but it still doesn't get me to where I want to be. It is closer though.

You can try it out yourself: https://git-scm.com/docs/scalar

Conclusion

So where does this leave us? Honestly, I could write another 5000 words on this stuff. It feels like as a field we get maddeningly close to cracking this code and then give up because we hit upon a solution that is mostly good enough. As workflows have continued to evolve, we haven't come back to touch this third rail of application design.

Why? I think the people not satisfied with git are told that is a result of them not understanding it. It creates a feeling that if you aren't clicking with the tool, then the deficiency is with you and not with the tool. I also think programmers love decentralized designs because it encourages the (somewhat) false hope of portability. Yes I am entirely reliant on GitHub actions, Pull Requests, GitHub access control, SSO, secrets and releases but in a pinch I could move the actual repo itself to a different provider.

Hopefully someone decides to take another run at this problem. I don't feel like we're close to done and it seems like, from playing around with all these, that there is a lot of low-hanging optimization fruit that anyone could grab. I think the primary blocker would be you'd need to leave git behind and migrate to a totally different structure, which might be too much for us. I'll keep hoping it's not though.

Corrections/suggestions: https://c.im/@matdevdug