Dear Executive Suite,

You may think that you don't need to know what "distributed version control systems" (DVCS) means... but you would be wrong to do so.

Human collaboration is the most critical product of a business whether it be high strategic imperatives or intricacies of interpersonal negotiating and tactics. Surprisingly, the nuts and bolts logistics of how we collaborate, whether it be video chat, meetings, Word Doc, Email threads or Google Doc, play massive roles in shaping the quality and speed of how our teams collaborate.

These same "communication logistics" are also core to the way software engineers work and the good news is that software engineers have been working on improving these systems for decades. Many of the basic concepts of collaboration have already been adopted by business at large, but the biggest, most important shift we've experienced is just at the point where it will start to cross the chasm from tech to business at large. The seismic shift that Git an GitHub have had on the logistics of software engineering & Open Source development in the past 5 years presages a dramatic revolution in the way successful companies are going to operate in the next decade. By understanding the revolution that has recently occurred in DVCS you will be able to get a glimpse of how your business should be operating in 2025.

The essential insight is that by relinquishing a bit of control over process and focussing on the atomization of good ideas, we're starting to move away from 'the cathedral' and fulfill the promise of 'the bazaar'. This essay will seek to explain the principles of DVCS and give non-technical readers a sense of how this has revitalized open source development and what that means for the future of business.

The Evolution of Collaboration: In the Beginning


In the beginning, there was no version control. This is probably the system that you grew up on as well. Files were either on your local machine or on the server. If they were local, you needed to send them out as attachments when you wanted others to see them. If you sent it to two people and they each edited it and send it back you were left to reconcile the changes yourself, and send out the revised edition. Alternatively the document may have be on a server in which case everyone can edit the same file, but woe be to those who tried to edit it at the same time, for it was easy to wipe out someone's changes. Worse, there was no real way to go back. There was no long term undo. If Mary & John open the file at the same time, Mary writes 3 pages and John fixes a typo, but Mary saves before John, then John wipes out everything Mary's done.

The "solution" here was to create "read-only" documents and or "lock" documents while they were being edited. While this works, it took all parallelism out of your process, reducing efficiency and limiting your ability to scale.

The business analog to this is when you had a single author of your new RFP or 2015 Strategy Document. Sure, you could incorporate small feedback, or even perform massive re-writes of sections, but the cost of doing this was high. Eventually inertia wins and everyone just wants the damn thing to be finished, resulting in a subpar product and a stifling of great ideas, since a late-breaking great idea will require enormous logistic effort.

And then there was light.


For our purposes, CVS can be thought of as the first version control system. CVS is a pretty straightforward system where the server is made to save a snapshot of each file every time it is saved. It is easy to go back in time with this system as every version of every file is saved forever. It is also simple to compare what changed between version 1.9 and 1.14 or in a file between two dates.

For many years this served software developers well. We could all get the canonical version of files from one place and we could easily bring our files up to date. We could then work away locally and once finished we could commit things back to the central repository. When workflows happen in parallel it will always be possible to conflict with other's work, but these conflicts were now readily apparent and nice interfaces grew up around CVS to allow us to see the differences and perform a merge in an easy and sane way. If your company is using Google Docs or Sharepoint, this is about where you are now. It seems pretty great doesn't it?  Anyone who has collaborated on a Google Doc can attest to the almost infinite superiority of the collaboration in comparison to "the-bad-old-way".

But not everyone was happy.

In fact, one of the more esteemed members of our community, Linus Torvalds (creator of Linux) simply hated version control of the sorts we've discussed so far and refused to use them. The reliance on a monolithic centralized repository was 'stupid (and ugly)' to paraphrase his thoughts. Most of us mere mortals were still overjoyed not to be overwriting each other's changes accidentally and delighted that there was ONE central place to get our information. So what were we missing? Why did Linus Torvald hate source control?

Branching


Branching is what happens when a user wants to pursue an idea that is not directly reconcilable with the mainstream. As an example, let's pretend that you're tasked with doing a comprehensive, interconnected rewrite of all the companies procedures. It is going to be long and difficult process. It will take weeks, and will be in a half baked and not ready for prime-time state for much of that time. That means we can't just commit it to the central repository, because then the unbaked efforts would be all mixed in to the current procedures. But unfortunately that was the only way we had to collaborate with others...

This is a great time for a branch. A branch is basically an extra sandbox for working on a side project, which we hope will be eventually merged back into the trunk, or master of our system.

Branching was possible in CVS, but it wasn't simple (and indeed is not a feature of SharePoint or Google Docs). Because it was possible however, many of us didn't realize just what we were missing when Linus and friends came down from the mountain top claiming that they had made branching a million times better. It sounded ok, but not revolutionary. But we were wrong.

Easy branching frees your mind to think creatively, because there's far less pain to taking a thought and expounding on it in a new branch. This happens in two forms, local branches and distributed branches and this is where things get interesting.

Local branches


I have a great idea in the middle of the night to change our marketing and communications theme. Awake at 3 in the morning, I create a local branch and get the thoughts out, boldly cutting out projects, rewording the slogan and changing the branding.  The next day I tweak it some more. On the third day I come to my senses and realize that it's terrible and I'm able to delete the branch without anyone knowing about my wild idea. Reputation saved. Creativity fostered.

Distributed branches


Alternatively, on day three my wild idea might seem like it's still got legs, but I'd like to get feedback on it. I can easily push the local branch out to colleagues who I'd like to take a look. This branch has now become a distributed project, de-centralized from the main line of day to day operations. If a co worker likes what they see, but wants to take things in another direction, they can easily branch this branch. In fact there is NO center. That 'main central branch'  I told you about? That's simply a convention. No branch is inherently better than any other. Readers of 'The Cathedral and the Bazaar' will recognize that we are firmly in bazaar territory now.

Trust > Authority


Without a distributed system, the normal way of controlling the system is through something we call Access Control Lists. These are pretty straightforward tools which stores a list of permissions for every user eg "Bob can edit Sales Documents". While they're conceptually simple, it may be fair to say that they're the single underlying failure behind innumerable business inefficiencies and Open Source to-date. Access control is a crude instrument. A user can change something or they can't. And therein lies the trouble. What if an employee spots a typo on the HR docs? Should they have write access to the centralized system? Certainly not. Should they be forced to find the appropriate change request form and file it with their manager? Certainly not. If they can fix it, they should be able to go ahead and do that right then. An access control list simply can't be fine grained enough to allow spelling fixes, but not policy changes. Distributed systems fix this problem dead. I find an error and change it. I don't have access, so I simply request a 'pull' to the maintainer, who can easily see that this is a simple change and merge it in.

This is a small example, but it underlies everything. When we rely on trust (with transparency) we gain huge efficiencies over simple coarse authority based systems.

The best and brightest: Management of a Distributed System


This may sound like chaos. Amazingly it isn't. While it would seem that without a central arbiter many projects would fall into a chaos of conflicting ideas, in fact, the nature of the process seems to have led to fewer fractures in the software community, not more. I would argue that this is because distributed systems make the process more transparent and meritocratic. If you have a good idea you can create a branch to show it to the world, which can then collaborate on it. When it is difficult to reconcile and merge ideas, creating change within an organization requires dramatic action. Think about the creation of Czar's, skunk works & splinter groups. A process that requires end-runs like these is an admission of a process that is failing to allow creativity.

The fact that ad hoc 'skunk works' are a mode of failure is not a new concept, but the central insight of the DVCS alternative is that we need not create a robust 'structure of change' or a 'continual improvement' system, but rather that we ought make everything a skunk work. With distributed systems, everyone's work is reduced to splinters, but the operating principle is that the best splinters should be picked up and merged back into the trunk.

On the Linux project, the master branch, the source of authority for all, is simply Linus Torvald's master branch. He relies on a number of lieutenants to pick up the best splinters from their subdomain. The master branch become more of an all-star team of ideas then a singular overarching concept.

Please do not make the mistake of thinking that software is somehow a special case for this sort of collaboration. It may seem crazy to think that meaningful coordination could be achieved if your corporation were to throw open the gates to this sort of bazaar-like collaboration, but consider that software engineering is the very epitome of an instance in which all parts must work seamlessly together. A miscue between sales & marketing may result in sub-optimal performance, but even a single byte of miscue between software components can easily spell a total system failure and this is the ancestral environment from which the bazaar comes.

The CEO of the future will perform much the same roll that Linus does for Linux. He is the final quality checker and reviewer of branches as they work their way into the corporate master. He relies on the community create nuggets of insight and brilliance. He trusts his lieutenants to find these gems and elevate them to his attention.


Take away: What we've learned. 


  • Lowering the barrier to experiment expands creativity. 
  • Allow experiments to see the light of day and they will accrete adherents & spur still other ideas into existence. 
  • The key to reaping benefits from the splinters of ideas the ability to merge piecemeal concepts efficiently.
  • Authority based systems are poor at integrating insights from the Bazaar. 
  • Trust based relationships, openness, and transparency are good frameworks for merging concepts and insights.
  • Management of distributed systems transformed from an authoritarian role to the "collector & curator" of the organizations best ideas.


Checklist for success today:


Easily access and edit information:

Can an curious manager pull P&L statements on a Saturday afternoon in order to better understand their departments role in the overall business? Business secrets keep competitors and employees in the same dark. Erecting barriers to information takes effort and has no chance of producing value.

Public experimentation:

Can an employee easily distribute a proposal to the rest of the company? Can experiments effectively snowball or do they require the innovator to push the ball all the way up the hill? If your sales manager in Topeka has an idea in the middle of the night for an advertisement / promotion, can they add this quickly to the list of concepts in the marketing department?

Good merging:

What are the barriers between an employee with a good idea and a change in the process? Do managers regularly integrate the work and experimentations of subordinates and can subordinated collaborate together without supervision?


What to read next:  Counting Votes is Hard