RubyHunt.dev

Building Git

Written by James Coglan
GET BOOK

Building Git
is a deep dive into the internals of the Git version control system. By rebuilding it in a high-level programming language, we explore the computer science behind this widely used tool. In the process, we gain a deeper understanding of Git itself as well as covering a wide array of broadly applicable programming topics, including:


Unix concepts

  • Reading and writing from files, making writes appear atomic, prevent race conditions between processes
  • Launching child processes in the foreground and background, communicating with them concurrently
  • Displaying output in the terminal, including colour formatting, paged output, and interacting with the user’s text editor
  • Parsing various file formats, including Git’s Merkle-tree-based commit model, the index, configuration files and packed object files

Data structures

  • How Git stores content on disk to make effective use of space, make the history efficient to search, and make it easy to detect differences between commits
  • Using diffs to efficiently update the contents of the workspace when checking out a new commit
  • Effectively using simple in-memory data structures to solve programming problems
  • Parsing and interpreting a query language for addressing commits

Concurrent editing

  • How Git uses branches to model concurrent edits
  • Algorithms for detecting differences between file versions and merging branches back together
  • Why merge conflicts happen, how they can be avoided, and how Git helps users prevent lost updates
  • How merging can be used as the basis for numerous operations to edit the commit history

Software engineering

  • Bootstrapping and growing a self-hosting system
  • Test-driven development
  • Refactoring to enable new feature development
  • Crash-only software design that allows programs to be interrupted and resumed

Networking

  • Using SSH to bootstrap a network protocol
  • How Git repositories communicate to minimise the data they need to transfer when fetching content
  • How the network protocol uses atomic operations to prevent users overwriting each other’s changes
GET BOOK
RubyHunt.dev
Advertise