Things I Learned at Google: Programming Over Time
This is part one of a multi-part series on lessons I learned while working as a software engineer at Google. If you’d like to start at the beginning, check out part zero. Part two is here.
From the preface of Software Engineering at Google:
One key insight we share in this book is that software engineering can be thought of as “programming integrated over time.” What practices can we introduce to our code to make it sustainable—able to react to necessary change—over its life cycle, from conception to introduction to maintenance to deprecation?
The development and maintenance of a typical Google team’s codebase can involve dozens of people (and as many unique team compositions) over ten or more years. As such, practices have evolved to sustain codebases through time and personnel turnover.
As a new grad, this perspective was completely new to me; most classes in school lasted twelve weeks and then were done forever. When I graduated, the longest I had ever worked on a single programming project was about one year and in school I never worked on a codebase with more than two other people. Most of the software practices I learned about in my first year at Google were focused on maintaining an ecosystem where codebases could survive and grow across years.
The Google Paradigm
Google’s software engineering tools and practices vary substantially from the git-based models that currently dominate the tech industry, so much so that in the past I have struggled to explain the developer experience to friends at other companies. The Google development paradigm is rooted in the use of a monorepo and centralized version control, living at HEAD, and submitting small changes with an emphasis on automated testing. When combined with purpose-built tooling, I have found this model to be both highly effective and pleasant to work in.
Google is well known for its monorepo: all code is stored in a single repository and managed with a centralized version control system (a Perforce descendant named ‘Piper’). Almost all of the company’s code is instantly searchable and a single change can touch any combination of files in the repository. This alone provides a massive productivity advantage: refactoring a widely-used function signature, upgrading a popular dependency, and sharing code across dozens of teams are all routine tasks at Google that incur large overhead in a distributed version control environment.
A less-noted but equally important aspect of development at Google is a longstanding cloud-based approach to development overall. This approach (where code browsing, change authoring, builds, and code review all happen remotely) provides efficiencies that are difficult to achieve with git-based workflows. Mostly these are overhead reductions: remote change authoring means no cloning or local change management and remote builds enable wide use of build caching and build reproducibility. A fully-remote developer environment makes it easy to change out hardware and improves security (as source code never lives on a portable machine).
Unfortunately, git’s popularity combined with its incompatibility with a centralized, ‘thin-client’ development model has greatly delayed or limited the availability of such efficiencies to the community at large. While git’s model is excellent for decentralized development (typical of open source projects), it doesn’t map well to a long-lived, structured organization with a truly ‘official’ repository.
Portions of the overall Google developer experience have only recently begun to emerge in the form of products like GitHub codespaces, merge queues, and Sourcegraph’s Code Search. In my opinion, the largest current gaps are in code review and monorepo simulation (where it is possible to atomically change code across many git repositories).
Living at HEAD
Also known as ‘trunk-based development’, this is the practice of having one ‘official’ branch of the repository and doing all development as close to the latest submitted change as possible. In fact, apart from release branches (which may have one or two extra changes ‘down-integrated’ into them), the monorepo and centralized version control system obviate the concept of branches entirely.
Having a purely linear history makes it easy to root cause test breakages via bisection. No feature branches means merge conflicts remain small and easily manageable. For an excellent overview of this approach and its other benefits, I highly recommend this writeup.
Small CLs
Code changes at Google are referred to as ‘CLs’. This is an abbreviation of the term ‘changelist’ which originates from Perforce, of which Google’s ‘Piper’ VCS is a descendent.
Google’s official internal guidance is that CLs should be kept as small as reasonably possible. Individual changes should be focused, ideally accomplishing a single goal. In practice, this means that about 250 lines of code (including tests!) is an acceptable size for a single CL. More than 400 lines of contentful (i.e. non-boilerplate) changes is typically seen as prima facie too large – in such situations it’s not uncommon for a reviewer to respond with a request to break the CL up.
The philosophy behind this policy is that small changes are easier to write correctly, easier to review thoroughly, and easier to roll back in the event of an issue. Google has a strong cultural expectation around review speed that is a necessary complement: reviewers are generally expected to respond to a code review as soon as feasible (in practice this typically means a 1-4 hour turnaround time).
Research by DORA corroborates my personal experience that small CLs (coupled with fast reviews) make development both faster and subjectively pleasant. Google has an (in)famous internal page related to this practice, which has been externally published here and goes over this philosophy in more detail.
Automated Testing
Google is well-known for its emphasis on automated testing; generally, CLs will not be approved if they do not contain corresponding tests and awareness of tests is well-integrated into both code review tooling and culture. I had minimal exposure to automated testing in college, so adapting to this environment (and learning how to write tests at all) was one of the largest adjustments as a new grad.
With time, I came to understand testing as an investment in codebase longevity, where costs are paid upfront to reap dividends over time. Unit tests for functions, component tests for UI elements, hermetic and end-to-end integration tests for systems, and specialized testing (load, performance, image-diff, etc.) wherever appropriate.
A commitment to testing incurs a heavy burden. Thorough testing means that often a majority of lines in changes (and therefore also the codebase) will be test code. Writing tests slows change authoring significantly, in my experience increasing development time by 50-100%. Tests also require maintenance. Test infrastructure becomes another service to be monitored and repaired and flaky tests must be investigated quickly to avoid blocking development.
Because the costs of test authoring and maintenance are so substantial, paying them is most worth it for large, long-lived projects as the benefits of testing scale linearly across time and superlinearly across project size. As a codebase grows, the interactions between its components (as well as the potential for code reuse) grow polynomially. With automated testing gating code submission, the coordination overhead of code changes in this environment is mostly or completely eliminated. If code written today may sit undisturbed for a decade, then corresponding tests will help prevent the development of a ‘haunted graveyard’.
The costs of testing are easy to appreciate, and some companies take a very different strategy. Facebook is famous (at least historically) for its ‘test in prod’ ethos. For me, it took inheriting a decade-old Java codebase on the Google for Nonprofits team to fully appreciate the investment that automated tests represented. While the applicability of this approach cannot be divorced from its context (in particular a company’s size and maturity), Google has used it to great success.
Software Multiverse
I sometimes tell the story of software engineering at Google as an alternate dimension where git was never invented and popular tooling instead evolved around centralized version control. This is part history and part mythology, but it gives an accurate picture of life on the Google ‘tech island’.
While many practices are not unique to the company, the technical foundation is fundamentally different and this ripples out across the entire internal ecosystem. At over 80k engineers, Google is large enough to have all the large-scale coordination and dependency challenges that are present in the open-source world. But because of its different foundation, it has some unique options to address these issues.
I have been drinking (or swimming in) the kool-aid long enough that I mostly think in terms of Google’s development paradigm. While some of its concepts (like a universal repo and build system) are inapplicable to the world at large, many others can exist comfortably in either reality. Google’s software engineering methods are much more portable than its technology – my guess is that additional tooling that externalizes the Google developer experience will follow the adoption of these methods rather than the other way around.