1 @title Recommendations on Revision Control
4 Project recommendations on how to organize revision control.
6 This document is purely advisory. Phabricator works with a variety of revision
7 control strategies, and diverging from the recommendations in this document
8 will not impact your ability to use it for code review and source management.
10 This is my (epriestley's) personal take on the issue and not necessarily
11 representative of the views of the Phabricator team as a whole.
15 There are a few ways to use SVN, a few ways to use Mercurial, and many many many
16 ways to use Git. Particularly with Git, every project does things differently,
17 and all these approaches are valid for small projects. When projects scale,
18 strategies which enforce **one idea is one commit** are better.
20 = One Idea is One Commit =
22 Choose a strategy where **one idea is one commit** in the authoritative
23 master/remote version of the repository. Specifically, this means that an entire
24 conceptual changeset ("add a foo widget") is represented in the remote as
25 exactly one commit (in some form), not a sequence of checkpoint commits.
27 - In SVN, this means don't `commit` until after an idea has been completely
28 written. All reasonable SVN workflows naturally enforce this.
29 - In Git, this means squashing checkpoint commits as you go (with `git commit
30 --amend`) or before pushing (with `git rebase -i` or `git merge
31 --squash`), or having a strict policy where your master/trunk contains only
32 merge commits and each is a merge between the old master and a branch which
33 represents a single idea. Although this preserves the checkpoint commits
34 along the branches, you can view master alone as a series of single-idea
36 - In Mercurial, you can use the "queues" extension before 2.2 or `--amend`
37 after Mercurial 2.2, or wait to commit until a change is complete (like
38 SVN), although the latter is not recommended. Without extensions, older
39 versions of Mercurial do not support liberal mutability doctrines (so you
40 can't ever combine checkpoint commits) and do not let you build a default
41 out of only merge commits, so it is not possible to have an authoritative
42 repository where one commit represents one idea in any real sense.
46 A strategy where **one idea is one commit** has no real advantage over any other
47 strategy until your repository hits a velocity where it becomes critical. In
50 - Essentially all operations against the master/remote repository are about
51 ideas, not commits. When one idea is many commits, everything you do is more
52 complicated because you need to figure out which commits represent an idea
53 ("the foo widget is broken, what do I need to revert?") or what idea is
54 ultimately represented by a commit ("commit af3291029 makes no sense, what
55 goal is this change trying to accomplish?").
56 - Release engineering is greatly simplified. Release engineers can pick or
57 drop ideas easily when each idea corresponds to one commit. When an idea
58 is several commits, it becomes easier to accidentally pick or drop half of
59 an idea and end up in a state which is virtually guaranteed to be wrong.
60 - Automated testing is greatly simplified. If each idea is one commit, you
61 can run automated tests against every commit and test failures indicate a
62 serious problem. If each idea is many commits, most of those commits
63 represent a known broken state of the codebase (e.g., a checkpoint with a
64 syntax error which was fixed in the next checkpoint, or with a
65 half-implemented idea).
66 - Understanding changes is greatly simplified. You can bisect to a break and
67 identify the entire idea trivially, without fishing forward and backward in
68 the log to identify the extents of the idea. And you can be confident in
69 what you need to revert to remove the entire idea.
70 - There is no clear value in having checkpoint commits (some of which are
71 guaranteed to be known broken versions of the repository) persist into the
72 remote. Consider a theoretical VCS which automatically creates a checkpoint
73 commit for every keystroke. This VCS would obviously be unusable. But many
74 checkpoint commits aren't much different, and conceptually represent some
75 relatively arbitrary point in the sequence of keystrokes that went into
76 writing a larger idea. Get rid of them or create an abstraction layer (merge
77 commits) which allows you to ignore them when you are trying to understand
78 the repository in terms of ideas (which is almost always).
80 All of these become problems only at scale. Facebook pushes dozens of ideas
81 every day and thousands on a weekly basis, and could not do this (at least, not
82 without more people or more errors) without choosing a repository strategy where
83 **one idea is one commit**.
89 - reading recommendations on structuring branches with
90 @{article:Recommendations on Branching}; or
91 - reading recommendations on structuring changes with
92 @{article:Writing Reviewable Code}.