Port section 2.7 (recording changes) from mercurial to git

[hgbook-git] / tour.mdwn
diff --git a/tour.mdwn b/tour.mdwn

index 6a43df9cd847aae7a5b17afbc531b3160089e372..7ff65f635f2d5f665232672d5354f581ffa72f93 100644 (file)
--- a/tour.mdwn
+++ b/tour.mdwn
@@ -396,8 +396,8 @@ created:
  
         $ git log --since="2 weeks ago" --until="yesterday"
  
-Another useful option is --max-count which, unsurprisingly, limits the
-maximum number of commits to be displayed.
+Another useful option is -n or --max-count which, unsurprisingly,
+limits the maximum number of commits to be displayed.
  
  #### 2.4.3  More detailed information
  
@@ -439,7 +439,7 @@ looks suspicous, so let's tak a closer look. Remember that we can name
  it as master~3, HEAD~3, or any prefix of its commit identifier, (such
  as 13ed136b):
  
-       $ git log -p --max-count=1 13ed136b
+       $ git log -p -n 1 13ed136b
         commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
         Author: Bryan O'Sullivan <bos@serpentine.com>
         Date:   Tue Sep 6 13:15:43 2005 -0700
@@ -486,50 +486,56 @@ same output:
  
  ### 2.5  All about command options
  
-Let’s take a brief break from exploring Mercurial commands to discuss
+Let’s take a brief break from exploring git commands to discuss
  a pattern in the way that they work; you may find this useful to keep
  in mind as we continue our tour.
  
-Mercurial has a consistent and straightforward approach to dealing
+Git has a consistent and straightforward approach to dealing
  with the options that you can pass to commands. It follows the
  conventions for options that are common to modern Linux and Unix
  systems.
  
-  * Every option has a long name. For example, as we’ve already seen,
-    the “hg log” command accepts a --rev option.
-  * Most options have short names, too. Instead of --rev, we can use
-    -r. (The reason that some options don’t have short names is that
-    the options in question are rarely used.)
-  * Long options start with two dashes (e.g. --rev), while short
-    options start with one (e.g. -r).
-  * Option naming and usage is consistent across commands. For
-    example, every command that lets you specify a changeset ID or
-    revision number accepts both -r and --rev arguments.
+  * Most options have long names. For example, as we’ve already seen,
+    the “git log" command accepts a --max-count=<number> option.
+  * Some options have short, single-character names. Often these are
+    aliases for long commands, (such as "-n <number>" instead of
+    --max-count=<number>), but sometimes the option exists in
+    short-form with no long-form equivalent, (such as -p). [XXX: It
+    wouldn't hurt to fix this by adding --patch, etc. right?]
+  * Long options start with two dashes (e.g. --max-count), while short
+    options start with one (e.g. -n).
  
-In the examples throughout this book, I use short options instead of
-long. This just reflects my own preference, so don’t read anything
-significant into it.
+  * Option naming and usage is consistent across commands. For
+    example, every command that lets you specify a commit identifier
+    or range will accept the same expressions, (HEAD~3,
+    origin..master, 72d4f10e, etc), while any command that can be
+    limited by paths will accept the same expressions ("-- doc/
+    some-file.c"), etc.
  
-Most commands that print output of some kind will print more output
-when passed a -v (or --verbose) option, and less when passed -q (or
---quiet).
+Many commands that print output of some kind can be made more quiet by
+passing the -q or --quiet options.
  
  ### 2.6  Making and reviewing changes
  
-Now that we have a grasp of viewing history in Mercurial, let’s take a
+Now that we have a grasp of viewing history in git, let’s take a
  look at making some changes and examining them.
  
  The first thing we’ll do is isolate our experiment in a repository of
-its own. We use the “hg clone” command, but we don’t need to clone a
+its own. We use the “git clone” command, but we don’t need to clone a
  copy of the remote repository. Since we already have a copy of it
  locally, we can just clone that instead. This is much faster than
  cloning over the network, and cloning a local repository uses less
  disk space in most cases, too.
  
-       $ cd ..   
-       $ hg clone hello my-hello   
-       2 files updated, 0 files merged, 0 files removed, 0 files unresolved   
-       $ cd my-hello
+       $ cd ..
+       $ git clone hello my-hello
+       Initialized empty Git repository in /home/cworth/src/hgbook-git/my-hello/.git/
+       0 blocks
+
+       [XXX We say "empty" here, (presumably from the git-init part),
+       but shouldn't the command also report the succesful clone
+       which makes it non-empty? And what the heck does "0 blocks"
+       mean?]
  
  As an aside, it’s often good practice to keep a “pristine” copy of a
  remote repository around, which you can then make temporary clones of
@@ -539,6 +545,17 @@ until it’s complete and you’re ready to integrate it back. Because
  local clones are so cheap, there’s almost no overhead to cloning and
  destroying repositories whenever you want.
  
+Alternatively, you can achieve much the same effect by creating
+multiple branches in a single repository, (but we won't go into detail
+on how to do that in this chapter). Some people greatly appreciate
+having multiple branches in a single repository rather than having
+many repositories cluttering up their filesystem. Other people prefer
+the ability to have working-tree changes, and intermediate build
+files, etc. each isolated in a separate repository per branch. Both
+modes are very well-supported by git, so it's really a matter of which
+you find most appropriate at any time given your tastes and project
+workflows.
+
  In our my-hello repository, we have a file hello.c that contains the
  classic “hello, world” program. Let’s use the ancient and venerable
  sed command to edit this file so that it prints a second line of
@@ -547,155 +564,219 @@ scripted example this way. Since you’re not under the same constraint,
  you probably won’t want to use sed; simply use your preferred text
  editor to do the same thing.)
  
-       $ sed -i '/printf/a∖∖tprintf("hello again!∖∖n");' hello.c
+       $ sed -i '/printf/a\\tprintf("hello again!\\n");' hello.c       
  
-Mercurial’s “hg status” command will tell us what Mercurial knows
-about the files in the repository.
+The “git status” command will tell us what git knows about the files
+in the repository.
  
-       $ ls   
-       Makefile  hello.c   
-       $ hg status   
-       M hello.c
-
-The “hg status” command prints no output for some files, but a line
-starting with “M” for hello.c. Unless you tell it to, “hg status” will
-not print any output for files that have not been modified.
-
-The “M” indicates that Mercurial has noticed that we modified
-hello.c. We didn’t need to inform Mercurial that we were going to
-modify the file before we started, or that we had modified the file
-after we were done; it was able to figure this out itself.
+       $ ls 
+       hello.c  Makefile
+       $ git status
+       # On branch master
+       # Changed but not updated:
+       #   (use "git add <file>..." to update what will be committed)
+       #
+       #       modified:   hello.c
+       #
+       no changes added to commit (use "git add" and/or "git commit -a")
+
+We see that “git status” command prints a line with "modified" for
+hello.c. The “git status” command will not print any output for files
+that have not been modified.
+
+Notice that we didn’t need to inform git that we were going to modify
+the file before we started, or that we had modified the file after we
+were done; it was able to figure this out itself.
  
  It’s a little bit helpful to know that we’ve modified hello.c, but we
  might prefer to know exactly what changes we’ve made to it. To do
-this, we use the “hg diff” command.
-
-       $ hg diff   
-       diff -r b57f9a090b62 hello.c   
-       --- a/hello.c Tue Sep 06 15:43:07 2005 -0700   
-       +++ b/hello.c Sun Jun 17 18:05:50 2007 +0000   
-       @@ -8,5 +8,6 @@ int main(int argc, char ⋆⋆argv)   
-       int main(int argc, char ⋆⋆argv)   
-       {   
-       printf("hello, world!∖");   
-       + printf("hello again!∖n");   
-       return 0;   
-       }
+this, we use the “git diff” command.
+
+       $ git diff
+       diff --git a/hello.c b/hello.c
+       index 9a3ff79..6d28887 100644
+       --- a/hello.c
+       +++ b/hello.c
+       @@ -8,5 +8,6 @@
+        int main(int argc, char **argv)
+        {
+               printf("hello, world!\");
+       +       printf("hello again!\n");
+               return 0;
+        }
  
-### 2.7  Recording changes in a new changeset
+### 2.7  Recording changes in a new commit
  
-We can modify files, build and test our changes, and use “hg status”
-and “hg diff” to review our changes, until we’re satisfied with what
+We can modify files, build and test our changes, and use “git status”
+and “git diff” to review our changes, until we’re satisfied with what
  we’ve done and arrive at a natural stopping point where we want to
-record our work in a new changeset.
+record our work in a new commit.
  
-The “hg commit” command lets us create a new changeset; we’ll usually
+The “git commit” command lets us create a new changeset; we’ll usually
  refer to this as “making a commit” or “committing”.
  
  #### 2.7.1  Setting up a username
  
-When you try to run “hg commit” for the first time, it is not
-guaranteed to succeed. Mercurial records your name and address with
-each change that you commit, so that you and others will later be able
-to tell who made each change. Mercurial tries to automatically figure
-out a sensible username to commit the change with. It will attempt
-each of the following methods, in order:
-
-  1. If you specify a -u option to the “hg commit” command on the
-     command line, followed by a username, this is always given the
-     highest precedence.
-  2. If you have set the HGUSER environment variable, this is checked next. 
-  3. If you create a file in your home directory called .hgrc, with a
-     username entry, that will be used next. To see what the contents
-     of this file should look like, refer to section [2.7.1][11]
-     below.
-  4. If you have set the EMAIL environment variable, this will be used
-     next.
-  5. Mercurial will query your system to find out your local user name
-     and host name, and construct a username from these
-     components. Since this often results in a username that is not
-     very useful, it will print a warning if it has to do this.
-
-If all of these mechanisms fail, Mercurial will fail, printing an
-error message. In this case, it will not let you commit until you set
-up a username.
-
-You should think of the HGUSER environment variable and the -u option
-to the “hg commit” command as ways to override Mercurial’s default
-selection of username. For normal use, the simplest and most robust
-way to set a username for yourself is by creating a .hgrc file; see
-below for details.
-
-##### Creating a Mercurial configuration file
-
-To set a user name, use your favourite editor to create a file called
-.hgrc in your home directory. Mercurial will use this file to look up
-your personalised configuration settings. The initial contents of your
-.hgrc should look like this.
-
-       # This is a Mercurial configuration file.   
-       [ui]   
-       username = Firstname Lastname <email.address@domain.net>
-
-The “[ui]” line begins a section of the config file, so you can read
-the “username = ...” line as meaning “set the value of the username
-item in the ui section”. A section continues until a new section
-begins, or the end of the file. Mercurial ignores empty lines and
-treats any text from “#” to the end of a line as a comment.
+When you try to run “git commit” for the first time, it might not do
+exactly what you want. Git records your name and address with each
+change that you commit, (as both author and committer unless you tell
+it otherwise), so that you and others will later be able to tell who
+made each change. Git tries to automatically figure out a sensible
+name and address to attribute to both author and committer. It will
+attempt each of the following methods, in order, (stopping for each field as soon as a value is found):
+
+  1. If you specify a --author option to the “git commit” command on
+     the command line, followed by a "Real Name <email@example.com>"
+     string, then this name and addresss will be used for the author
+     fields. The committer fields will still be determined as
+     below. This option is very helpful for when applying a commit
+     originally authored by someone other than yourself.
+  2. If any of the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL,
+     GIT_COMMITTER_NAME, or GIT_COMMITER_EMAIL environment variables
+     are set, then those values will be used for the corresponding
+     fields.
+  3. If you have a file in your home directory called .gitconfig, with
+     name or email settings in the [user] section, then these values
+     will be used to set any remaining author and committer
+     fields. For more details on the contents of this file, refer to
+     section 2.7.1 below.
+  4. If you have a file in the local repository called .git/config,
+     again with name or email settings in the [user] section, then
+     these values will be used to set any remaining author and
+     committer fields.
+  5. If you have set the EMAIL environment variable, this will be used
+     to set author and committer email addresses if still unset.
+  6. git will query your system to find out your real name from
+     available GECOS field and your username, hostname, and domain to
+     construct an email address, (or at least an identifier resembling
+     an email address).
+
+If all of these mechanisms fail, "git commit" will fail, printing an
+error message instructing you how to use "git config" to tell git your
+name and email address.
+
+You should think of the GIT_AUTHOR/COMMITER_NAME/EMAIL environment
+variables and the --author option to the “git commit” command as ways
+to override git’s default selection. For normal use, the simplest and
+most robust way to set your information is by creating a .gitconfig
+file, (either manually or with the "git config" command); see below
+for details.
+
+##### Creating a git configuration file
+
+To set your name and email address, just use the following commands:
+
+       git config --global user.name "Your Name"
+       git config --global user.email "you@example.com"
+
+The --global option means that this command will set global
+information, (affecting all repositories on this machine), in the
+.gitconfig file in your home directory. Alternately, you could omit
+the --global which would make the change take effect only in the local
+repository. This is convenient if you want to have different email
+addresses associated with different projects, for example.
+
+Of course, git's configuration file is a simple-to-edit plain-text
+file, so instead of using the above commands, you can also just edit
+the files directly. Use your favorite editor to create a file called
+.gitconfig in your home directory, (or if you ran the above commands
+then it will be there already). The initial contents of your
+.gitconfig should look like this.
+
+       # This is a git configuration file.   
+       [user]
+               name = Your Name
+               email = you@example.com
+
+Similarly, you can make a repository-specific configuration by editing
+.git/config in the local repository. It will already have some
+sections present, (created by the "git clone"), just add a [user]
+section as above.
+
+The “[user]” line begins a section of the config file, so you can read
+the “name = ...” line as meaning “set the value of the name item in
+the user section”. This is the same notion expressed with the
+"user.name" syntax on the git-config command line.  A section
+continues until a new section begins, or the end of the file. Git
+ignores empty lines and treats any text from “#” to the end of a line
+as a comment.
  
  ##### Choosing a user name
  
-You can use any text you like as the value of the username config
-item, since this information is for reading by other people, but for
-interpreting by Mercurial. The convention that most people follow is
-to use their name and email address, as in the example above.
-
-Note: Mercurial’s built-in web server obfuscates email addresses, to
-make it more difficult for the email harvesting tools that spammers
-use. This reduces the likelihood that you’ll start receiving more junk
-email if you publish a Mercurial repository on the web.
+You can use any text you like as the value of the name and email
+configuration items, since this information is for reading by other
+people, not for interpreting by git. It is conventional to use a valid
+email address, but some, (notably Linus Torvalds, the original author
+of git), actually like the default user@hostname convention that git
+falls back on without any additional information. There's no
+requirement that the email address actually be valid, and perhaps it's
+useful to be reminded which machine was used to create particular
+commits.
  
  #### 2.7.2  Writing a commit message
  
-When we commit a change, Mercurial drops us into a text editor, to
+When we commit a change, git drops us into a text editor to
  enter a message that will describe the modifications we’ve made in
-this changeset. This is called the commit message. It will be a record
-for readers of what we did and why, and it will be printed by “hg log”
+this commit. This is called the commit message. It will be a record
+for readers of what we did and why, and it will be printed by “git log”
  after we’ve finished committing.
  
-       $ hg commit
-
-The editor that the “hg commit” command drops us into will contain an
-empty line, followed by a number of lines starting with “HG:”.
-
-       empty line   
-       HG: changed hello.c
-
-Mercurial ignores the lines that start with “HG:”; it uses them only
+       $ git commit -a
+
+Note: The -a on the command-line instructs git to commit all changes
+to tracked files. Without this, "git commit" will only commit changes
+that have been previously staged for committing with "git add
+file". The most common usage is to commit with "git commit -a" and
+only use "git add file; git commit" when there is a need to commit
+only some subset of changes that have been made.
+
+The editor that the “git commit” command drops us into will contain an
+empty line, followed by a number of lines starting with “#”.
+
+       empty line
+       # Please enter the commit message for your changes.
+       # (Comment lines starting with '#' will not be included)
+       # On branch master
+       # Changes to be committed:
+       #   (use "git reset HEAD <file>..." to unstage)
+       #
+       #       modified:   hello.c
+       #       
+
+git ignores the lines that start with “#”; it uses them only
  to tell us which files it’s recording changes to. Modifying or
  deleting these lines has no effect.
  
  #### 2.7.3  Writing a good commit message
  
-Since “hg log” only prints the first line of a commit message by
-default, it’s best to write a commit message whose first line stands
-alone. Here’s a real example of a commit message that doesn’t follow
+A good commit message will generally have a single line that
+summarizes the commit, a blank line, and then one or more pargraphs
+with supporting detail. Since many tools only print the first line of
+a commit message by default, it’s important that the first line stands
+alone.
+
+One example of a first-line-only viewer is "git log
+--pretty=short". Other examples include graphical history viewers such
+as gitk and gitview, and web-based viewers such as gitweb and cgit.
+
+Here’s a real example of a commit message that doesn’t follow
  this guideline, and hence has a summary that is not readable.
  
-       changeset:   73:584af0e231be   
-       user:        Censored Person <censored.person@example.org>   
-       date:        Tue Sep 26 21:37:07 2006 -0700   
-       summary:     include buildmeister/commondefs.   Add an exports and install
+       $ git log --pretty=short
+       commit 3ef5535144da88a854f7930503845cd44506c2e2
+       Author: Censored Person <censored.person@example.org>
+       
+           include buildmeister/commondefs.   Add an exports and install
  
  As far as the remainder of the contents of the commit message are
-concerned, there are no hard-and-fast rules. Mercurial itself doesn’t
+concerned, there are no hard-and-fast rules. git itself doesn’t
  interpret or care about the contents of the commit message, though
  your project may have policies that dictate a certain kind of
  formatting.
  
  My personal preference is for short, but informative, commit messages
  that tell me something that I can’t figure out with a quick glance at
-the output of “hg log --patch”.
+the output of “git log -p".
  
  #### 2.7.4  Aborting a commit
  
@@ -704,41 +785,129 @@ editing a commit message, simply exit from your editor without saving
  the file that it’s editing. This will cause nothing to happen to
  either the repository or the working directory.
  
-If we run the “hg commit” command without any arguments, it records
-all of the changes we’ve made, as reported by “hg status” and “hg
-diff”.
-
  #### 2.7.5  Admiring our new handiwork
  
-Once we’ve finished the commit, we can use the “hg tip” command to
-display the changeset we just created. This command produces output
-that is identical to “hg log”, but it only displays the newest
-revision in the repository.
+Once we’ve finished the commit, we can use the “git show” command to
+display the commit we just created. As discussed previously, this
+command produces output that is identical to “git log -p”, but for
+only a single revision, (and the most recent revision by default):
  
-       $ hg tip -vp   
-       changeset:   5:fa1321bf0c80   
-       tag:         tip   
-       user:        Bryan O'Sullivan <bos@serpentine.com>   
-       date:        Sun Jun 17 18:05:50 2007 +0000   
-       files:       hello.c   
-       description:   
-       Added an extra line of output   
+       $ git show
+       commit 018cfb742be6176443ffddac454e593e802ddf3e
+       Author: Carl Worth <cworth@cworth.org>
+       Date:   Thu Sep 27 23:55:00 2007 -0700
         
+           Added an extra line of output.
+           
+           If I would have been clever I would have fixed that old typo
+           while I was at it...
         
-       diff -r b57f9a090b62 -r fa1321bf0c80 hello.c   
-       --- a/hello.c Tue Sep 06 15:43:07 2005 -0700   
-       +++ b/hello.c Sun Jun 17 18:05:50 2007 +0000   
-       @@ -8,5 +8,6 @@ int main(int argc, char ⋆⋆argv)   
-       int main(int argc, char ⋆⋆argv)   
-       {   
-       printf("hello, world!∖");   
-       + printf("hello again!∖n");   
-       return 0;   
-       }   
+       diff --git a/hello.c b/hello.c
+       index 9a3ff79..6d28887 100644
+       --- a/hello.c
+       +++ b/hello.c
+       @@ -8,5 +8,6 @@
+        int main(int argc, char **argv)
+        {
+               printf("hello, world!\");
+       +       printf("hello again!\n");
+               return 0;
+        }
+
+Note that you will not see the same commit identifier for your commit,
+even if the change you made is identical to mine. The commit
+identifier incorporates not only the contents of the files, but commit
+message, the author and committer names and emails, and the author and
+commit dates. (OK, so now you probably know enough to be able to guess
+the right command to produce a commit with exactly the commit
+identifier shown above. Can you do it?)
+
+#### 2.7.6 Fixing up a broken commit (before anyone else sees it)
+
+So now that we've cloned a local repository, made a change to the
+code, setup our name and email address, and made a commit with a
+careful message, we're just about ready to share our change with the
+world. But wait, we forgot to try to compile it didn't we?
+
+       $ make
+       cc    -c -o hello.o hello.c
+       hello.c:10:9: warning: missing terminating " character
+       hello.c:10:9: warning: missing terminating " character
+       hello.c: In function ‘main’:
+       hello.c:10: error: missing terminating " character
+       hello.c:11: error: expected ‘)’ before ‘;’ token
+       hello.c:13: warning: passing argument 1 of ‘printf’ makes pointer from integer without a cast
+       hello.c:13: error: expected ‘;’ before ‘}’ token
+       make: *** [hello.o] Error 1
+
+Oh look. The code's broken and doesn't compile. We don't want to share
+code in this state. For situations where you notice one tiny detail
+that got left out of the last commit, (a silly syntax error, a
+misspelling in a comment or commit messsage), git provides a very
+handy tool for just changing the last commit.
+
+So fix that typo, (a missing 'n' between the '\' and the '"'), with
+your editor or with something like this:
+
+       sed -i 's/\\"/\\n"/' hello.c
+
+And then you can just amend the previous commit rather than creating a
+new one with the --amend option to "git commit":
+
+       $ git commit -a --amend
+
+Note that we use -a to include the code change here. And that helps
+point out a situation where "git commit" is useful without the -a
+option, "git commit --amend" is a useful command for amend just the
+last commit message, without committing any new code changes, even if
+some files have been modified in the working tree.
+
+And here's the final result:
+
+       $ git show
+       commit 839b58d021c618bd0e1d336d4d5878a0082672e6
+       Author: Carl Worth <cworth@cworth.org>
+       Date:   Thu Sep 27 23:55:00 2007 -0700
         
+           Added an extra line of output and fixed the typo bug.
+       
+       diff --git a/hello.c b/hello.c
+       index 9a3ff79..ca750e0 100644
+       --- a/hello.c
+       +++ b/hello.c
+       @@ -7,6 +7,7 @@
+        
+        int main(int argc, char **argv)
+        {
+       -       printf("hello, world!\");
+       +       printf("hello, world!\n");
+       +       printf("hello again!\n");
+               return 0;
+        }
  
-We refer to the newest revision in the repository as the tip revision,
-or simply the tip.
+I can't help but point out that this really was a poor example for
+--amend. The end result is a single commit that does two independent
+things, (fixes one bug and adds one new feature). It's much better to
+create a code history where each commit makes an independent change,
+(and as small as possible). This is important for several reasons:
+
+  * Small changes are easier to review
+
+  * Independent changes are easier to split up if only part of the
+    series gets accepted "upstream" for one reason or another.
+
+  * The smaller the changes are the more useful the history will be
+    when actually using the history, not just viewing it. This is
+    particularly important when doing "git bisect"---that's a powerful
+    tool for isolating the single commit that introduces a bug. And
+    it's much more powerful if the commit it isolates is as small as
+    possible.
+
+So it's a good thing this document is available under a license that
+allows for distribution of modified versions. Someone should clean up
+the --amend example to not teach bad habits like I did above. [Note:
+All this bad-habit stuff was introduced by me, and was not present in
+Bryan's original chapter. -Carl]
  
  ### 2.8  Sharing changes