## Chapter 2  
A tour of git: the basics

### 2.0 Copyright

This document is a modified version originally known as "Distributed
revision control with Mercurial" and originally authored by Bryan
O’Sullivan. The original document was obtained from
<http://hgbook.red-bean.com/>.

Copyright © 2006, 2007 Bryan O’Sullivan.

This material may be distributed only subject to the terms and
conditions set forth in version 1.0 of the Open Publication
License. Please refer to Appendix D for the license text.

As this is a modified version, the name of Bryan O'Sullivan is used
only to properly credit him with the original text. The appearance of
his name here explicitly does not assert or imply his endorsement of
this modified document.

Portions Copyright © 2007 Carl Worth.

Changes made by Carl include the following:

  * 2007-09-27:
    * Convert from HTML to markdown source syntax
    * Eliminate all content except Chapter 2 and Appendix D
    * Eliminate line numbers from examples
    * Modified to describe git instead of mercurial

### 2.1  Installing git on your system

Prebuilt binary packages of git are available for many popular
operating systems. These make it easy to start using git on your
computer immediately.

#### 2.1.1  Linux

Because each Linux distribution has its own packaging tools, policies,
and rate of development, it’s difficult to give a comprehensive set of
instructions on how to install git binaries. The version of
git that you will end up with can vary depending on how active
the person is who maintains the package for your distribution.

To keep things simple, I will focus on installing git from the
command line under the most popular Linux distributions. Most of these
distributions provide graphical package managers that will let you
install git with a single click. The package name to look for is
often git, but is sometimes git-core, (due to an unfortunate name
with git, meaning GNU Interactive Tools).

  * Debian 

	apt-get install git-core

  * Fedora Core 

	yum install git

  * Gentoo 

	emerge git

  * OpenSUSE 

	yum install git

  * Ubuntu 

	apt-get install git

#### 2.1.2  Mac OS X

A git-core package is available through
[macports](http://macports.org). Once macports is enabled, the command
to install git is:

	port install git-core

#### 2.1.3  Windows

Git has long been available as part of cygwin, and works reasonably
well in that environment. Some people find cygwin a particularly
inelegant approach to running git and would prefer a "native"
solution. To this end, the [msysgit
project](http://code.google.com/p/msysgit/) is rapidly putting
together a solution including various packages with full
installers. These include GitMe, a package to install the entire
development environment necessary to work on improving the msysgit
port of git, and WinGit, a package for installing just git itself
without the development environment, (still in Alpha as of September
2008).

### 2.2  Getting started

To begin, we’ll use the “git version” command to find out whether git
is actually installed properly. Versions 1.5 and newer of git are much
more friendly to new users than versions 1.4 and older. If you aren't
yet running version 1.5 or newer, it's highly recommended that you
upgrade.

	$ git version   
	git version 1.5.3.2

#### 2.2.1  Built-in help

Git provides a built-in help system. This is invaluable for those
times when you find yourself stuck trying to remember how to run a
command. If you are completely stuck, simply run “git help”; it will
print a brief list of commonly-used commands, along with a description
of what each does. If you ask for help on a specific command (such as
"git help init"), it prints more detailed information. [XXX: Does "git
help <foo>" work universally as a built-in or does it expect man to be
present and just call out to "man git-<foo>"?]

	[XXX: The original hgbook includes the complete output of "hg
	help init" at this point. I'm not including the corresponding
	"git help init" output as it would be excessively long. The
	description alone is quite reasonable, (other than a
	not-too-helpful aside about the obsolete git-init-db command),
	but it only comes after a full screen's worth of options
	details. Might it make sense to have a more summarized help
	output for "git help <foo>" than all of the documentation
	available for git-<foo>? And perhaps alos provide a "git -v
	help" similar to "hg -v help" for more?]

### 2.3  Working with a repository

In git, everything happens inside a repository. The repository
for a project contains all of the files that “belong to” that project,
along with a historical record of the project’s files.

There’s nothing particularly magical about a repository; it is simply
a directory tree in your filesystem that git treats as
special. You can rename or delete a repository any time you like,
using either the command line or your file browser.

#### 2.3.1  Making a local copy of a repository

Copying a repository is just a little bit special. While you could use
a normal file copying command to make a copy of a repository, it’s
best to use a built-in command that git provides. This command
is called “git clone”, because it creates an identical copy of an
existing repository.

	$ git clone git://cworth.org/git/hello
	Initialized empty Git repository in /tmp/hello/.git/
	remote: Generating pack...
	remote: Done counting 15 objects.
	remote: Deltifying 15 objects...
	remote:  100% (15/15) done
	remote: Total 15 (delta 2), reused 15 (delta remote: 2)
	Indexing 15 objects...
	 100% (15/15) done
	Resolving 2 deltas...
	 100% (2/2) done

If for some reason you are prevented from talking on the git: port,
then there is also the capability to clone a repository (less
efficiently) over http:

	$ git clone http://cworth.org/git/hello
	Initialized empty Git repository in /tmp/hello/.git/
	Getting alternates list for http://cworth.org/git/hello
	Getting pack list for http://cworth.org/git/hello
	Getting index for pack 04ecb061314ecbd60fa0610ecf55a1cbf85ea294
	Getting pack 04ecb061314ecbd60fa0610ecf55a1cbf85ea294
	 which contains a1a0e8b392b17caf50325498df54802fe3c03710
	walk a1a0e8b392b17caf50325498df54802fe3c03710
	walk 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	walk 13ed136b983a9c439eddeea8a1c2076cffbb685f
	walk 0a633bf58b45fcf1a8299d3c82cd1fd26d3f48f2
	walk db7117a9dd9a6e57e8632ea5848e1101eee0fbde

If our clone succeeded, we should now have a local directory called
hello. This directory will contain some files.

	$ ls -l
	total 4
	drwxr-xr-x 3 cworth cworth 4096 2007-09-27 16:40 hello
	$ ls hello
	hello.c  Makefile

These files have the same contents and history in our repository as
they do in the repository we cloned.

Every git repository is complete, self-contained, and
independent. It contains its own private copy of a project’s files and
history. A cloned repository remembers the location of the repository
it was cloned from, but it does not communicate with that repository,
or any other, unless you tell it to.

What this means for now is that we’re free to experiment with our
repository, safe in the knowledge that it’s a private “sandbox” that
won’t affect anyone else.

#### 2.3.2  What’s in a repository?

When we take a more detailed look inside a repository, we can see that
it contains a directory named .git. This is where git keeps all
of its metadata for the repository.

	$ cd hello
	$ ls -a
	.  ..  .git  hello.c  Makefile

The contents of the .git directory and its subdirectories are private
to git. Every other file and directory in the repository is
yours to do with as you please.

To introduce a little terminology, the .git directory is the “real”
repository, and all of the files and directories that coexist with it
are said to live in the working directory. An easy way to remember the
distinction is that the repository contains the history of your
project, while the working directory contains a snapshot of your
project at a particular point in history.

### 2.4  A tour through history

One of the first things we might want to do with a new, unfamiliar
repository is understand its history. The “git log” command gives us a
view of history.

	$ git log
	commit a1a0e8b392b17caf50325498df54802fe3c03710
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 15:43:07 2005 -0700
	
	    Trim comments.
	
	commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:58 2005 -0700
	
	    Get make to generate the final binary from a .o file.
	
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.
	
	commit 0a633bf58b45fcf1a8299d3c82cd1fd26d3f48f2
	Author: Bryan O'Sullivan <mpm@selenic.com>
	Date:   Fri Aug 26 01:21:28 2005 -0700
	
	    Create a makefile
	
	commit db7117a9dd9a6e57e8632ea5848e1101eee0fbde
	Author: Bryan O'Sullivan <mpm@selenic.com>
	Date:   Fri Aug 26 01:20:50 2005 -0700
	
	    Create a standard "hello, world" program

By default, this command prints a brief paragraph of output for each
change to the project that was recorded. In git terminology, we
call each of these recorded events a commit.

The fields in a record of output from “git log” are as follows. 

  * commit This field consists of a string of 40 hexadecimal characters.
    This is a unique identifier for referring to particular commits.
  * Author The identity of the person who authored the commit. This
    field consist of two sub-fields for the user's name and email
    address, (or at least an email-like idenitifer). Note that git
    stores a separate "Committer" field for the person who commited
    the change, (since often an author will email a change to a
    maintainer that commits it). The "git log" command doesn't display
    the Committer, but other git tools do.
  * Date The date and time on which the commit was authored, (again
    stored separately from the date the change was committed).
    timezone in which it was created. (The date and time are displayed
    in the timezone of the person who created the commit.)
  * commit message The text message that the creator of the commit
    entered to describe the commit, (generally a one-line summary
    followed by more supporting text).

The default output printed by “git log” is purely a summary; it is
missing a lot of detail.

Figure [2.1][8] provides a graphical representation of the history of
the hello repository, to make it a little easier to see which
direction history is “flowing” in. We’ll be returning to this figure
several times in this chapter and the chapter that follows.

* * *

![PIC][9]   

Figure 2.1: 
Graphical history of the hello repository

* * *

#### 2.4.1  Commits, revisions, and talking to other people

As English is a notoriously sloppy language, and computer science has
a hallowed history of terminological confusion (why use one term when
four will do?), revision control has a variety of words and phrases
that mean the same thing. If you are talking about git history
with other people, you will find that what we have called a “commit”
is often called a "revision". In other systems, a similar notion
is referred to as a "changeset". You might even see abbreviations of
these terms such as "rev", "change", or even "cset".

While it may not matter much what word you use to refer to the concept
of “a commit”, it's important to know how to name “a specific
commit”. We have already seen one means of referring to a particular
commit, the 40-character hexadecimal string shown by "git log". These
commit identifiers are powerful because they are permanent, unique
identifiers that always identify the same commit in any copy of a
repository. If two users are examining a working directory associated
with the same commit identifier, then those two users have precisely
the same contents in all files, and exactly the same history leading
to that commit.

So there are places where it is often important to archive the
complete commit identifier, (perhaps in bug-tracking systems to
indicate a specific commit that fixes a bug, for example). But often,
in more casual settings, it's more convenient to use abbreviated
commit identifiers. Git accept any unique prefix of a commit
identifier, (and for reasonably-sized project the first 8 or 10
characters are almost always unique).

And unlike the permanent commit identifiers, git also provides
transient means of identifying commits. In fact, in day-to-day use of
git, you will probably use these names more than commit
identifiers. One example is branch names, (such as the default
"master" branch in any git repository), or any project-specific branch
names such as "stable", "experimental", or "crazy-insane-changes". Git
also provides a special name "HEAD" which always refers to the current
branch.

#### 2.4.2 Naming related commits

Git offers simple ways to name revisions that are related to
particular revisions in the history. One syntax is the ~ suffix which
refers to the parent of a commit, or if followed by a number, to the
Nth parent. For example, since "HEAD" refers to the most recent commit
in the current branch, "HEAD~", refers to the previous commit, and
"HEAD~2" refers to two commits back in the history.

Another useful syntax is .. which can be used to specify a range of
commits. So "origin..master" specifies everything that has been
committed to master since it derived from origin.

#### 2.4.3  Viewing specific revisions

You can use "git log" to explore the range syntax just introduced. For
example, to see a list of the most recent 3 revisions you can use
"HEAD~3..", (the destination of the range is implicitly HEAD in this
case):

	$ git log HEAD~3..
	commit a1a0e8b392b17caf50325498df54802fe3c03710
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 15:43:07 2005 -0700
	
	    Trim comments.
	
	commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:58 2005 -0700
	
	    Get make to generate the final binary from a .o file.
	
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.

#### 2.4.4 Other log filters

Besides filtering by commit identifiers, git allows you to easily
filter the log output according to which files (or directories) are
modified by listing them after "--" wihch is necessary to distinguish
commit names from file names:

	$ git log -- Makefile
	commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:58 2005 -0700
	
	    Get make to generate the final binary from a .o file.
	
	commit 0a633bf58b45fcf1a8299d3c82cd1fd26d3f48f2
	Author: Bryan O'Sullivan <mpm@selenic.com>
	Date:   Fri Aug 26 01:21:28 2005 -0700
	
	    Create a makefile

And "git log" can also filter based on the dates at which commits were
created:

	$ git log --since="2 weeks ago" --until="yesterday"

Another useful option is -n or --max-count which, unsurprisingly,
limits the maximum number of commits to be displayed.

#### 2.4.3  More detailed information

While the default information printed by “git log” is useful if you
already know what you’re looking for, you may need to see more details
of the change, such as the "diffstat" information with --stat:

	$ git log --stat --max-count=3
	commit a1a0e8b392b17caf50325498df54802fe3c03710
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 15:43:07 2005 -0700
	
	    Trim comments.
	
	 hello.c |    8 ++------
	 1 files changed, 2 insertions(+), 6 deletions(-)
	
	commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:58 2005 -0700
	
	    Get make to generate the final binary from a .o file.
	
	 Makefile |    2 ++
	 1 files changed, 2 insertions(+), 0 deletions(-)
	
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.
	
	 hello.c |    2 +-
	 1 files changed, 1 insertions(+), 1 deletions(-)

Or perhaps you'd like to see the actual patch content of each change,
which you can get with -p. That commit with the word typo in its name
looks suspicous, so let's tak a closer look. Remember that we can name
it as master~3, HEAD~3, or any prefix of its commit identifier, (such
as 13ed136b):

	$ git log -p -n 1 13ed136b
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.
	
	diff --git a/hello.c b/hello.c
	index ed55ec0..80b260c 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -11,6 +11,6 @@
	 
	 int main(int argc, char **argv)
	 {
	-       printf("hello, world!\n");
	+       printf("hello, world!\");
	        return 0;
	 }

Of course, wanting to see all this information for a single commit is
such a common operation that it's given its own name in git, "git
show". So "git show 13ed136b" is a much easier way to get exactly the
same output:

	$ git show 13ed136b
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.
	
	diff --git a/hello.c b/hello.c
	index ed55ec0..80b260c 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -11,6 +11,6 @@
	 
	 int main(int argc, char **argv)
	 {
	-       printf("hello, world!\n");
	+       printf("hello, world!\");
	        return 0;
	 }

### 2.5  All about command options

Let’s take a brief break from exploring git commands to discuss
a pattern in the way that they work; you may find this useful to keep
in mind as we continue our tour.

Git has a consistent and straightforward approach to dealing
with the options that you can pass to commands. It follows the
conventions for options that are common to modern Linux and Unix
systems.

  * Most options have long names. For example, as we’ve already seen,
    the “git log" command accepts a --max-count=<number> option.
  * Some options have short, single-character names. Often these are
    aliases for long commands, (such as "-n <number>" instead of
    --max-count=<number>), but sometimes the option exists in
    short-form with no long-form equivalent, (such as -p). [XXX: It
    wouldn't hurt to fix this by adding --patch, etc. right?]
  * Long options start with two dashes (e.g. --max-count), while short
    options start with one (e.g. -n).

  * Option naming and usage is consistent across commands. For
    example, every command that lets you specify a commit identifier
    or range will accept the same expressions, (HEAD~3,
    origin..master, 72d4f10e, etc), while any command that can be
    limited by paths will accept the same expressions ("-- doc/
    some-file.c"), etc.

Many commands that print output of some kind can be made more quiet by
passing the -q or --quiet options.

### 2.6  Making and reviewing changes

Now that we have a grasp of viewing history in git, let’s take a
look at making some changes and examining them.

The first thing we’ll do is isolate our experiment in a repository of
its own. We use the “git clone” command, but we don’t need to clone a
copy of the remote repository. Since we already have a copy of it
locally, we can just clone that instead. This is much faster than
cloning over the network, and cloning a local repository uses less
disk space in most cases, too.

	$ cd ..
	$ git clone hello my-hello
	Initialized empty Git repository in /home/cworth/src/hgbook-git/my-hello/.git/
	0 blocks

	[XXX We say "empty" here, (presumably from the git-init part),
	but shouldn't the command also report the succesful clone
	which makes it non-empty? And what the heck does "0 blocks"
	mean?]

As an aside, it’s often good practice to keep a “pristine” copy of a
remote repository around, which you can then make temporary clones of
to create sandboxes for each task you want to work on. This lets you
work on multiple tasks in parallel, each isolated from the others
until it’s complete and you’re ready to integrate it back. Because
local clones are so cheap, there’s almost no overhead to cloning and
destroying repositories whenever you want.

Alternatively, you can achieve much the same effect by creating
multiple branches in a single repository, (but we won't go into detail
on how to do that in this chapter). Some people greatly appreciate
having multiple branches in a single repository rather than having
many repositories cluttering up their filesystem. Other people prefer
the ability to have working-tree changes, and intermediate build
files, etc. each isolated in a separate repository per branch. Both
modes are very well-supported by git, so it's really a matter of which
you find most appropriate at any time given your tastes and project
workflows.

In our my-hello repository, we have a file hello.c that contains the
classic “hello, world” program. Let’s use the ancient and venerable
sed command to edit this file so that it prints a second line of
output. (I’m only using sed to do this because it’s easy to write a
scripted example this way. Since you’re not under the same constraint,
you probably won’t want to use sed; simply use your preferred text
editor to do the same thing.)

	$ sed -i '/printf/a\\tprintf("hello again!\\n");' hello.c       

The “git status” command will tell us what git knows about the files
in the repository.

	$ ls 
	hello.c  Makefile
	$ git status
	# On branch master
	# Changed but not updated:
	#   (use "git add <file>..." to update what will be committed)
	#
	#       modified:   hello.c
	#
	no changes added to commit (use "git add" and/or "git commit -a")

We see that “git status” command prints a line with "modified" for
hello.c. The “git status” command will not print any output for files
that have not been modified.

Notice that we didn’t need to inform git that we were going to modify
the file before we started, or that we had modified the file after we
were done; it was able to figure this out itself.

It’s a little bit helpful to know that we’ve modified hello.c, but we
might prefer to know exactly what changes we’ve made to it. To do
this, we use the “git diff” command.

	$ git diff
	diff --git a/hello.c b/hello.c
	index 9a3ff79..6d28887 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -8,5 +8,6 @@
	 int main(int argc, char **argv)
	 {
	        printf("hello, world!\");
	+       printf("hello again!\n");
	        return 0;
	 }

### 2.7  Recording changes in a new commit

We can modify files, build and test our changes, and use “git status”
and “git diff” to review our changes, until we’re satisfied with what
we’ve done and arrive at a natural stopping point where we want to
record our work in a new commit.

The “git commit” command lets us create a new changeset; we’ll usually
refer to this as “making a commit” or “committing”.

#### 2.7.1  Setting up a username

When you try to run “git commit” for the first time, it might not do
exactly what you want. Git records your name and address with each
change that you commit, (as both author and committer unless you tell
it otherwise), so that you and others will later be able to tell who
made each change. Git tries to automatically figure out a sensible
name and address to attribute to both author and committer. It will
attempt each of the following methods, in order, (stopping for each field as soon as a value is found):

  1. If you specify a --author option to the “git commit” command on
     the command line, followed by a "Real Name <email@example.com>"
     string, then this name and addresss will be used for the author
     fields. The committer fields will still be determined as
     below. This option is very helpful for when applying a commit
     originally authored by someone other than yourself.
  2. If any of the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL,
     GIT_COMMITTER_NAME, or GIT_COMMITER_EMAIL environment variables
     are set, then those values will be used for the corresponding
     fields.
  3. If you have a file in your home directory called .gitconfig, with
     name or email settings in the [user] section, then these values
     will be used to set any remaining author and committer
     fields. For more details on the contents of this file, refer to
     section 2.7.1 below.
  4. If you have a file in the local repository called .git/config,
     again with name or email settings in the [user] section, then
     these values will be used to set any remaining author and
     committer fields.
  5. If you have set the EMAIL environment variable, this will be used
     to set author and committer email addresses if still unset.
  6. git will query your system to find out your real name from
     available GECOS field and your username, hostname, and domain to
     construct an email address, (or at least an identifier resembling
     an email address).

If all of these mechanisms fail, "git commit" will fail, printing an
error message instructing you how to use "git config" to tell git your
name and email address.

You should think of the GIT_AUTHOR/COMMITER_NAME/EMAIL environment
variables and the --author option to the “git commit” command as ways
to override git’s default selection. For normal use, the simplest and
most robust way to set your information is by creating a .gitconfig
file, (either manually or with the "git config" command); see below
for details.

##### Creating a git configuration file

To set your name and email address, just use the following commands:

	git config --global user.name "Your Name"
	git config --global user.email "you@example.com"

The --global option means that this command will set global
information, (affecting all repositories on this machine), in the
.gitconfig file in your home directory. Alternately, you could omit
the --global which would make the change take effect only in the local
repository. This is convenient if you want to have different email
addresses associated with different projects, for example.

Of course, git's configuration file is a simple-to-edit plain-text
file, so instead of using the above commands, you can also just edit
the files directly. Use your favorite editor to create a file called
.gitconfig in your home directory, (or if you ran the above commands
then it will be there already). The initial contents of your
.gitconfig should look like this.

	# This is a git configuration file.   
	[user]
		name = Your Name
		email = you@example.com

Similarly, you can make a repository-specific configuration by editing
.git/config in the local repository. It will already have some
sections present, (created by the "git clone"), just add a [user]
section as above.

The “[user]” line begins a section of the config file, so you can read
the “name = ...” line as meaning “set the value of the name item in
the user section”. This is the same notion expressed with the
"user.name" syntax on the git-config command line.  A section
continues until a new section begins, or the end of the file. Git
ignores empty lines and treats any text from “#” to the end of a line
as a comment.

##### Choosing a user name

You can use any text you like as the value of the name and email
configuration items, since this information is for reading by other
people, not for interpreting by git. It is conventional to use a valid
email address, but some, (notably Linus Torvalds, the original author
of git), actually like the default user@hostname convention that git
falls back on without any additional information. There's no
requirement that the email address actually be valid, and perhaps it's
useful to be reminded which machine was used to create particular
commits.

#### 2.7.2  Writing a commit message

When we commit a change, git drops us into a text editor to
enter a message that will describe the modifications we’ve made in
this commit. This is called the commit message. It will be a record
for readers of what we did and why, and it will be printed by “git log”
after we’ve finished committing.

	$ git commit -a

Note: The -a on the command-line instructs git to commit all changes
to tracked files. Without this, "git commit" will only commit changes
that have been previously staged for committing with "git add
file". The most common usage is to commit with "git commit -a" and
only use "git add file; git commit" when there is a need to commit
only some subset of changes that have been made.

The editor that the “git commit” command drops us into will contain an
empty line, followed by a number of lines starting with “#”.

	empty line
	# Please enter the commit message for your changes.
	# (Comment lines starting with '#' will not be included)
	# On branch master
	# Changes to be committed:
	#   (use "git reset HEAD <file>..." to unstage)
	#
	#       modified:   hello.c
	#	

git ignores the lines that start with “#”; it uses them only
to tell us which files it’s recording changes to. Modifying or
deleting these lines has no effect.

#### 2.7.3  Writing a good commit message

A good commit message will generally have a single line that
summarizes the commit, a blank line, and then one or more pargraphs
with supporting detail. Since many tools only print the first line of
a commit message by default, it’s important that the first line stands
alone.

One example of a first-line-only viewer is "git log
--pretty=short". Other examples include graphical history viewers such
as gitk and gitview, and web-based viewers such as gitweb and cgit.

Here’s a real example of a commit message that doesn’t follow
this guideline, and hence has a summary that is not readable.

	$ git log --pretty=short
	commit 3ef5535144da88a854f7930503845cd44506c2e2
	Author: Censored Person <censored.person@example.org>
	
	    include buildmeister/commondefs.   Add an exports and install

As far as the remainder of the contents of the commit message are
concerned, there are no hard-and-fast rules. git itself doesn’t
interpret or care about the contents of the commit message, though
your project may have policies that dictate a certain kind of
formatting.

My personal preference is for short, but informative, commit messages
that tell me something that I can’t figure out with a quick glance at
the output of “git log -p".

#### 2.7.4  Aborting a commit

If you decide that you don’t want to commit while in the middle of
editing a commit message, simply exit from your editor without saving
the file that it’s editing. This will cause nothing to happen to
either the repository or the working directory.

#### 2.7.5  Admiring our new handiwork

Once we’ve finished the commit, we can use the “git show” command to
display the commit we just created. As discussed previously, this
command produces output that is identical to “git log -p”, but for
only a single revision, (and the most recent revision by default):

	$ git show
	commit 018cfb742be6176443ffddac454e593e802ddf3e
	Author: Carl Worth <cworth@cworth.org>
	Date:   Thu Sep 27 23:55:00 2007 -0700
	
	    Added an extra line of output.
	    
	    If I would have been clever I would have fixed that old typo
	    while I was at it...
	
	diff --git a/hello.c b/hello.c
	index 9a3ff79..6d28887 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -8,5 +8,6 @@
	 int main(int argc, char **argv)
	 {
	        printf("hello, world!\");
	+       printf("hello again!\n");
	        return 0;
	 }

Note that you will not see the same commit identifier for your commit,
even if the change you made is identical to mine. The commit
identifier incorporates not only the contents of the files, but commit
message, the author and committer names and emails, and the author and
commit dates. (OK, so now you probably know enough to be able to guess
the right command to produce a commit with exactly the commit
identifier shown above. Can you do it?)

#### 2.7.6 Fixing up a broken commit (before anyone else sees it)

So now that we've cloned a local repository, made a change to the
code, setup our name and email address, and made a commit with a
careful message, we're just about ready to share our change with the
world. But wait, we forgot to try to compile it didn't we?

	$ make
	cc    -c -o hello.o hello.c
	hello.c:10:9: warning: missing terminating " character
	hello.c:10:9: warning: missing terminating " character
	hello.c: In function ‘main’:
	hello.c:10: error: missing terminating " character
	hello.c:11: error: expected ‘)’ before ‘;’ token
	hello.c:13: warning: passing argument 1 of ‘printf’ makes pointer from integer without a cast
	hello.c:13: error: expected ‘;’ before ‘}’ token
	make: *** [hello.o] Error 1

Oh look. The code's broken and doesn't compile. We don't want to share
code in this state. For situations where you notice one tiny detail
that got left out of the last commit, (a silly syntax error, a
misspelling in a comment or commit messsage), git provides a very
handy tool for just changing the last commit.

So fix that typo, (a missing 'n' between the '\' and the '"'), with
your editor or with something like this:

	sed -i 's/\\"/\\n"/' hello.c

And then you can just amend the previous commit rather than creating a
new one with the --amend option to "git commit":

	$ git commit -a --amend

Note that we use -a to include the code change here. And that helps
point out a situation where "git commit" is useful without the -a
option, "git commit --amend" is a useful command for amend just the
last commit message, without committing any new code changes, even if
some files have been modified in the working tree.

And here's the final result:

	$ git show
	commit 839b58d021c618bd0e1d336d4d5878a0082672e6
	Author: Carl Worth <cworth@cworth.org>
	Date:   Thu Sep 27 23:55:00 2007 -0700
	
	    Added an extra line of output and fixed the typo bug.
	
	diff --git a/hello.c b/hello.c
	index 9a3ff79..ca750e0 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -7,6 +7,7 @@
	 
	 int main(int argc, char **argv)
	 {
	-       printf("hello, world!\");
	+       printf("hello, world!\n");
	+       printf("hello again!\n");
	        return 0;
	 }

I can't help but point out that this really was a poor example for
--amend. The end result is a single commit that does two independent
things, (fixes one bug and adds one new feature). It's much better to
create a code history where each commit makes an independent change,
(and as small as possible). This is important for several reasons:

  * Small changes are easier to review

  * Independent changes are easier to split up if only part of the
    series gets accepted "upstream" for one reason or another.

  * The smaller the changes are the more useful the history will be
    when actually using the history, not just viewing it. This is
    particularly important when doing "git bisect"---that's a powerful
    tool for isolating the single commit that introduces a bug. And
    it's much more powerful if the commit it isolates is as small as
    possible.

So it's a good thing this document is available under a license that
allows for distribution of modified versions. Someone should clean up
the --amend example to not teach bad habits like I did above. [Note:
All this bad-habit stuff was introduced by me, and was not present in
Bryan's original chapter. -Carl]

### 2.8  Sharing changes

We mentioned earlier that repositories in Mercurial are
self-contained. This means that the changeset we just created exists
only in our my-hello repository. Let’s look at a few ways that we can
propagate this change into other repositories.

#### 2.8.1  Pulling changes from another repository

To get started, let’s clone our original hello repository, which does
not contain the change we just committed. We’ll call our temporary
repository hello-pull.

	$ cd ..   
	$ hg clone hello hello-pull   
	2 files updated, 0 files merged, 0 files removed, 0 files unresolved

We’ll use the “hg pull” command to bring changes from my-hello into
hello-pull. However, blindly pulling unknown changes into a repository
is a somewhat scary prospect. Mercurial provides the “hg incoming”
command to tell us what changes the “hg pull” command would pull into
the repository, without actually pulling the changes in.

	$ cd hello-pull   
	$ hg incoming ../my-hello   
	comparing with ../my-hello   
	searching for changes   
	changeset:   5:fa1321bf0c80   
	tag:         tip   
	user:        Bryan O'Sullivan <bos@serpentine.com>   
	date:        Sun Jun 17 18:05:50 2007 +0000   
	summary:     Added an extra line of output   
	

(Of course, someone could cause more changesets to appear in the
repository that we ran “hg incoming” in, before we get a chance to “hg
pull” the changes, so that we could end up pulling changes that we
didn’t expect.)

Bringing changes into a repository is a simple matter of running the
“hg pull” command, and telling it which repository to pull from.

	$ hg tip   
	changeset:   4:b57f9a090b62   
	tag:         tip   
	user:        Bryan O'Sullivan <bos@serpentine.com>   
	date:        Tue Sep 06 15:43:07 2005 -0700   
	summary:     Trim comments.   
	
	$ hg pull ../my-hello   
	pulling from ../my-hello   
	searching for changes   
	adding changesets   
	adding manifests   
	adding file changes   
	added 1 changesets with 1 changes to 1 files   
	(run 'hg update' to get a working copy)   
	$ hg tip   
	changeset:   5:fa1321bf0c80   
	tag:         tip   
	user:        Bryan O'Sullivan <bos@serpentine.com>   
	date:        Sun Jun 17 18:05:50 2007 +0000   
	summary:     Added an extra line of output   
	

As you can see from the before-and-after output of “hg tip”, we have
successfully pulled changes into our repository. There remains one
step before we can see these changes in the working directory.

#### 2.8.2  Updating the working directory

We have so far glossed over the relationship between a repository and
its working directory. The “hg pull” command that we ran in
section [2.8.1][12] brought changes into the repository, but if we
check, there’s no sign of those changes in the working directory. This
is because “hg pull” does not (by default) touch the working
directory. Instead, we use the “hg update” command to do this.

	$ grep printf hello.c   
	printf("hello, world!∖");   
	$ hg update tip   
	1 files updated, 0 files merged, 0 files removed, 0 files unresolved   
	$ grep printf hello.c   
	printf("hello, world!∖");   
	printf("hello again!∖n");

It might seem a bit strange that “hg pull” doesn’t update the working
directory automatically. There’s actually a good reason for this: you
can use “hg update” to update the working directory to the state it
was in at any revision in the history of the repository. If you had
the working directory updated to an old revision—to hunt down the
origin of a bug, say—and ran a “hg pull” which automatically updated
the working directory to a new revision, you might not be terribly
happy.

However, since pull-then-update is such a common thing to do,
Mercurial lets you combine the two by passing the -u option to “hg
pull”.

	hg pull -u

If you look back at the output of “hg pull” in section [2.8.1][12]
when we ran it without -u, you can see that it printed a helpful
reminder that we’d have to take an explicit step to update the working
directory:

	(run 'hg update' to get a working copy)

To find out what revision the working directory is at, use the “hg
parents” command.

	$ hg parents   
	changeset:   5:fa1321bf0c80   
	tag:         tip   
	user:        Bryan O'Sullivan <bos@serpentine.com>   
	date:        Sun Jun 17 18:05:50 2007 +0000   
	summary:     Added an extra line of output   
	

If you look back at figure [2.1][8], you’ll see arrows connecting each
changeset. The node that the arrow leads from in each case is a
parent, and the node that the arrow leads to is its child. The working
directory has a parent in just the same way; this is the changeset
that the working directory currently contains.

To update the working directory to a particular revision, give a
revision number or changeset ID to the “hg update” command.

	$ hg update 2   
	2 files updated, 0 files merged, 0 files removed, 0 files unresolved   
	$ hg parents   
	changeset:   2:057d3c2d823c   
	user:        Bryan O'Sullivan <bos@serpentine.com>   
	date:        Tue Sep 06 13:15:43 2005 -0700   
	summary:     Introduce a typo into hello.c.   
	
	$ hg update   
	2 files updated, 0 files merged, 0 files removed, 0 files unresolved

If you omit an explicit revision, “hg update” will update to the tip
revision, as shown by the second call to “hg update” in the example
above.

#### 2.8.3  Pushing changes to another repository

Mercurial lets us push changes to another repository, from the
repository we’re currently visiting. As with the example of “hg pull”
above, we’ll create a temporary repository to push our changes into.

	$ cd ..   
	$ hg clone hello hello-push   
	2 files updated, 0 files merged, 0 files removed, 0 files unresolved

The “hg outgoing” command tells us what changes would be pushed into
another repository.

	$ cd my-hello   
	$ hg outgoing ../hello-push   
	comparing with ../hello-push   
	searching for changes   
	changeset:   5:fa1321bf0c80   
	tag:         tip   
	user:        Bryan O'Sullivan <bos@serpentine.com>   
	date:        Sun Jun 17 18:05:50 2007 +0000   
	summary:     Added an extra line of output   
	

And the “hg push” command does the actual push. 

	$ hg push ../hello-push   
	pushing to ../hello-push   
	searching for changes   
	adding changesets   
	adding manifests   
	adding file changes   
	added 1 changesets with 1 changes to 1 files

As with “hg pull”, the “hg push” command does not update the working
directory in the repository that it’s pushing changes into. (Unlike
“hg pull”, “hg push” does not provide a -u option that updates the
other repository’s working directory.)

What happens if we try to pull or push changes and the receiving
repository already has those changes? Nothing too exciting.

	$ hg push ../hello-push   
	pushing to ../hello-push   
	searching for changes   
	no changes found

#### 2.8.4  Sharing changes over a network

The commands we have covered in the previous few sections are not
limited to working with local repositories. Each works in exactly the
same fashion over a network connection; simply pass in a URL instead
of a local path.

	$ hg outgoing http://hg.serpentine.com/tutorial/hello   
	comparing with http://hg.serpentine.com/tutorial/hello   
	searching for changes   
	changeset:   5:fa1321bf0c80   
	tag:         tip   
	user:        Bryan O'Sullivan <bos@serpentine.com>   
	date:        Sun Jun 17 18:05:50 2007 +0000   
	summary:     Added an extra line of output   
	

In this example, we can see what changes we could push to the remote
repository, but the repository is understandably not set up to let
anonymous users push to it.

	$ hg push http://hg.serpentine.com/tutorial/hello   
	pushing to http://hg.serpentine.com/tutorial/hello   
	searching for changes   
	ssl required

   [1]: http://hgbook.red-bean.com/hgbookch3.html
   [2]: http://hgbook.red-bean.com/hgbookch1.html
   [3]: http://hgbook.red-bean.com/hgbookch1.html#tailhgbookch1.html
   [4]: #tailhgbookch2.html
   [5]: http://hgbook.red-bean.com/hgbook.html#hgbookch2.html
   [6]: http://mercurial.berkwood.com/
   [7]: http://hgbook.red-bean.com/hgbookli4.html#Xweb:macpython
   [8]: #x6-340581
   [9]: hgbookch2_files/tour-history.png
   [10]: http://hgbook.red-bean.com/hgbookch12.html#x16-27100012.4
   [11]: #x6-420002.7.1
   [12]: #x6-490002.8.1
   [13]: http://hgbook.red-bean.com/hgbookch2.html

## Appendix D  
Open Publication License

Version 1.0, 8 June 1999 

### D.1  Requirements on both unmodified and modified versions

The Open Publication works may be reproduced and distributed in whole
or in part, in any medium physical or electronic, provided that the
terms of this license are adhered to, and that this license or an
incorporation of it by reference (with any options elected by the
author(s) and/or publisher) is displayed in the reproduction.

Proper form for an incorporation by reference is as follows: 

Copyright (c) year by author’s name or designee. This material may be
distributed only subject to the terms and conditions set forth in the
Open Publication License, vx.y or later (the latest version is
presently available at
[http://www.opencontent.org/openpub/][http://www.opencontent.org/openpub/]).

The reference must be immediately followed with any options elected by
the author(s) and/or publisher of the document (see section D.6).

Commercial redistribution of Open Publication-licensed material is
permitted.

Any publication in standard (paper) book form shall require the
citation of the original publisher and author. The publisher and
author’s names shall appear on all outer surfaces of the book. On all
outer surfaces of the book the original publisher’s name shall be as
large as the title of the work and cited as possessive with respect to
the title.

### D.2  Copyright

The copyright to each Open Publication is owned by its author(s) or
designee.

### D.3  Scope of license

The following license terms apply to all Open Publication works,
unless otherwise explicitly stated in the document.

Mere aggregation of Open Publication works or a portion of an Open
Publication work with other works or programs on the same media shall
not cause this license to apply to those other works. The aggregate
work shall contain a notice specifying the inclusion of the Open
Publication material and appropriate copyright notice.

Severability. If any part of this license is found to be unenforceable
in any jurisdiction, the remaining portions of the license remain in
force.

No warranty. Open Publication works are licensed and provided “as is”
without warranty of any kind, express or implied, including, but not
limited to, the implied warranties of merchantability and fitness for
a particular purpose or a warranty of non-infringement.

### D.4  Requirements on modified works

All modified versions of documents covered by this license, including
translations, anthologies, compilations and partial documents, must
meet the following requirements:

  1. The modified version must be labeled as such. 
  2. The person making the modifications must be identified and the
     modifications dated.
  3. Acknowledgement of the original author and publisher if
     applicable must be retained according to normal academic citation
     practices.
  4. The location of the original unmodified document must be identified. 
  5. The original author’s (or authors’) name(s) may not be used to
     assert or imply endorsement of the resulting document without the
     original author’s (or authors’) permission.

### D.5  Good-practice recommendations

In addition to the requirements of this license, it is requested from
and strongly recommended of redistributors that:

  1. If you are distributing Open Publication works on hardcopy or
     CD-ROM, you provide email notification to the authors of your
     intent to redistribute at least thirty days before your
     manuscript or media freeze, to give the authors time to provide
     updated documents. This notification should describe
     modifications, if any, made to the document.
  2. All substantive modifications (including deletions) be either
     clearly marked up in the document or else described in an
     attachment to the document.
  3. Finally, while it is not mandatory under this license, it is
     considered good form to offer a free copy of any hardcopy and
     CD-ROM expression of an Open Publication-licensed work to its
     author(s).

### D.6  License options

The author(s) and/or publisher of an Open Publication-licensed
document may elect certain options by appending language to the
reference to or copy of the license. These options are considered part
of the license instance and must be included with the license (or its
incorporation by reference) in derived works.

  1. To prohibit distribution of substantively modified versions
     without the explicit permission of the author(s). “Substantive
     modification” is defined as a change to the semantic content of
     the document, and excludes mere changes in format or
     typographical corrections.

     To accomplish this, add the phrase “Distribution of substantively
     modified versions of this document is prohibited without the
     explicit permission of the copyright holder.” to the license
     reference or copy.

  2. To prohibit any publication of this work or derivative works in
     whole or in part in standard (paper) book form for commercial
     purposes is prohibited unless prior permission is obtained from
     the copyright holder.

     To accomplish this, add the phrase “Distribution of the work or
     derivative of the work in any standard (paper) book form is
     prohibited unless prior permission is obtained from the copyright
     holder.” to the license reference or copy.