## Chapter 2
A tour of git: the basics

### 2.0 Copyright

This document is a modified version of a document originally titled
"Distributed revision control with Mercurial" and originally authored
by Bryan O’Sullivan. The original document was obtained from
<http://hgbook.red-bean.com/>.

Copyright © 2006, 2007 Bryan O’Sullivan.

This material may be distributed only subject to the terms and
conditions set forth in version 1.0 of the Open Publication
License. Please refer to Appendix D for the license text.

As this is a modified version, the name of Bryan O'Sullivan is used
only to properly credit him with the original text. The appearance of
his name here explicitly does not assert or imply his endorsement of
this modified document.

Portions Copyright © 2007 Carl Worth.

Changes made by Carl include the following:

  * 2007-09-27:
    * Convert from HTML to markdown source syntax
    * Eliminate all content except Chapter 2 and Appendix D
    * Eliminate line numbers from examples
    * Modified to describe git instead of mercurial

The source of this modified version can be obtained via git:

	git clone git://cworth.org/git/hgbook-git

or

	git clone http://cworth.org/git/hgbook-git

and can be [browsed online](http://git.cworth.org/git/hgbook-git)

### 2.1  Installing git on your system

Prebuilt binary packages of git are available for many popular
operating systems. These make it easy to start using git on your
computer immediately.

#### 2.1.1  Linux

Because each Linux distribution has its own packaging tools, policies,
and rate of development, it’s difficult to give a comprehensive set of
instructions on how to install git binaries. The version of
git that you will end up with can vary depending on how active
the person is who maintains the package for your distribution.

To keep things simple, I will focus on installing git from the
command line under the most popular Linux distributions. Most of these
distributions provide graphical package managers that will let you
install git with a single click. The package name to look for is
often git, but is sometimes git-core, (due to an unfortunate name
with git, meaning GNU Interactive Tools).

  * Debian

	apt-get install git-core

  * Fedora Core

	yum install git

  * Gentoo

	emerge git

  * OpenSUSE

	yum install git

  * Ubuntu

	apt-get install git-core

#### 2.1.2  Mac OS X

A git-core package is available through
[macports](http://macports.org). Once macports is enabled, the command
to install git is:

	port install git-core

#### 2.1.3  Windows

Git has long been available as part of cygwin, and works reasonably
well in that environment. Some people find cygwin a particularly
inelegant approach to running git and would prefer a "native"
solution. To this end, the [msysgit
project](http://code.google.com/p/msysgit/) is rapidly putting
together a solution including various packages with full
installers. These include GitMe, a package to install the entire
development environment necessary to work on improving the msysgit
port of git, and WinGit, a package for installing just git itself
without the development environment, (still in Alpha as of September
2007).

### 2.2  Getting started

To begin, we’ll use the “git version” command to find out whether git
is actually installed properly. Versions 1.5 and newer of git are much
more friendly to new users than versions 1.4 and older. If you aren't
yet running version 1.5 or newer, it's highly recommended that you
upgrade.

	$ git version
	git version 1.5.3.2

#### 2.2.1  Built-in help

Git provides a built-in help system. This is invaluable for those
times when you find yourself stuck trying to remember how to run a
command. If you are completely stuck, simply run “git help”; it will
print a brief list of commonly-used commands, along with a description
of what each does. If you ask for help on a specific command (such as
"git help init"), it prints more detailed information. [XXX: Does "git
help <foo>" work universally as a built-in or does it expect man to be
present and just call out to "man git-<foo>"?]

	[XXX: The original hgbook includes the complete output of "hg
	help init" at this point. I'm not including the corresponding
	"git help init" output as it would be excessively long. The
	description alone is quite reasonable, (other than a
	not-too-helpful aside about the obsolete git-init-db command),
	but it only comes after a full screen's worth of options
	details. Might it make sense to have a more summarized help
	output for "git help <foo>" than all of the documentation
	available for git-<foo>? And perhaps alos provide a "git -v
	help" similar to "hg -v help" for more?]

### 2.3  Working with a repository

In git, everything happens inside a repository. The repository
for a project contains all of the files that “belong to” that project,
along with a historical record of the project’s files.

There’s nothing particularly magical about a repository; it is simply
a directory tree in your filesystem that git treats as
special. You can rename or delete a repository any time you like,
using either the command line or your file browser.

#### 2.3.1  Creating a local copy of a remote repository

As suggested, a repository can be copied through normal file-copying
commands. But git also provides a "git clone" tool for copying a
repository. This provides a means of copying a repository over the
network, and is also useful with a local repository since it is much
more efficient than creating a normal copy, (creating a local clones
is blazingly fast).

We've assembled a simple repository that will be used in the examples
throughout this chapter. Go ahead and clone this repository now so
that you will be able to follow along:

	$ git clone git://cworth.org/git/hello
	Initialized empty Git repository in /tmp/hello/.git/
	remote: Generating pack...
	remote: Done counting 15 objects.
	remote: Deltifying 15 objects...
	remote:  100% (15/15) done
	remote: Total 15 (delta 2), reused 15 (delta remote: 2)
	Indexing 15 objects...
	 100% (15/15) done
	Resolving 2 deltas...
	 100% (2/2) done

If for some reason you are prevented from talking on the git: port,
then there is also the capability to clone a repository (less
efficiently) over http:

	$ git clone http://cworth.org/git/hello
	Initialized empty Git repository in /tmp/hello/.git/
	Getting alternates list for http://cworth.org/git/hello
	Getting pack list for http://cworth.org/git/hello
	Getting index for pack 04ecb061314ecbd60fa0610ecf55a1cbf85ea294
	Getting pack 04ecb061314ecbd60fa0610ecf55a1cbf85ea294
	 which contains a1a0e8b392b17caf50325498df54802fe3c03710
	walk a1a0e8b392b17caf50325498df54802fe3c03710
	walk 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	walk 13ed136b983a9c439eddeea8a1c2076cffbb685f
	walk 0a633bf58b45fcf1a8299d3c82cd1fd26d3f48f2
	walk db7117a9dd9a6e57e8632ea5848e1101eee0fbde

If our clone succeeded, we should now have a local directory called
hello. This directory will contain some files.

	$ ls -l
	total 4
	drwxr-xr-x 3 cworth cworth 4096 2007-09-27 16:40 hello
	$ ls hello
	hello.c  Makefile

These files have the same contents and history in our repository as
they do in the repository we cloned.

Every git repository is complete, self-contained, and
independent. It contains its own private copy of a project’s files and
history. A cloned repository remembers the location of the repository
it was cloned from, but it does not communicate with that repository,
or any other, unless you tell it to.

What this means for now is that we’re free to experiment with our
repository, safe in the knowledge that it’s a private “sandbox” that
won’t affect anyone else.

#### 2.3.2  What’s in a repository?

When we take a more detailed look inside a repository, we can see that
it contains a directory named .git. This is where git keeps all
of its metadata for the repository.

	$ cd hello
	$ ls -a
	.  ..  .git  hello.c  Makefile

The contents of the .git directory and its subdirectories are private
to git. Every other file and directory in the repository is
yours to do with as you please.

To introduce a little terminology, the .git directory is the “real”
repository, and all of the files and directories that coexist with it
are said to live in the working directory. An easy way to remember the
distinction is that the repository contains the history of your
project, while the working directory contains a snapshot of your
project at a particular point in history.

### 2.4  A tour through history

One of the first things we might want to do with a new, unfamiliar
repository is understand its history. The “git log” command gives us a
view of history.

	$ git log
	commit a1a0e8b392b17caf50325498df54802fe3c03710
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 15:43:07 2005 -0700
	
	    Trim comments.
	
	commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:58 2005 -0700
	
	    Get make to generate the final binary from a .o file.
	
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.
	
	commit 0a633bf58b45fcf1a8299d3c82cd1fd26d3f48f2
	Author: Bryan O'Sullivan <mpm@selenic.com>
	Date:   Fri Aug 26 01:21:28 2005 -0700
	
	    Create a makefile
	
	commit db7117a9dd9a6e57e8632ea5848e1101eee0fbde
	Author: Bryan O'Sullivan <mpm@selenic.com>
	Date:   Fri Aug 26 01:20:50 2005 -0700
	
	    Create a standard "hello, world" program

By default, this command prints a brief paragraph of output for each
change to the project that was recorded. In git terminology, we
call each of these recorded events a commit.

The fields in a record of output from “git log” are as follows.

  * commit This field consists of a string of 40 hexadecimal characters.
    This is a unique identifier for referring to particular commits.
  * Author The identity of the person who authored the commit. This
    field consist of two sub-fields for the user's name and email
    address, (or at least an email-like idenitifer). Note that git
    stores a separate "Committer" field for the person who commited
    the change, (since often an author will email a change to a
    maintainer that commits it). The "git log" command doesn't display
    the Committer, but other git tools do.
  * Date The date and time on which the commit was authored, (again
    stored separately from the date the change was committed).
    timezone in which it was created. (The date and time are displayed
    in the timezone of the person who created the commit.)
  * commit message The text message that the creator of the commit
    entered to describe the commit, (generally a one-line summary
    followed by more supporting text).

The default output printed by “git log” is purely a summary; it is
missing a lot of detail.

#### 2.4.1  Commits, revisions, and talking to other people

As English is a notoriously sloppy language, and computer science has
a hallowed history of terminological confusion (why use one term when
four will do?), revision control has a variety of words and phrases
that mean the same thing. If you are talking about git history
with other people, you will find that what we have called a “commit”
is often called a "revision". In other systems, a similar notion
is referred to as a "changeset". You might even see abbreviations of
these terms such as "rev", "change", or even "cset".

While it may not matter much what word you use to refer to the concept
of “a commit”, it's important to know how to name “a specific
commit”. We have already seen one means of referring to a particular
commit, the 40-character hexadecimal string shown by "git log". These
commit identifiers are powerful because they are permanent, unique
identifiers that always identify the same commit in any copy of a
repository. If two users are examining a working directory associated
with the same commit identifier, then those two users have precisely
the same contents in all files, and exactly the same history leading
to that commit.

So there are places where it is often important to archive the
complete commit identifier, (perhaps in bug-tracking systems to
indicate a specific commit that fixes a bug, for example). But often,
in more casual settings, it's more convenient to use abbreviated
commit identifiers. Git accept any unique prefix of a commit
identifier, (and for reasonably-sized project the first 8 or 10
characters are almost always unique).

And unlike the permanent commit identifiers, git also provides
transient means of identifying commits. In fact, in day-to-day use of
git, you will probably use these names more than commit
identifiers. One example is branch names, (such as the default
"master" branch in any git repository), or any project-specific branch
names such as "stable", "experimental", or "crazy-insane-changes". Git
also provides a special name "HEAD" which always refers to the current
branch.

#### 2.4.2 Naming related commits

Git offers simple ways to name revisions that are related to
particular revisions in the history. One syntax is the ~ suffix which
refers to the parent of a commit, or if followed by a number, to the
Nth parent. For example, since "HEAD" refers to the most recent commit
in the current branch, "HEAD~", refers to the previous commit, and
"HEAD~2" refers to two commits back in the history.

Another useful syntax is .. which can be used to specify a range of
commits. So "origin..master" specifies everything that has been
committed to master since it diverged from origin.

#### 2.4.3  Viewing specific revisions

You can use "git log" to explore the range syntax just introduced. For
example, to see a list of the most recent 3 revisions you can use
"HEAD~3..", (the destination of the range is implicitly HEAD in this
case):

	$ git log HEAD~3..
	commit a1a0e8b392b17caf50325498df54802fe3c03710
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 15:43:07 2005 -0700
	
	    Trim comments.
	
	commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:58 2005 -0700
	
	    Get make to generate the final binary from a .o file.
	
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.

#### 2.4.4 Other log filters

Besides filtering by commit identifiers, git allows you to easily
filter the log output according to which files (or directories) are
modified by listing them after "--" wihch is necessary to distinguish
commit names from file names:

	$ git log -- Makefile
	commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:58 2005 -0700
	
	    Get make to generate the final binary from a .o file.
	
	commit 0a633bf58b45fcf1a8299d3c82cd1fd26d3f48f2
	Author: Bryan O'Sullivan <mpm@selenic.com>
	Date:   Fri Aug 26 01:21:28 2005 -0700
	
	    Create a makefile

And "git log" can also filter based on the dates at which commits were
created:

	$ git log --since="2 weeks ago" --until="yesterday"

Another useful option is -n or --max-count which, unsurprisingly,
limits the maximum number of commits to be displayed.

#### 2.4.5  More detailed information

While the default information printed by “git log” is useful if you
already know what you’re looking for, you may need to see more details
of the change, such as the "diffstat" information with --stat:

	$ git log --stat --max-count=3
	commit a1a0e8b392b17caf50325498df54802fe3c03710
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 15:43:07 2005 -0700
	
	    Trim comments.
	
	 hello.c |    8 ++------
	 1 files changed, 2 insertions(+), 6 deletions(-)
	
	commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:58 2005 -0700
	
	    Get make to generate the final binary from a .o file.
	
	 Makefile |    2 ++
	 1 files changed, 2 insertions(+), 0 deletions(-)
	
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.
	
	 hello.c |    2 +-
	 1 files changed, 1 insertions(+), 1 deletions(-)

Or perhaps you'd like to see the actual patch content of each change,
which you can get with -p. That commit with the word typo in its name
looks suspicous, so let's tak a closer look. Remember that we can name
it as master~3, HEAD~3, or any prefix of its commit identifier, (such
as 13ed136b):

	$ git log -p -n 1 13ed136b
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.
	
	diff --git a/hello.c b/hello.c
	index ed55ec0..80b260c 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -11,6 +11,6 @@
	 
	 int main(int argc, char **argv)
	 {
	-       printf("hello, world!\n");
	+       printf("hello, world!\");
	        return 0;
	 }

Of course, wanting to see all this information for a single commit is
such a common operation that it's given its own name in git, "git
show". So "git show 13ed136b" is a much easier way to get exactly the
same output:

	$ git show 13ed136b
	commit 13ed136b983a9c439eddeea8a1c2076cffbb685f
	Author: Bryan O'Sullivan <bos@serpentine.com>
	Date:   Tue Sep 6 13:15:43 2005 -0700
	
	    Introduce a typo into hello.c.
	
	diff --git a/hello.c b/hello.c
	index ed55ec0..80b260c 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -11,6 +11,6 @@
	 
	 int main(int argc, char **argv)
	 {
	-       printf("hello, world!\n");
	+       printf("hello, world!\");
	        return 0;
	 }

### 2.5  All about command options

Let’s take a brief break from exploring git commands to discuss
a pattern in the way that they work; you may find this useful to keep
in mind as we continue our tour.

Git has a consistent and straightforward approach to dealing
with the options that you can pass to commands. It follows the
conventions for options that are common to modern Linux and Unix
systems.

  * Most options have long names. For example, as we’ve already seen,
    the “git log" command accepts a --max-count=<number> option.
  * Some options have short, single-character names. Often these are
    aliases for long commands, (such as "-n <number>" instead of
    --max-count=<number>), but sometimes the option exists in
    short-form with no long-form equivalent, (such as -p). [XXX: It
    wouldn't hurt to fix this by adding --patch, etc. right?]
  * Long options start with two dashes (e.g. --max-count), while short
    options start with one (e.g. -n).

  * Option naming and usage is consistent across commands. For
    example, every command that lets you specify a commit identifier
    or range will accept the same expressions, (HEAD~3,
    origin..master, 72d4f10e, etc), while any command that can be
    limited by paths will accept the same expressions ("-- doc/
    some-file.c"), etc.

Many commands that print output of some kind can be made more quiet by
passing the -q or --quiet options.

### 2.6  Making and reviewing changes

Now that we have a grasp of viewing history in git, let’s take a
look at making some changes and examining them.

The first thing we’ll do is isolate our experiment in a repository of
its own. We use the “git clone” command, but we don’t need to clone a
copy of the remote repository. Since we already have a copy of it
locally, we can just clone that instead. This is much faster than
cloning over the network, and cloning a local repository uses less
disk space in most cases, too.

	$ cd ..
	$ git clone hello my-hello
	Initialized empty Git repository in /tmp/my-hello/.git/
	0 blocks

	[XXX We say "empty" here, (presumably from the git-init part),
	but shouldn't the command also report the succesful clone
	which makes it non-empty? And what the heck does "0 blocks"
	mean?]

As an aside, it’s often good practice to keep a “pristine” copy of a
remote repository around, which you can then make temporary clones of
to create sandboxes for each task you want to work on. This lets you
work on multiple tasks in parallel, each isolated from the others
until it’s complete and you’re ready to integrate it back. Because
local clones are so cheap, there’s almost no overhead to cloning and
destroying repositories whenever you want.

Alternatively, you can achieve much the same effect by creating
multiple branches in a single repository, (but we won't go into detail
on how to do that in this chapter). Some people greatly appreciate
having multiple branches in a single repository rather than having
many repositories cluttering up their filesystem. Other people prefer
the ability to have working-tree changes, and intermediate build
files, etc. each isolated in a separate repository per branch. Both
modes are very well-supported by git, so it's really a matter of which
you find most appropriate at any time given your tastes and project
workflows.

In our my-hello repository, we have a file hello.c that contains the
classic “hello, world” program. Let’s use the ancient and venerable
sed command to edit this file so that it prints a second line of
output. (I’m only using sed to do this because it’s easy to write a
scripted example this way. Since you’re not under the same constraint,
you probably won’t want to use sed; simply use your preferred text
editor to do the same thing.)

	$ sed -i '/printf/a\\tprintf("hello again!\\n");' hello.c     

The “git status” command will tell us what git knows about the files
in the repository.

	$ ls 
	hello.c  Makefile
	$ git status
	# On branch master
	# Changed but not updated:
	#   (use "git add <file>..." to update what will be committed)
	#
	#       modified:   hello.c
	#
	no changes added to commit (use "git add" and/or "git commit -a")

We see that “git status” command prints a line with "modified" for
hello.c. The “git status” command will not print any output for files
that have not been modified.

Notice that we didn’t need to inform git that we were going to modify
the file before we started, or that we had modified the file after we
were done; it was able to figure this out itself.

It’s a little bit helpful to know that we’ve modified hello.c, but we
might prefer to know exactly what changes we’ve made to it. To do
this, we use the “git diff” command.

	$ git diff
	diff --git a/hello.c b/hello.c
	index 9a3ff79..6d28887 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -8,5 +8,6 @@
	 int main(int argc, char **argv)
	 {
	        printf("hello, world!\");
	+       printf("hello again!\n");
	        return 0;
	 }

### 2.7  Recording changes in a new commit

We can modify files, build and test our changes, and use “git status”
and “git diff” to review our changes, until we’re satisfied with what
we’ve done and arrive at a natural stopping point where we want to
record our work in a new commit.

The “git commit” command lets us create a new changeset; we’ll usually
refer to this as “making a commit” or “committing”.

#### 2.7.1  Setting up a username

When you try to run “git commit” for the first time, it might not do
exactly what you want. Git records your name and address with each
change that you commit, (as both author and committer unless you tell
it otherwise), so that you and others will later be able to tell who
made each change. Git tries to automatically figure out a sensible
name and address to attribute to both author and committer. It will
attempt each of the following methods, in order, (stopping for each field as soon as a value is found):

  1. If you specify a --author option to the “git commit” command on
     the command line, followed by a "Real Name <email@example.com>"
     string, then this name and addresss will be used for the author
     fields. The committer fields will still be determined as
     below. This option is very helpful for when applying a commit
     originally authored by someone other than yourself.
  2. If any of the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL,
     GIT_COMMITTER_NAME, or GIT_COMMITER_EMAIL environment variables
     are set, then those values will be used for the corresponding
     fields.
  3. If you have a file in your home directory called .gitconfig, with
     name or email settings in the [user] section, then these values
     will be used to set any remaining author and committer
     fields. For more details on the contents of this file, refer to
     section 2.7.1 below.
  4. If you have a file in the local repository called .git/config,
     again with name or email settings in the [user] section, then
     these values will be used to set any remaining author and
     committer fields.
  5. If you have set the EMAIL environment variable, this will be used
     to set author and committer email addresses if still unset.
  6. git will query your system to find out your real name from
     available GECOS field and your username, hostname, and domain to
     construct an email address, (or at least an identifier resembling
     an email address).

If all of these mechanisms fail, "git commit" will fail, printing an
error message instructing you how to use "git config" to tell git your
name and email address.

You should think of the GIT_AUTHOR/COMMITER_NAME/EMAIL environment
variables and the --author option to the “git commit” command as ways
to override git’s default selection. For normal use, the simplest and
most robust way to set your information is by creating a .gitconfig
file, (either manually or with the "git config" command); see below
for details.

##### Creating a git configuration file

To set your name and email address, just use the following commands:

	git config --global user.name "Your Name"
	git config --global user.email "you@example.com"

The --global option means that this command will set global
information, (affecting all repositories on this machine), in the
.gitconfig file in your home directory. Alternately, you could omit
the --global which would make the change take effect only in the local
repository. This is convenient if you want to have different email
addresses associated with different projects, for example.

Of course, git's configuration file is a simple-to-edit plain-text
file, so instead of using the above commands, you can also just edit
the files directly. Use your favorite editor to create a file called
.gitconfig in your home directory, (or if you ran the above commands
then it will be there already). The initial contents of your
.gitconfig should look like this.

	# This is a git configuration file. 
	[user]
		name = Your Name
		email = you@example.com

Similarly, you can make a repository-specific configuration by editing
.git/config in the local repository. It will already have some
sections present, (created by the "git clone"), just add a [user]
section as above.

The “[user]” line begins a section of the config file, so you can read
the “name = ...” line as meaning “set the value of the name item in
the user section”. This is the same notion expressed with the
"user.name" syntax on the git-config command line.  A section
continues until a new section begins, or the end of the file. Git
ignores empty lines and treats any text from “#” to the end of a line
as a comment.

##### Choosing a user name

You can use any text you like as the value of the name and email
configuration items, since this information is for reading by other
people, not for interpreting by git. It is conventional to use a valid
email address, but some, (notably Linus Torvalds, the original author
of git), actually like the default user@hostname convention that git
falls back on without any additional information. There's no
requirement that the email address actually be valid, and perhaps it's
useful to be reminded which machine was used to create particular
commits.

#### 2.7.2  Writing a commit message

When we commit a change, git drops us into a text editor to
enter a message that will describe the modifications we’ve made in
this commit. This is called the commit message. It will be a record
for readers of what we did and why, and it will be printed by “git log”
after we’ve finished committing.

	$ git commit -a

Note: The -a on the command-line instructs git to commit the new
content of *all* tracked files that have been modified. This is a
convenience over explicitly listing filenames to be committed on the
"git commit" command line. It is useful to use "git commit <files>"
when there is a need to commit only some subset of the files that have
been modified.

If new files need to be committed for the first time, just use "git
add <file>" before "git commit -a". If a file needs to be removed,
just remove it as normal before committing and "git commit -a" will
notice that---it does not need to be explicitly told about the
removal.

The editor that the “git commit” command drops us into will contain an
empty line, followed by a number of lines starting with “#”.

	empty line
	# Please enter the commit message for your changes.
	# (Comment lines starting with '#' will not be included)
	# On branch master
	# Changes to be committed:
	#   (use "git reset HEAD <file>..." to unstage)
	#
	#       modified:   hello.c
	#	

git ignores the lines that start with “#”; it uses them only
to tell us which files it’s recording changes to. Modifying or
deleting these lines has no effect.

#### 2.7.3  Writing a good commit message

A good commit message will generally have a single line that
summarizes the commit, a blank line, and then one or more pargraphs
with supporting detail. Since many tools only print the first line of
a commit message by default, it’s important that the first line stands
alone.

One example of a first-line-only viewer is "git log
--pretty=short". Other examples include graphical history viewers such
as gitk and gitview, and web-based viewers such as gitweb and cgit.

Here’s a real example of a commit message that doesn’t follow
this guideline, and hence has a summary that is not readable.

	$ git log --pretty=short
	commit 3ef5535144da88a854f7930503845cd44506c2e2
	Author: Censored Person <censored.person@example.org>
	
	    include buildmeister/commondefs.   Add an exports and install

As far as the remainder of the contents of the commit message are
concerned, there are no hard-and-fast rules. git itself doesn’t
interpret or care about the contents of the commit message, though
your project may have policies that dictate a certain kind of
formatting.

My personal preference is for short, but informative, commit messages
that tell me something that I can’t figure out with a quick glance at
the output of “git log -p".

#### 2.7.4  Aborting a commit

If you decide that you don’t want to commit while in the middle of
editing a commit message, simply exit from your editor without saving
the file that it’s editing. This will cause nothing to happen to
either the repository or the working directory.

#### 2.7.5  Admiring our new handiwork

Once we’ve finished the commit, we can use the “git show” command to
display the commit we just created. As discussed previously, this
command produces output that is identical to “git log -p”, but for
only a single revision, (and the most recent revision by default):

	$ git show
	commit 018cfb742be6176443ffddac454e593e802ddf3e
	Author: Carl Worth <cworth@cworth.org>
	Date:   Thu Sep 27 23:55:00 2007 -0700
	
	    Added an extra line of output.
	    
	    If I would have been clever I would have fixed that old typo
	    while I was at it...
	
	diff --git a/hello.c b/hello.c
	index 9a3ff79..6d28887 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -8,5 +8,6 @@
	 int main(int argc, char **argv)
	 {
	        printf("hello, world!\");
	+       printf("hello again!\n");
	        return 0;
	 }

Note that you will not see the same commit identifier for your commit,
even if the change you made is identical to mine. The commit
identifier incorporates not only the contents of the files, but commit
message, the author and committer names and emails, and the author and
commit dates. (OK, so now you probably know enough to be able to guess
the right command to produce a commit with exactly the commit
identifier shown above. Can you do it?)

#### 2.7.6 Fixing up a broken commit (before anyone else sees it)

So now that we've cloned a local repository, made a change to the
code, setup our name and email address, and made a commit with a
careful message, we're just about ready to share our change with the
world. But wait, we forgot to try to compile it didn't we?

	$ make
	cc    -c -o hello.o hello.c
	hello.c:10:9: warning: missing terminating " character
	hello.c:10:9: warning: missing terminating " character
	hello.c: In function ‘main’:
	hello.c:10: error: missing terminating " character
	hello.c:11: error: expected ‘)’ before ‘;’ token
	hello.c:13: warning: passing argument 1 of ‘printf’ makes pointer from integer without a cast
	hello.c:13: error: expected ‘;’ before ‘}’ token
	make: *** [hello.o] Error 1

Oh look. The code's broken and doesn't compile. We don't want to share
code in this state. For situations where you notice one tiny detail
that got left out of the last commit, (a silly syntax error, a
misspelling in a comment or commit messsage), git provides a very
handy tool for just changing the last commit.

So fix that typo, (a missing 'n' between the '\' and the '"'), with
your editor or with something like this:

	sed -i 's/\\"/\\n"/' hello.c

And then you can just amend the previous commit rather than creating a
new one with the --amend option to "git commit":

	$ git commit -a --amend

Note that we use -a to include the code change here. And that helps
point out a situation where "git commit" is useful without the -a
option, "git commit --amend" is a useful command for amend just the
last commit message, without committing any new code changes, even if
some files have been modified in the working tree.

And here's the final result:

	$ git show
	commit 839b58d021c618bd0e1d336d4d5878a0082672e6
	Author: Carl Worth <cworth@cworth.org>
	Date:   Thu Sep 27 23:55:00 2007 -0700
	
	    Added an extra line of output and fixed the typo bug.
	
	diff --git a/hello.c b/hello.c
	index 9a3ff79..ca750e0 100644
	--- a/hello.c
	+++ b/hello.c
	@@ -7,6 +7,7 @@
	 
	 int main(int argc, char **argv)
	 {
	-       printf("hello, world!\");
	+       printf("hello, world!\n");
	+       printf("hello again!\n");
	        return 0;
	 }

I can't help but point out that this really was a poor example for
--amend. The end result is a single commit that does two independent
things, (fixes one bug and adds one new feature). It's much better to
create a code history where each commit makes an independent change,
(and as small as possible). This is important for several reasons:

  * Small changes are easier to review

  * Independent changes are easier to split up if only part of the
    series gets accepted "upstream" for one reason or another.

  * The smaller the changes are the more useful the history will be
    when actually using the history, not just viewing it. This is
    particularly important when doing "git bisect"---that's a powerful
    tool for isolating the single commit that introduces a bug. And
    it's much more powerful if the commit it isolates is as small as
    possible.

So it's a good thing this document is available under a license that
allows for distribution of modified versions. Someone should clean up
the --amend example to not teach bad habits like I did above. [Note:
All this bad-habit stuff was introduced by me, and was not present in
Bryan's original chapter. -Carl]

### 2.8  Sharing changes

We mentioned earlier that repositories in git are
self-contained. This means that the commit we just created exists
only in our my-hello repository. Let’s look at a few ways that we can
propagate this change into other repositories.

#### 2.8.1  Pulling changes from another repository

To get started, let’s clone our original hello repository, which does
not contain the change we just committed. We’ll call our temporary
repository hello-pull.

	$ cd ..
	$ git clone hello hello-pull
	Initialized empty Git repository in /tmp/hello-pull/.git/
	0 blocks

We could use the “git pull” command to apply changes from my-hello to
our master branch in hello-pull. However, blindly pulling unknown
changes into a repository is a somewhat scary prospect. The "git pull"
command is coneptually the combination of two commands, "git fetch"
and "git merge"; we can run those separately to examine the changes
before applying them locally. First we do the fetch:

	$ cd hello-pull
	$ git fetch ../my-hello
	remote: Generating pack...
	Unpacking 3 objects...
	 100% (3/3) done
	remote: Done counting 5 objects.
	Result has 3 objects.
	Deltifying 3 objects...
	 100% remote: (3/3) done
	Total 3 (delta 1), reused 0 (delta 0)

The fetched commits (or commit in this case) are available as the name
FETCH_HEAD. [XXX: Shouldn't git-fetch print that name out to the user
if the user didn't provide a specific branch name to fetch into.] And
the difference between what we had before and what exists on
FETCH_HEAD can easily be examined with the ..FETCH_HEAD range
notation:

	$ git log ..FETCH_HEAD
	commit 839b58d021c618bd0e1d336d4d5878a0082672e6
	Author: Carl Worth <cworth@cworth.org>
	Date:   Thu Sep 27 23:55:00 2007 -0700
	
	    Added an extra line of output and fixed the typo bug.

Since these commits actually exist in the local repository now, we
don't need to fetch or pull them from the remote repository again---we
can now use "git merge" to apply the previously fetched commits. (A
mercurial user might notice here that git does not have the race
condition between "hg incoming" and "hg pull" that mercurial has since
the commits are fetched only once.)

	$ git merge FETCH_HEAD
	Updating a1a0e8b..839b58d
	Fast forward
	 hello.c |    3 ++-
	 1 files changed, 2 insertions(+), 1 deletions(-)

Notice that "git merge" reports that our branch pointer has been
updated from a1a0e8b to 839b58d. Also, this is a "fast forward"
meaning that the new commits are a linear sequence on top of the
commit we already hand. In other words, there wasn't any divergence
between these two repositories so no actual "merge" commit was
created.

This separation of fetch and merge is useful when you need to
carefully review some changes before applying them. But often you're
in a situation where you know you trust the remote repository and you
simply want to pull those changes as conveniently as possible, (no
extra commands, no typing a magic name like FETCH_HEAD). This is the
case when the tracking upstream development of a project with git. And
in that case, the above steps are as simple as just executing "git
pull". So let's repeat all that the simpler way:

	$ cd ..
	$ git clone hello hello-tracking
	Initialized empty Git repository in /tmp/hello-tracking/.git/
	0 blocks
	$ cd hello-tracking
	$ git pull ../my-hello
	remote: Generating pack...
	remote: Done counting 5 objects.
	Result has 3 objects.
	Deltifying 3 objects...
	Unpacking 3 objects...
	remote:  100% (3/3) done
	Total 3 (delta 1), reused 0 (delta 0)
	 100% (3/3) done
	Updating a1a0e8b..839b58d
	Fast forward
	 hello.c |    3 ++-
	 1 files changed, 2 insertions(+), 1 deletions(-)

It should be plain to see that the "git pull" command really did the
combined sequence of "git fetch" and "git merge". Also, if you want to
pull from the same repository you cloned from originally, (which is
the common case for the upstream-tracking scenario), then "git pull"
with no explicit repository is suffcient, and it will default to
pulling from the same repository as the original clone.

[XXX: The structure of the preceding section follows that of the
original hgbook. But an alternate structure that arranged to pull from
the originally cloned repository (as would be common) would allow for
more straightforward use of git's features. For example, instead of
the silly FETCH_HEAD stuff it would allow for "git fetch" and "git log
master..origin" to be a very nice replacement for "hg
incoming". Similarly, below, "git log origin..master" would make a
nice replacement for "hg outgoing" which is something I didn't offer
at all. One could also use git's remotes with the myriad repositories
as used here, but it would require doing things like "git remote add
<some-name> ../hello-pull" and that seems like a bit much to introduce
for a turorial of this level. If nothing else, if the above section
seems a little intimidating, understand that it's because things are
not presented in the most natural "git way", (and I'm a little too
tired to fix it tonight).]

Note: Mercurial users who are reading this might wonder if there's a
need for the equivalent of "hg update" after doing a "git pull". And
the answer is no. Unlike mercurial, "git pull" and "git merge" will
automatically update the workind-directory files as necessary.

#### 2.8.2  Checking out previous revisions

It's often useful to examine the working-tree state of some specific
revision other than the tip of some branch. For example, maybe you
would like to build a particular tagged version, or maybe you'd like
to test the behavior of the code before a particular change was
introduced. To do this, use "git checkout" and pass it the name of any
revision, (with a branch name, a tag name, or any other commit
identifier). For example, to examine our project before the original
typo was introduced:

	$ git checkout 0a633bf5
	Note: moving to "0a633bf5" which isn't a local branch
	If you want to create a new branch from this checkout, you may do so
	(now or later) by using -b with the checkout command again. Example:
	  git checkout -b <new_branch_name>
	HEAD is now at 0a633bf... Create a makefile

The note that git gives us is to indicate that we are checking out a
non-branch revision. This is perfectly fine if we are just exploring
history, but if we actually wanted to use this revision as the basis
for new commits, we would first have to create a new branch name as it
describes.

If we were to use "git checkout" with a branch name, then that would
change the current branch, (meaning that any new commits would advance
that branch pointer).

For now, let's return back to the tip of the master branch by just
checking it out again:

	$ git checkout master
	Previous HEAD position was 0a633bf... Create a makefile
	Switched to branch "master"

#### 2.8.3  Pushing changes to another repository

Git lets us push changes to another repository, from the repository
we’re currently visiting. As with previous examples, above, we’ll
first create a temporary repository to push our changes into. But
instead of using "git clone", this time we'll use "git init" to make a
repository from an empty directory. We do this to create a "bare"
repository which is simply a repository that has no working-directory
files associated with it. In general, you should only push to bare
repositories.

	$ cd ..
	$ mkdir hello-push
	$ cd hello-push
	$ git --bare init
	Initialized empty Git repository in /tmp/hello-push/

And then we'll go back to our my-hello repository to perform the
push. Since this is our very first push into this repository we need
to tell git which branches to push. The easiest way to do this is to
use --all to indicate all branches:

	$ cd ../my-hello
	$ git push ../hello-push --all
	updating 'refs/heads/master'
	  from 0000000000000000000000000000000000000000
	  to   839b58d021c618bd0e1d336d4d5878a0082672e6
	Generating pack...
	Done counting 18 objects.
	Deltifying 18 objects...
	 100% (18/18) done
	Writing 18 objects...
	 100% (18/18) done
	Total 18 (delta 3), reused 0 (delta 0)
	Unpacking 18 objects...
	 100% (18/18) done
	refs/heads/master: 0000000000000000000000000000000000000000 -> 839b58d021c618bd0e1d336d4d5878a0082672e6

For subsequent pushes we don't need to specify --all as "git push"
will push all branches that exist in both the local and remote
repositories.

What happens if we try to pull or push changes and the receiving
repository already has those changes? Nothing too exciting.

	$ git push ../hello-push
	Everything up-to-date

#### 2.8.4  Sharing changes over a network

The commands we have covered in the previous few sections are not
limited to working with local repositories. Each works in exactly the
same fashion over a network connection; simply pass in a URL or an ssh
host:/path/name specification instead of a local path.

## Appendix D
Open Publication License

Version 1.0, 8 June 1999 

### D.1  Requirements on both unmodified and modified versions

The Open Publication works may be reproduced and distributed in whole
or in part, in any medium physical or electronic, provided that the
terms of this license are adhered to, and that this license or an
incorporation of it by reference (with any options elected by the
author(s) and/or publisher) is displayed in the reproduction.

Proper form for an incorporation by reference is as follows: 

Copyright (c) year by author’s name or designee. This material may be
distributed only subject to the terms and conditions set forth in the
Open Publication License, vx.y or later (the latest version is
presently available at
[http://www.opencontent.org/openpub/][http://www.opencontent.org/openpub/]).

The reference must be immediately followed with any options elected by
the author(s) and/or publisher of the document (see section D.6).

Commercial redistribution of Open Publication-licensed material is
permitted.

Any publication in standard (paper) book form shall require the
citation of the original publisher and author. The publisher and
author’s names shall appear on all outer surfaces of the book. On all
outer surfaces of the book the original publisher’s name shall be as
large as the title of the work and cited as possessive with respect to
the title.

### D.2  Copyright

The copyright to each Open Publication is owned by its author(s) or
designee.

### D.3  Scope of license

The following license terms apply to all Open Publication works,
unless otherwise explicitly stated in the document.

Mere aggregation of Open Publication works or a portion of an Open
Publication work with other works or programs on the same media shall
not cause this license to apply to those other works. The aggregate
work shall contain a notice specifying the inclusion of the Open
Publication material and appropriate copyright notice.

Severability. If any part of this license is found to be unenforceable
in any jurisdiction, the remaining portions of the license remain in
force.

No warranty. Open Publication works are licensed and provided “as is”
without warranty of any kind, express or implied, including, but not
limited to, the implied warranties of merchantability and fitness for
a particular purpose or a warranty of non-infringement.

### D.4  Requirements on modified works

All modified versions of documents covered by this license, including
translations, anthologies, compilations and partial documents, must
meet the following requirements:

  1. The modified version must be labeled as such. 
  2. The person making the modifications must be identified and the
     modifications dated.
  3. Acknowledgement of the original author and publisher if
     applicable must be retained according to normal academic citation
     practices.
  4. The location of the original unmodified document must be identified. 
  5. The original author’s (or authors’) name(s) may not be used to
     assert or imply endorsement of the resulting document without the
     original author’s (or authors’) permission.

### D.5  Good-practice recommendations

In addition to the requirements of this license, it is requested from
and strongly recommended of redistributors that:

  1. If you are distributing Open Publication works on hardcopy or
     CD-ROM, you provide email notification to the authors of your
     intent to redistribute at least thirty days before your
     manuscript or media freeze, to give the authors time to provide
     updated documents. This notification should describe
     modifications, if any, made to the document.
  2. All substantive modifications (including deletions) be either
     clearly marked up in the document or else described in an
     attachment to the document.
  3. Finally, while it is not mandatory under this license, it is
     considered good form to offer a free copy of any hardcopy and
     CD-ROM expression of an Open Publication-licensed work to its
     author(s).

### D.6  License options

The author(s) and/or publisher of an Open Publication-licensed
document may elect certain options by appending language to the
reference to or copy of the license. These options are considered part
of the license instance and must be included with the license (or its
incorporation by reference) in derived works.

  1. To prohibit distribution of substantively modified versions
     without the explicit permission of the author(s). “Substantive
     modification” is defined as a change to the semantic content of
     the document, and excludes mere changes in format or
     typographical corrections.

     To accomplish this, add the phrase “Distribution of substantively
     modified versions of this document is prohibited without the
     explicit permission of the copyright holder.” to the license
     reference or copy.

  2. To prohibit any publication of this work or derivative works in
     whole or in part in standard (paper) book form for commercial
     purposes is prohibited unless prior permission is obtained from
     the copyright holder.

     To accomplish this, add the phrase “Distribution of the work or
     derivative of the work in any standard (paper) book form is
     prohibited unless prior permission is obtained from the copyright
     holder.” to the license reference or copy.