## Chapter 2 A tour of git: the basics ### 2.0 Copyright This document is a modified version of a document originally titled "Distributed revision control with Mercurial" and originally authored by Bryan O’Sullivan. The original document was obtained from . Copyright © 2006, 2007 Bryan O’Sullivan. This material may be distributed only subject to the terms and conditions set forth in version 1.0 of the Open Publication License. Please refer to Appendix D for the license text. As this is a modified version, the name of Bryan O'Sullivan is used only to properly credit him with the original text. The appearance of his name here explicitly does not assert or imply his endorsement of this modified document. Portions Copyright © 2007 Carl Worth. Changes made by Carl include the following: * 2007-09-27: * Convert from HTML to markdown source syntax * Eliminate all content except Chapter 2 and Appendix D * Eliminate line numbers from examples * Modified to describe git instead of mercurial The source of this modified version can be obtained via git: git clone git://cworth.org/git/hgbook-git or git clone http://cworth.org/git/hgbook-git and can be [browsed online](http://git.cworth.org/git/hgbook-git) ### 2.1 Installing git on your system Prebuilt binary packages of git are available for many popular operating systems. These make it easy to start using git on your computer immediately. #### 2.1.1 Linux Because each Linux distribution has its own packaging tools, policies, and rate of development, it’s difficult to give a comprehensive set of instructions on how to install git binaries. The version of git that you will end up with can vary depending on how active the person is who maintains the package for your distribution. To keep things simple, I will focus on installing git from the command line under the most popular Linux distributions. Most of these distributions provide graphical package managers that will let you install git with a single click. The package name to look for is often git, but is sometimes git-core, (due to an unfortunate name with git, meaning GNU Interactive Tools). * Debian apt-get install git-core * Fedora Core yum install git * Gentoo emerge git * OpenSUSE yum install git * Ubuntu apt-get install git-core #### 2.1.2 Mac OS X A git-core package is available through [macports](http://macports.org). Once macports is enabled, the command to install git is: port install git-core #### 2.1.3 Windows Git has long been available as part of cygwin, and works reasonably well in that environment. Some people find cygwin a particularly inelegant approach to running git and would prefer a "native" solution. To this end, the [msysgit project](http://code.google.com/p/msysgit/) is rapidly putting together a solution including various packages with full installers. These include GitMe, a package to install the entire development environment necessary to work on improving the msysgit port of git, and WinGit, a package for installing just git itself without the development environment, (still in Alpha as of September 2007). ### 2.2 Getting started To begin, we’ll use the “git version” command to find out whether git is actually installed properly. Versions 1.5 and newer of git are much more friendly to new users than versions 1.4 and older. If you aren't yet running version 1.5 or newer, it's highly recommended that you upgrade. $ git version git version 1.5.3.2 #### 2.2.1 Built-in help Git provides a built-in help system. This is invaluable for those times when you find yourself stuck trying to remember how to run a command. If you are completely stuck, simply run “git help”; it will print a brief list of commonly-used commands, along with a description of what each does. If you ask for help on a specific command (such as "git help init"), it prints more detailed information. [XXX: Does "git help " work universally as a built-in or does it expect man to be present and just call out to "man git-"?] [XXX: The original hgbook includes the complete output of "hg help init" at this point. I'm not including the corresponding "git help init" output as it would be excessively long. The description alone is quite reasonable, (other than a not-too-helpful aside about the obsolete git-init-db command), but it only comes after a full screen's worth of options details. Might it make sense to have a more summarized help output for "git help " than all of the documentation available for git-? And perhaps alos provide a "git -v help" similar to "hg -v help" for more?] ### 2.3 Working with a repository In git, everything happens inside a repository. The repository for a project contains all of the files that “belong to” that project, along with a historical record of the project’s files. There’s nothing particularly magical about a repository; it is simply a directory tree in your filesystem that git treats as special. You can rename or delete a repository any time you like, using either the command line or your file browser. #### 2.3.1 Creating a local copy of a remote repository As suggested, a repository can be copied through normal file-copying commands. But git also provides a "git clone" tool for copying a repository. This provides a means of copying a repository over the network, and is also useful with a local repository since it is much more efficient than creating a normal copy, (creating a local clones is blazingly fast). We've assembled a simple repository that will be used in the examples throughout this chapter. Go ahead and clone this repository now so that you will be able to follow along: $ git clone git://cworth.org/git/hello Initialized empty Git repository in /tmp/hello/.git/ remote: Generating pack... remote: Done counting 15 objects. remote: Deltifying 15 objects... remote: 100% (15/15) done remote: Total 15 (delta 2), reused 15 (delta remote: 2) Indexing 15 objects... 100% (15/15) done Resolving 2 deltas... 100% (2/2) done If for some reason you are prevented from talking on the git: port, then there is also the capability to clone a repository (less efficiently) over http: $ git clone http://cworth.org/git/hello Initialized empty Git repository in /tmp/hello/.git/ Getting alternates list for http://cworth.org/git/hello Getting pack list for http://cworth.org/git/hello Getting index for pack 04ecb061314ecbd60fa0610ecf55a1cbf85ea294 Getting pack 04ecb061314ecbd60fa0610ecf55a1cbf85ea294 which contains a1a0e8b392b17caf50325498df54802fe3c03710 walk a1a0e8b392b17caf50325498df54802fe3c03710 walk 72d4f10e4a27dbb09ace1503c20dbac1912ee451 walk 13ed136b983a9c439eddeea8a1c2076cffbb685f walk 0a633bf58b45fcf1a8299d3c82cd1fd26d3f48f2 walk db7117a9dd9a6e57e8632ea5848e1101eee0fbde If our clone succeeded, we should now have a local directory called hello. This directory will contain some files. $ ls -l total 4 drwxr-xr-x 3 cworth cworth 4096 2007-09-27 16:40 hello $ ls hello hello.c Makefile These files have the same contents and history in our repository as they do in the repository we cloned. Every git repository is complete, self-contained, and independent. It contains its own private copy of a project’s files and history. A cloned repository remembers the location of the repository it was cloned from, but it does not communicate with that repository, or any other, unless you tell it to. What this means for now is that we’re free to experiment with our repository, safe in the knowledge that it’s a private “sandbox” that won’t affect anyone else. #### 2.3.2 What’s in a repository? When we take a more detailed look inside a repository, we can see that it contains a directory named .git. This is where git keeps all of its metadata for the repository. $ cd hello $ ls -a . .. .git hello.c Makefile The contents of the .git directory and its subdirectories are private to git. Every other file and directory in the repository is yours to do with as you please. To introduce a little terminology, the .git directory is the “real” repository, and all of the files and directories that coexist with it are said to live in the working directory. An easy way to remember the distinction is that the repository contains the history of your project, while the working directory contains a snapshot of your project at a particular point in history. ### 2.4 A tour through history One of the first things we might want to do with a new, unfamiliar repository is understand its history. The “git log” command gives us a view of history. $ git log commit a1a0e8b392b17caf50325498df54802fe3c03710 Author: Bryan O'Sullivan Date: Tue Sep 6 15:43:07 2005 -0700 Trim comments. commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451 Author: Bryan O'Sullivan Date: Tue Sep 6 13:15:58 2005 -0700 Get make to generate the final binary from a .o file. commit 13ed136b983a9c439eddeea8a1c2076cffbb685f Author: Bryan O'Sullivan Date: Tue Sep 6 13:15:43 2005 -0700 Introduce a typo into hello.c. commit 0a633bf58b45fcf1a8299d3c82cd1fd26d3f48f2 Author: Bryan O'Sullivan Date: Fri Aug 26 01:21:28 2005 -0700 Create a makefile commit db7117a9dd9a6e57e8632ea5848e1101eee0fbde Author: Bryan O'Sullivan Date: Fri Aug 26 01:20:50 2005 -0700 Create a standard "hello, world" program By default, this command prints a brief paragraph of output for each change to the project that was recorded. In git terminology, we call each of these recorded events a commit. The fields in a record of output from “git log” are as follows. * commit This field consists of a string of 40 hexadecimal characters. This is a unique identifier for referring to particular commits. * Author The identity of the person who authored the commit. This field consist of two sub-fields for the user's name and email address, (or at least an email-like idenitifer). Note that git stores a separate "Committer" field for the person who commited the change, (since often an author will email a change to a maintainer that commits it). The "git log" command doesn't display the Committer, but other git tools do. * Date The date and time on which the commit was authored, (again stored separately from the date the change was committed). timezone in which it was created. (The date and time are displayed in the timezone of the person who created the commit.) * commit message The text message that the creator of the commit entered to describe the commit, (generally a one-line summary followed by more supporting text). The default output printed by “git log” is purely a summary; it is missing a lot of detail. #### 2.4.1 Commits, revisions, and talking to other people As English is a notoriously sloppy language, and computer science has a hallowed history of terminological confusion (why use one term when four will do?), revision control has a variety of words and phrases that mean the same thing. If you are talking about git history with other people, you will find that what we have called a “commit” is often called a "revision". In other systems, a similar notion is referred to as a "changeset". You might even see abbreviations of these terms such as "rev", "change", or even "cset". While it may not matter much what word you use to refer to the concept of “a commit”, it's important to know how to name “a specific commit”. We have already seen one means of referring to a particular commit, the 40-character hexadecimal string shown by "git log". These commit identifiers are powerful because they are permanent, unique identifiers that always identify the same commit in any copy of a repository. If two users are examining a working directory associated with the same commit identifier, then those two users have precisely the same contents in all files, and exactly the same history leading to that commit. So there are places where it is often important to archive the complete commit identifier, (perhaps in bug-tracking systems to indicate a specific commit that fixes a bug, for example). But often, in more casual settings, it's more convenient to use abbreviated commit identifiers. Git accept any unique prefix of a commit identifier, (and for reasonably-sized project the first 8 or 10 characters are almost always unique). And unlike the permanent commit identifiers, git also provides transient means of identifying commits. In fact, in day-to-day use of git, you will probably use these names more than commit identifiers. One example is branch names, (such as the default "master" branch in any git repository), or any project-specific branch names such as "stable", "experimental", or "crazy-insane-changes". Git also provides a special name "HEAD" which always refers to the current branch. #### 2.4.2 Naming related commits Git offers simple ways to name revisions that are related to particular revisions in the history. One syntax is the ~ suffix which refers to the parent of a commit, or if followed by a number, to the Nth parent. For example, since "HEAD" refers to the most recent commit in the current branch, "HEAD~", refers to the previous commit, and "HEAD~2" refers to two commits back in the history. Another useful syntax is .. which can be used to specify a range of commits. So "origin..master" specifies everything that has been committed to master since it diverged from origin. #### 2.4.3 Viewing specific revisions You can use "git log" to explore the range syntax just introduced. For example, to see a list of the most recent 3 revisions you can use "HEAD~3..", (the destination of the range is implicitly HEAD in this case): $ git log HEAD~3.. commit a1a0e8b392b17caf50325498df54802fe3c03710 Author: Bryan O'Sullivan Date: Tue Sep 6 15:43:07 2005 -0700 Trim comments. commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451 Author: Bryan O'Sullivan Date: Tue Sep 6 13:15:58 2005 -0700 Get make to generate the final binary from a .o file. commit 13ed136b983a9c439eddeea8a1c2076cffbb685f Author: Bryan O'Sullivan Date: Tue Sep 6 13:15:43 2005 -0700 Introduce a typo into hello.c. #### 2.4.4 Other log filters Besides filtering by commit identifiers, git allows you to easily filter the log output according to which files (or directories) are modified by listing them after "--" wihch is necessary to distinguish commit names from file names: $ git log -- Makefile commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451 Author: Bryan O'Sullivan Date: Tue Sep 6 13:15:58 2005 -0700 Get make to generate the final binary from a .o file. commit 0a633bf58b45fcf1a8299d3c82cd1fd26d3f48f2 Author: Bryan O'Sullivan Date: Fri Aug 26 01:21:28 2005 -0700 Create a makefile And "git log" can also filter based on the dates at which commits were created: $ git log --since="2 weeks ago" --until="yesterday" Another useful option is -n or --max-count which, unsurprisingly, limits the maximum number of commits to be displayed. #### 2.4.5 More detailed information While the default information printed by “git log” is useful if you already know what you’re looking for, you may need to see more details of the change, such as the "diffstat" information with --stat: $ git log --stat --max-count=3 commit a1a0e8b392b17caf50325498df54802fe3c03710 Author: Bryan O'Sullivan Date: Tue Sep 6 15:43:07 2005 -0700 Trim comments. hello.c | 8 ++------ 1 files changed, 2 insertions(+), 6 deletions(-) commit 72d4f10e4a27dbb09ace1503c20dbac1912ee451 Author: Bryan O'Sullivan Date: Tue Sep 6 13:15:58 2005 -0700 Get make to generate the final binary from a .o file. Makefile | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) commit 13ed136b983a9c439eddeea8a1c2076cffbb685f Author: Bryan O'Sullivan Date: Tue Sep 6 13:15:43 2005 -0700 Introduce a typo into hello.c. hello.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) Or perhaps you'd like to see the actual patch content of each change, which you can get with -p. That commit with the word typo in its name looks suspicous, so let's tak a closer look. Remember that we can name it as master~3, HEAD~3, or any prefix of its commit identifier, (such as 13ed136b): $ git log -p -n 1 13ed136b commit 13ed136b983a9c439eddeea8a1c2076cffbb685f Author: Bryan O'Sullivan Date: Tue Sep 6 13:15:43 2005 -0700 Introduce a typo into hello.c. diff --git a/hello.c b/hello.c index ed55ec0..80b260c 100644 --- a/hello.c +++ b/hello.c @@ -11,6 +11,6 @@ int main(int argc, char **argv) { - printf("hello, world!\n"); + printf("hello, world!\"); return 0; } Of course, wanting to see all this information for a single commit is such a common operation that it's given its own name in git, "git show". So "git show 13ed136b" is a much easier way to get exactly the same output: $ git show 13ed136b commit 13ed136b983a9c439eddeea8a1c2076cffbb685f Author: Bryan O'Sullivan Date: Tue Sep 6 13:15:43 2005 -0700 Introduce a typo into hello.c. diff --git a/hello.c b/hello.c index ed55ec0..80b260c 100644 --- a/hello.c +++ b/hello.c @@ -11,6 +11,6 @@ int main(int argc, char **argv) { - printf("hello, world!\n"); + printf("hello, world!\"); return 0; } ### 2.5 All about command options Let’s take a brief break from exploring git commands to discuss a pattern in the way that they work; you may find this useful to keep in mind as we continue our tour. Git has a consistent and straightforward approach to dealing with the options that you can pass to commands. It follows the conventions for options that are common to modern Linux and Unix systems. * Most options have long names. For example, as we’ve already seen, the “git log" command accepts a --max-count= option. * Some options have short, single-character names. Often these are aliases for long commands, (such as "-n " instead of --max-count=), but sometimes the option exists in short-form with no long-form equivalent, (such as -p). [XXX: It wouldn't hurt to fix this by adding --patch, etc. right?] * Long options start with two dashes (e.g. --max-count), while short options start with one (e.g. -n). * Option naming and usage is consistent across commands. For example, every command that lets you specify a commit identifier or range will accept the same expressions, (HEAD~3, origin..master, 72d4f10e, etc), while any command that can be limited by paths will accept the same expressions ("-- doc/ some-file.c"), etc. Many commands that print output of some kind can be made more quiet by passing the -q or --quiet options. ### 2.6 Making and reviewing changes Now that we have a grasp of viewing history in git, let’s take a look at making some changes and examining them. The first thing we’ll do is isolate our experiment in a repository of its own. We use the “git clone” command, but we don’t need to clone a copy of the remote repository. Since we already have a copy of it locally, we can just clone that instead. This is much faster than cloning over the network, and cloning a local repository uses less disk space in most cases, too. $ cd .. $ git clone hello my-hello Initialized empty Git repository in /tmp/my-hello/.git/ 0 blocks [XXX We say "empty" here, (presumably from the git-init part), but shouldn't the command also report the succesful clone which makes it non-empty? And what the heck does "0 blocks" mean?] As an aside, it’s often good practice to keep a “pristine” copy of a remote repository around, which you can then make temporary clones of to create sandboxes for each task you want to work on. This lets you work on multiple tasks in parallel, each isolated from the others until it’s complete and you’re ready to integrate it back. Because local clones are so cheap, there’s almost no overhead to cloning and destroying repositories whenever you want. Alternatively, you can achieve much the same effect by creating multiple branches in a single repository, (but we won't go into detail on how to do that in this chapter). Some people greatly appreciate having multiple branches in a single repository rather than having many repositories cluttering up their filesystem. Other people prefer the ability to have working-tree changes, and intermediate build files, etc. each isolated in a separate repository per branch. Both modes are very well-supported by git, so it's really a matter of which you find most appropriate at any time given your tastes and project workflows. In our my-hello repository, we have a file hello.c that contains the classic “hello, world” program. Let’s use the ancient and venerable sed command to edit this file so that it prints a second line of output. (I’m only using sed to do this because it’s easy to write a scripted example this way. Since you’re not under the same constraint, you probably won’t want to use sed; simply use your preferred text editor to do the same thing.) $ sed -i '/printf/a\\tprintf("hello again!\\n");' hello.c The “git status” command will tell us what git knows about the files in the repository. $ ls hello.c Makefile $ git status # On branch master # Changed but not updated: # (use "git add ..." to update what will be committed) # # modified: hello.c # no changes added to commit (use "git add" and/or "git commit -a") We see that “git status” command prints a line with "modified" for hello.c. The “git status” command will not print any output for files that have not been modified. Notice that we didn’t need to inform git that we were going to modify the file before we started, or that we had modified the file after we were done; it was able to figure this out itself. It’s a little bit helpful to know that we’ve modified hello.c, but we might prefer to know exactly what changes we’ve made to it. To do this, we use the “git diff” command. $ git diff diff --git a/hello.c b/hello.c index 9a3ff79..6d28887 100644 --- a/hello.c +++ b/hello.c @@ -8,5 +8,6 @@ int main(int argc, char **argv) { printf("hello, world!\"); + printf("hello again!\n"); return 0; } ### 2.7 Recording changes in a new commit We can modify files, build and test our changes, and use “git status” and “git diff” to review our changes, until we’re satisfied with what we’ve done and arrive at a natural stopping point where we want to record our work in a new commit. The “git commit” command lets us create a new changeset; we’ll usually refer to this as “making a commit” or “committing”. #### 2.7.1 Setting up a username When you try to run “git commit” for the first time, it might not do exactly what you want. Git records your name and address with each change that you commit, (as both author and committer unless you tell it otherwise), so that you and others will later be able to tell who made each change. Git tries to automatically figure out a sensible name and address to attribute to both author and committer. It will attempt each of the following methods, in order, (stopping for each field as soon as a value is found): 1. If you specify a --author option to the “git commit” command on the command line, followed by a "Real Name " string, then this name and addresss will be used for the author fields. The committer fields will still be determined as below. This option is very helpful for when applying a commit originally authored by someone other than yourself. 2. If any of the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, GIT_COMMITTER_NAME, or GIT_COMMITER_EMAIL environment variables are set, then those values will be used for the corresponding fields. 3. If you have a file in your home directory called .gitconfig, with name or email settings in the [user] section, then these values will be used to set any remaining author and committer fields. For more details on the contents of this file, refer to section 2.7.1 below. 4. If you have a file in the local repository called .git/config, again with name or email settings in the [user] section, then these values will be used to set any remaining author and committer fields. 5. If you have set the EMAIL environment variable, this will be used to set author and committer email addresses if still unset. 6. git will query your system to find out your real name from available GECOS field and your username, hostname, and domain to construct an email address, (or at least an identifier resembling an email address). If all of these mechanisms fail, "git commit" will fail, printing an error message instructing you how to use "git config" to tell git your name and email address. You should think of the GIT_AUTHOR/COMMITER_NAME/EMAIL environment variables and the --author option to the “git commit” command as ways to override git’s default selection. For normal use, the simplest and most robust way to set your information is by creating a .gitconfig file, (either manually or with the "git config" command); see below for details. ##### Creating a git configuration file To set your name and email address, just use the following commands: git config --global user.name "Your Name" git config --global user.email "you@example.com" The --global option means that this command will set global information, (affecting all repositories on this machine), in the .gitconfig file in your home directory. Alternately, you could omit the --global which would make the change take effect only in the local repository. This is convenient if you want to have different email addresses associated with different projects, for example. Of course, git's configuration file is a simple-to-edit plain-text file, so instead of using the above commands, you can also just edit the files directly. Use your favorite editor to create a file called .gitconfig in your home directory, (or if you ran the above commands then it will be there already). The initial contents of your .gitconfig should look like this. # This is a git configuration file. [user] name = Your Name email = you@example.com Similarly, you can make a repository-specific configuration by editing .git/config in the local repository. It will already have some sections present, (created by the "git clone"), just add a [user] section as above. The “[user]” line begins a section of the config file, so you can read the “name = ...” line as meaning “set the value of the name item in the user section”. This is the same notion expressed with the "user.name" syntax on the git-config command line. A section continues until a new section begins, or the end of the file. Git ignores empty lines and treats any text from “#” to the end of a line as a comment. ##### Choosing a user name You can use any text you like as the value of the name and email configuration items, since this information is for reading by other people, not for interpreting by git. It is conventional to use a valid email address, but some, (notably Linus Torvalds, the original author of git), actually like the default user@hostname convention that git falls back on without any additional information. There's no requirement that the email address actually be valid, and perhaps it's useful to be reminded which machine was used to create particular commits. #### 2.7.2 Writing a commit message When we commit a change, git drops us into a text editor to enter a message that will describe the modifications we’ve made in this commit. This is called the commit message. It will be a record for readers of what we did and why, and it will be printed by “git log” after we’ve finished committing. $ git commit -a Note: The -a on the command-line instructs git to commit the new content of *all* tracked files that have been modified. This is a convenience over explicitly listing filenames to be committed on the "git commit" command line. It is useful to use "git commit " when there is a need to commit only some subset of the files that have been modified. If new files need to be committed for the first time, just use "git add " before "git commit -a". If a file needs to be removed, just remove it as normal before committing and "git commit -a" will notice that---it does not need to be explicitly told about the removal. The editor that the “git commit” command drops us into will contain an empty line, followed by a number of lines starting with “#”. empty line # Please enter the commit message for your changes. # (Comment lines starting with '#' will not be included) # On branch master # Changes to be committed: # (use "git reset HEAD ..." to unstage) # # modified: hello.c # git ignores the lines that start with “#”; it uses them only to tell us which files it’s recording changes to. Modifying or deleting these lines has no effect. #### 2.7.3 Writing a good commit message A good commit message will generally have a single line that summarizes the commit, a blank line, and then one or more pargraphs with supporting detail. Since many tools only print the first line of a commit message by default, it’s important that the first line stands alone. One example of a first-line-only viewer is "git log --pretty=short". Other examples include graphical history viewers such as gitk and gitview, and web-based viewers such as gitweb and cgit. Here’s a real example of a commit message that doesn’t follow this guideline, and hence has a summary that is not readable. $ git log --pretty=short commit 3ef5535144da88a854f7930503845cd44506c2e2 Author: Censored Person include buildmeister/commondefs. Add an exports and install As far as the remainder of the contents of the commit message are concerned, there are no hard-and-fast rules. git itself doesn’t interpret or care about the contents of the commit message, though your project may have policies that dictate a certain kind of formatting. My personal preference is for short, but informative, commit messages that tell me something that I can’t figure out with a quick glance at the output of “git log -p". #### 2.7.4 Aborting a commit If you decide that you don’t want to commit while in the middle of editing a commit message, simply exit from your editor without saving the file that it’s editing. This will cause nothing to happen to either the repository or the working directory. #### 2.7.5 Admiring our new handiwork Once we’ve finished the commit, we can use the “git show” command to display the commit we just created. As discussed previously, this command produces output that is identical to “git log -p”, but for only a single revision, (and the most recent revision by default): $ git show commit 018cfb742be6176443ffddac454e593e802ddf3e Author: Carl Worth Date: Thu Sep 27 23:55:00 2007 -0700 Added an extra line of output. If I would have been clever I would have fixed that old typo while I was at it... diff --git a/hello.c b/hello.c index 9a3ff79..6d28887 100644 --- a/hello.c +++ b/hello.c @@ -8,5 +8,6 @@ int main(int argc, char **argv) { printf("hello, world!\"); + printf("hello again!\n"); return 0; } Note that you will not see the same commit identifier for your commit, even if the change you made is identical to mine. The commit identifier incorporates not only the contents of the files, but commit message, the author and committer names and emails, and the author and commit dates. (OK, so now you probably know enough to be able to guess the right command to produce a commit with exactly the commit identifier shown above. Can you do it?) #### 2.7.6 Fixing up a broken commit (before anyone else sees it) So now that we've cloned a local repository, made a change to the code, setup our name and email address, and made a commit with a careful message, we're just about ready to share our change with the world. But wait, we forgot to try to compile it didn't we? $ make cc -c -o hello.o hello.c hello.c:10:9: warning: missing terminating " character hello.c:10:9: warning: missing terminating " character hello.c: In function ‘main’: hello.c:10: error: missing terminating " character hello.c:11: error: expected ‘)’ before ‘;’ token hello.c:13: warning: passing argument 1 of ‘printf’ makes pointer from integer without a cast hello.c:13: error: expected ‘;’ before ‘}’ token make: *** [hello.o] Error 1 Oh look. The code's broken and doesn't compile. We don't want to share code in this state. For situations where you notice one tiny detail that got left out of the last commit, (a silly syntax error, a misspelling in a comment or commit messsage), git provides a very handy tool for just changing the last commit. So fix that typo, (a missing 'n' between the '\' and the '"'), with your editor or with something like this: sed -i 's/\\"/\\n"/' hello.c And then you can just amend the previous commit rather than creating a new one with the --amend option to "git commit": $ git commit -a --amend Note that we use -a to include the code change here. And that helps point out a situation where "git commit" is useful without the -a option, "git commit --amend" is a useful command for amend just the last commit message, without committing any new code changes, even if some files have been modified in the working tree. And here's the final result: $ git show commit 839b58d021c618bd0e1d336d4d5878a0082672e6 Author: Carl Worth Date: Thu Sep 27 23:55:00 2007 -0700 Added an extra line of output and fixed the typo bug. diff --git a/hello.c b/hello.c index 9a3ff79..ca750e0 100644 --- a/hello.c +++ b/hello.c @@ -7,6 +7,7 @@ int main(int argc, char **argv) { - printf("hello, world!\"); + printf("hello, world!\n"); + printf("hello again!\n"); return 0; } I can't help but point out that this really was a poor example for --amend. The end result is a single commit that does two independent things, (fixes one bug and adds one new feature). It's much better to create a code history where each commit makes an independent change, (and as small as possible). This is important for several reasons: * Small changes are easier to review * Independent changes are easier to split up if only part of the series gets accepted "upstream" for one reason or another. * The smaller the changes are the more useful the history will be when actually using the history, not just viewing it. This is particularly important when doing "git bisect"---that's a powerful tool for isolating the single commit that introduces a bug. And it's much more powerful if the commit it isolates is as small as possible. So it's a good thing this document is available under a license that allows for distribution of modified versions. Someone should clean up the --amend example to not teach bad habits like I did above. [Note: All this bad-habit stuff was introduced by me, and was not present in Bryan's original chapter. -Carl] ### 2.8 Sharing changes We mentioned earlier that repositories in git are self-contained. This means that the commit we just created exists only in our my-hello repository. Let’s look at a few ways that we can propagate this change into other repositories. #### 2.8.1 Pulling changes from another repository To get started, let’s clone our original hello repository, which does not contain the change we just committed. We’ll call our temporary repository hello-pull. $ cd .. $ git clone hello hello-pull Initialized empty Git repository in /tmp/hello-pull/.git/ 0 blocks We could use the “git pull” command to apply changes from my-hello to our master branch in hello-pull. However, blindly pulling unknown changes into a repository is a somewhat scary prospect. The "git pull" command is coneptually the combination of two commands, "git fetch" and "git merge"; we can run those separately to examine the changes before applying them locally. First we do the fetch: $ cd hello-pull $ git fetch ../my-hello remote: Generating pack... Unpacking 3 objects... 100% (3/3) done remote: Done counting 5 objects. Result has 3 objects. Deltifying 3 objects... 100% remote: (3/3) done Total 3 (delta 1), reused 0 (delta 0) The fetched commits (or commit in this case) are available as the name FETCH_HEAD. [XXX: Shouldn't git-fetch print that name out to the user if the user didn't provide a specific branch name to fetch into.] And the difference between what we had before and what exists on FETCH_HEAD can easily be examined with the ..FETCH_HEAD range notation: $ git log ..FETCH_HEAD commit 839b58d021c618bd0e1d336d4d5878a0082672e6 Author: Carl Worth Date: Thu Sep 27 23:55:00 2007 -0700 Added an extra line of output and fixed the typo bug. Since these commits actually exist in the local repository now, we don't need to fetch or pull them from the remote repository again---we can now use "git merge" to apply the previously fetched commits. (A mercurial user might notice here that git does not have the race condition between "hg incoming" and "hg pull" that mercurial has since the commits are fetched only once.) $ git merge FETCH_HEAD Updating a1a0e8b..839b58d Fast forward hello.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) Notice that "git merge" reports that our branch pointer has been updated from a1a0e8b to 839b58d. Also, this is a "fast forward" meaning that the new commits are a linear sequence on top of the commit we already hand. In other words, there wasn't any divergence between these two repositories so no actual "merge" commit was created. This separation of fetch and merge is useful when you need to carefully review some changes before applying them. But often you're in a situation where you know you trust the remote repository and you simply want to pull those changes as conveniently as possible, (no extra commands, no typing a magic name like FETCH_HEAD). This is the case when the tracking upstream development of a project with git. And in that case, the above steps are as simple as just executing "git pull". So let's repeat all that the simpler way: $ cd .. $ git clone hello hello-tracking Initialized empty Git repository in /tmp/hello-tracking/.git/ 0 blocks $ cd hello-tracking $ git pull ../my-hello remote: Generating pack... remote: Done counting 5 objects. Result has 3 objects. Deltifying 3 objects... Unpacking 3 objects... remote: 100% (3/3) done Total 3 (delta 1), reused 0 (delta 0) 100% (3/3) done Updating a1a0e8b..839b58d Fast forward hello.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) It should be plain to see that the "git pull" command really did the combined sequence of "git fetch" and "git merge". Also, if you want to pull from the same repository you cloned from originally, (which is the common case for the upstream-tracking scenario), then "git pull" with no explicit repository is suffcient, and it will default to pulling from the same repository as the original clone. [XXX: The structure of the preceding section follows that of the original hgbook. But an alternate structure that arranged to pull from the originally cloned repository (as would be common) would allow for more straightforward use of git's features. For example, instead of the silly FETCH_HEAD stuff it would allow for "git fetch" and "git log master..origin" to be a very nice replacement for "hg incoming". Similarly, below, "git log origin..master" would make a nice replacement for "hg outgoing" which is something I didn't offer at all. One could also use git's remotes with the myriad repositories as used here, but it would require doing things like "git remote add ../hello-pull" and that seems like a bit much to introduce for a turorial of this level. If nothing else, if the above section seems a little intimidating, understand that it's because things are not presented in the most natural "git way", (and I'm a little too tired to fix it tonight).] Note: Mercurial users who are reading this might wonder if there's a need for the equivalent of "hg update" after doing a "git pull". And the answer is no. Unlike mercurial, "git pull" and "git merge" will automatically update the workind-directory files as necessary. #### 2.8.2 Checking out previous revisions It's often useful to examine the working-tree state of some specific revision other than the tip of some branch. For example, maybe you would like to build a particular tagged version, or maybe you'd like to test the behavior of the code before a particular change was introduced. To do this, use "git checkout" and pass it the name of any revision, (with a branch name, a tag name, or any other commit identifier). For example, to examine our project before the original typo was introduced: $ git checkout 0a633bf5 Note: moving to "0a633bf5" which isn't a local branch If you want to create a new branch from this checkout, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b HEAD is now at 0a633bf... Create a makefile The note that git gives us is to indicate that we are checking out a non-branch revision. This is perfectly fine if we are just exploring history, but if we actually wanted to use this revision as the basis for new commits, we would first have to create a new branch name as it describes. If we were to use "git checkout" with a branch name, then that would change the current branch, (meaning that any new commits would advance that branch pointer). For now, let's return back to the tip of the master branch by just checking it out again: $ git checkout master Previous HEAD position was 0a633bf... Create a makefile Switched to branch "master" #### 2.8.3 Pushing changes to another repository Git lets us push changes to another repository, from the repository we’re currently visiting. As with previous examples, above, we’ll first create a temporary repository to push our changes into. But instead of using "git clone", this time we'll use "git init" to make a repository from an empty directory. We do this to create a "bare" repository which is simply a repository that has no working-directory files associated with it. In general, you should only push to bare repositories. $ cd .. $ mkdir hello-push $ cd hello-push $ git --bare init Initialized empty Git repository in /tmp/hello-push/ And then we'll go back to our my-hello repository to perform the push. Since this is our very first push into this repository we need to tell git which branches to push. The easiest way to do this is to use --all to indicate all branches: $ cd ../my-hello $ git push ../hello-push --all updating 'refs/heads/master' from 0000000000000000000000000000000000000000 to 839b58d021c618bd0e1d336d4d5878a0082672e6 Generating pack... Done counting 18 objects. Deltifying 18 objects... 100% (18/18) done Writing 18 objects... 100% (18/18) done Total 18 (delta 3), reused 0 (delta 0) Unpacking 18 objects... 100% (18/18) done refs/heads/master: 0000000000000000000000000000000000000000 -> 839b58d021c618bd0e1d336d4d5878a0082672e6 For subsequent pushes we don't need to specify --all as "git push" will push all branches that exist in both the local and remote repositories. What happens if we try to pull or push changes and the receiving repository already has those changes? Nothing too exciting. $ git push ../hello-push Everything up-to-date #### 2.8.4 Sharing changes over a network The commands we have covered in the previous few sections are not limited to working with local repositories. Each works in exactly the same fashion over a network connection; simply pass in a URL or an ssh host:/path/name specification instead of a local path. ## Appendix D Open Publication License Version 1.0, 8 June 1999 ### D.1 Requirements on both unmodified and modified versions The Open Publication works may be reproduced and distributed in whole or in part, in any medium physical or electronic, provided that the terms of this license are adhered to, and that this license or an incorporation of it by reference (with any options elected by the author(s) and/or publisher) is displayed in the reproduction. Proper form for an incorporation by reference is as follows: Copyright (c) year by author’s name or designee. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, vx.y or later (the latest version is presently available at [http://www.opencontent.org/openpub/][http://www.opencontent.org/openpub/]). The reference must be immediately followed with any options elected by the author(s) and/or publisher of the document (see section D.6). Commercial redistribution of Open Publication-licensed material is permitted. Any publication in standard (paper) book form shall require the citation of the original publisher and author. The publisher and author’s names shall appear on all outer surfaces of the book. On all outer surfaces of the book the original publisher’s name shall be as large as the title of the work and cited as possessive with respect to the title. ### D.2 Copyright The copyright to each Open Publication is owned by its author(s) or designee. ### D.3 Scope of license The following license terms apply to all Open Publication works, unless otherwise explicitly stated in the document. Mere aggregation of Open Publication works or a portion of an Open Publication work with other works or programs on the same media shall not cause this license to apply to those other works. The aggregate work shall contain a notice specifying the inclusion of the Open Publication material and appropriate copyright notice. Severability. If any part of this license is found to be unenforceable in any jurisdiction, the remaining portions of the license remain in force. No warranty. Open Publication works are licensed and provided “as is” without warranty of any kind, express or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose or a warranty of non-infringement. ### D.4 Requirements on modified works All modified versions of documents covered by this license, including translations, anthologies, compilations and partial documents, must meet the following requirements: 1. The modified version must be labeled as such. 2. The person making the modifications must be identified and the modifications dated. 3. Acknowledgement of the original author and publisher if applicable must be retained according to normal academic citation practices. 4. The location of the original unmodified document must be identified. 5. The original author’s (or authors’) name(s) may not be used to assert or imply endorsement of the resulting document without the original author’s (or authors’) permission. ### D.5 Good-practice recommendations In addition to the requirements of this license, it is requested from and strongly recommended of redistributors that: 1. If you are distributing Open Publication works on hardcopy or CD-ROM, you provide email notification to the authors of your intent to redistribute at least thirty days before your manuscript or media freeze, to give the authors time to provide updated documents. This notification should describe modifications, if any, made to the document. 2. All substantive modifications (including deletions) be either clearly marked up in the document or else described in an attachment to the document. 3. Finally, while it is not mandatory under this license, it is considered good form to offer a free copy of any hardcopy and CD-ROM expression of an Open Publication-licensed work to its author(s). ### D.6 License options The author(s) and/or publisher of an Open Publication-licensed document may elect certain options by appending language to the reference to or copy of the license. These options are considered part of the license instance and must be included with the license (or its incorporation by reference) in derived works. 1. To prohibit distribution of substantively modified versions without the explicit permission of the author(s). “Substantive modification” is defined as a change to the semantic content of the document, and excludes mere changes in format or typographical corrections. To accomplish this, add the phrase “Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder.” to the license reference or copy. 2. To prohibit any publication of this work or derivative works in whole or in part in standard (paper) book form for commercial purposes is prohibited unless prior permission is obtained from the copyright holder. To accomplish this, add the phrase “Distribution of the work or derivative of the work in any standard (paper) book form is prohibited unless prior permission is obtained from the copyright holder.” to the license reference or copy.