Git
References
- Git Home
- Git Tutorial
- Git on Ubuntu
- Pro Git
- Git, from the bottom up
- Linux Greatest Invention
- Tech Talk: Linux Torvalds on git
- Git cheat sheet
Introduction
Git Features:
- Reliability
- Performance
- Distributed
Distributed
Originally from BitKeeper. Other distributed SCM is Mercurial.
- No single repository. Everybody always has his own copy of the repository. The repository content is pulled from other people's repository.
- No politics, no commit access control. All work is always done locally, so there is no need to define such politics.
Reliability
Every change, file, directory, etc. is cryptographically hashed (sha1sum).
- Easy corruption detection. Any tampering to a file or directory content (either malicious or because of hardware failure) is immediately detected.
- Easy distribution. Moreover because the repository is distributed all over the place, it is very easy to repair a given repository. You only need to drop all broken objects, and get all missing objects from a remote copy.
Performance
Very fast commit. Local repository
Terminology and Concepts
- commit
- A commit is a snapshot of your working tree at some point in time. There are different ways to name a commit:
- branchname — a branch name is an alias for most recent commit on that branch
- tagname — similar to a branch alias, but that does not change in time
- HEAD — currently checked out commit
- c82a22c — the SHA-1 hash id of the commit (can be truncated as long as it remains unique)
- name^ — the parent of commit name
- name^^ — the grand-parent of commit name (and so on)
- name^2 — the 2nd parent of commit name (and so on)
- name~10 — the 10th ancestor of commit name (same as name^^^^^^^^^^)
- name:path — reference a specific file/directory in a given commit
- name^{tree} — reference the tree held by a commit
- name1..name2 — a commit range, i.e. all commits reachable from name2 back to, but no including, name1 (if either name is omitted, use HEAD instead)
- name1...name2 — refers to all commits referenced by name1 or name2, but not by both. For
git diff
, refers to all commits between name2 and the common ancestor of name1 and name2. - master.. — to review changes made to the current branch
- ..master — after a
fetch
, to review all changes occured since lastrebase
ormerge
- --since="2 weeks ago" — all commits since a certain date
- --until=”1 week ago” — all commits up to a certain date
- --grep=pattern — all commits whose commit message matches the regular expression pattern.
- --committer=pattern — all commits whose committer matches the pattern
- --author=pattern — all commits whose author matches the pattern
- --no-merges — all commits in a range that have only one pattern (i.e. ignore all merge commits)
- detached head
- When HEAD is no longer a reference to anything (like ref: refs/heads/branch), but instead contains the actual hash of a commit.
git checkout -b newbranch # To attach HEAD back on a new branch...
- hunk
- individual change within a file (basically a file diff output is made of a sequence of one or more hunks).
Install
Install the following essential packages:
- git-core — the main program
- git-gui — a gui front-end
Optionally install also:
- git-doc — documentation
- gitweb — Web interface
- ViewGit — Another web interface
- gitosis — Project management:
- tig — a text-mode repository browser interface to git and color pager.
tig # launch browser
git show | tig # Use as pager. Colorize output of git-show
- gitview — Git Repository browser
- gitg — a Git repository browser targeting Gtk+ / GNOME Version delivered with Lucid/Maverick is a very old one. Compile from the sources to get the latest version. Alternatively the repository ppa:pasgui/ppa contains a more recent version.
- qgit — A graphical interface to git repositories using QT
To install the repository, create a file /etc/apt/sources.list.d/pasgui-ppa-lucid.list (change lucid as necessary):
deb http://ppa.launchpad.net/pasgui/ppa/ubuntu lucid main
deb-src http://ppa.launchpad.net/pasgui/ppa/ubuntu lucid main
Then add the apt key:
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F599ACE3
Configuration
References:
- Git Community Boot - Customizing Git
- Git handy feedback on command-line
Global per-user configuration settings are stored in file ~/.gitconfig
- Add color to git output for all commmands:
- Define author/email
- Add some frequently used aliases:
- Some handy scripts:
- git-wtf displays the state of your repository in a readable and easy-to-scan format
git config --global color.ui true
git config --global user.name "Your Name"
git config --global user.email you@example.com
git config --global alias.st 'status'
git config --global alias.ci 'commit'
git config --global alias.co 'checkout'
git config --global alias.br 'branch'
git config --global alias.last 'log -1 HEAD'
git config --global alias.hist 'log --oneline --graph --decorate --45'
git config --global alias.ha 'log --oneline --graph --decorate --45 --all'
git config --global alias.dc 'diff --cached'
git config --global alias.wdiff 'diff --color-words'
git config --global alias.wshow 'show --color-words'
How-To
Here we shall describe how to perform some tasks in Git.
Cloning to/from a Server using SSH
Reference: [1]
Clone a local repository to remote server griffin, through ssh. Repositories are all stored in a directory repositories/ in home directory of remote user git :
git clone --bare myproject myproject.git # Create a bare clone of your repository, if not available yet
scp -r myproject.git/ git@griffin:repositories/ # Copy the repository to server - requires SSH access
rm -rf myproject.git # Delete local bare clone
Now any other user that has SSH access to git@griffin may get a copy of that repository with
git clone git@griffin:repositories/myproject.git # Clone repository and create working tree in myproject/
Now, the user that created the repository at the first place can
- either delete his own repository and clone the remote one as any other user,
- or more safely, he can tell git to add the remote repository and set up tracking branch for master:
git remote add -f origin git@griffin:repositories/my_project.git # Add remote repository and fetch automatically
git remote set-head -a origin # Set origin/HEAD automatically - see man git-remote, set-head
git branch --set-upstream master remotes/origin/master # Set master to track remote branch master from origin
See git-clone below for more details.
Cloning from a Server using SSH (limited access)
Reference: [2]
Say you have an SSH access to a server but git-core is not installed you you can't install it yourself (for instance on a shared hosting server). You can still use git but it requires some "hacking":
- First copy the executables from package git-core, directory /usr/bin to some directory on the server where you have write access (say private/bin). Note that somes files are actually symlinks: private/bin/: -rwxr-xr-x user webusers git* lrwxrwxrwx user webusers git-receive-pack -> git* -rwxr-xr-x user webusers git-shell* lrwxrwxrwx user webusers git-upload-archive -> git* -rwxr-xr-x user webusers git-upload-pack*
- Clone from the server using -u command-line switch:
- Edit the local myproject/.git/config file to add the lines marked with a +:
[remote "origin"] fetch = +refs/heads/*:refs/remotes/origin/* url = user@server:private/git/myproject.git + uploadpack = /path/to/private/bin/git-upload-pack + receivepack = /path/to/private/bin/git-receive-pack
git clone -u </path/to/private/bin/git-upload-pack> user@server:private/git/myproject.git
Mirroring
Reference: [3]
Cloning from a Server using git: Protocol over a Proxy
The referenced links propose some script. Here another variant. Add this script to your path (say ~/bin/proxygit):
#!/bin/bash
# proxygit - git through http proxy
#
# Usage: proxygit [options] COMMAND [ARGS]
# Setup Git HTTP proxy variable GIT_PROXY_COMMAND and call git with the given parameters.
# The proxy settings are read from env variable $http_proxy
#
# Note:
# - Requires package socat
# - $GIT_PROXY_COMMAND must not be defined
if [ -n "$GIT_PROXY_COMMAND" ]; then
PROXY=$(echo $http_proxy|sed -r 's!(http://)?([^/:]*):([0-9]*)/?!\2!')
PROXYPORT=$(echo $http_proxy|sed -r 's!(http://)?([^/:]*):([0-9]*)/?!\3!')
exec /usr/bin/socat - "PROXY:$PROXY:$1:$2,proxyport=$PROXYPORT"
else
export GIT_PROXY_COMMAND="$0"
exec git "$@"
fi
Work with local and remote branches
Let's assume you have already a remote repository setup, like you would obtain if you clone a remote repository:
git clone user@server:repositories/project.git
By issuing this command, git will automatically:
- Create a local repository, clone of the remote one
- Call the remote repository origin
- Create a local branch master, and
- Configure it to track the branch master on origin. That remote branch is called locally origin/master.
The following commands can be used to create, track, delete local and remote branches.
# CREATE a local branch
git branch newbranch # Create a new local branch 'newbranch'
# ... use "git checkout newbranch" to check it out
# CREATE & CHECKOUT a local branch
git checkout -b newbranch # Create a new local branch 'newbranch' and check it out
# PUBLISH a local branch (for TRACKing)
# git push [remotename] [localbranch]:[remotebranch]
git push origin serverfix # Push a local branch 'serverfix' to remote (create it if necessary)
git push -u origin serverfix # ... same but also set upstream reference for TRACKING
git push -u origin serverfix:serverfix # ... same as above
git push -u origin serverfix:coolfix # ... same but call the branch 'coolfix' on the remote
# TRACK a remote branch
git branch --track sf origin/serverfix # Create a local branch 'sf' that tracks remote branch 'serverfix'
git branch --set-upstream sf origin/serverfix # ... same, but when local branch 'sf' already exists
# TRACK & CHECKOUT a remote branch
git checkout --track origin/serverfix # Checkout a new local branch 'serverfix' to track remote branch 'serverfix'
# (remember that this branch is called locally 'origin/serverfix')
git checkout -b sf origin/serverfix # ... same as above, but the local branch is named 'sf'
# DELETE a branch
git branch -d oldbranch # Delete local branch 'oldbranch'
git push origin :serverfix # Delete remote branch 'serverfix' on 'origin'
# (basically this means push nothing to remote 'oldremote')
In summary:
- Use
git branch
to create, update, delete branches on the local repository. - Use
git checkout
to checkout (possibly new) local branches. - Use
git push
to update the remote, possibly publishing or deleting branches..
Commands
Here we'll summarize how to use some of the Git commands
git-branch
git-branch lists, creates, or deletes branches
# CREATE a local branch
git branch newbranch # Create a new local branch 'newbranch'
# ... use "git checkout newbranch" to check it out
# TRACK a remote branch
git branch --track sf origin/serverfix # Create a local branch 'sf' that tracks remote branch 'serverfix'
git branch --set-upstream sf origin/serverfix # ... same, but when local branch 'sf' already exists
# DELETE a branch
git branch -d oldbranch # Delete local branch 'oldbranch'
Note:
- You can also track local branch (
git branch -t local1 local2
), but is it useful? - You can delete local tracking branch (
git branch -d origin/serverfix
) but that only makes sense if the remote branch has been deleted on the remote, or if git-fetch has been configured not to import that branch anymore.
git-checkout
git-checkout checkouts a branch or paths to the working tree.
# CHECKOUT a local branch
git checkout mybranch # Checkout an existing branch
# CREATE & CHECKOUT a local branch
git checkout -b newbranch # Create a new local branch 'newbranch' and check it out
# TRACK & CHECKOUT a remote branch
git checkout --track origin/serverfix # Checkout a new local branch 'serverfix' to track remote branch 'serverfix'
# (remember that this branch is called locally 'origin/serverfix')
git checkout -b sf origin/serverfix # ... same as above, but the local branch is named 'sf'
# DISCARD local changes (.. checkout file as in local repository)
git checkout -- <file> # DISCARD changes in file <file> in the working tree
git-clone
git-clone is mainly used to create a local copy of a remote repository, or to create a bare repository (i.e. one without a working tree) for remote storage:
- Clone a remote repository:
- Create a bare repository for remote storage:
git clone git@griffin:repositories/myproject.git # Clone repository and create working tree in myproject/
git clone --bare myproject myproject.git # Create a bare clone of your repository, if not available yet
scp -r myproject.git/ git@griffin:repositories/ # Copy the repository to server - requires SSH access
rm -rf myproject.git # Delete local bare clone
The command git clone /dir/repo/project.git
is identical to running the following commands:
git init # Create an empty repo
git remote add -f origin /dir/repo/project.git # Add a remote repo called 'origin' and fetch
git set-head -a # Set default remote branch for remote 'origin' automatically
git checkout --track origin/master # Create a tracking branch 'master', and update working tree
Another equivalent option for last command is git checkout -b origin origin/master
(since the start-point is remote, git creates a tracking branch).
Some variants:
- To get a local copy of a remote repository, but without changing the working tree (i.e. keeping all local changes), just change the last command to:
git init
git remote add -f origin -m master /dir/repo/project.git
git branch --track master origin/master # Branch 'master' set up to track remote branch 'master' from 'origin'
- To merge remote branch locally, but without creating a tracking branch, change the last command to :
# ...
git merge origin/master # Merge
git-commit
git commit -m "commit message" # Gives immediately the commit message on the command-line
git commit -a # Add all changes and commit in one pass
git commit --amend # Amend tip current branch (message, add some files) - also for merge commits
Add the following line to your file ~/.gitconfig:
git config --global alias.ci 'commit'
Now you can use ci instead of commit:
git ci -m "commit message"
git-reflog
git reflog manages reflog information. This is very handy to repair mistakes, or to recover lost commits (e.g. after modifying the head of a branch that was the only reference to a given commit)
git reflog
git-push
git push updates remote refs along with associated objects. In layman english, git push basically pushes changes to the remote repository, possibly creating, updating, deleting branches, objects or references.
# PUBLISH a local branch (for TRACKing)
# git push [remotename] [localbranch]:[remotebranch]
git push origin serverfix # Push a local branch 'serverfix' to remote (create it if necessary)
git push -u origin serverfix # ... same but also set upstream reference for TRACKING
git push -u origin serverfix:serverfix # ... same as above
git push -u origin serverfix:coolfix # ... same but call the branch 'coolfix' on the remote
# DELETE a branch
git push origin :serverfix # Delete remote branch 'serverfix' on 'origin'
# (basically this means push nothing to remote branch 'serverfix')
git-remote
- git-remote set-head
- Sets / deletes the default branch for a remote. For remote origin, this creates the reference refs/remotes/origin/HEAD with content ref: refs/remotes/origin/master if default branch is master.
Use set-head to follow the changes in another branch than the default one:
git remote set-head origin exoticbranch
git diff remotes/origin # Show diffs with remote branch exoticbranch from origin
# ... shortcut for git diff remotes/origin/exoticbranch
git-reset
git reset resets the current HEAD to the new state (hence also moving branch tip if you are currently on a branch). There are 3 different reset modes:
- soft
Only change the HEAD reference to a different commit. Working tree files and index are left untouched. - mixed (default)
Like soft, but also reset the index (working tree files are left untouched). This actually clear the index from any staged changes. - hard
Like mixed, but also erases all changes in the working tree, so that it matches the contents of the new HEAD. This is a dangerous command, so better use some alternatives that avoid data loss, like:Commit any changes first
git commit -a -m "snapshot WIP" git reset --hard~3
Stash the changes
git stash git reset --hard HEAD~3 # ... git reset --hard HEAD@{1} # or ORIG_HEAD git stash apply
Idem & don't change master too early
git stash git checkout -b new-branch HEAD~3 ... git branch -D master git branch -m new-branch master
Note:
- git reset copies the old head to ORIG_HEAD.
- Caution! — git reset moves the tip of the current branch (when HEAD is a ref to a branch). Don't do this on changes that have been published.
Some use cases (see man git-reset
for details):
git commit ...
git reset --soft HEAD^
edit
git commit -a -c ORIG_HEAD # or -C
git commit ...
git reset --hard HEAD~3
git branch topic/wip
git reset --hard HEAD~3 # or --mixed
git checkout topic/wip # or -m topic/wip
|
git pull # conflicts
git reset --hard
git pull . topic/branch # no conflict
git reset --hard ORIG_HEAD
git pull
git reset --merge ORIG_HEAD
|
git checkout feature
#work work work
git commit -a -m "snapshot WIP"
git checkout master
#fix fix fix
git commit
git checkout feature
git reset HEAD^ # or --soft
git add foo.c
git reset -- foo.c
|
Tips
Frequently Used Commands
git commit -a # Add all changes and commit in one pass
git commit --amend # Amend tip current branch (message, add some files) - also for merge commits
Working the Git Way
- Check project diff before
commit -a
: - Give
git commit
a directory argument instead of using-a
: - Clean up an ugly sequence of commits ([6]). Better than hunk-based commit because (1) each stage can be tested individually, (2) intermediate commits may contain changes that is not in the final one.
- First make sure that the ugly sequence is on some temporary branch target (what we aim for), and that end result is good and clean.
- Switch back to starting point, and do:
- Edit diff file, to select only those changes we want to include in a first commit. Then do a
git-apply diff
- Test, finalize the last changes before commits, and diff against target if necessary.
- Commit, and repeat from step 2.
- When done, branch target can be removed
- Use
gitk
to get a graphical visualisation of current commit, or some subsets. For instance - Use
git stash
to save the current state of the working tree (see [7]). - Forgot to add some files in the previous commit? Mistyped the commit message? Use
git commit --amend
:
git diff # First see what's in the working tree (or git status)
git commit -a # Commit all changes
git commit fs/ # Commit all changes in directory fs
git diff -R target > diff # diff to target
vi diff
git-apply diff # Must be in project root dir
# test test test
git diff -R target > diff # if necessary
gitk # View current commit and all ancestors
gitk master.. # View changes to current branch (i.e. reachable from HEAD, excluding master)
git stash # Save current work in working tree
... # (whatever, including git reset --hard...)
git stash apply # Bring back changes in working tree
git commit # Oups! forgot one file
git add somefile # ... Add the missing file
git commit --amend # ... and replace the previous commit
Word-by-word diffs
Reference:
- Blog [8] and [9]
- Manpage gitattributes(5)
- Manpage git-diff(1)
Add the following alias to your ~/.gitconfig file:
[alias]
wdiff = diff --color-words
wshow = show --color-words
Now you can have word-by-word diffs / shows:
git wdiff fname
git show
Improving tokenization in diffs
Diffs can even be further improved thanks to the use of the .gitattributes file in your repository. For instance:
*.tex diff=tex
*.h diff=cpp
*.c diff=cpp
*.cpp diff=cpp
One can also define filters that would clean files before checkin/checkout.