Git

From miki
Revision as of 12:05, 26 July 2011 by Mip (talk | contribs) (→‎git-log)
Jump to navigation Jump to search

References

Git cheat sheet

Subversion integration
See Git SVN page


Git on Windows

Introduction

Git Features:

  • Reliability
  • Performance
  • Distributed

Distributed

Originally from BitKeeper. Other distributed SCM is Mercurial.

  • No single repository. Everybody always has his own copy of the repository. The repository content is pulled from other people's repository.
  • No politics, no commit access control. All work is always done locally, so there is no need to define such politics.

Reliability

Every change, file, directory, etc. is cryptographically hashed (sha1sum).

  • Easy corruption detection. Any tampering to a file or directory content (either malicious or because of hardware failure) is immediately detected.
  • Easy distribution. Moreover because the repository is distributed all over the place, it is very easy to repair a given repository. You only need to drop all broken objects, and get all missing objects from a remote copy.

Performance

Very fast commit. Local repository

Terminology and Concepts

commit
A commit is a snapshot of your working tree at some point in time. There are different ways to name a commit:
  • branchname — a branch name is an alias for most recent commit on that branch
  • tagname — similar to a branch alias, but that does not change in time
  • HEAD — currently checked out commit
  • c82a22c — the SHA-1 hash id of the commit (can be truncated as long as it remains unique)
  • name^ — the parent of commit name
  • name^^ — the grand-parent of commit name (and so on)
  • name^2 — the 2nd parent of commit name (and so on)
  • name~10 — the 10th ancestor of commit name (same as name^^^^^^^^^^)
  • name:path — reference a specific file/directory in a given commit
  • name^{tree} — reference the tree held by a commit
  • name1..name2 — a commit range, i.e. all commits reachable from name2 back to, but no including, name1 (if either name is omitted, use HEAD instead)
  • name1...name2 — refers to all commits referenced by name1 or name2, but not by both. For git diff, refers to all commits between name2 and the common ancestor of name1 and name2.
  • master.. — to review changes made to the current branch
  • ..master — after a fetch, to review all changes occured since last rebase or merge
  • --since="2 weeks ago" — all commits since a certain date
  • --until=”1 week ago” — all commits up to a certain date
  • --grep=pattern — all commits whose commit message matches the regular expression pattern.
  • --committer=pattern — all commits whose committer matches the pattern
  • --author=pattern — all commits whose author matches the pattern
  • --no-merges — all commits in a range that have only one pattern (i.e. ignore all merge commits)
detached head
When HEAD is no longer a reference to anything (like ref: refs/heads/branch), but instead contains the actual hash of a commit.
git checkout -b newbranch           # To attach HEAD back on a new branch...
hunk
individual change within a file (basically a file diff output is made of a sequence of one or more hunks).

Install

Install the following essential packages:

  • git-core — the main program
  • git-gui — a gui front-end

Optionally install also:

  • git-doc — documentation
  • gitweb — Web interface
  • ViewGit — Another web interface
  • gitosis — Project management:
  • tig — a text-mode repository browser interface to git and color pager.
tig                                # launch browser
git show | tig                     # Use as pager. Colorize output of git-show
  • gitview — Git Repository browser
  • gitg — a Git repository browser targeting Gtk+ / GNOME
  • Version delivered with Lucid/Maverick is a very old one. Compile from the sources to get the latest version. Alternatively the repository ppa:pasgui/ppa contains a more recent version.
    To install the repository, create a file /etc/apt/sources.list.d/pasgui-ppa-lucid.list (change lucid as necessary):
    deb http://ppa.launchpad.net/pasgui/ppa/ubuntu lucid main 
    deb-src http://ppa.launchpad.net/pasgui/ppa/ubuntu lucid main
    

    Then add the apt key:

    sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys F599ACE3
    
  • qgit — A graphical interface to git repositories using QT

Configuration

References:

Global per-user configuration settings are stored in file ~/.gitconfig

  • Add color to git output for all commmands:
  • git config --global color.ui true
    
  • Define author/email
  • git config --global user.name "Your Name"
    git config --global user.email you@example.com
    
  • Add some frequently used aliases:
  • git config --global alias.st 'status'
    git config --global alias.ci 'commit'
    git config --global alias.co 'checkout'
    git config --global alias.br 'branch'
    git config --global alias.last 'log -1 HEAD'
    git config --global alias.h 'log --oneline --graph --decorate --45'
    git config --global alias.ha 'log --oneline --graph --decorate --45 --all'
    git config --global alias.l 'log --pretty=tformat:\"%C(yellow)%h %Cblue%an %Cgreen%cr %Creset%s %Cred%d\" --graph -45'
    git config --global alias.la 'log --pretty=tformat:\"%C(yellow)%h %Cblue%an %Cgreen%cr %Creset%s %Cred%d\" --graph -45 --all'
    git config --global alias.dc 'diff --cached'
    git config --global alias.wdiff 'diff --color-words'
    git config --global alias.wshow 'show --color-words'
    
  • Some handy scripts:
    • git-wtf displays the state of your repository in a readable and easy-to-scan format
  • To solve the issue of ugly fonts (not anti-aliased besides other uglyness), install tk8.5 and force alternatives for wish (see [1]):
  • sudo apt-get install tk8.5
    sudo update-alternatives --config wish
    # select wish8.5
    

    An alternative however is to use gitg.

How-To

Here we shall describe how to perform some tasks in Git.

Cloning to/from a Server using SSH

Reference: [2]

Clone a local repository to remote server griffin, through ssh. Repositories are all stored in a directory repositories/ in home directory of remote user git :

git clone --bare myproject myproject.git          # Create a bare clone of your repository, if not available yet
scp -r myproject.git/ git@griffin:repositories/   # Copy the repository to server - requires SSH access
rm -rf myproject.git                              # Delete local bare clone

Now any other user that has SSH access to git@griffin may get a copy of that repository with

git clone git@griffin:repositories/myproject.git  # Clone repository and create working tree in myproject/

Now, the user that created the repository at the first place can

  • either delete his own repository and clone the remote one as any other user,
  • or more safely, he can tell git to add the remote repository and set up tracking branch for master:
git remote add -f origin git@griffin:repositories/my_project.git  # Add remote repository and fetch automatically
git remote set-head -a origin                                     # Set origin/HEAD automatically - see man git-remote, set-head
git branch --set-upstream master origin                           # Set master to track head (here origin/master) branch from origin

See git-clone below for more details.

Cloning from a Server using SSH (limited access)

Reference: [3]

Say you have an SSH access to a server but git-core is not installed you you can't install it yourself (for instance on a shared hosting server). You can still use git but it requires some "hacking":

  1. First copy the executables from package git-core, directory /usr/bin to some directory on the server where you have write access (say private/bin). Note that somes files are actually symlinks:
  2. private/bin/: -rwxr-xr-x user webusers git* lrwxrwxrwx user webusers git-receive-pack -> git* -rwxr-xr-x user webusers git-shell* lrwxrwxrwx user webusers git-upload-archive -> git* -rwxr-xr-x user webusers git-upload-pack*
  3. Clone from the server using -u command-line switch:
  4. git clone -u </path/to/private/bin/git-upload-pack> user@server:private/git/myproject.git
    
  5. Edit the local myproject/.git/config file to add the lines marked with a +:
    [remote "origin"]
         fetch = +refs/heads/*:refs/remotes/origin/*
         url = user@server:private/git/myproject.git
    +    uploadpack = /path/to/private/bin/git-upload-pack
    +    receivepack = /path/to/private/bin/git-receive-pack
    

Mirroring

Reference: [4]

Cloning from a Server using git: Protocol over a Proxy

Reference: [5], [6]

The referenced links propose some script. Here another variant. Add this script to your path (say ~/bin/proxygit):

#!/bin/bash
# proxygit - git through http proxy
#
# Usage:  proxygit [options] COMMAND [ARGS]
#   Setup Git HTTP proxy variable GIT_PROXY_COMMAND and call git with the given parameters.
#   The proxy settings are read from env variable $http_proxy
#
# Note:
# - Requires package socat
# - $GIT_PROXY_COMMAND must not be defined

if [ -n "$GIT_PROXY_COMMAND" ]; then
	PROXY=$(echo $http_proxy|sed -r 's!(http://)?([^/:]*):([0-9]*)/?!\2!')
	PROXYPORT=$(echo $http_proxy|sed -r 's!(http://)?([^/:]*):([0-9]*)/?!\3!')
	exec /usr/bin/socat - "PROXY:$PROXY:$1:$2,proxyport=$PROXYPORT"
else
	export GIT_PROXY_COMMAND="$0"
	exec git "$@"
fi

Work with local and remote branches

Let's assume you have already a remote repository setup, like you would obtain if you clone a remote repository:

git clone user@server:repositories/project.git

By issuing this command, git will automatically:

  • Create a local repository, clone of the remote one
  • Call the remote repository origin
  • Create a local branch master, and
  • Configure it to track the branch master on origin. That remote branch is called locally origin/master.

The following commands can be used to create, track, delete local and remote branches.

# CREATE a local branch
git branch newbranch                          # Create a new local branch 'newbranch'
                                              # ... use "git checkout newbranch" to check it out

# CREATE & CHECKOUT a local branch
git checkout -b newbranch                     # Create a new local branch 'newbranch' and check it out

# PUBLISH a local branch (for TRACKing)
# git push [remotename] [localbranch]:[remotebranch]
git push origin serverfix                     # Push a local branch 'serverfix' to remote (create it if necessary)
git push -u origin serverfix                  # ... same but also set upstream reference for TRACKING
git push -u origin serverfix:serverfix        # ... same as above
git push -u origin serverfix:coolfix          # ... same but call the branch 'coolfix' on the remote

# TRACK a remote branch
git branch --track sf origin/serverfix        # Create a local branch 'sf' that tracks remote branch 'serverfix'
git branch --set-upstream sf origin/serverfix # ... same, but when local branch 'sf' already exists

# TRACK & CHECKOUT a remote branch
git checkout --track origin/serverfix         # Checkout a new local branch 'serverfix' to track remote branch 'serverfix'
                                              #   (remember that this branch is called locally 'origin/serverfix')
git checkout -b sf origin/serverfix           # ... same as above, but the local branch is named 'sf'

# FETCH / UPDATE from remote
git fetch
git fetch --prune                             # After fetching, remove any remote tracking branches that no longer exist on the remote

# FETCH from & MERGE with remote
git pull
git pull --prune                             # After fetching, remove any remote tracking branches that no longer exist on the remote

# DELETE a branch
git branch -d sf                              # Delete local branch 'sf'
git branch -d -r origin/serverfix             # Delete remote tracking branch 'serverfix'
git push origin :serverfix                    # Delete branch 'serverfix' on 'origin'
                                              # (basically this means push nothing to remote 'serverfix')

In summary:

  • Use git branch to create, update, delete branches on the local repository.
  • Use git checkout to checkout (possibly new) local branches.
  • Use git push to update the remote, possibly publishing or deleting branches.

Define a diff textconv filter

This applies a diff filter, but only for git diff and git log commands:

  • Edit file ~/.gitattributes, add (! no quotes around the diff parameter)
*.adr       diff=bookmarks_adr
  • Edit file ~/.gitconfig, add
[diff "bookmarks_adr"]
    textconv = sed -r '/NAME=|URL=|SHORT NAME=/!d'

Commands

Here we'll summarize how to use some of the Git commands

git-add

git-add adds file contents to the index

git add -A                                    # Stage all modified AND new files in current directory and recursively

git-branch

git-branch lists, creates, or deletes branches

# CREATE a local branch
git branch newbranch                          # Create a new local branch 'newbranch'
                                              # ... use "git checkout newbranch" to check it out
 
# TRACK a remote branch
git branch --track sf origin/serverfix        # Create a local branch 'sf' that tracks remote branch 'serverfix'
git branch --set-upstream sf origin/serverfix # ... same, but when local branch 'sf' already exists
 
# DELETE a branch
git branch -d sf                              # Delete local branch 'sf'
git branch -d -r origin/serverfix             # Delete remote tracking branch 'serverfix' (see remark below)
git remote prune origin                       # Prune all state remote tracking branch

# MOVE a branch
git branch -f branch commit                   # Move tip of an existing branch to a different commit

Note:

  • You can also track local branch (git branch -t local1 local2), but is it useful?
  • Deleting remote tracking branch (git branch -d -r) on the local repository only makes sense if the remote branch has been deleted on the remote, or if git-fetch has been configured not to import that branch anymore. So the best is simply to prune remote tracking branches automatically:
  • git remote prune origin       # Remove all remote tracking branches that no longer exist on the remote (i.e. stale branches)
    git fetch --prune             # After fetching, remove all remote tracking branches that no longer exist on the remote
    

git-checkout

git-checkout checkouts a branch or paths to the working tree.

# CHECKOUT a local branch
git checkout mybranch                         # Checkout an existing branch

# CREATE & CHECKOUT a local branch
git checkout -b newbranch                     # Create a new local branch 'newbranch' and check it out
 
# TRACK & CHECKOUT a remote branch
git checkout --track origin/serverfix         # Checkout a new local branch 'serverfix' to track remote branch 'serverfix'
                                              #   (remember that this branch is called locally 'origin/serverfix')
git checkout -b sf origin/serverfix           # ... same as above, but the local branch is named 'sf'

# DISCARD local changes (.. checkout file as in local repository)
git checkout -- <file>                        # DISCARD changes in file <file> in the working tree

git-clone

git-clone is mainly used to create a local copy of a remote repository, or to create a bare repository (i.e. one without a working tree) for remote storage:

  • Clone a remote repository:
  • git clone git@griffin:repositories/myproject.git  # Clone repository and create working tree in myproject/
    
  • Create a bare repository for remote storage:
  • git clone --bare myproject myproject.git          # Create a bare clone of your repository, if not available yet
    scp -r myproject.git/ git@griffin:repositories/   # Copy the repository to server - requires SSH access
    rm -rf myproject.git                              # Delete local bare clone
    


The command git clone /dir/repo/project.git is identical to running the following commands:

git init                                            # Create an empty repo
git remote add -f origin /dir/repo/project.git      # Add a remote repo called 'origin' and fetch
git set-head -a                                     # Set default remote branch for remote 'origin' automatically
git checkout --track origin/master                  # Create a tracking branch 'master', and update working tree

Another equivalent option for last command is git checkout -b origin origin/master (since the start-point is remote, git creates a tracking branch).

Some variants:

  • To get a local copy of a remote repository, but without changing the working tree (i.e. keeping all local changes), just change the last command to:
git init
git remote add -f origin -m master /dir/repo/project.git
git branch --track master origin/master             # Branch 'master' set up to track remote branch 'master' from 'origin'
  • To merge remote branch locally, but without creating a tracking branch, change the last command to :
# ...
git merge origin/master                             # Merge

git-commit

git commit -m "commit message"    # Gives immediately the commit message on the command-line
git commit -a                     # Add all changes and commit in one pass
git commit --amend                # Amend tip current branch (message, add some files) - also for merge commits

Add the following line to your file ~/.gitconfig:

git config --global alias.ci 'commit'

Now you can use ci instead of commit:

git ci -m "commit message"

git-fetch

git-fetch downloads objects and refs from another repository.

git fetch -p                      # After fetching, remove any remote-tracking branches which no longer exist on the remote
                                  # (see also git remote prune origin)

git-filter-branch

git-filter-branch rewrites branches.

This command can be used for instance to rewrite / rebase a branch while changing the author/committer name/email (see [7] and [8]):

# Set the committer name to author name, for instance after a rebase
git filter-branch --commit-filter '
    export GIT_COMMITTER_NAME="$GIT_AUTHOR_NAME"; 
    export GIT_COMMITTER_EMAIL="$GIT_AUTHOR_EMAIL"; 
    export GIT_COMMITTER_DATE="$GIT_AUTHOR_DATE"; 
    git commit-tree "$@"' -- basecommit..HEAD
# Change the author/committer name for specific commit (e.g. if wrong email was used):
git filter-branch --commit-filter '
        if [ "$GIT_COMMITTER_NAME" = "<Old Name>" ];
        then
                GIT_COMMITTER_NAME="<New Name>";
                GIT_AUTHOR_NAME="<New Name>";
                GIT_COMMITTER_EMAIL="<New Email>";
                GIT_AUTHOR_EMAIL="<New Email>";
                git commit-tree "$@";
        else
                git commit-tree "$@";
        fi' HEAD

Note that basecommit..HEAD can easily be changed to other commit specification, like --since="1 year ago"

Another example to remove sensitive files from repository (from [9]):

# Delete the file(s)
git clone git@github.com:defunkt/github-gem.git
cd github-gem/
git filter-branch --index-filter 'git rm --cached --ignore-unmatch pattern1 pattern2 ...' HEAD 
# .... Add --tag-name-filter "cat" to keep tags - will overwrite existing tags!

# Push to origin
git push origin master --force

# Force cleanup
rm -rf .git/refs/original/
git reflog expire --expire=now --all
git gc --prune=now
git gc --aggressive --prune=now

git-grep

git-grep prints lines matching a pattern (see also Git Book)

# Find all occurences of pattern in all files committed since last year
for i in $(git log --oneline --all --graph --since="1 year ago" | egrep -o " [a-h0-9]{7} "); do git grep pattern $i; done

git-log

git-log shows commit logs.

git log                            # Standard history log
git log -5                         # Limit to 5 commits
git log -- file                    # List commits affecting file
git log -p -- file                 # History log, show patch/diff for file
git log -p -M -- file              #  ... idem, but find also renames
git log --stat -1                  # Show diff-stat for last commit

Some handy aliases:

[alias]
    h = log --oneline --graph --decorate --45
    ha = log --oneline --graph --decorate --45 --all
    l = log --pretty=tformat:\"%C(yellow)%h %Cblue%an %Cgreen%cr %Creset%s %Cred%d\" --graph -45
    la = log --pretty=tformat:\"%C(yellow)%h %Cblue%an %Cgreen%cr %Creset%s %Cred%d\" --graph -45 --all

git-pull

git-pull fetches from and merges with another repository or a local branch.

git pull -p                   # After fetching, remove any remote-tracking branches which no longer exist on the remote
                              # (see also git remote prune)

git-push

git push updates remote refs along with associated objects. In layman english, git push basically pushes changes to the remote repository, possibly creating, updating, deleting branches, objects or references.

# PUBLISH a local branch (for TRACKing)
# git push [remotename] [localbranch]:[remotebranch]
git push                                      # Push current branch to remote
git push origin                               # Push all matching branches to remote
git push origin serverfix                     # Push a local branch 'serverfix' to remote (create it if necessary)
git push -u origin serverfix                  # ... same but also set upstream reference for TRACKING
git push -u origin serverfix:serverfix        # ... same as above
git push -u origin serverfix:coolfix          # ... same but call the branch 'coolfix' on the remote

# DELETE a branch
git push origin :serverfix                    # Delete remote branch 'serverfix' on 'origin'
                                              # (basically this means push nothing to remote branch 'serverfix')

git-reflog

git reflog manages reflog information. This is very handy to repair mistakes, or to recover lost commits (e.g. after modifying the head of a branch that was the only reference to a given commit)

git reflog

git-remote

git-remote set-head
Sets / deletes the default branch for a remote. For remote origin, this creates the reference refs/remotes/origin/HEAD with content ref: refs/remotes/origin/master if default branch is master.
Use set-head to follow the changes in another branch than the default one:
git remote set-head origin -a                       # Set default remote branch for remote 'origin' automatically
git diff origin                                     # -> Will show difference with origin/master (if 'master' is the default)
git remote set-head origin exoticbranch             # ... or set it to a different branch (here 'exoticbranch')
git diff origin                                     # -> Now will show diff with origin/exoticbranch

git-reset

git reset resets the current HEAD to the new state (hence also moving branch tip if you are currently on a branch). There are 3 different reset modes:

  • soft
    Only change the HEAD reference to a different commit. Working tree files and index are left untouched.
  • mixed (default)
    Like soft, but also reset the index (working tree files are left untouched). This actually clear the index from any staged changes.
  • hard
    Like mixed, but also erases all changes in the working tree, so that it matches the contents of the new HEAD. This is a dangerous command, so better use some alternatives that avoid data loss, like:

    Commit any changes first

    git commit -a -m "snapshot WIP"
    git reset --hard~3
    

    Stash the changes

    git stash
    git reset --hard HEAD~3
    # ...
    git reset --hard HEAD@{1} # or ORIG_HEAD
    git stash apply
    

    Idem & don't change master too early

    git stash
    git checkout -b new-branch HEAD~3
    ...
    git branch -D master
    git branch -m new-branch master
    

Note:

  • git reset copies the old head to ORIG_HEAD.
  • Caution!git reset moves the tip of the current branch (when HEAD is a ref to a branch). Don't do this on changes that have been published.


Some use cases (see man git-reset for details):

  • Undo a commit and redo
git commit ...
git reset --soft HEAD^
edit
git commit -a -c ORIG_HEAD # or -C
  • Undo commits permanently
git commit ...
git reset --hard HEAD~3
  • Undo a commit, making it a topic branch
git branch topic/wip
git reset --hard HEAD~3 # or --mixed
git checkout topic/wip  # or -m topic/wip
  • Undo a merge or pull
git pull                # conflicts
git reset --hard
git pull . topic/branch # no conflict
git reset --hard ORIG_HEAD
  • Undo a merge or pull inside a dirty work tree
git pull
git reset --merge ORIG_HEAD
  • Interrupted workflow
git checkout feature
#work work work
git commit -a -m "snapshot WIP"
git checkout master
#fix fix fix
git commit
git checkout feature
git reset HEAD^     # or --soft
  • Reset a single file in the index
git add foo.c
git reset -- foo.c

Tips

Frequently Used Commands

git commit -a                     # Add all changes and commit in one pass
git commit --amend                # Amend tip current branch (message, add some files) - also for merge commits

Working the Git Way

  • Check project diff before commit -a:
  • git diff                          # First see what's in the working tree (or git status)
    git commit -a                     # Commit all changes
    
  • Give git commit a directory argument instead of using -a:
  • git commit fs/                     # Commit all changes in directory fs
    
  • Clean up an ugly sequence of commits ([10]).
  • Better than hunk-based commit because (1) each stage can be tested individually, (2) intermediate commits may contain changes that is not in the final one.
    1. First make sure that the ugly sequence is on some temporary branch target (what we aim for), and that end result is good and clean.
    2. Switch back to starting point, and do:
    3. git diff -R target > diff             # diff to target
      
    4. Edit diff file, to select only those changes we want to include in a first commit. Then do a git-apply diff
    5. vi diff
      git-apply diff                        # Must be in project root dir
      
    6. Test, finalize the last changes before commits, and diff against target if necessary.
    7. # test test test
      git diff -R target > diff             # if necessary
      
    8. Commit, and repeat from step 2.
    9. When done, branch target can be removed
  • Use gitk to get a graphical visualisation of current commit, or some subsets. For instance
  • gitk                                 # View current commit and all ancestors
    gitk master..                        # View changes to current branch (i.e. reachable from HEAD, excluding master)
    
  • Use git stash to save the current state of the working tree (see [11]).
  • git stash                            # Save current work in working tree
    ...                                  # (whatever, including git reset --hard...)
    git stash apply                      # Bring back changes in working tree
    
  • Forgot to add some files in the previous commit? Mistyped the commit message? Use git commit --amend:
  • git commit                           # Oups! forgot one file
    git add somefile                     # ... Add the missing file
    git commit --amend                   # ... and replace the previous commit
    

Word-by-word diffs

Reference:

Add the following alias to your ~/.gitconfig file:

[alias]
    wdiff = diff --color-words
    wshow = show --color-words

Now you can have word-by-word diffs / shows:

   git wdiff fname
   git show

Improving tokenization in diffs

Diffs can even be further improved thanks to the use of the .gitattributes file in your repository. For instance:

*.tex   diff=tex
*.h     diff=cpp
*.c     diff=cpp
*.cpp   diff=cpp

One can also define filters that would clean files before checkin/checkout.