Growing with the Web

A gentle introduction to Git

Published
Tags:

Git is a source control (or version control) system designed and developed by Linus Torvalds back in 2005 for the development of the Linux kernel. Similar to other source control systems like TFS or Subversion, it manages your source code enabling a team to work on the same project while minimising conflicts.

Git logo

Over the last few years Git has seen incredible growth making it one of the widest-adopted source control systems in the world. According to an Eclipse Foundation survey conducted in May 2013, Git has risen from 6.8% usage in 2010 to a massive 30.3% in 2013.

This article provides an overview of the basic concepts and commands in Git to get you up and running.

Cloning and merging

Git was designed for open source software and that shines through in its workflow. Say you wanted to submit a patch to John’s repository, the first thing you would do is clone the repository. This will give you your own identical local version of the repository that you can work on and commit your changes to. Once the work is complete you will merge your commits into his repository.

Distributed version control

TFS revisions vs Git commits

What sets Git apart from regular source control systems is that it’s distributed. I’ve found the best way to explain this is to compare it to a non-distributed one. In TFS or subversion for example, you have a single centralised repository on some server that everyone commits to. The first commit is commit #1, the second is commit #2, and so on. With Git the commit names are actually hashes that looks like f52435ce2ffeb7d6b8f1573ca8a6bba9d0697520.

Since there is only a single repository in TFS everyone commits to the repository and if necessary, will sync from the server to resolve any conflicts before committing.

An example of a TFS tree
A TFS tree

With Git, the idea of numbered commits doesn’t make as much sense as there could be hundreds of different versions of the same repository all being pulled from and merged into each other.

An example of a Git tree
A Git tree

This is what is happening in the above image:

  • Jessica clones Daniel’s repository, commits a change and pushes it back to Daniel’s repository.
  • Nathan clones Daniel’s repository, commits some changes and pushes them to Daniel’s repository.
  • Sam clones Nathan’s repository and later merges from Nathan’s repository to bring the commits made by Nathan to Sam’s repository.
  • John clones Nathan’s repository, makes a commit and pushes it back to Nathan’s repository.
  • Daniel makes the initial repository, accepts some merges and commits at the end.

Multiple repositories

As time passes by, the original repository may not end up being the primary repository. For example, say John wants to modify Daniel’s project to suit his needs, but Daniel doesn’t want to bring the changes into his repository. Provided Daniel’s license allows, John could clone Daniel’s repository and continue working on it as a completely separate project.

A forked project

Staging

Before files are commited they first need to be staged for commit. Once files are staged, they can then be committed to the local repository.

Branching

A repository can contain many branches, the first of which is the master branch that is typically where the first commit of the project will be. A branch is basically an independent version of the project code within the repository. This allows a developer to work on a large feature on the branch find-command which can be paused at any time by returning to the master branch.

Two branches being branced off of and merged into master

In an ideal world, all development should be done on a non-master branch, with a separate branch each new feature. This gives the developer a very clean and manageable workspace. The concept of a branch is not Git-specific, but the idea that all development should be done on a branch is arguably more prevalent on Git due to how simple they are to manage.

Commands

Here are a few must know commands to get you started.

Command Description
git init Creates an empty git repository
git clone <repository> Clones a repository
git pull <repository> Pulls changes from a repository
git status Show any modified, added, removed and staged files since the last commit
git diff Shows file changes between the staged files and unstaged files
git diff --cached Shows file changes between the the last commit and the staged files
git add <file1> \[<file2> ...\] Stage one or more files for commit
git add -A Stage all changes in the repository
git commit -m <message> Commits the staged files with a message
git push Push commits to a remote repository
git log Show a list of commits made to the repository
git help Shows a list of Git commands
git help <command> Shows help on <command>

Do I really need to use a command line?

No you don’t, there are a range of GUIs available on the official Git site. I actually recommend you start somewhere like GitHub and use a graphical client first to get yourself familiar with all the lingo. I personally started out with GitHub for Windows.

Once you’re a little more confident have a shot at using the command line if you want more control and a deeper understanding of the system.

Final words

This article is just the tip of the iceberg. Git is an incredibly powerful source control system, the enormous rise in usage over the last few years can likely be attributed to the development community realising that.

Like this article?
Subscribe for more!