A gentle introduction to Git
Git is a source control (or version control) system designed and developed by Linus Torvalds back in 2005 for the development of the Linux kernel. Similar to other source control systems like TFS or Subversion, it manages your source code enabling a team to work on the same project while minimising conflicts.
Over the last few years Git has seen incredible growth making it one of the widest-adopted source control systems in the world. According to an Eclipse Foundation survey conducted in May 2013, Git has risen from 6.8% usage in 2010 to a massive 30.3% in 2013.
This article provides an overview of the basic concepts and commands in Git to get you up and running.
Cloning and merging
Git was designed for open source software and that shines through in its workflow. Say you wanted to submit a patch to John’s repository, the first thing you would do is clone the repository. This will give you your own identical local version of the repository that you can work on and commit your changes to. Once the work is complete you will merge your commits into his repository.
Distributed version control
What sets Git apart from regular source control systems is that it’s distributed. I’ve found the best way to explain this is to compare it to a non-distributed one. In TFS or subversion for example, you have a single centralised repository on some server that everyone commits to. The first commit is commit #1, the second is commit #2, and so on. With Git the commit names are actually hashes that looks like f52435ce2ffeb7d6b8f1573ca8a6bba9d0697520
.
Since there is only a single repository in TFS everyone commits to the repository and if necessary, will sync from the server to resolve any conflicts before committing.
With Git, the idea of numbered commits doesn’t make as much sense as there could be hundreds of different versions of the same repository all being pulled from and merged into each other.
This is what is happening in the above image:
- Jessica clones Daniel’s repository, commits a change and pushes it back to Daniel’s repository.
- Nathan clones Daniel’s repository, commits some changes and pushes them to Daniel’s repository.
- Sam clones Nathan’s repository and later merges from Nathan’s repository to bring the commits made by Nathan to Sam’s repository.
- John clones Nathan’s repository, makes a commit and pushes it back to Nathan’s repository.
- Daniel makes the initial repository, accepts some merges and commits at the end.
Multiple repositories
As time passes by, the original repository may not end up being the primary repository. For example, say John wants to modify Daniel’s project to suit his needs, but Daniel doesn’t want to bring the changes into his repository. Provided Daniel’s license allows, John could clone Daniel’s repository and continue working on it as a completely separate project.
Staging
Before files are commited they first need to be staged for commit. Once files are staged, they can then be committed to the local repository.
Branching
A repository can contain many branches, the first of which is the master branch that is typically where the first commit of the project will be. A branch is basically an independent version of the project code within the repository. This allows a developer to work on a large feature on the branch find-command
which can be paused at any time by returning to the master
branch.
In an ideal world, all development should be done on a non-master branch, with a separate branch each new feature. This gives the developer a very clean and manageable workspace. The concept of a branch is not Git-specific, but the idea that all development should be done on a branch is arguably more prevalent on Git due to how simple they are to manage.
Commands
Here are a few must know commands to get you started.
Command | Description |
---|---|
git init |
Creates an empty git repository |
git clone <repository> |
Clones a repository |
git pull <repository> |
Pulls changes from a repository |
git status |
Show any modified, added, removed and staged files since the last commit |
git diff |
Shows file changes between the staged files and unstaged files |
git diff --cached |
Shows file changes between the the last commit and the staged files |
git add <file1> \[<file2> ...\] |
Stage one or more files for commit |
git add -A |
Stage all changes in the repository |
git commit -m <message> |
Commits the staged files with a message |
git push |
Push commits to a remote repository |
git log |
Show a list of commits made to the repository |
git help |
Shows a list of Git commands |
git help <command> |
Shows help on <command> |
Do I really need to use a command line?
No you don’t, there are a range of GUIs available on the official Git site. I actually recommend you start somewhere like GitHub and use a graphical client first to get yourself familiar with all the lingo. I personally started out with GitHub for Windows.
Once you’re a little more confident have a shot at using the command line if you want more control and a deeper understanding of the system.
Final words
This article is just the tip of the iceberg. Git is an incredibly powerful source control system, the enormous rise in usage over the last few years can likely be attributed to the development community realising that.