In this third and final look at the Git source control system, I will introduce some more advanced concepts and show you some tricks employed by experienced Git users.
As with most things, and as anyone who has worked with Git for a while knows, there’s more than one way to skin a cat. A lot of tasks can be performed with a couple of basic commands. However, a few advanced concepts and tricks will sometimes help you achieve your goals more elegantly.
Advanced Concepts
Stash: The Clipboard
The “stash” is one such feature. In some situations, a “clean” working directory is recommended (or even essential). That means there shouldn’t be any local changes (for example, when switching the current branch).Imagine the following scenario. You’ve been working on a new feature for several hours, when suddenly a critical bug report comes in. Of course, you’ve already changed a couple of files. But now you must switch branches to be able to work on your current production code. You could simply commit your changes — but they’re only half done (and committing stuff that is only half done is bad karma!).
The stash helps you solve precisely this dilemma. All current changes are saved on this clipboard, and your working directory is left in a clean condition. As soon as you’re done fixing that bug, you can return to working on your feature — and simply restore all stashed changes.
Staging Parts Of Files
A large commit that mixes a lot of different topics is hard for other developers to understand, and rolling back problems will be hard if problems should occur. That’s why creating granular commits that contain only related changes is so important in version control.Git helps you do this by enabling you to add parts of a changed file to the staging area. If you execute git add
with the -p
parameter, Git lets you choose for every part of the file whether you want to stage it or not. This way, you can control very precisely which changes should go into your next commit — and which should remain for a later commit.
Tracking Branches
If you’ve already glanced at the configuration file of one of your local Git repositories (.git/config), you might have spotted one of these sections:Git saves some meta data about the relationship between two branches; in this case, our local “master” branch tracks the same-named branch on the remote “origin.” This meta data is used by a couple of commands in Git, such as push
, pull
and status
.
In general, though, you don’t have to worry about managing all of this meta data. If you create a new local branch based on a remote branch, Git will set up the tracking relationship for you.
Undoing Things
Most mistakes that you make in Git can be corrected pretty easily.Let’s take a simple case. You have mistyped your last commit message and now want to correct this typo. Git offers an –amend
parameter for its commit
command. This will overwrite the last commit and make it look as if your little mistake never happened. Amending also allows you to change the set of committed files by adding and removing items to and from the commit. But remember one golden rule: don’t amend commits that you’ve already pushed to a remote!
The revert
command lets you “take back” a commit (and this time it doesn’t have to be your most recent commit). Reverting, however, will not delete any commits. Quite the opposite: a new commit will be created that reverses the effects of the corresponding commit.
The reset
command is useful if you truly regret your most recent commit(s). It takes advantage of the fact that branches are really nothing more than pointers to a certain commit. This command rolls the pointer back to an older commit. In fact, reset
will not even delete any commits; but your project’s history will look like it has done exactly that.
Integrating Selected Commits
Usually, you would integrate changes into a branch by merging with another one. In those rare cases where a merge is undesired, Git offers an alternative with thecherry-pick
command. Instead of integrating a complete branch (when merging), cherry-pick
allows you to integrate any desired commit. You can even integrate multiple selected commits in one go — but remember to start with the oldest one to avoid problems.
GUI applications like Tower make tasks like these a lot easier by allowing you to simply drag and drop the desired commits.
Rebase Instead Of Merge
The most common method to integrate one branch into another is to perform a “merge.” For an ordinary three-way merge, Git takes the endpoints of the branches to be merged and the common parent commit as the basis for the integration. This results in a so-called “merge commit” that connects both branches like a knot.A “rebase” is an alternative to an ordinary merge. A rebase does not result in a separate merge commit and therefore produces no “bumps” in your project’s history. It will look as if the history has run linearly and all commits have happened on the same branch.
Let’s look at a concrete scenario to better understand what a rebase does:
Here, branch_B is our current HEAD branch. If we execute git rebase branch_A
, the following things will happen. First, all new commits (C2 and C4) that originated after the last common commit (C1) will be temporarily removed. Now, branch_A’s new commits will be applied on branch_B. This means that both branches are now on the same position: on branch_A’s position.
Right at the beginning, branch_B’s new commits were removed temporarily. Now is the time to reapply them: one after the other, and in their original order.
In the end, by rebasing the branches, no merge commit was necessary, and the history has remained linear.
Rewriting History
A handful of commands in Git will change a project’s history. In addition to amending a commit,rebase
also falls into this category.
Note that the reapplied commits in our sample scenario aren’t completely identical to the original commits (which is why they’re named C2’ and C4’). The commit SHA-1 has changed because we rewrote history with the rebase
.
Always follow this golden rule when using commands that change your history: local commits that haven’t been published can safely be changed by using rebase
or amend
. However, if they’ve already been pushed to a remote repository, you should not use these tools anymore. Your teammates will thank you for it.
Housekeeping In Git
A Git repository can accumulate quite a number of objects over its lifetime, be they commits, files or file trees. Organizing these objects optimally is crucial to keeping Git fast. Thegit gc
command (where gc
stands for “garbage collect”) was made for just this purpose. Although it’s executed automatically in the background when running certain commands, running gc
from time to time is still a good idea (preferably with the –aggressive
parameter, to ensure the best result).
Desktop Tools And External Code Hosting
Tools
If you spend a lot of time in the command line, you can spice things up by using plug-ins. Nice little helpers include Tab auto-completion for branch names, and directly displaying the current branch in your shell’s prompt. Plug-ins are available for Bash and zsh.As alternatives to the command line, various desktop clients might be worth a look. A lot of tasks can be performed more easily and comfortably using a graphical user interface — if not only to not have to memorize all of the commands and parameters. Windows users might want to look at Tortoise Git, while Mac OS users can try Tower (disclaimer: this is the author’s product).
Repository Hosting
More and more companies and individual developers are opting not to host their own code anymore. Providing and maintaining expensive server infrastructure is not everyone’s cup of tea: special know-how is necessary, ressources are tied up, and a high degree of security and availability must be guaranteed.Meanwhile, some companies are already offering code hosting as “software as a service.” Some of the most popular ones currently are GitHub, Beanstalk and Codebase. If hosting your code externally is an option, then check out one of these services.
Conclusion
Git is an extremely versatile tool. In its early days, it consisted of more than 140 binaries that could be combined flexibly with one another. While using Git has become much easier with recent versions, it has managed to maintain this flexibility. As a result, Git can be viewed as a toolset for creating your very own version control workflow.But the advanced tools are not all that make Git interesting. Its extraordinary speed and unique branching concept, for example, are great to have, too.