Falling in love with Git

Ever since watching Linus Torvalds’ talk about two years ago, I was excited about Git. However, it was only half a year ago that I first used it for one of my own projects during an internet outage. I’ve been working with Git extensively for a while now, and I’m still impressed.

Here’s what I love about Git:

  • Distributed. I didn’t know about distributed version control before watching Torvalds’ talk, but it all made sense to me immediately – especially when thinking about open source projects. Clone a repository, work with it, create your own branches, collaborate on something with someone else and contribute back to the project by having them pull your changes or by creating a patch with the built-in command. Just brilliant.
  • Powerful. I’ve never seen a source code management system as powerful as Git before. You can create and merge branches easily (git checkout/branch), you can store your current changes, work on something else and restore them later (git stash), you can employ a Subversion-like workflow by pushing to one or more remote repositories (git remote/push/pull), you can use Git to work with CVS (git cvsimport), Subversion (git svn) and lots of other SCMs, you can create and apply patches (git format-patch/apply), you can do regression testing (git bisect) and more. Despite all its features, I find it remarkably easy to use, probably because I grew up with Unix, and Git is Unix philosophy at its finest.
  • Fast. It’s hard to believe how fast Git is. Cloning a whole repository, including the complete history of all branches is a matter of seconds for my typical project size. Plus since you’re working locally 99% of the time, you hardly ever have to wait on the network. When switching branches for the first time, I thought something went wrong because it took just one second.
  • Github. Github is definitely one of the great things about Git. Besides hosting your projects conveniently, you can follow other projects and developers to see what’s happening in a Twitter-like timeline and you can fork other projects at the click of a button. For instance: I just cloned a project from Github and made some local changes. Since I had only read access to that repository, I simply forked it on Github, added it as a remote repository in my local repository and pushed my changes to it. I then removed the original repository from the list of remotes and I could continue to work as if nothing had happened. And all that in about one minute.

You can see how amazed I am. However, I have little hope of introducing Git where I work. The Eclipse plugin EGit is simply not there yet, nor is Windows support, and considering how difficult it was to replace CVS with Subversion, it seems surreal to think about using Git there. At least I’m Git-only for all of my private projects by now.

But if you can chose your own tools, I suggest you give Git a try even if your team is using a different source code management system. If you’re a Subversion user (I’m not a Subversion hater like Torvalds, I still like it in fact) I can point you to a very nice introductory series of screencasts demonstrating how you can collaborate with Subversion users.

Test-driven development

I’m probably horribly late for the party, but I’m currently exploring test-driven development after reading Kent Beck’s classic Test-driven development: By example. Although most developers claim to know TDD, I have only met a few who knew what it was actually about – the majority seems to confuse it with unit testing.

Here’s the surprise: TDD is not a testing technique. It’s a development technique, and a good one at that. Here’s why I like it:

  • It forces you to carefully think about what you want to implement before you dive into the how, leading to simpler APIs.
  • It pushes you towards loose coupling and modularisation – virtues in object-oriented design.
  • It makes you focus your energy on solving actual problems, away from just-in-case code and bloated, prophetic designs.

This is how TDD works in a nutshell:

  1. Think about what you want to create and hence what you want to test. Make a TODO list of all the test cases you plan to write.
  2. Pick a test case and implement it. Use imaginary classes and methods – it doesn’t have to compile.
  3. Make the test case compile as fast as possible. Create empty classes, implement methods that return constants, just make it compile.
  4. Run the test case. It will most likely fail. Now make it pass as fast as possible, using constants and duplication is explicitly allowed.
  5. As soon as the test case passes, refactor the code. Remove constants and duplication, refactor your dreamt API to match reality, write Javadoc etc. Run all test cases often and make sure that they still pass once you’re done. If you find a problem that isn’t identified by the existing test cases, add it to the TODO list and ignore it for now.
  6. Pick the next test case to implement. If you are unsure whether the code will only work with the specific test case you just wrote, write a similar one with different data. Beck calls this triangulation and it is supposed to increase the confidence in your code. TDD is all about confidence.

I’ve investigated TDD with two problems so far:

I first tackled an interview-style question about combinatorics. Although I was not a hundred percent sure what I was doing at all times, I was able to create a working solution astonishingly fast. Whenever I was in doubt (quite often), I would write another test case and fix the code until all test cases passed again. Now I know why it’s called test-driven; it does indeed drive development, e.g. by breaking down complicated problems into smaller, solvable ones. I never had to stop and agonise – if I couldn’t make a test case work, I threw it away and wrote one that made a smaller step towards my goal.

That really aroused my interest in TDD, but it was a neat little isolated problem, something for which TDD is known to work very well. How about a more realistic problem?

I recently started to port a Tetris clone I wrote in Java to GWT. There were performance problems because I drew the whole grid again after each change instead of just reacting to the changes, so I decided to rewrite the game logic. I figured that this would be just the right problem for my next TDD experiment, plus I was curious how it would fare in game development, so I began.

I just implemented the last test case from my list and I’m pretty happy with the results. However, it was more complicated than my first experiment, and there were many design issues. For instance: In the game (I assume you know Tetris), new pieces are placed at random horizontal positions. I could safely ignore this fact for a while, but as soon as I had to write test cases for moving pieces horizontally, things got complicated.

My test cases asked the pieces for their horizontal position and tested movement based on that information. I asserted that the moved piece did move left, unless it was on the left edge of the grid, in which case it shouldn’t move. I noticed that this test case would sometimes pass and sometimes not, based on the random horizontal position of the piece (e.g. if my code was broken and didn’t move the piece at all, the test case would only pass if the piece was positioned on the left edge of the grid).

I was not willing to tolerate a test case that would sometimes pass and sometimes not, because I wouldn’t be able to have confidence in the test cases anymore. Should I run all test cases a hundred times in a row after each change, just to make sure that each situation is tested? I had to get the randomness out of the game logic code. I did that by creating a class that would generate random numbers with methods like createRandomPiecePosition(). I then created an interface for this class and a mock implementation that returned fixed values which could be set by the test cases as required. This solved the problem, and it’s really nice in terms of modularisation.

I really enjoyed refactoring whenever I felt like it – rerunning my test cases gave me confidence that I didn’t break anything. I also noticed that I was surprisingly fast, even though I spent a lot more time on design issues than anticipated, because of the fast workflow: I could write some code and instantly see whether it works. When I worked on the Java version of the game, I had to create a specific situation (e.g. game over) by actually playing the game, which was annoyingly time consuming. Furthermore, the more unexpected bugs I fixed, the less confidence I had in the code, making me test more often. With the TDD workflow, I noticed an unexpected bug, wrote a test case to identify it and made it pass. I never had to think about the bug again, because I knew that there was a test case that would identify it should it occur again.

All in all, I’m really impressed by TDD, and I plan to use it for the game logic of my latest project.