Programming

80 columns

I tend to wrap most code I write at 80 columns. Many programmers I interact with (especially those contributing to my projects) don’t really see the point of that, so I here’s why I do it:

1. It works well with UNIX tools like less, grep etc. at standard terminal sizes. Sure, you can resize your terminal emulator, so I guess it’s not a very strong argument in this day and age.

2. It works well with pretty much every diff and review tool. That’s a better argument, because many of these tools have some kind of line limit, and 80 is certainly the lowest common denominator that works well everywhere.

3. Most lines are shorter than 80 columns anyway, this rule is mostly about how to deal with exceptionally long lines. Java code is a notable exception here, wrapping that at 80 columns is pretty hard. I’d still argue for a line length limit in Java code, 100 or 120 columns seems to be popular.

4. From my experience, short lines result in more readable code overall. Less nested expressions and more (reasonably named) temporary variables always appear to make things easier to follow.

5. You need less horizontal space for your editor. So you can show other windows (browser, terminal, other files, …) next to it.

I came up with only one downside of wrapping at 80 columns: You have to think about how to wrap long lines, or preferably refactor them into multiple shorter lines. It’s definitely some extra effort, but as I noted above, I strongly believe this aids the overall readability of the code, so I’d say it’s time well spent.

Reimplementing the wheel

Assuming you’re a programmer, I bet that you’ve heard the phrase reinventing the wheel before. I usually cringe when I hear someone say it, for two reasons:

1. It’s actually reimplementing the wheel

The phrase is almost always used when someone is implementing something that has been implemented by a library or framework. (That’s at least what my anecdotal evidence suggests.) So it’s reimplementing, not reinventing. Reinventing would be someone implementing a sort function, but completely ignoring all research done in that area, trying to come up with their own algorithm. I don’t think that’s common.

But there are valid reasons for implementing something that has been implemented before, here just a few:

  • You want to actually understand what’s going on, and be able to tune freely, because it’s an important part of your application
  • Libraries age far more quickly than research, and are not available in all environments
  • You just want to keep things simple (note that I’m not using simple synonymously with easy here)
Of course, there are valid reasons against doing that as well, for example:
  • It’s faster
  • You and your colleagues don’t need to learn and remember the theory
  • Depending on the library and programmers in question, often higher reliability
So it’s sometimes the right thing to do, sometimes not.

 

2. It’s condescending

It might be said with the best of intentions to safe someone else from wasting time, but it’s still saying they’re actually wasting their time. Without having a clue about why they chose to implement something themselves instead of using an existing implementation, that’s downright condescending.

Making assumptions about something you don’t know is generally not a good idea. Why not assume people actually know what they’re doing unless evidence suggests otherwise?

Unit tests versus code quality

Do unit tests improve code quality? Some famous consultants might disagree, but I think they don’t. Testable code isn’t automatically better code. Depending on the capabilities of your language, it’s probably worse.

Now don’t get me wrong, unit testing is a good thing. But I think we need to realise that we’re often making a trade-off between simplicity and testability. To me, simplicity is the most important factor of code quality, but many people lean towards testability and are very successful with that. Maybe a complex, well tested system is better than a simpler system with less test coverage, I can’t answer that. But you can’t have both, at least not in static languages.

Let me explain what I mean: In a unit test, you’re testing a small part of your system in isolation. You’re ensuring that a single module, class or function, works as expected, now and in the future. You only test the unit as it can be used from the outside, and that’s good, because its implementation details shouldn’t matter to the rest of the system. But it’s also a problem.

It’s a problem because you often have to actually change the unit to make it testable. There are plenty of examples for this, but I’ll stick to one in this post: You’re testing a unit that uses a random number generator. Since the behaviour of the unit will be different every time you use it, you need a way to take control of that random number generator in order to test it reliably.

If your language supports object-oriented programming, the common approach is to introduce an interface for the thing you need to control and inject it from the outside. Let’s say we’re creating a RandomNumberGenerator interface and pass an instance of it to our unit. You can then create a fake implementation that does just what you want, and pass that one to the unit from the tests. Now you can make sure that your unit works fine for various random numbers.

However, we have just added to the system’s complexity. We have created a facade for a random number generator, which is very likely already available in your language’s standard library. Anyone working on your code base will now have to know that your facade has to be used instead of the standard method. We have also introduced an interface that doesn’t make much sense right now: Ignoring the tests, there is only one random number generator – why have an interface if there is only one implementation of it? That’s nothing but unnecessary code other poor souls will have to wrap their head around. Maybe you even introduced a dependency injection framework – this is going to make your code base a lot more complex.

Languages that support monkey patching (most dynamic languages, e.g. JavaScript) are an entirely different matter: You can simply rebind the dependencies in your tests. I think that’s how testing is supposed to be: We should just write simple and clean code and be able to test it, without having to think about how to test it and what trade-offs to make. But static languages are still around, popular, and the only option for many applications, so I guess we will need to make such trade-offs for quite some time. Let’s at least be honest about it: It sucks.

Why we used Clojure and ClojureScript for Flurfunk

Why have you decided to use Clojure and are you still happy with your choice?

This question has been asked more than once now, and although I answered it in a Google+ comment, it seems you can’t link to those, so here’s a blog post.

I wouldn’t call myself an “old Lisper” (as Thomas did), but I had some experience with Clojure and other Lisps and thought it might be a good choice. We went with Clojure in particular because it runs on the JVM, and we wanted to integrate well with our company’s Java environment.

Clojure code tends to be succinct, readable and easy to change, which was useful since we didn’t have a very clear picture of where Flurfunk should go when we started, nor did we have much time.

ClojureScript, which was only a month old when we began to work on Flurfunk, is a different matter. Wasn’t exactly hassle-free. I think we might have been faster if we had used JavaScript, plus I wouldn’t have been the only one working on it. But I’ve seen ClojureScript get better every month, and was pretty productive after the initial problems were solved. A rewrite wouldn’t have paid off so far.

So to answer the question: Yes, we’re happy with our choices. I don’t think that we couldn’t have done it without Clojure, but it certainly played its part. If nothing else, it kept me motivated. On the other hand, I think we would have seen more collaboration (both in our team and now that it’s open source) if we had picked (J)Ruby and JavaScript.

Those decisions are way too hard and I don’t think there are really right and wrong choices. We both like Clojure, it probably comes down to that.

What’s with all these parentheses?

I’m fairly certain that every programmer will, at some point in his career, say something along the lines of:

I don’t know, what’s with all these parentheses?

And he will be talking about Lisp. I certainly did a few years ago. It’s quite strange when you’re used to other languages, why would anyone want all these parentheses?

Why have something like this:

(filter (lambda (price)
          (< price 10))
        prices)

When you can have it like this:

prices.select do |price|
    price < 10
end

Indeed, there is not much syntactical sugar, as Rubyists say, in most Lisp dialects. If you think about it, there isn’t really much syntax at all: You’re really just writing a syntax tree in Lisp’s list literals. Considering that, the syntax is actually pretty clear. Let’s look at what this might look like written in JavaScript’s array literals:

[filter, [lambda, [price],
           [lt, price, 10]],
         prices]

Not that much different, eh? That’s what Lisp is. And that’s why Lisp macros are so powerful: You can easily transform these data structures into something else, changing your code. I can’t think of a single other language that lets me do that. Another benefit is that when looking at Lisp code, you normally don’t have to wonder about things like operator precedence and strange literals you didn’t know, the syntax is unambiguous.

So if the parentheses are what’s separating you and Lisp, give it a chance. No one I know who gave Lisp the chance it deserves ever regretted it, whether or not it became their main language.

The Mustache way

Mustache is my favourite template engine.

It’s the only one I know that tries to keep logic out. There is a wee bit of logic, but just the bits without which it wouldn’t make sense to use it.

The traditional approach to templating looks like this:

  1. Have a model of the data relevant for a view
  2. Insert bits of that model into the view

The problem with this is that the second step is quite complicated. In most cases, you’re not rendering the model 1:1 onto the view, you convert data types, format dates, bring some bits of data together and separate others.

There are many powerful template engines that are very good at that. Some allow you to insert code, others have their own expression language. Some have mechanisms for extending the template language so you can add all the features you need.

When people used to that see Mustache, they usually want to add features, or they simply dismiss it as not being powerful enough.

Here’s the thing: This is code. Why do it in the template at all costs? Why not do this in actual code? Why not use your preferred language?

This is my approach to templating with Mustache:

  1. Have a model of the data relevant for a view
  2. Prepare it for the template (I usually copy and adjust the model)
  3. Render the prepared model 1:1 on the view
With this, you have the view logic in your code and your templates stay clean. How it ought to be, if you ask me.

JVM language popularity

I was lately interested in how popular the major JVM languages are in comparison, so I did some quick tests.

I compared Java, Scala, Groovy, Clojure and JRuby. I included both JRuby and Ruby in my queries, because JRuby isn’t really a distinct language.

The tests

Google

Quite obvious, eh? I searched for “x language” where x is one of the languages and wrote down the number of results. I’m fully aware that this isn’t a very good test.

Ruby 97,400,000
Java 46,200,000
Scala 29,200,000
Groovy 17,700,000
Clojure 3,460,000
JRuby 1,770,000
               

I thought Java would come out on top, surprised me.

Tiobe

The good old programming language popularity index.

Java #1
Ruby #13
Groovy #31
Clojure >#50
Scala >#50
JRuby Not listed
               

I thought Scala would do way better than Groovy.

GitHub

The most popular project hosting service.

Ruby #2
Java #5
Scala #18
Clojure #21
Groovy #22
JRuby Not listed
               

Scala, Clojure and Groovy are pretty close here.

StackOverflow

Probably the most important Q/A site for programmers.

Java 218,432
Ruby 41,435
Scala 8,104
Groovy 3,772
Clojure 2,762
JRuby 1,051
               

Java and Ruby are quite popular, the others less so.

Conclusion

Unsurprisingly, Java is by far the most popular language. So if alternative JVM languages are the future, the future doesn’t seem to be quite here yet.

The second place goes to Ruby. Ruby, not JRuby – it’s hard to figure out what percentage of the Ruby community is using JRuby.

Scala, Groovy and Clojure are similar in popularity. Sometimes Scala is on top, sometimes Groovy. Nonetheless, I’m actually most impressed by Clojure. It did pretty well, considering that it’s radically different to Java/Groovy/Scala and only 5 years old. (Groovy and Scala are both 9 years old, Java and Ruby both 18.)

Bottom line: When considering which JVM language (other than Java) to use, popularity can’t really be a factor. That’s good.

Clostache 1.0 – now spec compliant

I normally merely tweet about new releases of my pet projects, but Clostache, a Mustache parser for Clojure, is hitting 1.0, which I think warrants a blog post.

I changed quite a bit, in order to achieve compliance with the Mustache spec. The spec wasn’t around when I first wrote Clostache, and I learned about it’s existence just a few weeks ago from someone ranting on Twitter. Having a non-compliant Mustache parser didn’t feel right, so I changed that.

I had to fix a few bugs (mostly whitespace issues that don’t matter when producing HTML) and implement two features I never needed and thus ignored: partials and set delimiters. I also had to implement two features I didn’t know about (nor do they seem to be documented anywhere): dotted names and implicit iterators. See the README for examples. I kept the parser core mostly intact because it’s stable and mature by now, but some of the features would have been easier to implement with imperative parsing logic. It would also be faster, but Clostache has no performance issues, so that reason isn’t good enough either.

Anyway, if you’re using Clostache, please update to 1.0 and let me know if you run into any problems.

Make Emacs evaluate Clojure in 5 minutes

Emacs is my editor of choice for Clojure development (as for all Lisps), and according to the State of Clojure 2011 survey, that’s true for 68% of all Clojure developers.

Yet from what I’ve seen, some Emacs using Clojure developers don’t evaluate their code in Emacs, and the survey shows that 20% of all Clojure developers use the command-line REPL. Read on to figure out how to change that in just about 5 minutes.

“But why would I want to evaluate Clojure in Emacs?”, you might ask. Here’s why I can’t live without it: When I was new to Lisp, I missed the step-by-step debugger I knew from imperative languages. That changed when I started to write Emacs Lisp and got used to evaluating code in the editor. In functional programming, you avoid mutable state, especially global mutable state, so you can normally locate and fix bugs by evaluating suspicious parts of your program in isolation. Proper runtime inspection is still useful and necessary in some cases, and there are tools for that, but evaluating code in Emacs has been sufficient for me so far.

While most Clojurians seem to swear by SLIME to achieve this, I’m not particularly comfortable with it. It’s big, annoying to set up and there has been no release or tag to date, you have to get the latest code from CVS. Nothing you’re likely to get up and running in 5 minutes. If you’ve got some free time, do try SLIME, you might like it. If you, like me, would rather avoid it: Read on to learn how to set up inferior-lisp-mode (where inferior-lisp stands for Lisps other than Emacs Lisp) for Clojure.

Setting up inferior-lisp-mode

All you really need to do is set the variable inferior-lisp-program to the command that invokes the Clojure REPL.

Leiningen

If you use Leiningen for all projects, add the following to your init.el:

(add-hook 'clojure-mode-hook
          (lambda ()
            (setq inferior-lisp-program "lein repl")))

Maven

For Maven users, it takes a bit more effort. The inferior-lisp command will execute the configured command in the same directory as the file you are currently visiting, which is probably not the directory where your Maven pom.xml resides. Maven, however, wants you to execute most commands in that directory. To circumvent this, I wrote a wrapper script that locates the pom.xml before invoking Maven. Armed with that, you can set up inferior-lisp-program as follows:

(add-hook 'clojure-mode-hook
          (lambda ()
            (setq inferior-lisp-program "smvn clojure:repl")))

Both Leiningen and Maven

When you are working with both Leiningen and Maven projects (like I do), things get a little more complicated. I thought about writing an Emacs Lisp function that automatically discovers whether a project is a Leiningen or a Maven project and invokes the respective command, but decided to shave that yak later. Instead, I use Emacs’ dir local variables to set the inferior-lisp-program for each project. For a Leiningen project, just add a .dir-locals.el file with the following content to your project’s root directory:

((clojure-mode . ((inferior-lisp-program . "lein repl"))))

For Maven projects, do the same with the following content:

((clojure-mode . ((inferior-lisp-program . "smvn clojure:repl"))))

And to eliminate the annoying warning that pops up whenever the dir locals are set, add the following to your init.el:

(add-hook 'clojure-mode-hook
          (lambda ()
            (setq safe-local-variable-values
                  '((inferior-lisp-program . "lein repl")
                    (inferior-lisp-program . "smvn clojure:repl")))))

Using inferior-lisp-mode

Now that you’ve set up inferior-lisp-mode, you can just open a Clojure file and type: M-x inferior-lisp, which starts a Clojure REPL in the current Emacs window. You can then use the REPL conveniently from within Emacs, and evaluate s-expressions in your code by placing the cursor on closing parantheses and pressing C-x C-e.

That’s it, have fun 🙂