Regular expressions

Earlier this month, I expressed my astonishment about the fact that the majority of software developers I’ve worked with in the last seven years doesn’t know the first thing about regular expressions:

fhd%3A It's amazing how many actual developers don't know regular expressions. It's like carpenters who don't know about hammers.

As you might have guessed, I regard regular expressions as a fundamental element of every programmer’s toolbox. However, I’m not good with metaphors, and I don’t know the first thing about carpentry, so hammer missed the point. Thomas Ferris Nicolaisen found a better analogy:

tfnico%3A %40fhd I would rather say it's like carpenters who don't know circle-saws %3B)

He’s right: Regular expressions are a specialised way to work with text, mostly relevant to programmers – not everyone who works with text in general.

Most of the other replies I got indicated that although they knew (or had once known) regular expressions, they rarely used it nowadays. I think that’s a shame, so I decided to share what I do with regular expressions on a daily basis, maybe you’ll find it useful. I do use them in code occasionally, but what I do all the time, whether I use an editor or an IDE, is searching and replacing. If you’re not at all familiar with regular expressions, I suggest this reference to make sense of the remainder of this post.

Searching

I sometimes mention that I grew up with UNIX and that’s true. One of the first things I learned about programming was how to use the tools of the Linux command-line, like grep, which is a command that allows you to search the contents of one or more files with a regular expression.

I can’t come up with a convincing example because I mostly use regular expression searching in conjunction with replacing, rarely alone. But imagine you’re trying to search for a specific string in JavaScript, but forgot which String delimiter (‘ or “) you used. Here’s the grep command:

grep -R "[\"']Some string[\"']" /path/to/your/webapp

Naturally, you don’t have to grow a beard and become a CLI geek to harness the power of regular expression searching, here’s how you do the exact same thing in Eclipse:

Regular expression search in Eclipse

Replacing

As mentioned above, I use regular expressions mostly for searching and replacing, a very powerful technique that saved me countless hours of mind-numbing, repetetive typing. Have you ever heard a co-worker make the same keyboard sound many times in a row? Like moving the cursor to the next line, placing it at the beginning and pressing CTRL+V? I’m a lazy person, and I can’t stand repetetive typing tasks. Fortunately, you can avoid the majority of these with regular expressions.

Here’s an example of how regular expression search and replace speeds up refactoring. We had a whole lot of test cases that looked like this:

assertThat(RomanNumerals.convert(1), is("I");
...
assertThat(RomanNumerals.convert(5), is("V");
...
assertThat(RomanNumerals.convert(10), is("X");

Too much duplication, so we created a method assertRomanNumeralEquals() to get rid of that:

private static void assertRomanNumeralEquals(String roman, int arab) {
    assertThat(RomanNumerals.convert(arab), is(roman));
}

Eclipse was able to extract the method for us, but it wasn’t able to make all the assertThat() invocations use the new method instead. So that’s where regular expression replacement comes in handy even in a sophisticated IDE. I replaced the following expression:

assertThat\(RomanNumerals.convert\((.*)\),\ is\((".*")\)\);

With this:

assertThatRomanNumeralEquals(\2, \1);

This is how it looks in Eclipse (select the lines to which you want to apply this before opening the find/replace dialog):

Regular expression search and replace in Eclipse

The expression might look a bit intimidating if you’re not used to regular expressions, but you will be able to write down something like this in no time if you practice them.

In case you’re wondering, this is also possible on the command-line, with the sed command.

Conclusion

Regular expressions are a powerful tool for processing and editing text, automated or interactively. If you use them habitually, you will have learned something for life, because every reasonable editor and IDE supports them. However, regular expressions are not standarised, so there are slight differences between Perl, Java etc. You might have noticed that there  are also some minor differences between grep and Eclipse in the first example above. This is sometimes good for a few short confusions, but it has never hurt my productivity notably.

Speaking of productivity; although regular expressions will probably not make you write code faster, they can significantly increase refactoring speed, a task on which I find myself working most of the time. How much time do you spend actually writing code down? And how much time do you spend editing existing code? I think the ratio is at least 1:10 in my case. If you are able to refactor fast, you will refactor more often, which is likely to improve design and maintainability of your code.

If you, however, decide to ignore regular expressions until you find a situation in which you really need them (that might never happen, you can always find a workaround), you are entering a negative feedback loop: You are not very familiar with them, so if you are faced with problems, they don’t come to mind and you don’t use them. If you don’t use them regularly, you will never become familiar with them. Searching and replacing is an ideal way to break that loop, so I suggest you try it.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>