by Nick
Friday, June 23, 2006 10:49 AM
As I'm prone to do from time to time... here is another programming rant. One of the books on my shelf at work is Mastering Regular Expressions from O'Reilly. It's a very good book on the topic, and if you're interested in learning regular expressions, I highly recommend it. Recently a coworker came into my cube, noticed the book and said "Wow, you have a regular expressions book. I really want to learn about them, because they look so great!" Her voice was filled with excitement. I had to say it, but felt bad when I killed that excitement by saying, "They're not. They're evil. They should be used as little as possible."
Regular Expressions are very versatile, very powerful, and can do an amazing number of functions, and if I had my way, I'd never use them. From my experience in industry, the programmer who generally likes regular expressions is also the type of programmer who never comments their code. They're also the type of programmer who tries to combine as many operations as possible into one statement. They like compact. Compact looks cool... but compact can't be understood easily.
The reason why I think regular expressions are evil is because, unless you are a true regular expressions expert (and few really exist), you can't look at a regular expression and quickly say what it does. It's this weird black box of strange characters strung together that's difficult to interpret. It probably works, and it probably does the job very well. But when you come back to that section of code in 3 months, will you be able to say what it does? Will someone else who didn't write that expression be able to come in tomorrow and know what it does quickly? Or will they have to spend an hour parsing through it in order to determine its purpose?
I do use regular expressions sparingly... but I comment the hell out of them. I tend to use 3 times the number of comment lines when I embed regular expressions than normal code. You have to if you want to know what you did a month later. I say regular expressions are evil, because my goal is to always write code that someone can come in and look at, and know what it does without having to tear their hair out.
I don't always succeed in that goal, but at least I try.