Monday, March 29, 2010

Regular Expressions

I’m sure I’ve mentioned regular expressions (regex) on this blog before. I love ’em. (Note: If you’re not a computer nerd, you don’t need to know what regular expressions are, and can ignore this post. If you are a computer nerd—in any area of computer science—you definitely should know what regular expressions are. But… you can probably still skip this post.) Such a powerful technology, and it’s already built into most programming environments. (Or even the command line, if you use any operating system other than Windowsgrep anyone?)

However, much as I enjoy the power of regex, there is no doubt that the syntax is a little… opaque. For example, suppose you want to validate that an email address is in a “correct” format. You could write some code that does the following:

  • check for the presence of the @ character (there should be one and only one)
  • see if there are any dots (and whether or not those dots occur before or after the @, because there may or may not be some before, but there has to be at least one after—but a dot can’t be the last character)
  • Check for special characters like the dash, and make sure it doesn’t come right before the @, or right before a dot. (It can exist, it just can’t exist in those special spots. e.g. you can have serna-ferna@somewhere.com but you can’t have sernaferna-@somewhere.com or sernaferna@somewhere-.com.)
And there are various other rules you might need to check. You’d probably need one or more lines of code to check each of these rules. Or you can just validate the address against a single regular expression, in one fell swoop. One line of code (in most programming environments), and you can do some very complex pattern matching.

For example, in Java, assuming we have a string called emailAddress with the address we want to validate, and a string called EMAIL_REGEX_STRING with our regular expression, we could do the following:
if(!emailAddress.matches(EMAIL_REGEX_STRING)) {
  // handle error
}
From a coding perspective, this is a lot simpler. With one line of code we can validate that email address, and the validation can be as complex as we want it to be. The regular expression can include all of the rules mentioned above, and more, all in one string.

I bring this up because I was given just such an expression today, to validate an email address. It does, indeed, validate all of the rules mentioned above. Unfortunately, it looks like this:

(?i)^[a-z0-9`!#\$%&\*\+\/=\?\^\'\-_]+((\.)+[a-z0-9`!#\$%&\*\+\/=\?\^\'\-_]+)*@([a-z0-9]+([\-][a-z0-9])*)+([\.]([a-z0-9]+([\-][a-z0-9])*)+)+$

Wow. Not so readable, eh? Just to understand it, I had to try and break it up, piece by piece, and figure out what’s going on. This is the result, with some pseudo comments in there:

(?i)                                // make the regex case-insensitive
^[a-z0-9`!#\$%&\*\+\/=\?\^\'\-_]+ // string must begin with 1 or more of the characters between the [ and ]
( // next section...
(\.)+ // if there is a dot...
[a-z0-9`!#\$%&\*\+\/=\?\^\'\-_]+ // must be followed by one or more of the characters between the [ and ]
)* // ... section happens 0 or more times
@ // followed by an @ symbol
( // next section...
[a-z0-9]+ // one or more characters of a-z or 0-9
([\-][a-z0-9])* // optionally followed by dashes, followed by a-z and/or 0-9 characters
)+ // ... section happens 1 or more times
( // next section...
[\.] // a dot
( // followed by...
[a-z0-9]+ // 1 or more a-z or 0-9 characters
([\-][a-z0-9])* // optionally followed by dashes followed by a-z and/or 0-9 characters
)+ // ... 1 or more times
)+ // ... section happens 1 or more times
$ // must end here


Still pretty bad. It’s no wonder that people take a look at regex syntax and decide they don’t have the time to learn it.

The worst part is, I think there are some mistakes in this expression, but I can’t even be sure! Can you really have a ` character or a dollar sign or an ampersand in an email address?!? Or am I even reading that right?

Saturday, March 27, 2010

Two Videos From Shad

Andrea showed these to me today, and I thought I’d share them with y’all.



Friday, March 26, 2010

Pretty Lattes

For a while, when I was looking for a change, I was using the following display picture in MSN Messenger:

coffee

(It may even have been during Roll Up the Rim to Win season.)

When we were in Ottawa this week, for a short mini vacation, we stopped in at a coffee place that actually decorates their lattes like that.

Latte 01
Latte 02
Latte 03
Latte 04

Google Wave and Novel Pulse

I have long been saying—possibly here but definitely in other circles—that one of the key things that will help Wave technologies take off is when other companies/organizations start incorporating the federation protocol into their services. People aren’t going to log onto Google Wave to communicate for business—or at least, not many of them will—but they will log onto their corporate wave servers to do so. The example I keep using is when Microsoft incorporates the technologies into Exchange Server (if that ever happens—obviously it’s early days).

I was pleasantly surprised to see today that Novel has already started in this direction, with their Novel Pulse product. I hadn’t expected it to start happening this early, but I’m glad it is.

Thursday, March 18, 2010

Gnome 3

For the Linux geeks out there, Gnome 3 has been released, and there is talk of making it the default desktop environment for Ubuntu. You can take a tour of it at the Gnome site, or view the Cheat Sheet to see descriptions of a whole bunch of features. (I first read about it on the Works With U blog, complete with screenshots and a screencast.)

It looks like it has the potential to be either:

  1. Very cool, or
  2. Very annoying, or
  3. Both
It would be nice to see a new paradigm for desktop environments, instead of the old taskbar at the bottom (or top) of the screen with buttons for all of the open windows. So I’m hoping that this will be one of those things that people get used to quickly, and wonder how they ever lived without it. But only time will tell.

Monday, March 15, 2010

Twitter

I haven’t yet become a twit—is that the term for someone who uses Twitter?—so I have avoided saying much on this blog about Twitter. However, Joel mentioned something in the context of a larger post and it made me smile, so I thought I’d share it:

Although I appreciate that many people find Twitter to be valuable, I find it a truly awful way to exchange thoughts and ideas. It creates a mentally stunted world in which the most complicated thought you can think is one sentence long. It’s a cacophony of people shouting their thoughts into the abyss without listening to what anyone else is saying. Logging on gives you a page full of little hand grenades: impossible-to-understand, context-free sentences that take five minutes of research to unravel and which then turn out to be stupid, irrelevant, or pertaining to the television series Battlestar Galactica. I would write an essay describing why Twitter gives me a headache and makes me fear for the future of humanity, but it doesn’t deserve more than 140 characters of explanation, and I’ve already spent 820.

Thursday, March 04, 2010

Joel’s Taking it Offline

I’m not sure if I should actually write this post. Everyone else who reads his blog is probably doing the same thing, and I’ll just be one voice amongst a million. But I’ve already started typing, I might as well finish…

Joel Spolsky has decided to stop blogging at Joel on Software. Which gives me mixed emotions.

On the one hand, it was a great blog, with many great, classic posts on software development. Many of those classic posts became part of the books, Joel on Software and More Joel on Software. I can’t remember how I first came across Joel’s blog, but as soon as I read it, I was hooked.

On the other hand, Joel hasn’t been writing that much on the blog lately anyway. By lately, I mean, oh… a year? Two years? More? The thing is, I started following him too late. However it was that I came across Joel’s blog, it was through some link to one of his classic posts, which I read, and thought, “Ah, now there’s an intelligent post on [whatever it was about].” Then I read his first book, and enjoyed it thoroughly all the way through. So I added his blog to Google Reader, and prepared myself for the brilliance that would inevitably follow. Only to find that he rarely posted, and when he did, it was either a post about Stack Overflow, or a link to his latest column on Inc. Magazine. It turns out that Joel seems to have done most of his great writing before I ever got on the bandwagon. (I did like the articles for Inc., don’t get me wrong.)

So I’m sorry to see him go, and I wish him luck, but at the same time, he’s got good reasons for stopping, and he hasn’t posted as much in recent times so I guess I’m not missing much anyway.

I just wish there was a slew of other bloggers out there for me to read, ready to take his place.

Wednesday, March 03, 2010

Another Wave Concept Video

The last video I posted illustrated how a bot might be put to good use in a wave, along with a clever way that wave technologies could be embedded within a larger business process.

Today I include a video from SAP with a prototype for a system called Gravity, which illustrates a clever way that an extension can be built for Wave for integrating Wave with other systems.


I can easily see Sparx Systems coming up with an extension for Enterprise Architect UML models (or similar extensions for Rational Rose or other UML tools), or extensions for embedding Microsoft Office documents/spreadsheets/presentations, or a million other ways of integrating software with Wave through extensions.

Or, to go back to the previous example, Salesforce.com envisioned embedding a wave in their ticketing system, but they could also have gone the other way, and created an extension to embed a trouble ticket into the wave. When you load a wave, the extension could dip into the trouble ticketing system to load the information.