Monday, June 21, 2010

Voice Recognition

I read a nice post on Coding Horror today about voice recognition—and the fact that it’s still not here yet. (And may never come.) I’m not sure what I can add to his post, though, other than just saying “me too” over and over.

Atwood mentions the Google app you can get for your iPhone, and its voice recognition feature. My experience has been slightly different than his—I’ve found that it works remarkably well—but at the same time, I think of it more like a novelty, not a real useful thing. It’s fun to pull it out and speak into it, and have it automatically perform a Google search for me, especially if I want to show it off to others, but when I want to actually search for something… I pull up Safari, and type it into the Google search text box. (If I had my dream phone, I’d just do the search right from the phone’s “desktop” and not even pull up the browser until I found a result I needed!) I use Safari instead of the Google app with its voice recognition for a couple of reasons:

  1. It’s faster. Safari loads faster than the Google app on my iPhone, and if I find the result I want, it’s going to end up loading Safari anyway. So it’s much faster to just start with Safari in the first place, and cut out the middle man.
  2. Add to that the times when the voice recognition doesn’t work, and you have to do the search over and over, vs. just typing it in and getting it right the first time.
Plus there’s the whole speaking at your phone thing. Aside from the coolness factor, is there actually any benefit to saying your Google search at your phone, instead of typing it in? Any benefit whatsoever?

I was also right there with Atwood when it comes to dictation. He mentions that someone had had the idea of having him and Joel Spolsky use voice recognition software transcribe their podcast, and I was thinking of when I was looking into doing something similar for our church. We were going to start putting our sermons online, and I was thinking that having a textual version of the sermon would be very handy for things like Google searches, so I was playing around with Microsoft Word’s speech recognition. Which, again, is very good. But… not good enough, it turns out. In fact, I was trying to do some tests, using Microsoft Word, and one word that it could just never get right was “verse”. Imagine trying to transcribe a sermon without using the word “verse”! (To get a feel for why this is important, go back through any of the sermons we’ve got online, and see how often the pastor is referring to this or that verse, as he refers to passage after passage.) It’s possible that the speech recognition might have done a good job, and I’d just have to go through and correct it, but I’m with Atwood on this one, too:
Maybe it’s just me, but the friction of the huge error rate inherent in the machine transcript seems far more intimidating than a blank slate human transcription. The humans may not be particularly efficient, but they all add value along the way—collective human judgment can editorially improve the transcript, by removing all the duplication, repetition, and “ums” of a literal, by-the-book transcription.
I actually approached it very optimistically, but in my testing quickly came away with the idea that it wouldn’t work out well in practice. (What we ended up doing is coupling the pastor’s sermon notes with the audio for the sermon. It’s not the best solution—pastors often end up straying from their notes, so the notes won’t always match up with the actual sermon—but I think it’s a good compromise.)

So even though I sometimes find the iPhone’s text input kind of annoying, I’ll still choose it over the Google voice recognition any day. And do—every day.

Tuesday, June 01, 2010

Incentivizing Work

I read an article by Joel Spolsky in which he was talking about gaming incentive plans. If you haven’t read it, go and check it out.

Joel was looking at one side of the incentive problem: If you try and get a result by incenting people, they’ll learn how to get the incentives, regardless of whether they’ve actually produced the result—they learn to game the system. But what about the other side: What if people didn’t game the system? What if they really did try to produce the result you were looking for? And what if you promised them that the quicker you produced it, the higher the reward? That is essentially the free market system in action, right?

Actually, it turns out that doesn’t work anyway. It’s not just that people will game the system and work around the incentive model you’re creating, it’s worse than that: the incentive model doesn’t work in the first place. Hence a talk by a guy named Daniel H. Pink. I found out about him from a post on Coding Horror; if you head on over to that post, you’ll see two videos from Daniel:

  1. The first one is from TED, and is longer, but more informative.
    • As a side note, I just recently heard about TED, and I’ve found a lot of interesting talks on a variety of topics. After you’ve seen one or both of these Daniel H. Pink videos, feel free to browse around the TED website for other talks. I’m sure you’ll find something interesting.
  2. The second is a shortened version, which is sort of a summary of the things said in the first one (but has prettier pictures).