Review: ViaVoice Millennium Edition
Developer: IBM Corporation
Price: $79.95 (street)
Requirements: G3- or G4-based Mac, 48 MB of RAM, 200 MB available disk space, audio input jack compatible with Andrea NC-71 microphone.
“When Did Retesting…”
Hi folks, don’t pay the ransom, I escaped.
I know, you’ve not heard from me since ATPM 6.04 (which I said the old fashioned way, “six point zero four,” rather than the modern/trendy “dot” term, which leads to ViaVoice misquotes such as “six not 04” and “six dots 04,” and the literally correct but not what I wanted “6 dot 04”), but life is what happens when you’re making other plans. For seven weeks I was traveling so light I was (shudder) UNABLE TO TAKE MY iMAC, hence this two month late conclusion to this review.
So…I’ve given ViaVoice a mo.nth of regular use, and become a moderately competent user thereof, and now it’s time to tell you if you should spend your time and money to do the same.
And the answer is (may I have the envelope, please? riiip)…it depends a lot on your needs.
ViaVoice is a speech recognition program that takes your spoken word and puts it on the screen for you, all typed out and ready for cut-and-paste into the text program of your choice. It works well. In fact, I’d say it works well enough. It does not work great, at least not for a speaker of my skills on an early (333 Mhz) iMac, but I’ve seen a professional do a ViaVoice demo on some G4 hot rod, and his output was as fast and accurate as my keyboard work.
As you’d expect from IBM, all the bases are covered. For example, the ViaVoice Setup Assistant uses 3D animation to show you how to put the headphone on your head.
I’m still faster with keyboard and fingers than I am with ViaVoice and my mouth. Still, it is a remarkable product and worthy of attention, and it’s possible we’ll look back on it as the herald to a new paradigm. Because ViaVoice does bring speech recognition to a useful level—it’s inexpensive, it can be mastered by people lacking computer science degrees, and it works—every step forward from here is icing on the cake. Once a technology is made possible, and available to the masses, the reverse of the “dancing dog” effect comes into play—no longer impressed by the fact that the dog dances at all, we want to know why this poodle’s Argentine Tango lacks feeling, and why that collie is still doing the Macarena (“Honestly, it’s almost the 21st Century—did you see that ridiculous collie?”).
Once again, it’s your chance to get in on the ground floor of a computer revolution. Like text editors twenty years ago, desktop publishers fifteen years ago, 3D animators ten years ago, and video editors five years ago, you can get into a new technology while it’s still young and quirky. Using ViaVoice, you’ll get the same sort of amazement from the crowd that you got in ’84 when you changed a whole page from Chicago to Helvetica with a swoop of the mouse.
So despite my modest Good rating from a practical standpoint, ViaVoice has some fringe benefits that put it in a Virtual Very Nice category. It isn’t really Very Nice, but it provides an interesting simulation of Very Nice.
What’s Good Today
Hey, I’m a writer. When stuff happens in my brain (hey, I told you I was a writer, I can turn eloquent phrases like that ’til the cows come home), I plonk away at the keyboard, and the stuff in my brain goes into my Mac’s RAM, where it can go up on the screen, or into the printer, or across the phone line, and perhaps eventually go to a printing press and eventually to the Out Of Print section of Powell’s Book Store. Tah-dah!
But the plonk-away-at-the-keyboard part is a means, not an end. I’ve never thought of myself as a typist, I think of myself as a writer. If all my fingers fell off, I’d keep writing, thanks to ViaVoice. I’d be as good a writer as ever, just a little slower. Actually, as far as output per day goes, I might even be a bit faster, since nobody would expect me to do the dishes any more.
Okay, I’m not as qualified as Callihan to make jokes like that, so seriously, folks, there are a whole bunch of physical challenges in this world. Many of them—from arthritis to quadriplegia, with nine letters left to go—interfere with keyboard use. If there’s some reason you can’t operate a keyboard efficiently, and you have a consistent speaking voice, ViaVoice can do an adequate job of getting your thoughts on the screen.
One such reason you can’t operate a keyboard might be…well, maybe you don’t know how to type. ViaVoice can have you plugging away at an accurate 20 words a minute in a day or two, as opposed to a semester of Typing I (mind you, they didn’t have delete keys when I was in high school), which doesn’t sound like much, but even today there are college kids hunting and pecking their way through term papers who would love to see 20 wpm.
I started typing in ’84 because there was No Other Way. For half the price of a 1984 Macintosh you can now get a blueberry iMac with 1000 times more RAM (I’m including the free 64 MB upgrade available currently from catalog houses) and 15,000 times the storage (a 6 GB hard drive instead of a 400k floppy), and while the clock speed has merely gone from 8 MHz to 350 MHz, there are clocks and there are clocks and there’s far more difference between a 68000 chip and a G3 than clock speed alone would indicate. It barely makes sense to compare the processing speed between the first Mac and the current iMac, because what are you going to compare? How fast Photoshop will perform an Unsharp Mask operation to a full page print quality color photo? Uh, no you’re not. If the software had been written back in ’84, (it hadn’t been) it would have taken hours and hours and over 100 disk swaps.
Yet MacWrite 1.0, running on an original ’84 Macintosh, could process my words as fast as I could type them in. Big deal, right? It was only 30 words per minute (yeah, but highly creative words!), so what’s my point?
What’s Good Coming Up?
Glad you asked. My point is, it’s about time we had a new way of entering text in our machines, because the technology has outrun the human ability to press keys, by a factor of…what? Hundreds at least, maybe thousands. For every moment that my little old iMac is noting I pressed a key, logging my entry, and processing its display on the monitor, it’s probably spending thousands of moments muttering, “Dum de dum, wonder when that guy’s going to press another key so we can get some work done, dum de deedle de dum…” Even if 20 wpm had been the maximum the Mac could have accepted from the keyboard in ’84, it would be good for at least 900 words a minute by now.
So how long before ViaVoice for Macintosh version x.x can process my speech as fast as I can speak it? According to my calculations, not long, and that’s only an estimate.
There’s one big difference between keystroke recognition and voice recognition—keystrokes are not open to interpretation. Did you type A-L-L, or did you type O-I-L? That’s no great challenge to the software. But did you say “all” or did you say “oil”? That depends on where you spent your formative years. I know a Florida gal, and I swear the only reason nobody’s put laundry detergent in her Buick is because of context.
Apparently, ViaVoice also uses context to deal with ambiguities. Unlike the voice recognizers of yore, ViaVoice analysis whole clumps of words at a time. I am sure it takes a lot of Deep Thought, and all this thinking noticeably slows the output, but the results are quite astounding. For example, ViaVoice successfully interpreted “an early show was held from 1 to 3 for 5 participants who could not attend the evening show,” where a less sophisticated program would have ignored the beginning and the end, and printed “12345" in the middle.
Why is this quirky? Because ViaVoice tries to make sense of everything it hears—even slurred words, even random background noises—so its typos range from wondrous to bizarre. Your spell checker won’t be much good here, because every word, no matter how inappropriate, will be properly spelled. For example, the title of this piece was a genuine attempt at speaking, “One two three testing.” It didn’t come out exactly as I’d intended.
Typos on a keyboard create nonsense words. Typos with primitive voice recognition programs put real words in nonsensical places. Typos with ViaVoice put the ghost of James Joyce in your computer. You can’t trust your spell checker, you can’t trust the grammar checker, you can’t even trust a copy editor. You need to go carefully over your own work or your readers will say, “Jamie’s been writing some strange stuff lately.”
This is important. When your friends come back from their honeymoon, they won’t believe your “typo” excuse if “…united in holy matrimony” gets in the papers as “…one night of unholy matrimony.”
To see just how hard ViaVoice tries to make intelligible prose from a sow’s ear, place the headset by a speaker and put on a favorite music CD. Here’s ViaVoice’s take on Mussorgsky’s “Pictures From an Exhibition”, exactly as transcribed:
The of aha and and and and and a of a harem level and again assume athletic if her hundred home and and and and long her her away if inhale and ratified a if Hanoi-1/5 if handoff with halfway if this oaths one will if but one which punishes fishhook hours all of law
And here’s a snippet from Carmina Burana, one of Orff’s greatest hits, in my humble opinion:
Is it is all the eight B or an bourbon long though a bunny or five House knows of Roger Gardner of a great man takes so in late in I saw somebody what I need to be doing is assuredness it could not get the yen a result of the interesting other walking legs the euro plastic glove was released to put up a possibly happen if if current write looking until your best academic year when the people run wild bird that done a share and its next here at if the king and a technique so it’s so do care
I don’t fully grasp what’s going on, but clearly it is making some complex assumptions about context. Even simple pairs like “walking legs” and “wild bird” go beyond the words-in-a-dice-cup you’d expect from random sound input, and a lot of monkeys could pound a lot of typewriters before “until your best academic year” popped up on the screen. And did you see how it capitalized “Gardner,” presuming it to be Roger’s last name?. I expect future versions will get real good at figuring out what you mean to say, particularly if you give it a fair chance and spend some time training it to your own set of quirks.
You load the software off the CD, clicking the usual I Agree and Continue buttons on installer alerts. You plug in the headphone. Are you ready to go? Not quite; this is one program where you have to (shudder, gasp) read the manual. It tells you how to set a software reference for background noise and measure your voice levels, and then the training begins. No, not your training, ViaVoice’s training. You read it a couple of stories in a normal voice, and it follows you along like a preschooler on your lap. When you’re done, it has a pretty good idea of how you talk.
Pretty good, but not perfect. To improve its understanding, you need to correct its mistakes. Highlight the incorrect word (which you can do through voice commands), open the Correction window (with mouse or voice command) and correct the misunderstanding. Next time you say what you said, ViaVoice will get it right.
This provides splendid opportunities for Stupid Mac Tricks, since ViaVoice will respond however you train it. For example, I put on my finest Inspector Clouseau accent and taught my copy of ViaVoice that the sound zee-voo-play is written “if you please,” and if I say karkee-zroo-frax in a thick Boris Badenov accent, instead of reading “car keys roof racks” it prints, “get moose and squirrel.”
Thus if my Intel Inside friends aren’t sufficiently impressed by ViaVoice’s legitimate features, or say, “Yeah, what can it do that Dragon Dictate won’t do?” I say, “It has a five language translation module. Watch this!
“Translate French to English, zee-voo-play,” I say, and ViaVoice dutifully write “Translate French to English if you please.”
And the spectators say, “Whoa, how does it do that? My PC can’t do that!”
So I shrug and say, “Yeah, well, it’s the Mac version. IBM had to dumb it down for the PC version so it would run okay on a Pentium. I’ve got mine set up for English, French, Twee, Russian and Farsi. I don’t know any Twee or Farsi, but I’ve picked up a little Russian from some old television shows…”
ViaVoice is good already. It’s more fun than a Furby, plus it’s a practical data entry device if you’ll accept a few limitations. But I can see the writing on the wall. Soon ViaVoice will be earning a Very Nice rating, and computer writers will find their keyboards gathering dust, like all the computer artists who find their watercolor boxes are getting all dried out. And around the time ViaVoice earns an Excellent, keyboards can go the way of hot lead and movable type printing presses. You can wait for ViaVoice to mature, or get it while it’s Good and get cheap upgrades later.