Casper Speaks

Anyone who has been to a Macworld Expo has probably seen the Voice Navigator people demonstrating how easy it is to create their corporate logo with Voice Navigator even when speaking at the speed of a trained auctioneer. Despite that fact that most of us couldn't give a hoot about creating the Articulate Systems logo and few people even want to talk that fast, it's an impressive demo. Heck, I want to test one.

Apple may have just upped the ante in terms of demos with the demonstration of Casper, the listening Mac. (Sounds a bit like a cross between Casper the friendly ghost and Mr. Ed the talking horse, no?) Casper is essentially some very sophisticated software that allows a Mac to recognize vague commands from almost any speaker. From what I've heard from people who have seen the demos, it really does recognize continuous speech, and can even respond with voice output as well.

Casper must be doing quite a bit more extra processing on the incoming voice data than the Voice Navigator does because the Voice Navigator merely matches a voice waveform to an entry in a command dictionary, whereas Casper has several hundred words attached to each command, though I'm not entirely sure how that is set up (partly because Apple isn't saying). Several of the demos have asked Casper to do relatively complex things like looking up and dialing phone numbers, acting as a voice interface to a VCR, and paying bills electronically. Apparently, in development Casper was fed a large number of sentences spoken by many different people, which avoids the Voice Navigator's requirement of training the software to each individual. Of course, this is still a technology demo, which means that it might still be a couple of years before it becomes a commercial product, but it's still incredibly promising, especially for everyone who has been lusting after a Star Trek-style communicator and computer interface.

Casper does not require any special hardware, unlike all of the other speech-recognition products on the market for Macs and DOS machines, although the demo was done on a Quadra 900 with a digital signal processing (DSP) chip and a better microphone than currently comes with the Mac. I imagine that the technology could be made to work on a plain 68030, but it might be too slow to really use. Just another incentive to upgrade, I guess, and Apple very well may start building the necessary hardware into the upcoming Macs. One place that the speech recognition technology will almost certainly appear is in Apple's Personal Digital Assistants (PDA) which are reputed to use the RISC technology developed by Advanced RISC Machines Ltd., the British company Apple helped form a while back (See TidBITS-033). Anything with a 3" x 5" screen and no keyboard needs a better method of working with data than a stylus. Casper does not currently do dictation (or Windows, for that matter, but more on that next week!), which will limit its use for actually entering data, but merely being able to recognize commands should be quite useful.

I'm curious to see how Apple will handle the interface to Casper, because if it can recognize any voice, it will have to be able to block out surrounding voices. Data muggers could appear too - people who would make comments over your shoulder to your Mac or PDA running Casper. "Oh you mean I shouldn't have said "Erase the hard disk... yes, I'm sure I want to do that." to your Mac? I'm sorry." I'm sure that Apple will work out safeguards for that sort of thing, but it's certainly something to think about. In the meantime, we all have one more future to drool over.

