YOU ARE HERE: LAT HomeCollections


Speaking Out in Favor of Voice Software

Trends: Recognition programs have cost some people their jobs and helped others keep them-- including engineers still searching for the dream machine.


As a legal secretary, Regina Schneider could once type 105 words per minute. But in 1992 she fell victim to an occupational injury: carpal tunnel syndrome, an injury to the wrists increasingly suffered by typists and other keyboard operators.

"Despite surgery, use of my hands was absolutely minimal and typing was out of the question," the Novato, Calif., woman said recently. That seemed to rule out secretarial work or any office job requiring the use of a computer.

Until 1995, that is, when she persuaded her workers' compensation insurer to buy her a $700 version of Kurzweil VOICE, a computer program that can recognize tens of thousands of spoken words and thus enable users to write, calculate and even program computers.

"Technology put me out of work, and now technology is going to put me back to work," said Schneider, who uses a word processor, database and other computer programs in her full-time studies to become a rehabilitation counselor.

People like Schneider who suffer from keyboard-related injuries represent one of the few specialized groups that has benefited so far from voice-recognition software. Others include doctors and lawyers, who use voice-activated programs to give dictation, and Wall Street brokers, who use them to execute trade orders.

But the dream machine--one that can understand spoken English just like a regular person--is still stuck on the drawing board after decades of dedicated research. Most consumers have had to settle for glitch-prone programs that execute a limited number of rudimentary spoken commands.

Building a computer that responds to the sound of the human voice has been the computer industry's goal since before popular culture allowed "Star Trek's" Capt. Kirk and the astronauts of "2001: A Space Odyssey" to speak to their machines. Many in the computer industry see a speech-based interface as the missing link that will convert the last of today's technophobes into computer users.

Optimists confidently predict that a usable voice program will be on the market by about 2000. But substantial hurdles still exist on both the hardware and software sides.

" 'Star Trek' takes place in the 24th century, so we've got a little time yet," said Roger Matus, director of marketing for Dragon Systems, a Newton, Mass., company that makes one of the leading computer-dictation programs.

Speech-recognition systems work in the reverse of a person turning a thought into a sentence. A human being starts with an idea, chooses the words to express it, strings them together and then creates the sounds necessary to utter them.

A computer, by contrast, starts with a "heard" sentence and subjects it to a series of statistical algorithms that model human speech to determine the string of words that were spoken. Finally it checks the positions of the words in context, to confirm that the sentence makes sense according to the rules of English syntax.

While the approach sounds simple, the task itself can be overwhelming: The number of possible 20-word sentences that draw on a 25,000-word vocabulary is expressed as nine followed by 87 zeros. There are additional complications when accents and homonyms--"to," "too" and "two"--are added to the mix.

Rather than consider so many possibilities, speech-recognition software makes a series of assumptions and educated guesses to narrow things down. Systems for sale today, promoters say, manage to get it right about 95% of the time.

In large measure, that degree of success is the result of the systems' narrow functions. Programs that need to recognize only a limited number of words and phrases are comparatively easy to design and require a relatively small amount of computing power.

Indeed, simple applications of voice recognition have already infiltrated most peoples' daily lives.

Telephone companies are replacing directory assistance operators with electronic versions that can often find a phone number based on a spoken request. (Human operators are supposed to step in if the computer is confounded.) Other operator-assisted functions are being transferred to computers as well.

"AT&T has a system which handles collect calls automatically using speech recognition," said William Meisel, editor of the Encino-based Speech Recognition Update newsletter. "The system only has to recognize a handful of words, and it's saving them $100 million a year and the need for about 75,000 operators."

With voice-recognition technology, phone customers can place calls by simply picking up the phone and saying, "Call Mom." That capability could be critical for cellular phone companies, since support is growing in several states for laws requiring drivers to dial by voice, rather than push-button, for safety's sake.

Dictation systems that turn spoken words into computer text are gaining popularity as functionality goes up and prices come down. Most dictation systems have been aimed at narrow markets where the vocabulary is limited and the sentence structure is relatively easy to predict.

Los Angeles Times Articles