At MIT's Computer Science and Artificial Intelligence Laboratory, speech-recognition expert Victor Zue is more interested in smart, voice-responsive computers than in Waibel's Pentium-packed knapsacks.
As part of the Oxygen Project, he is studying speech interfaces that could bring a flood of computerized information to the vast majority of the world that has yet to see PCs and modems.
Someday, Zue predicts, people around the world may carry cheap, souped-up cell phones that will enable them to tap into nearly limitless networked resources.
All the information on the Internet literature, news, chat rooms, medical research will then be accessible to anyone who picks up a handset and asks for it.
Before such grand changes can take place, there's a great deal of work to be done. Current systems make a lot of mistakes when they respond to a broad range of speakers.
The problem, Waibel says, is that people speak "dysfluently": They repeat themselves, don't complete sentences and sometimes don't even complete words.
Computer scientists are striving to develop software that can handle "ers," "umms" and fragmentary sentences, then select crucial words from garbled ramblings.
Most voice interfaces also have limited vocabularies, confining themselves to restricted subject areas in which each stage of conversation allows only a few sensible responses.
True conversations with computers may require overhauling the way speech-recognition programs work.
Possible solutions range from new mathematical methods for analyzing spoken sounds to computer programs that draw on vast databases of abstract knowledge about the world.
Next Page > A Universal Translator > Page 1, 2, 3, 4, 5