![]() |
| Home | Company | Technology | Products | News/Articles | Contact |
|
|
Background
In theory, "Speech-to-Text" would be a natural alternative: if one could simply speak into their computer or device and have the text magically appear on the screen. The speech-to-text problem, acknowledged to be “the holy grail of computing”, has been historically plagued with problems including: (a) infinite language perplexity, (b) background and channel noises, (c) varied pronunciations, (d) unacceptable speaker-training methods, and (e) lack of intuitive error-correction. In reality, speech recognition in the commercial world has been successful in only limited command-and-control applications like call-center-automation where the lexicon is compressed.
Voice Powered Text Prediction™ by TravellingWave
• VoicePredict Multimodal Platform (patent-pending) fuses the hand/finger/stylus based inputs with microphone's speech input, in real-time, to result in voice powered text prediction. Additionally, if a user decides not to speak or if the background noise conditions are not suitable for optimum speech recognoition, the system automatically falls back to plain text prediction using keypad inputs, thus rendering a near 100% reliable speech interface solution. • Frequency Localized Temporal Speech Processing (proprietary) is based on the company's RAGs (Rao-Aronov-Garafutdinov Speech-Processing) algorithm which in turn is based on published research1-3 on compact features modeling the travelling wave phenomena in the human cochlea. This module extracts modulation information from speech, as opposed to traditional power spectrum analysis. This enables VoicePredict system to function robustly in noisy environments. • Acoustic and Language Modeling (patent-pending) techniques exploit the multimodal user interface and are optimized for the mobile text input problem.
REFERENCES: (1) Research Supported by National Science Foundation under the Small Business Innovation Research Phase-I, Phase-IB, and Phase-II grants; currently active (2) "On Decomposing Speech into Modulated Components", Ashwin Rao and Ramdas Kumaresan, Journal of the IEEE Trans. On Speech and Audio Processing, May 2000 (3) "Model Based Approach to Envelope and Positive Instantaneous Frequency Estimation of Signals", Ramdas Kumaresan and Ashwin Rao, Journal of the Acoustical Society of America, March 1999 |
| © 2009 -
TravellingWave Inc. All rights reserved.
|