Speech Recognition HOWTO
Prev		Next

5. Speech Recognition Software

5.1. Free Software

Much of the free software listed here is available for download at: http://sunsite.uio.no/pub/Linux/sound/apps/speech/

5.1.11. More Free Software?

If you know of free software that isn't included in the above list, please send me a note at: scook@gear21.com. If you're in the mood, you can also send me where to get a copy of the software, and any impressions you may have about it. Thanks!

5.2. Commercial Software

5.2.1. IBM ViaVoice

IBM has made true on their promise to support Linux with their series of ViaVoice products for Linux, though the future of their SDKs aren't set in stone (their licensing agreement for developers isn't officially released as of this date - more to come).

Their commercial (not-free) product, IBM ViaVoice Dictation for Linux (available at http://www-4.ibm.com/software/speech/linux/dictation.html) performs very well, but has some sizeable system requirements compared to the more basic ASR systems (64M RAM and 233MHz Pentium). For the $59.95US price tag you also get an Andrea NC-8 microphone. It also allows multiple users (but I haven't tried it with multiple users, so if anyone has any experience please give me a shout). The package includes: documentation (PDF), Trainer, dictation system, and installation scripts. Support for additional Linux Distributions based on 2.2 kernels is also available in the latest release.

The ASR SDK is available for free, and includes IBM's SMAPI, grammar API, documentation, and a variety of sample programs. The ViaVoice Run Time Kit provides an ASR engine and data files for dictation functions, and user utilities. The ViaVoice Command & Control Run Time Kit includes the ASR engine and data files for command and control functions, and user utilities. The SDK and Kits require 128M RAM and a Linux 2.2 or better kernel)

The SDKs and Kits are available for free at: http://www-4.ibm.com/software/speech/dev/sdk_linux.html

5.2.2. Vocalis Speechware

More information on Vocalis and Vocalis Speechware is available at: http://www.vocalisspeechware.com and http://www.vocalis.com.

5.2.3. Babel Technologies

Babel Technologies has a Linux SDK available called Babear. It is a speaker-independent system based on Hybrid Markov Models and Artificial Neural Networks technology. They also have a variety of products for Text-to-speech, speaker verification, and phoneme analysis. More information is available at: http://www.babeltech.com.

5.2.4. SpeechWorks

I didn't see anything on their website that specifically mentioned Linux, but their "OpenSpeech Recognizer" uses VoiceXML, which is an open standard. More information is available at: http://www.speechworks.com.

5.2.5. Nuance

Nuance offers a speech recognition/natural language product (currently Nuance 8.0) for a variety of *nix platforms. It can handle very large vocabularies and uses a unqiue distributed architecture for scalability and fault tolerance. More information is available at: http://www.nuance.com.

5.2.6. Abbot/AbbotDemo

Abbot is a very large vocabulary, speaker independent ASR system. It was originally developed by the Connectionist Speech Group at Cambridge University. It was transferred (commercialized) to SoftSound. More information is available at: http://www.softsound.com.

AbbotDemo is a demonstration package of Abbot. This demo system has a vocabulary of about 5000 words and uses the connectionist/HMM continuous speech algorithm. This is a demonstration program with no source code.

5.2.7. Entropic

The fine people over at Entropic have been bought out by Micro$oft... Their products and support services have all but disappeared. Their support for HTK and ESPS/waves+ is gone, and their future is in the hands of M$. Their old website as http://www.entropic.com has more information.

K.K. Chin advised me that the original developers of the HTK (the Speech Vision and Robotic Group at Cambridge) are still providing support for it. There is also a "free" version available at: http://htk.eng.cam.ac.uk. Also note that Microsoft still owns the copyright to the current HTK code...

5.2.8. More Commercial Products

There are rumors of more commercial ASR products becoming available in the near future (including L&H). I talked with a couple of L&H representatives at Comdex 2000 (Vegas) and none of them could give me any information on a Linux release, or even if they planned on releasing any products for Linux. If you have any further information, please send any details to me at scook@gear21.com.

Prev	Home	Next
Hardware		Inside Speech Recognition

docs.sk

comprehensive documentation repository

Speech Recognition HOWTO

5. Speech Recognition Software

5.1. Free Software

5.1.1. XVoice

5.1.2. CVoiceControl/kVoiceControl

5.1.3. Open Mind Speech

5.1.4. GVoice

5.1.5. ISIP

5.1.6. CMU Sphinx

5.1.7. Ears

5.1.8. NICO ANN Toolkit

5.1.9. Myers' Hidden Markov Model Software

5.1.10. Jialong He's Speech Recognition Research Tool