Archive for the ‘NaturallySpeaking’ tag
Speech recognition software; not for everyone
Any science fiction movie worth its salt features computers or robots that not only recognise human speech but also understand what the speaker means. This isn’t likely to stay fiction for much longer*.
Right now, some of the best brains in the computer industry are working to develop ‘natural language processing’ products. That’s the jargon term for software that turns vocal sounds into meaningful data.
When natural language processing arrives, it will be the biggest breakthrough in the history of computing. Apart from anything else, it will mean we can do away with keyboards and screens – or at least relegate them to occasional use.
But intelligent voice recognition isn’t just about computers. Imagine telephones, TV sets and microwave ovens that understand spoken instructions.
Natural language processing
Natural language processing won’t all arrive at once. We are at probably ten and maybe 15 years away from computers hearing the difference between war and Waugh. And computers capable of meaningful conversation, like Star Wars’ C3P0, won’t be available for 20 years or more.
The early fruits of language processing have been on the market for years. ACT, a now defunct British computer maker, sold a system that responded to a limited set of voice commands in 1981. Today, we have fourth, possibly even fifth, generation voice recognition products that can turn the spoken word into typewritten text – most of the time. You may see them referred to as speech recognition applications.
The slow progress to date has not been due to software issues, but had more to do with the available computer power. You need a hefty processor to run voice recognition. Make that hefty spare processing capacity after all the fancy graphics and other cycle-chewing work required by modern operating systems.
When they work as advertised, today’s voice recognition products are impressive. It was Arthur C Clark who wrote that any sufficiently advanced technology tends to look like magic. It’s hard not to believe in the supernatural the first time you see your own speech appear as type on a PC screen.
Speech better for small business
Unlike many computer products on the market today, voice recognition offers more to small businesses than large companies. There are two reasons for this. First, voice recognition requires a degree of effort on the part of the user. A users train the software to understand an individual’s voice and speech patterns.
Further refinement is needed over the first few weeks of use. Because of this, it only works well for highly motivated people. Mischievous, or reluctant users can make sure their systems never work effectively.
A second reason is that voice recognition products needs lots of support. The cost of supporting an individual PC user tends to rise with organization size. In a big company, the cost overheads of voice recognition can outweigh the productivity gains.
The biggest name in voice recognition is Dragon, part of Nuance. The company’s Naturally Speaking software comes in a variety of packages costing from around NZ$200 for a student edition climbing to NZ$1500 for a corporate package. (Prices are in New Zealand dollars, roughly 50 cents US).
In addition to a powerful computer – Nuance recommends a 2.4GHz Pentium Dual Core, anything less will deliver disappointing results – voice recognition systems need a good microphone. In theory they can work with a PC’s internal microphone. In practice it usually isn't worth the bother.
Memory is less important
Memory isn’t so important if you’re running an older operating system, but if you’re running Windows Vista, you’ll need well over 1GB of Ram. Pretty much any PC sound card will do so long as it can handle recording.
All the commercial specialist voice recognition programs are available in packages that include microphones, usually on headsets. You can also buy digital voice recorders bundled with speech recognition software – these can be great for taking voice notes when you are out and about.
You may already have voice recognition on your computer. The latest versions of Microsoft Windows and Office have baked-in speech recognition. It’s not as quick, as polished or as customisable as NaturallySpeaking, but at least Microsoft's tools give you an opportunity to test the technology before parting with any cash.
You couldn’t realistically use any existing voice recognition products to write a book unless you were patient. Nor are they likely to replace good typing skills in the near future. Nevertheless they are more than adequate for composing emails and short memos. What’s more, when voice recognition tools are integrated into a computer’s operating system, they can control functions such as opening and closing files or selection commands from menus. In fact, voice recognition tools are widely used by people with disabilities – especially the blind and other people with seeing difficulties.
* Not likely to be fiction much longer?
There’s a bit of poetic licence here. The ACT voice recognition system I saw in London in 1981 could only ‘learn’ ten words. A salesman told me proper voice recognition was around “two years away”. It’s nearly 30 years on, and while the programs are massively better, they still need polish before being acceptable to mainstream users. Maybe two more years will do the trick. As for ‘natural language processing’… that’ll take longer.