Taming the Dragon
How to master voice recognition technology without the pain and frustration.
By Tim Cooper
In 2003 I was diagnosed with repetitive strain injury. As a freelance journalist, I had very little choice but to carry on working which made the RSI steadily worse. By 2006 I could barely type for more than an hour each day before the pain brought me to a halt.
Voice recognition technology came to my rescue and, as part of a package of measures, saved my career. I am now writing (ie voicing) this article at more than 100 words per minute with about 99% accuracy. Previously my best touch typing produced about 65 words per minute, so not only has voice recognition enabled me to carry on working, it has also made me much more productive.
Now I can even programme the software to carry out short cuts on complex multi-step procedures across applications with just a single word or phrase. For example, by saying ‘open BT mail’, my computer opens Internet Explorer, finds BT Yahoo in my favourites, types in my password and then opens my inbox. The latest version of my software is also full of handy pre-written short cuts such as ’search Google for Andrew Brown’ and ’send email to Andrew Brown’. Another bonus – using voice recognition means you can work with your feet up, standing up, or whichever position takes your fancy!
What I Wish I’d Known When I Started
But it hasn’t always been this way – far from it. When I first started using voice recognition, it became more part of the problem than the solution. When I bought Dragon NaturallySpeaking (DNS) – the most commonly used voice recognition software – back in 2004, it was a horribly frustrating ordeal which I know many other people who use it out of necessity have experienced.
The following is what I wish I had known then. It applies to all potential users but especially those who have been forced into heavy usage due to injury or disability. The comments apply to DNS only and not any of the other voice recognition packages. Please also bear in mind that everyone comes to voice recognition with a unique set of problems and needs. These are my experiences, but yours may be completely different.
The performance of your voice recognition package will depend on a wide range of factors, including quality of microphone and soundcard in your computer, the speed of your processor and RAM (buy the best spec you can afford), your own clarity of speech and accent, ambient noise, which programs you are using it in etc. Not surprisingly, the vendors Nuance don’t tell you much about this until you get near the back of the manual! It can be a long and frustrating learning curve, especially if you’re climbing it alone.
The first package I bought was DNS Preferred version 7. Back then, the microphone that came in the DNS box was of inferior quality. I found out eventually that investment in a better quality microphone vastly improved the accuracy of recognition and transformed the experience from frustrating to satisfying in a stroke. I believe that this factor alone has been enough to cause people to give up using DNS.
However, I have just tested DNS version 10 Professional, the latest version. I am happy to report that the microphone supplied in the box (models vary, I got the Andrea NC-91) is vastly improved. Dictating in a Word document, it was 99% accurate after only 10 minutes training, compared to 99.5% with my expensive, high quality microphone (GN Netcom 2200).
Despite the good performance of the Andrea mic, I would still recommend potential heavy users speak to a consultant or knowledgeable supplier about their microphone needs. A different microphone from the one in the box may still be appropriate. For example, because I’m frequently on the phone during the day, it’s well worth me spending money on a separate good quality microphone with a switchbox (such as the Netcom), so I can switch between Dragon and talking on the phone easily and without removing my headset. The Netcom switchbox also needs a compatible telephone, such as the BT Converse.
After only 10 minutes of training DNS yourself, all versions are impressively easy to use for simple dictation in Word. But if, like most people, you use a range of applications for work, you will probably need the more expensive but function rich DNS Professional edition, rather than DNS Preferred. You will probably also need at least one session with a trainer if you can afford it (mine cost around £500 a day!). Of course you can get the main concepts from reading the manual, but for those using DNS for work, a trainer/programmer is recommended. On the advice of my Access to Work consultant (see below), I had two morning sessions of training and programming in DNS which helped enormously.
Training and Programming
For example, as well as Word I am a heavy user of Outlook, Excel and Internet Explorer, including some complex web-based programs, so I needed to learn how DNS Professional worked in these applications to avoid more frustration. My trainer also wrote some programs to help me perform many of my common daily tasks without touching the keyboard – especially the ones that Dragon and I would otherwise struggle to cope with.
Heavy users must have ongoing help for when things start to go wrong. For example, I found that DNS v8 quickly got ridden with bugs, some of which were caused by the way it interacts with Windows. Left unrepaired, these bugs can drive you crazy, but they can easily be fixed if you know how.
Obviously you should check the troubleshooting guide in the manual first. I see there is a welcome mention of how to deal with one of the most common bug-causing operations, ‘CTFmon’ (don’t ask!), in the V10 manual. But, in my experience it’s likely that you will need help beyond that – I still call my suppliers for advice from time to time. This could come from a specialist shop – such as those listed below – your consultant, your trainer/programmer, friends or colleagues who are experienced with DNS (you will meet plenty if you join an RSI support group, for example). Joining the discussion forum on the ‘Know Brainer’ website is also essential, I’ve found. You’ll be amazed at what free advice you can get from other users and experts.
Heavy users of voice recognition software are at risk of voice strain. I was starting to experience this but, by coincidence, met a speech therapist who explained to me why it happened and how to prevent it.
I was lucky. Software writer Andy says ‘After developing RSI, I started using Dragon NaturallySpeaking a lot at work. After about six months I noticed that my voice was getting tired very quickly. Soon it got to a stage where I would open my mouth and nothing came out. I was diagnosed with acute muscular stress of the larynx. One year later, I still haven’t been able to go back to using Dragon.’
I know from RSI forums that, although it may be rare, other people have had similar experiences to Andy.
There are some tips in the back of the DNS manual on avoiding voice strain (see tips/avoiding vocal strain). I think it is essential that you read them before you start using the product, so just to be sure, I’ve pasted them in here.
When dictating for long periods of time, posture, correct breathing, and regular breaks are important.
Use good posture: sit up straight or stand in front of your computer.
Do not speak in a loud voice or in any way that is stressful to you.
Breathe deeply from your abdomen and not from the top of your chest.
Loosen up and relax: stretch your arms, shoulders, neck, and jaw muscles.
Take occasional breaks: get up, move around, and stretch.
Keep your vocal cords moist: take sips of water (you can use a straw so you do not need to move the microphone).
Do not dictate for longer than is comfortable.
I would add to this that you should try to avoid using the smaller vocal muscles near the top of your throat which are more likely to strain. Like a stage actor, relax, take a good breath through your abdomen before you start dictating and use your deeper throat muscles on the exhale to speak. Although you do need to enunciate more than usual to get good recognition in DNS, you should otherwise speak as naturally as you can. If you still think your voice is being strained, stop using Dragon and get an appointment with a speech therapist.
Finally, the tips in the DNS manual are all there for a good reason – you must read them to improve your performance and avoid frustration. Heavy users should read the whole manual and could even learn to write programs in it themselves. Voice recognition is like chess – it takes minutes to learn but a lifetime to master!
A spokesperson for Nuance, the provider of DNS, provided this response to the comments made in this article. ‘Nuance is constantly looking at best practice advice and tips that it can share with customers, through product documentation, industry forums and events.
‘In the last five years DNS and speech technology has come on leaps and bounds in terms of ease of use and accuracy. This has been made possible through the significant advances we have seen in base level processing power and on board memory in today’s computers, advances in sound card design (particularly with laptops) and the availability of high quality noise cancelling headsets at reasonable cost. There has also been significant development in Dragon’s core speech engine, which provides much faster training and virtually out-of-the-box use, together with heightened accuracy.’
Access to Work
In the UK, if you have a health or disability issue which affects the way you do your job, you may be eligible for help from the Access to Work scheme, obtained through Job Centre Plus. I qualified for this in 2004 and received a grant of 80% of the cost of the consultant, equipment, training and programming mentioned above, plus some other ergonomic equipment – which in total cost around £2500. I believe the experience was easier for me because I worked for myself from home (no need to consult with bosses or HR departments) but Access to Work is also available to the employers of those who need it and to the unemployed.
Feel free to contact me at firstname.lastname@example.org if you want to discuss any of these issues. Voice recognition is frequently discussed at the RSI Support Groups and Awareness Days advertised at www.rsiaction.co.uk.
Speech Recognition Specialist Providers:
The Speech Centre
voice recognition support forum
Use the filter or just dip into discussions, otherwise it will dump tens of emails on you with new discussion threads every day
© 2011 RSI Action…
Registered Charity No. 1114977 Company No. 05697873