Microsoft have been researching voice recognition and synthesis for a long time and for a long while there was very little public evidence of its results. Then they brought out some voice functions for Office XP and around a year ago came Voice Command for Media player, as part of Windows XP Plus pack. This week they have released Voice Command for mobile devices (PDA, mobile phones, etc) that runs on Windows Mobile 2003 which gives you control over a subset of the functions.
One advance that they claim for the newest version is speaker independence, meaning that you do not need to train the software to recognise your voice. This has been a major barrier to previous, as it didn’t work “out of the box”. We imagine that this has probably been helped by the fact that the available commands are limited, and may have been selected for unique sounds or patterns.
The mobile version adds functionality to the address book and calendar. Contacts can be dialled from the address book by voice commands alone; “Call Harry Perkins Mobile”, will work, as well as just saying the number to be called. Contact details can also be recalled and displayed “Show William Wonker”.
The calendar can be interrogated; “What is my next appointment?”, will respond with the time, date and all of the text associated with the entry. When a calendar reminder alarm appears, its details are also read out.
For us, the most interesting part is the voice control of the media player, which works both on the PDA version and the XP Plus pack. Not only is there the expected track controls; “Pause”, “Play”, “Previous track, “Next track”, but music can be select by Artist, Album, or Genre at the first level, then by the music started by speaking the chosen detail.
An interesting addition is ability to ask “What song is this?” and have the synthesised voice read it back to you. We feel it is a short step for this to also work with radio content, with the obvious follow-on question from the computer, “Do you want to buy it?/add this to your music collection?”, which then connected you to a music download service.
The list of voice commands on the XP version is considerably longer that the Mobile version, numbering over forty, so they have provided as useful “What can I say?” command, that will read out all of the possible commands available at that point.
The voice that speaks back to you is still a bit too computer like. We have heard much better quality, in fact near human-sounding voices from other companies, but the processing power they need is greater than is currently offered on mobile devices.
In addition to the amount of memory that the program itself takes up, additional device memory (RAM) is required depending on the number of contacts or songs you want to be able to request. A hefty 7 MB of RAM is required for every 500 contacts and 100 songs you want to be able to use.
Selling for just short of $40, it sounds like reasonable value. It is slightly surprising that Microsoft is charging money for these features, as it would normally integrate them into the operating system, and it may well do in the future, but they have to try and claw back all of those millions of R&D dollars that they have spent.
We imagine that they working away at additional country packs but currently it is only be available in the US, and their press office tells us there are no dates for anywhere else yet.
For the future
When the XP Plus pack was released, it was not well received, with nearly all of reviews dismissing the value in being able control your media player with voice commands. We think they missed the significance of this first step.
The potential for this is not talking to the PC sitting on the desk in front of you, but navigating music or indeed, video content, in the rooms of your house or within the car simply by speaking to it. It eliminates the need to have a touch screen/keyboard/mouse in each location.