Microphone Bar Extensions
been using the Speech Recognition built-into the Vista shell. it works great, but there were times when it didn't expose commands that i wanted. so i wrote MS and asked for these commands to be added in a future version; or if they already existed, and i was just speaking the commands incorrectly. shortly after sending that message ... it hit me ... i'm a developer ... i should be able to write that myself. so this short article is about extending Vista's speech recognition capabilities.
the extensions run as a system tray application. all the app really does is create an instance of the shared SpeechRecognizer and adds some grammars to it. if a command from one of the grammars is recognized, then the app executes the command. the app does have an options window where you can enable/disable grammars. it also sends messages to the Microphone bar. you can distinguish commands executed by the Extensions, because the Microphone Bar text will be all lower-case.
these are the commands that are currently exposed by the extensions ... please write me if you have suggestions for other commands.
one problem i had with Vista's speech reco, is that it's mostly always listening. so if i'm listening to music over an external speaker, then that music might create a false positive recognition; especially if its a podcast. this pretty much forces you to use speech reco with a headset, so that the audio out does not interfere with the micrphone in. to get around this, i've added tap-and-talk support to /micBarExtend. this allows you to use speech reco with a desktop microphone and external speakers. first, you tap a button which will mute the external volume, and start listening for commads. then, you speak your commands. finally, you tap the button again to stop speech reco from listening and unmute the external volume.
NOTE if tap-and-talk is not working for you, you might have to change /micBarExtend to 'run as administrator' by right clicking on the executable and setting its compatibility.
been trying different hardware with speech recognition. a promising looking device was the Snowball. the problem is its output level is way too low. checking the manufacturer's forums shows that many people have had this problem and that there is a firmware update on their website. except i couldn't get the firmware program to work on Vista ... crap. tried some other mics i've got laying around too ... and have had the best luck with headsets. the problem is i don't want to always have to wear a headset. i need some hardware options. notebook manufacturers need to start putting array mics on board. otherwise, we need to be able to purchase array mic setups that can be mounted onto monitors / notebook screens. and there needs to be a better selection of USB desktop mics ... that actually work. i'd also like for keyboards / mice to get 'tap and talk' buttons. a button that i can quickly press to initiate speech recognition and possibly mute audio output. so if i'm playing music and want to switch to the next track, i can just hit the 'tap and talk' button, which cuts the music on the external speakers, then i say 'next track', and release the button. then the music would start playing again and the media player would switch to the next track. i still need to see if Bluetooth 2 headsets are will work with Vista speech reco ... the problem is i can't get the drivers to work yet :( honestly, i really like Vista, but i can't believe XP drivers just don't work out of the box. anyway, the speech recognition software is great ... but the hardware sucks. it just so happens that MS is great at hardware ... how about getting the hardware teams to put together some reference implementations ...
here is a video showing how to use some of the extended commands
it ended up being really easy to extend Vista's speech recognition capabilities.
here is the C# source code. the little bit of UI is WPF, the rest of the code is System.Speech.Recognition and some pInvokes.
i plan on updating this with more commands, so let me know if there are any commands you need/want
more speech related articles. later