rss
 
comment(s)

archives
J|F|M|A|M|J|J|A|S|O|N|D
(20##) 10 9 8 7 6 5 4 3 2 1 0 <
 
DesktopWeb FormText   attempts to fix speech reco attachFri, 02 Feb 2007 05:33:39 GMT # 

alright, so i kicked around /micBarExtend to try and prevent the hack :

first attempt was to add a user-defined keyword. so that a user would say 'computer open notepad' instead of just 'open notepad'; with 'computer' being the user-defined keyword. it would ignore all other audio input until hearing the keyword. then after hearing the keyword, it would only be listening for some short period of time. then it would only start listening for the user defined keyword again. the problem is it doesn't look like the System.Speech API gives us a way to hook the built-in Speech UI commands, so i have no way of intercepting those calls and preventing them from happening.

the next attempt was to extend the tap-and-talk functionality to only be activated with a user key press. so i made a grammar that only listens for 'start listening'. the problem is that the built-in Speech UI has a higher priority than my 3rd party grammar, even if i max out the priority of my grammar. so there is no way for me to intercept that event and handle it.

next idea is a total hack. have a timer polling to see if the Speech UI is listening. if its listening, and wasn't tap-and-talk triggered, then programmatically send a 'stop listening' command to prevent the hack from occuring. the problem is i dont get an event when 'start listening' occurs. also, it doesn't look like there is a property for me to check to see if the speech recognizer is listening or not.

the only other thing i could think of is a bit of a hack too. the idea is to turn off microphone input until the tap-and-talk button is pressed. and only allow mic input between tap-and-talk key presses. now i just need to figure out how i can programmatically control mic input in Vista ... but if i can figure that out, then that will defeat the attack :)

of course MS can easily fix it a # of ways : 1) user defined keyword 2) tap-and-talk mode. actually, i would prefer for keyboards/mice/remotes to have dedicated buttons to trigger speech reco 3) filter audio output, so that it cannot be used as speech reco input 4) speaker recognition, so that it will only listen to audio input from a specific users voice ... that's all i can think of at the moment