rss
 
comment(s)

archives
J|F|M|A|M|J|J|A|S|O|N|D
(20##) 10 9 8 7 6 5 4 3 2 1 0 <
 
DesktopWeb FormText   eva : speech visionThu, 12 Feb 2009 15:56:04 GMT # 

wired.com asks Why Can't We Control Gadgets by Voice Alone?. of course there is the holy grail where the computer properly recognizes everything you say. but we are not there ... plus shortly after that, robots start hunting humans. until then, command and control works great. but there are still problems for developers and end-users.

for developers, the hardest part is creating grammars. there needs to be a library of grammars that developers can pull from. it needs basic rules like dates and integers, along with higher level grammars for specific tasks (e.g. music player grammar). plus it would need support for multiple languages. because there are a ton of different ways somebody can say a date. i shouldnt have to create that from scratch, and then try to create that from scratch for a different language.

for end-users, they would become accustomed to speaking the same commands for different gadgets. i.e. there is the scene in 'I Robot' where the girl is trying to control the CD player, and she is speaking different commands trying to find one that will work. for basic tasks, the commands need to be standardized, so a user can say 'next song' to any speech-enabled radio; instead of radio A listening for 'next track', radio B listening for 'next', and C listening for 'next song' (just an example, the grammar could support all those phrases). those standardized commands that a user expects for certain items would be specific grammars in the library for developers. so if a developer was creating a music playback device, they would support the basic radio grammar and then extend it for features specific to their device.