VoiceXML
is the HTML of the voice web, the open standard markup
language for voice applications. VoiceXML harnesses
the massive web infrastructure developed for HTML
to make it easy to create and deploy voice applications.
Like HTML, VoiceXML has opened up huge business opportunities.
While
HTML assumes a graphical web browser with display,
keyboard, and mouse, VoiceXML assumes a voice browser
with audio output, audio input, and keypad input.
Audio input is handled by the voice browser's speech
recognizer. Audio output consists both of recordings
and speech synthesized by the voice browser's text-to-speech
system.
Why
is the VoiceXML approach important ?
First,
the phone is important. There are over 1.5 billion
phones in use, far more than there are Internet-connected
computers. Phones are easy to use and don't need to
be booted up. Telephone networks are much more reliable
than data networks.
Mobile phones are achieving large penetration rates
too: unlike notebook computers and many PDAs, mobile
phones are highly portable, inexpensive, and have
long battery lives. Mobiles are a natural match for
location-based applications. They can be used while
driving (though not always safely).
Second, voice is important on the phone. Voice has
always been the natural mode of communication for
phones. Even though some mobiles have WAP/XHTML browsers,
their small screens and keypads make micro browsers
hard to use, especially while driving. The i-mode
system is more compelling, though shares the same
limitations.
But there are advantages to combining visual browsing
and voice browsing. For instance, complex information
is hard to remember when spoken to the user, but easy
to remember if it is presented in a persistent visual
form. And some misrecognitions of spoken input are
easy to correct with keypad entry. Therefore, we should
soon begin to see multi-modal applications deployed
alongside pure visual applications and pure voice
applications.
Third, the Internet is important to voice applications:
· Voice application development is easier because
VoiceXML is a high-level, domain-specific markup language,
and because voice applications can now be constructed
with plentiful, inexpensive, and powerful web application
development tools.
·
Voice applications are now far easier to deploy. No
longer must they reside on a special-purpose voice
server in a proprietary "walled garden":
they can be placed anywhere on the Internet and accessed
from any VoiceXML-compliant voice server.
·
Applications can be cleanly structured into service
logic on the web server, and presentation logic, in
VoiceXML pages delivered to the voice browser. This
has many advantages, not the least of which is that
a common application back end on the web server can
serve up different types of presentation logic based
on the user's device. This factoring leads to huge
savings.
Finally, voice, and therefore VoiceXML, is important
for web devices other than the phone. For example,
a voice actuated "universal remote" could
have an on-board voice browser and VoiceXML content
generated from all the devices in its vicinity. You
could walk into your family room, pull the remote
from your shirt pocket, press its push-to-talk button
and say "stereo: off; television: what action
movies are playing?"
BLiSS
has launched Hindi VoiceXML Interpreter and Hindi
VoiceXML Gateway.