Skip to Content

VoiceXML with XI: Speak out loud so you could be heard

Ever since my bank started using automated call centers, I have gone through various phases of enthusiasm to annoyance. Though annoyance has been mostly out of the voice application not understanding my ‘Indian’ accent of English, I have been greatly surprised with the wide range of services provided by them. More so, I always wondered how much effort was put in developing them. On a closer look, we will realize that a voice application has two parts.   1.The interaction with the user (includes two way conversation)   2.Fetching the data asked by the user from the backend application. Being a guy who deals with interfaces, developing an application for interpreting voice, sounds like a science fiction movie but as far as fetching data from the backend, the challenges are known well. This is where VoiceXML comes to our help. Apparently, VoiceXML couples the voice applications with XML, opening a bigger avenue in developing voice enabled interfaces. Support of XML provides a greater scope for using XI in the data fetching process, but more on that later. *What is VoiceXML?* VoiceXML is W3C endorsed mark up language which is designed for creating audio dialog that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed initiative conversations. In simple terms, it provides effective means for coupling web based developments with voice response applications. *How does VoiceXML work?* image On a very high level, the above picture shows how VoiceXML is used for fetching data from back end application and interpreting the same in to a voice response. To dive in deeper, I will use some of the examples from the W3C site ( for explaining some important components of VoiceXML. There are various service providers of VoiceXML interpreters and some of them provide development platform support for us to test our applications. For example, You can create an account for free at Voxeo ( to test the above example. More information on testing is available in this site. h4. Implementation Having seen the basics of VoiceXML, let us now build a proof of concept with XI as the integration engine for fectching data from backend. The below picture shows the process flow. There are many service providers who support VoiceXML and a comprehensive list could be obtained from the W3C site ( save hassles of calling a telephone number during our test phase, we will use JVoiceXML (, an open source VoiceXML interpreter built on Java, supporting JSAPI and JTAPI. Using JVoiceXML, you can test the application with the speaker and microphone of your laptop. You will need JDK 1.5 for this work fine.   image
You must be Logged on to comment or reply to a post.