Skip to Content
Legend : image : Open a new window. image: Google search Link. image: “Go to top link” works only on IE6.0.

Table Of Contents
Author’s guide

If you want to make your hands dirty right now, please go to the chapter 2 “How To Do” rather than read the chapter 1 Motivation.
You can listen to my recorded mp3. VoiceXML_PHP_SAP_20060804.mp3 image
After reading and practicing this weblog, you can get answers of the following questions.

  1. Is it possible to practice VoiceXML stuff on my computer?
  2. Do I really make a call whenever I test my VoiceXML application?
  3. Is there any free/demo version that I can build the entire voice technology stuff?
  4. How can I integrate Voice technologies with SAP system?
  5. What are VoiceXML, CCXML, callXML about?
  6. I have no idea about voice technology, but my supervisor forces me to develop VoiceXML applications. How can I start?
  7. Is it really difficult to build the applications related to Voice technologies?
  8. By the way, How can my organization use these technologies?
1 Motivation
1.1 Why Voice?image

Are You A Digital Native Or Digital Immigrant? Believe or not, when elders are trained to use the computer or the Internet, the most difficult task is “Double clicking”. It’s surprising, isn’t it? Ironically whether I am a digital native or digital immigrant, I have been trained for “how to use the digital devices”.
Due to pervasive VoIP applications and mobile phones, I am getting familiar with Voice Interface as a communication method toward digital devices/systems.
“Voice” is the human being’s basic communication method between people. How about between human being and the system for instance, the computer?
By using voice technologies and multimodal applications, some business process steps and prcessing time can be reduced (Marc Erich, 2005). Furthermore, training of “how to use” is not required as much as it used to be. From the developer point of view, Is it mature enough to learn and apply to the real business case? Please, feel free to vote your opinion(before reading this weblog)image.
Some case studies are mentioned at the end of this Weblogs.

1.2 Technical Termsimage

When you start to learn some technologies, you face many new special terms. For example,

  1. VoIP: Voice over Internet Protocol image , Mobile VoIPimage
  2. SIP: Session Initiation Protocolimage
  3. WSIP: Web Service SIP (Feng et al., 2004)
  4. VoiceXML: http://en.wikipedia.org/wiki/VoiceXML image
  5. CCXML: Call Control eXtensible Markup Language image
  6. ASR: Automatic Speech Recognition image
  7. TTS: Text To Speech image
1.3 Roadmap of Voice Technologiesimage

This is my personal view where voice technologies are going to.
image

  1. Text To Speech: In the past, Voice technologies emerged as a speaker of “text” such as reading a book.
  2. Interaction with human being: As Voice Recognition technologies has been developed, large organizations started to adopt these in order to automate some business processes such as answering/transferring a call and interacting customer’s requirements.
  3. Advisor based on KM, BI: Based on the user’s input or situation, the system not only answers the question but also gives a certain advice.
  4. Multimodal Artificial Intelligence under Heterogeneous ERP systems: Through EAI, the system collects information within enterprise wide level easily and responses intelligently with various channels.
1.4 Pie in the Skyimage

Recently, SAP announced “SAP NetWeaver Voice”. When I saw it, I was excited but few minutes later I realized it looked like “Pie in the Sky”. My eyes and my brain were happy but my fingers and my mouth were not happy. I researched, googled and tested “How I can make my hands dirty and my mouth melting”. Finally, I could manage to this weblog with VoiceXML engine, VoiceXML applications and PHP + SAP from the beginning to the end in orde to share my experiment with you.

2 How To Do
2.1 Scenarioimage

This is a simple imagination scenario.
The Victoria University has a voice clock-in/out system for mobile workers.
In the morning, “DAVID” calls to the system, logs on to it as a clock-in and gets the list of tasks for today.

The following diagram shows the overview of system structure for this demo.
image
[SJphone]image [FPG518-03]image [NW04SHOST]image [FPG518-01]image

2.2 Prerequisiteimage

You should have SAP Web Application Server on ABAP for this demo. If you don’t have it, please go to here http://www.sdn.sap.com/irj/sdn/downloaditem?rid=/library/uuid/cfc19866-0401-0010-35b2-dc8158247fb6image and get Full ABAP Edition-Trial.

2.3 Download Prophecyimage

Go to http://www.voxeo.com/prophecy/ imageand download Prophecy 7.0 GA (build 160) – 387mb download with ASR + Best quality TTS (requires 1GB of RAM)image (I haven’t tried the others.)
prophecy-126-large-tts.msi

2.4 Installation Prophecyimage

After you install prophecy-126-large-tts.msi successfully, it leads you to the new website that looks like the following screenshot.
image
image
Firstly you need to get 30 days free license key by clicking “Registration” [1] link. You may need to register http://evolution.voxeo.com/image . Related to installation steps, they are quite straight forward so I don’t explain them here. For your further learning, please visit “Documentation” [3] link.

2.5 Testing Prophecyimage

By clicking “Quick Start Guide” image , you will see how to do for a Prophecy quick test.

2.6 Download SAPRFC for PHPimage

Fortunately the built-in PHP version of Prophecy is 5.1.1. Therefore you can use saprfc-1.4.1-5.1.1.Win32.zip. You can download from here http://sourceforge.net/project/showfiles.php?group_id=29190image
image

2.7 Configuration SAPRFC for PHPimage

Under “C:Program FilesVoxeowww” folder, Create saprfc directory and unzip saprfc-1.4.1-5.1.1.Win32.zip file.
image

  1. Copy C:Program FilesVoxeowwwsaprfcphp_saprfc.dll to C:Program FilesVoxeo
  2. Edit C:Program FilesVoxeophp.ini file. Add extension=php_saprfc.dll in the 598th line.

image

2.8 Restarting Prophecy image

You can restart Prophecy services by clicking Start-Programs-Voxeo-Restart all services.
Or type prophecy restart command on the command window.
image

2.9 Checking SAPRFC for PHPimage

After you run this URL http://127.0.0.1:9990/phpinfo.phpimage , in the middle of page you should see the following screenshot related to saprfc information.
image
And when you click http://127.0.0.1:9990/saprfc/saprfc_test.phpimage, you also should see the web page like this imageimage . If you get some errors, please refer to the Scripting Languages for a help. And there are Weblogs related to PHP and SAP related to PHP + SAP.

2.10 VoiceXML, PHP and SAP image

If you followed the previous steps without any problem, you are ready to have an exciting experience.
I created and copied from BCUSER to DAVID in the client 000. Therefore in the client 000 there are BCUSER, DAVID, DDIC, and TMSADM users.
“DAVID” is more human friendly than “BCUSER”. The machine speaks out “BCUSER” as “B C U S E R” rather than “BC ~ user”.

  1. Download http://mobilian.org/SDN/test_vxml.phpsimage and save it as test_vxml.php under C:Program FilesVoxeowwwsaprfc. You should customize the connection details of your own SAP Web AS.
    image
  2. Go to Prophecy Management Console http://127.0.0.1:9995/mc.phpimage .
  3. Click Call Routing menu on your left hand side.
  4. Change Route 7 values. For instance,
    Route 7 URL: http://127.0.0.1:9990/saprfc/test_vxml.phpimage
    Route 7 Type: VXML
    image
    In order to access to the Prophecy system from other computers within my network, my own public IP address was typed (If you want to do this, you can logon to the Prophecy Management Console with your own public IP address).
  5. Save the change by clicking image button.
2.11 Testing an example on the same machine with Prophecy image

In order to call to the Prophecy system, you need to run Voxeo SIP softphone.

  1. Go to Start-Programs-Voxeo-SIP Phone
  2. Or type prophecy run phone on the command window.
    image
  3. In the Dial String field, Enter sip:7@127.0.0.1 [1]
  4. Click Dial button [2]
    image

You can listen to my recorded mp3 to compare to yours. VoiceXML_PHP_SAP_20060804.mp3 image

2.12 Testing an example on the other machineimage

If you want to access to the Prophecy system from the other machines, the prophecy system should have the public IP address or the network IP address(not 127.0.0.1).
And you need to prepare for Voice Over IP Phone such as SJphone (or copy Voxeo SIP softphone).

  1. Go to http://www.sjphone.org/sjp.htmlimage and download “SJphone for Windows” (XP/2000/ME/98, v.1.60.289a, 06.19.05)image
  2. Install it and run it.
  3. Before you test a demo, I strongly recommend you to run “Audio Wizard..[1]” in order to tune the softphone audio settings.
  4. Enter the SIP address of a demo [2]. E.g.sip:7@xxx.xxx.xx.xx
  5. Click the dial icon [3].
    image
    image

Actually, I planned to test it through my old PDA (Pocket PC Windows CE version 3.0) under the wireless network. Unfortunately SJphone is only available for Pocket PC 2003 or higher. However I believe some readers of this weblog can have an experiment with it. Please feel free to share your experience.

3 Source Code Brief Explanations
3.1 PHP and VoiceXML Headerimage

Line 3: You must define the XML header inside PHP code firstly.
Line 5: VoiceXML is declared here.
image

3.2 Connection to SAP Web ASimage

From Line 6 to Line 23: This part is the same as the given example file (example_userlist.php), but you should edit each value corresponded to your own SAP system.
image

3.3 Grammar fileimage

After SAPRFC connection is okay, the following source code will create the list of SAP user in the client 000.
For instance,
BCUSER | DAVID | DDIC | TSADM
image
I’d like to demonstrate two different types of grammar format in this demo.
image
This is a Java Speech Grammar File format.
Krishnakumar wrote the excellent weblog “VoiceXML with XI: Speak out loud so you could be heard” that explained VoiceXML very well.
and you see another example of JSGF
image
This is a grXML Grammar File format.
For more information, please refer to http://docs.voxeo.com/voicexml/2.0/ image and http://www.w3.org/Voice/2003/srgs-ir/ image

3.4 VoiceXML Variables and IF statementimage

[1] shows one of VoiceXML variable formats. This variable contains the recognized phrase in the grammar which the caller spoke.
image
[2] shows how IF statement works inside VoiceXML.

4 Case Studies

You can search many case studies connected to VoiceXML easily however I list some topics which I was impressed by. Some case studies can be out of “VoiceXML” technology range, but I think they are still good to read for the future.

4.1 In the Labimage
  • SNOW-Services for nomadic workers-(CHEVASSUS, 2005) project aims to support nomadic workers in their performance of maintenance and production tasks. Aeronautic maintenance business was selected to conduct this project. Speech, gestures and handwriting were considered as various input modes.
  • Mikael Drugge et al. (2006) conducted the research related to nursing home care. By using multimodal applications, the nurses can simplify their works such as notes taking for patients’ status and inserting new tasks into their schedules or their chart.
4.2 In the Commercial Areasimage
  • IBM and Teges Corporation built the speech-enabled applications at Miami Children’s Hospital in 2005 (Owens, 2005). These applications were implemented into the areas such as a Cardiac Intensive Care Unit (CICU), an Operating Room (OP) where the clinicians’ hands are busy with surgery or other tasks or they cannot access to computers due to hygienic issues.
  • The multimodal application was successfully tested in the warehouse environment by SAP (Samir and Janaki Mythily, 2006). The multimodal inputs and outputs helped increase the warehouse workers’ productivity.
4.3 In the Movieimage

In the I Robot movie when Detective Del Spooner went outside, he met a FedLx Robotimage.
and while he was driving a car (?) he tried to access to U.S. Robotic system by his voice commandimage.
Doctor Susan Calvin tried to play music through a JVC audio player by her voice command “Play”image.
VIKI: Virtual Interactive Kinetic Intelligenceimage is everywhere inside U.S.R organizationimage. I think this is a kind of Multimodal Artificial Intelligence product for Heterogeneous ERP systems.

4.4 In my mindimage

Well, most of cases look too expensive to build up right now. However after you completed this demo, you can feel how easy they are(?). Certainly I can imagine the useful business cases such as employee clock-in/out, automation of answering a call and transferring a call. Many call centres started to get benefits from this kind of technologies already.

5 Conclusionimage

To sum up, a lot of voice technologies are out there. Unfortunately we got some trouble to practice them on our development environment. With Prophecy, PHP and SAP Web AS, we can make our hands dirty easily without special costs (domestic phone call, international phone call or license fee). Whether the organization is large or small, I believe voice technologies can reduce the business cost and provide better services to employees and customers as well.
If you are interested in them, I strongly recommend visiting the website. http://www.kenrehor.com/voicexml/imageimage
If you want to implement Voice Service into your organization, have a close look at the section “VoiceXML Voice Service Providers” and “VoiceXML-based Deployed Applications”.

6 Asking for a helpimage

I don’t think in the world there is only one language. In this Ken Rehor’s World of VoiceXML image website, at the VoiceXML Development Tools section, most of tools are based on English (maybe or maybe not).
I’d like to make a table that sorted by each language (if possible). so I ask you for a help (I guess this is an advantage of international SDN).

Lanugage Vendor Description update/contributor
Korean Korea Telecom VoiceXML 1.0, PC-base VoiceXML Tool, 2006.08.11/David Kang
Korean Widerthan VoiceXML 1.0, PC-base VoiceXML Tool, Test phones don’t work. 2006.08.11/David Kang
Lanuage many companies I’m waiting for you ^__^ yyyy.mm.dd/contributor
7 Next Weblogimage

Multimodal Application: http://mmi.mobilian.org/image . This demo is based on Multimodal Markup Language (XHTML+Voice). For more information, please visit http://www.w3.org/2002/mmi/image

8 Bibliographiesimage

How about now? What do you think of Voice Technologies? please vote your opinion (After Reading Weblog)image

To report this post you need to login first.

2 Comments

You must be Logged on to comment or reply to a post.

  1. Darren Hague
    Excellent blog, David – this is amazing technology.

    All of the examples I have seen so far have been based on the user calling the system, but at the Tech Ed 2006 demo, the system called Shai in response to a workflow notification. Do you have any idea how this might be done?

    Regards,
    Darren

    (0) 

Leave a Reply