Aural Interfaces to Databases based on VoiceXML

Aural Interfaces to Databases based on VoiceXML

Presentation given at VDB6, 6th IFIP Workshop on Visual Database Systems, Brisbane, Australia.

ABSTRACT: As part of a general framework for the development of global information systems, we include support for the development of aural interfaces. The framework uses an object-oriented database for the management of application, document content and presentation data. The access layer is based around an XML server and XSLT for document generation from default and customised templates. Specifically, aural interfaces are supported through a VoiceXML server that provides the speech recognition and synthesis mechanisms, together with XSLT templates for the generation of VoiceXML. In this paper, we describe the implementation of a generic voice browser for application databases as well as the development of a customised aural interface for a community diary managing appointments and events.

Research paper: https://beatsigner.com/publications/signer_VDB6.pdf

1135dc242dcff3b90ae46fc586ff4da8?s=128

Beat Signer

May 29, 2002
Tweet

Transcript

  1. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland Aural Interfaces to Databases based on VoiceXML Beat Signer, Moira C. Norrie, Peter Geissbuehler and Daniel Heiniger
  2. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland Outline Motivation Architecture Voice Interfaces Application Development
  3. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland Avalanche Forecasting System Project to provide WAP and Voice Access
  4. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland Avalanche Forecasting System ... Information model (OM model) for SLF forecast data Application user interfaces for WAP and voice access national bulletin with maps and glossary local bulletin based on a region's start letter, GPS or Swiss Coordinates WAP responses for voice requests (mixed-mode) or triggered events
  5. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland Requirements Platform supporting universal client access to databases → eXtensible Information Management Architecture (XIMA) Use of a technology which allows the separation of content and presentation → XML and XSL Minimise effort to support new types of client devices, e.g. XML, HTML, WML, CHTML, VXML, ?
  6. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland XIMA OMS Java API OMS Java Workspace XML Server HTML Servlet WML Servlet VXML Servlet HTML Browser WML Browser VXML Browser Delegation Builds XML based on JDOM XML + XSLT → Response Main Entry Servlet OM Model Collections, Associations, multiple inheritance and multiple instantiation
  7. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland XML Reponse valid? XML Response <?xml version="1.0" encoding="ISO-8859-1"?> <oms> <instance id="OM_4077" last="true" pos="1" type="person"> <dressedWith type="person"/> <attribute name="name"> <string>Moira Norrie</string> </attribute> … <attribute name="picture"> <mime>/globis/staff/moira.jpg</mime> </attribute> <method name="age"/> … <link idref="OM_2693" inv="false" name="Workplace"/> </instance> … </oms> XML Schema <xsd:element name="oms"> <xsd:complexType> <xsd:choice minOccurs="0" maxOccurs="unbounded"> <xsd:element name="workspace" type="workspaceType"/> <xsd:element name="instance" type="instanceType"/> <xsd:element name="collection" type="collectionType"/> <xsd:element name="association" type="associationType"/> <xsd:element name="result" type="resultType"/> <xsd:element ref="warning"/> </xsd:choice> </xsd:complexType> </xsd:element> <xsd:complexType name="instanceType"> <xsd:sequence> <xsd:element name="dressedWith" type="dressedWithType" …> … <xsd:element name="link" type="linkType" minOccurs="0" …> </xsd:sequence> <xsd:attribute name="id" type="xsd:string" use="required"/> … </xsd:complexType>
  8. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland VoiceXML Development IBM WebSphere Voice Server SDK Deployment BeVocal Cafe Voice Portal Speech Recogniser Converts voice input into text Speech model Language Analyser Extracts meaning from text Grammar Application Server Gets data (text) from database Application database Speech Synthesiser Generates speech output Pronounciation rules Meaning Text Text Voice Input Voice Output Speech Speech
  9. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland VoiceXML ... VoiceXML is an application of XML Describes call flows and human machine dialogues Use advantages of web-based development and content delivery to build interactive voice response applications Hello Word Example <?xml version="1.0" encoding="ISO-8859-1"?> <vxml version="2.0"> <form id="f1"> <block>Hello World</block> </form> </vxml>
  10. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland XML to VXML Example XML Response <?xml version="1.0" encoding=… ?> <oms> <instance id="OM_4077" type="person" …> <dressedWith type="person"/> <attribute name="name"> <string>Moira Norrie</string> </attribute> … <method name="age"/> … </instance> </oms> XSLT Stylesheet <xsl:template match="instance"> <form id="instance_entry"> <block> <xsl:choose> <xsl:when test="count(dressedWith)=1"> Object <xsl:call-template name="removeUnderscore"> <xsl:with-param name="label" select="@id"/> </xsl:call-template> is dressed with type <xsl:value-of select="./@type"/> </xsl:when> … </xsl:template> … VXML Result <?xml version="1.0" encoding="ISO-8859-1"?> <vxml application="http://macbain/xima/omsmain_root.vxml" version="2.0"> <form id="instance_entry"><block> Object 4077 is dressed with type person and is viewed as type person. <prompt>It contains 8 attributes, 5 links, and 1 method</prompt> <goto next="#instance_process"/></block></form> <form id="instance_process"><field name="Member_Choice"><prompt>Would you like to hear the attributes, the links or the methods or go back?</prompt> …
  11. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland Design Phase Define the required functionality User analysis motivation, expertise High level decisions full-duplex (barge-in) simple grammars (dynamic) only synthesised speech (TTS) Representation of base types Information flow
  12. associations collections objects The database contains #Collections #Associations Would you

    like to go to the collections, to the associations, directly to an object or back to the main menu? The database contains the following # associations Choose an association Association 'name' contains #A Would you like to list the members or go back? Association 'name' contains the following # associations Choose a 'domaintype' or a 'rangetype' or say back Object 'oID' is dressed with type 'type' and currently viewed as type 'type'. It contains #Attr, #Links, #Methods Choose a link or say back The object contains the following # attributes Would you like to hear the attributes, the links or the methods, change the type or go back? You can choose among the following links You can choose among the following methods You can view the object as the following types The database contains the following # collections Choose a collection Collection 'name' contains #M Would you like to list the members or go back? Collection 'name' contains the following # members Choose one of the members The database contains #Objects Choose an object or say back Choose a method or say back Choose one of the types or say back The result of the method is Result
  13. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland Test and Refinement Phase Recognition problems elimination of similar sounding words from the grammar addition of optional words to the grammar (e.g. "please") Insufficient help functionality introduction of prompt-specific help instead of always active command list Immediate feedback after input has been processed ("OK" prompt)
  14. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland OMS Database Development Suite OM Semantic Object Data Model Application Modelling OMS Pro Rapid Prototyping System and Lightweight DBMS Database and Application Design OMS Java Data Management System and Application Framework Implementation
  15. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland XIMA Application Development Prototype the application's information model in prototyping system OMS Pro Export model (and data) to OMS Java Installation of XML Server with default XSLT stylesheets and servlets database immediately acessible by generic object browser Customisation of stylesheets
  16. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland Conclusions Database driven development of voice- enabled applications Rapid prototyping supported by OMS Pro and XIMA's generic object browser Multi-mode access provided by generic object browser (HTML, WAP, VXML) Customised user interfaces (stepwise refinement of XSLT stylesheets) New potential user communities
  17. Global Information Systems Group Department of Computer Science ETH Zurich,

    Switzerland Questions?