Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Aural Interfaces to Databases based on VoiceXML

Aural Interfaces to Databases based on VoiceXML

Presentation given at VDB6, 6th IFIP Workshop on Visual Database Systems, Brisbane, Australia.

ABSTRACT: As part of a general framework for the development of global information systems, we include support for the development of aural interfaces. The framework uses an object-oriented database for the management of application, document content and presentation data. The access layer is based around an XML server and XSLT for document generation from default and customised templates. Specifically, aural interfaces are supported through a VoiceXML server that provides the speech recognition and synthesis mechanisms, together with XSLT templates for the generation of VoiceXML. In this paper, we describe the implementation of a generic voice browser for application databases as well as the development of a customised aural interface for a community diary managing appointments and events.

Research paper: https://beatsigner.com/publications/signer_VDB6.pdf

Beat Signer
PRO

May 29, 2002
Tweet

More Decks by Beat Signer

Other Decks in Science

Transcript

  1. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    Aural Interfaces to Databases
    based on VoiceXML
    Beat Signer, Moira C. Norrie,
    Peter Geissbuehler and Daniel Heiniger

    View Slide

  2. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    Outline
    Motivation
    Architecture
    Voice Interfaces
    Application Development

    View Slide

  3. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    Avalanche Forecasting System
    Project to provide
    WAP and
    Voice Access

    View Slide

  4. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    Avalanche Forecasting System ...
    Information model (OM model) for SLF
    forecast data
    Application user interfaces for WAP
    and voice access
    national bulletin with maps and glossary
    local bulletin based on a region's start
    letter, GPS or Swiss Coordinates
    WAP responses for voice requests
    (mixed-mode) or triggered events

    View Slide

  5. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    Requirements
    Platform supporting universal client
    access to databases
    → eXtensible Information Management
    Architecture (XIMA)
    Use of a technology which allows the
    separation of content and presentation
    → XML and XSL
    Minimise effort to support new types of
    client devices, e.g. XML, HTML, WML,
    CHTML, VXML, ?

    View Slide

  6. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    XIMA
    OMS Java API
    OMS Java Workspace
    XML Server
    HTML Servlet WML Servlet VXML Servlet
    HTML
    Browser
    WML
    Browser
    VXML
    Browser
    Delegation
    Builds XML
    based on JDOM
    XML + XSLT
    → Response
    Main Entry Servlet
    OM Model
    Collections, Associations,
    multiple inheritance and
    multiple instantiation

    View Slide

  7. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    XML Reponse
    valid?
    XML Response





    Moira Norrie



    /globis/staff/moira.jpg







    XML Schema





















    View Slide

  8. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    VoiceXML
    Development
    IBM WebSphere Voice Server SDK
    Deployment
    BeVocal Cafe Voice Portal
    Speech
    Recogniser
    Converts voice
    input into text
    Speech model
    Language
    Analyser
    Extracts meaning
    from text
    Grammar
    Application
    Server
    Gets data (text)
    from database
    Application
    database
    Speech
    Synthesiser
    Generates
    speech output
    Pronounciation
    rules
    Meaning
    Text Text
    Voice Input Voice Output
    Speech Speech

    View Slide

  9. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    VoiceXML ...
    VoiceXML is an application of XML
    Describes call flows and human machine
    dialogues
    Use advantages of web-based development
    and content delivery to build interactive voice
    response applications
    Hello Word Example



    Hello World


    View Slide

  10. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    XML to VXML Example
    XML Response





    Moira Norrie






    XSLT Stylesheet





    Object



    is dressed with type





    VXML Result



    Object 4077 is dressed with type person and is viewed as type person.
    It contains 8 attributes, 5 links, and 1 method

    Would you
    like to hear the attributes, the links or the methods or go back?

    View Slide

  11. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    Design Phase
    Define the required functionality
    User analysis
    motivation, expertise
    High level decisions
    full-duplex (barge-in)
    simple grammars (dynamic)
    only synthesised speech (TTS)
    Representation of base types
    Information flow

    View Slide

  12. associations
    collections objects
    The database contains #Collections #Associations
    Would you like to go to the collections, to the associations,
    directly to an object or back to the main menu?
    The database contains the
    following # associations
    Choose an association
    Association 'name' contains #A
    Would you like to list the
    members or go back?
    Association 'name' contains the
    following # associations
    Choose a 'domaintype' or
    a 'rangetype' or say back
    Object 'oID' is dressed with type 'type' and currently viewed as type 'type'. It contains #Attr, #Links, #Methods
    Choose a link
    or say back
    The object contains the
    following # attributes
    Would you like to hear the attributes, the links or
    the methods, change the type or go back?
    You can choose among
    the following links
    You can choose among
    the following methods
    You can view the object
    as the following types
    The database contains the
    following # collections
    Choose a collection
    Collection 'name' contains #M
    Would you like to list the
    members or go back?
    Collection 'name' contains the
    following # members
    Choose one of the members
    The database contains #Objects
    Choose an object or say back
    Choose a method
    or say back
    Choose one of the
    types or say back
    The result of the
    method is Result

    View Slide

  13. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    Test and Refinement Phase
    Recognition problems
    elimination of similar sounding words from
    the grammar
    addition of optional words to the grammar
    (e.g. "please")
    Insufficient help functionality
    introduction of prompt-specific help
    instead of always active command list
    Immediate feedback after input has
    been processed ("OK" prompt)

    View Slide

  14. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    OMS Database Development Suite
    OM
    Semantic Object Data Model Application Modelling
    OMS Pro
    Rapid Prototyping System
    and Lightweight DBMS
    Database and
    Application Design
    OMS Java
    Data Management System
    and Application Framework
    Implementation

    View Slide

  15. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    XIMA Application Development
    Prototype the application's information
    model in prototyping system OMS Pro
    Export model (and data) to OMS Java
    Installation of XML Server with default
    XSLT stylesheets and servlets
    database immediately acessible by
    generic object browser
    Customisation of stylesheets

    View Slide

  16. Global Information Systems Group
    Department of Computer Science
    ETH Zurich, Switzerland
    Conclusions
    Database driven development of voice-
    enabled applications
    Rapid prototyping supported by OMS
    Pro and XIMA's generic object browser
    Multi-mode access provided by generic
    object browser (HTML, WAP, VXML)
    Customised user interfaces (stepwise
    refinement of XSLT stylesheets)
    New potential user communities

    View Slide