srcibph.ru

Drive space indicator 5.3.7.6

Voice Extensible Markup Language (VoiceXML) Version 2.0

W3C Recommendation 16 March 2004

This Version:
http://www.w3.org/TR/2004/REC-voicexml20-20040316/
Latest Version:
http://www.w3.org/TR/voicexml20/
Previous Version:
http://www.w3.org/TR/2004/PR-voicexml20-20040203/
Editors:
Scott McGlashan, Hewlett-Packard (Editor-in-Chief)
Daniel C.

Burnett, Nuance Communications
Jerry Carter, Invited Expert
Peter Danielsen, Lucent (until October 2002)
Jim Ferrans, Motorola
Andrew Hunt, ScanSoft
Bruce Lucas, IBM
Brad Porter, Tellme Networks
Ken Rehor, Vocalocity
Steph Tryphonas, Tellme Networks

Please refer to the errata for this document, which may include some normative corrections.

See also translations.

Copyright © 2004 ® (,Keio), All Rights Reserved.

W3C liability, trademark, document use and software licensing rules apply.


Abstract

This document specifies VoiceXML, the Voice Extensible Markup Language. VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed initiative conversations.

Its major goal is to bring the advantages of Web-based development and content delivery to interactive voice response applications.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document has been drive space indicator 5.3.7.6 by W3C Members and other interested parties, and it has been endorsed by the Director as a W3C Recommendation.

W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionaility and interoperability of the Web.

This specification is part of the W3C Speech Interface Framework and has been developed within the W3C Voice Browser Activity by participants in the Voice Browser Working Group (W3C Members only).

The design of VoiceXML 2.0 has been widely reviewed (see the disposition of comments) and satisfies the Working Group's technical requirements.

A list of implementations is included in the VoiceXML 2.0 implementation report, along drive space indicator 5.3.7.6 the associated test suite.

Comments are welcome on [email protected] (archive).

See W3C mailing list and archive usage guidelines.

The W3C maintains a list of any patent disclosures related to this work.

Conventions of this Document

In this document, the key words "must", "must not", "required", "shall", "shall not", "should", "should not", "recommended", "may", and "optional" are to be interpreted as described in [RFC2119] and indicate requirement levels for compliant VoiceXML implementations.

Table of Contents

Abbreviated Contents

Full Contents


1.

Overview

This document defines VoiceXML, the Voice Extensible Markup Language. Its background, basic concepts and use are presented in Section 1. The dialog constructs of form, menu and link, and the mechanism (Form Interpretation Algorithm) by which they are interpreted are then introduced in Section drive space indicator 5.3.7.6.

User input using DTMF and speech grammars is covered in Section 3, while Section 4 covers system output using speech synthesis and recorded audio. Mechanisms for manipulating dialog control flow, including variables, events, and executable elements, are explained in Section 5.

Environment features such drive space indicator 5.3.7.6 parameters and properties as well as resource handling are specified in Section 6. The appendices provide additional information including the VoiceXML Schema, a detailed specification of the Form Interpretation Algorithm and timing, audio file formats, and statements relating to conformance, internationalization, accessibility and privacy.

The origins of VoiceXML began in 1995 as an XML-based dialog design language intended to simplify the speech recognition application development process within an AT&T project drive space indicator 5.3.7.6 Phone Markup Language (PML).

As AT&T reorganized, teams at AT&T, Lucent and Motorola continued working on their own PML-like languages.

In 1998, W3C hosted a conference on voice browsers. By this time, AT&T and Lucent had different variants of their original PML, while Motorola had developed VoxML, and IBM was developing its own SpeechML. Many other attendees at the conference were drive space indicator 5.3.7.6 developing similar languages for dialog drive space indicator 5.3.7.6 for example, such as HP's TalkML and PipeBeach's VoiceHTML.

The VoiceXML Forum was then formed by AT&T, IBM, Lucent, and Motorola to pool their efforts.

The mission of the VoiceXML Forum was to define a standard dialog design language that developers could use to build conversational applications. They chose XML as the basis for this effort because it was clear to them that this was drive space indicator 5.3.7.6 direction technology was going.

In 2000, the VoiceXML Forum released VoiceXML 1.0 to the public. Shortly thereafter, VoiceXML 1.0 was submitted to the W3C as the basis for the creation of a new international standard.

VoiceXML 2.0 is the result of this work based on input from W3C Member companies, other W3C Working Groups, and the public.

Developers familiar with VoiceXML 1.0 are drive space indicator 5.3.7.6 directed to Changes from Previous Public Version which summarizes how VoiceXML 2.0 differs from VoiceXML 1.0.

1.1 Introduction

VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed initiative conversations.

Its major goal is to bring the advantages of Web-based development drive space indicator 5.3.7.6 content delivery to interactive voice response applications.

Here are two short examples of VoiceXML. The first is the venerable "Hello World":

<?xml version="1.0" encoding="UTF-8"?> <vxml xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd" drive space indicator 5.3.7.6 <form> <block>Hello World!</block> </form> </vxml>

The top-level drive space indicator 5.3.7.6 is <vxml>, which is mainly a container for dialogs.

There are two types of dialogs: forms drive space indicator 5.3.7.6 menus. Forms present information and gather input; menus offer choices of what to do next. This example has a single form, which contains a block that synthesizes and presents "Hello World!" to the user. Since the form does not specify a successor dialog, the conversation ends.

Our second example asks the user for a choice of drink and then submits it to a server script:

<?xml version="1.0" encoding="UTF-8"?> <vxml xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd" version="2.0"> <form> <field name="drink"> <prompt>Would you like coffee, tea, milk, or nothing?</prompt> drive space indicator 5.3.7.6 <grammar src="drink.grxml" type="application/srgs+xml"/> </field> <block> <submit next="http://www.drink.example.com/drink2.asp"/> </block> </form> </vxml>

A drive space indicator 5.3.7.6 is an input field.

The user must provide a value for the field before proceeding to the next drive space indicator 5.3.7.6 in the form. A sample interaction is:

C (computer): Would you like coffee, tea, milk, or nothing?

H (human): Orange juice.

C: I did not understand what you said.

(a platform-specific default message.)

C: Would you like coffee, tea, milk, or nothing?

H: Tea

C: (continues in document drink2.asp)

1.2 Background

This section contains a high-level architectural model, whose terminology is then used to describe the goals of VoiceXML, its scope, its design principles, drive space indicator 5.3.7.6 the requirements it places on the systems that support it.

1.2.1 Architectural Model

The architectural model assumed by this document has the following components:


Figure 1: Architectural Model

A document server (e.g.

a Web server) processes requests from a client application, the VoiceXML Interpreter, through the VoiceXML interpreter context. The server produces VoiceXML documents in reply, which are processed by the VoiceXML interpreter.

The VoiceXML interpreter context may monitor user inputs in parallel with the VoiceXML interpreter. For example, one VoiceXML interpreter context may always listen for a special escape phrase that takes the user to a high-level personal assistant, and another drive space indicator 5.3.7.6 listen for escape phrases that alter user preferences like volume or text-to-speech characteristics.

The implementation platform is controlled by the VoiceXML interpreter context and by the Drive space indicator 5.3.7.6 interpreter.

For instance, in an interactive voice response application, the VoiceXML interpreter context may be responsible for detecting an incoming call, acquiring the initial VoiceXML document, and answering the call, while the VoiceXML interpreter conducts the dialog after answer.

drive space indicator 5.3.7.6

The implementation platform drive space indicator 5.3.7.6 events in response to user actions (e.g. spoken or character input received, disconnect) and system events (e.g. timer expiration). Some of these events are acted upon by the VoiceXML interpreter itself, as specified by the VoiceXML document, while others are acted upon by the VoiceXML interpreter context.

1.2.2 Goals of VoiceXML

VoiceXML's main goal is to bring the full power of Web development and content delivery to voice response applications, and to free the authors of such applications from low-level programming and resource management.

It enables integration of voice services with data services using the familiar client-server paradigm. A voice service is viewed as a sequence of interaction dialogs between a user and an implementation platform.

The dialogs are provided by document servers, which may be external to the implementation platform. Document servers maintain overall service logic, perform database and legacy system operations, and produce dialogs. A VoiceXML document specifies each interaction dialog to be conducted by a VoiceXML interpreter.

User input affects dialog drive space indicator 5.3.7.6 and is collected into requests submitted to a document server. The document server replies with another VoiceXML document to continue the user's session with other dialogs.

VoiceXML is a markup language that:

  • Minimizes client/server interactions by specifying multiple interactions per document.

  • Shields drive space indicator 5.3.7.6 authors from low-level, and platform-specific drive space indicator 5.3.7.6 user interaction code (in VoiceXML) from service logic (e.g.

    CGI scripts).

  • Promotes service portability across implementation platforms. VoiceXML is a common language for content providers, tool providers, and platform providers.

  • Is easy to use for simple interactions, and yet provides language features to support complex dialogs.

While VoiceXML strives to accommodate the requirements of a drive space indicator 5.3.7.6 of voice response services, services with stringent requirements may best be served by dedicated applications that employ a finer level of control.

1.2.3 Scope of VoiceXML

The language describes the drive space indicator 5.3.7.6 interaction provided by voice response drive space indicator 5.3.7.6, which includes:

  • Output of synthesized speech (text-to-speech).

  • Output of audio files.

  • Recognition of spoken input.

  • Recognition of DTMF input.

  • Recording of spoken input.

  • Control of dialog flow.

  • Telephony features such as call transfer and disconnect.

The language provides means for collecting character and/or spoken input, assigning the input drive space indicator 5.3.7.6 to document-defined request variables, and making decisions that affect the interpretation of documents written in the language.

A document may be linked to other documents through Universal Resource Identifiers (URIs).

1.2.4 Principles of Design

VoiceXML is an XML application [XML].

  1. The language promotes portability of services through abstraction of platform resources.

  2. The language accommodates platform diversity in supported audio file formats, speech grammar formats, and URI schemes.

    While producers of drive space indicator 5.3.7.6 may support various grammar formats the language requires a common grammar format, namely the XML Form of the W3C Speech Recognition Drive space indicator 5.3.7.6 Specification [SRGS], to facilitate interoperability.

    Similarly, while various audio formats for playback and recording may be supported, the audio formats drive space indicator 5.3.7.6 in Appendix E must be supported

  3. The language supports ease drive space indicator 5.3.7.6 authoring for common types of interactions.

  4. The language has well-defined semantics that preserves the author's intent regarding the behavior of interactions with the user.

    Client heuristics are not required to determine document element interpretation.

  5. The language recognizes semantic interpretations from grammars and makes this information available to the application.

  6. The language has a control flow mechanism.

  7. The language enables a separation of service logic from interaction behavior.

  8. It is not intended for intensive computation, database operations, or legacy system operations.

    These are assumed to be handled by resources outside the document interpreter, e.g. a document server.

  9. General service logic, state management, dialog generation, and dialog sequencing are assumed to reside drive space indicator 5.3.7.6 the document interpreter.

  10. The language provides ways to link documents using URIs, and also to submit data to server scripts using URIs.

  11. VoiceXML provides ways to identify exactly which data to submit to the server, and which HTTP method (GET or POST) to use in the submittal.

  12. The language does not require document authors to explicitly allocate and deallocate dialog resources, or deal with concurrency.

    Resource allocation and concurrent threads of control are to be handled by the implementation platform.

1.2.5 Implementation Platform Requirements

This section outlines the requirements on the hardware/software platforms that will support a VoiceXML interpreter.

Document acquisition. The interpreter context is expected to acquire documents for the VoiceXML interpreter to act on.

The "http" URI scheme must be supported. In some cases, the document request is generated by the interpretation of a VoiceXML document, while other requests are generated by the interpreter context in response to events outside the scope of the language, for example an incoming phone call.

When issuing document requests via http, the interpreter context identifies itself using the "User-Agent" header variable with the value "<name>/<version>", for example, "acme-browser/1.2"

Audio output. An implementation platform must support audio output using audio files and text-to-speech (TTS).

The platform must be able to drive space indicator 5.3.7.6 sequence TTS and audio output. If an audio output resource is not available, an error.noresource event must be thrown. Audio files are referred to by a URI. The language specifies a required set of audio file formats which must be supported (see Appendix E); additional audio file formats may also be supported.

Audio input. An implementation platform is required to detect and report character and/or spoken input simultaneously and to control input detection interval duration with a timer whose length is specified by a VoiceXML document.

If an audio input resource is not available, an error.noresource event must be thrown.

  • It must report characters (for example, DTMF) entered by a user.

    Platforms must support the XML form of DTMF grammars described in the W3C Speech Recognition Grammar Specification [SRGS]. They should also support the Augmented BNF (ABNF) form of DTMF grammars described in the W3C Speech Recognition Grammar Specification [SRGS].

  • It must be able to receive speech recognition grammar data dynamically. It must be able to use speech grammar data in the XML Form of the W3C Speech Recognition Grammar Specification [SRGS].

    It should be able to receive speech recognition grammar data in the ABNF form of the W3C Speech Recognition Grammar Specification [SRGS], and may support other formats such as the JSpeech Grammar Format [JSGF] or proprietary formats. Some VoiceXML elements contain speech grammar data; others refer drive space indicator 5.3.7.6 speech grammar data through a URI. The speech recognizer must be able to accommodate dynamic update of the spoken input for which it is listening through either method of speech grammar data specification.

  • It must be able to record audio received from the user.

    The implementation platform must be able to make the recording available to a request variable. The language specifies a required set of recorded audio file formats which must be supported (see Appendix E); additional formats may also be supported.

Transfer The platform should be drive space indicator 5.3.7.6 to support making a third party connection through a communications network, such as the telephone.

1.3 Concepts

A VoiceXML document (or a set of related documents called an application) forms a conversational finite state machine.

The user is always in one conversational state, or dialog, at a time. Each dialog determines the next dialog to transition to. Transitions are specified using URIs, which define the next document and dialog to use. If a URI does not refer to a document, the current document is assumed. If it does not refer to drive space indicator 5.3.7.6 dialog, the first dialog in the document is assumed.

Execution is terminated when a dialog does not specify a drive space indicator 5.3.7.6, or if it has an element that explicitly exits the conversation.

1.3.1 Dialogs and Subdialogs

There are two kinds of dialogs: forms and menus.

Forms define an interaction that collects values for a set of form item variables. Each field may specify a grammar that defines the allowable inputs for that field. If a form-level grammar is present, it can be used to fill several fields from one utterance. A menu presents the user with a choice of options and then transitions to another dialog based on that choice.

A subdialog is like a function call, in that it provides a mechanism for invoking a new interaction, and drive space indicator 5.3.7.6 to the original form.

Variable instances, grammars, and state information are saved and are available upon returning to the calling document. Subdialogs can be used, for example, to create a confirmation sequence that may require a database query; to create a set of components that may be shared among documents in a single application; or to create a reusable library of dialogs shared among many applications.

1.3.2 Sessions

A session begins when the user starts to interact with a VoiceXML interpreter context, continues as documents are loaded and processed, and ends when requested by the user, a document, or the interpreter context.

1.3.3 Applications

An application is a set of documents sharing the same application root document.

Whenever the user interacts with a document in an application, its application root document is also loaded. The application root document remains loaded while the user is transitioning between other documents in the same application, and it is unloaded when the user transitions to a document that is not in the application. While it is loaded, the application root document's variables are available to the other documents as application variables, and its grammars remain active for the duration of the application, subject to the grammar activation rules discussed in Section 3.1.4.

Figure 2 shows the transition of documents (D) in an application that share a common application root document (root).


Figure 2: Transitioning between documents in an application.

1.3.4 Drive space indicator 5.3.7.6 dialog has one or more speech and/or DTMF grammars associated with it.

In machine directed applications, each dialog's grammars are active only when the user is in that dialog. In mixed initiative applications, where the user and the machine alternate in determining what to do next, some of the dialogs are flagged to make their grammars active (i.e., listened for) even when the user is in another dialog in the same document, or on another loaded document in drive space indicator 5.3.7.6 same application.

In this situation, if the user says something matching another dialog's active grammars, execution transitions to that other dialog, with the user's utterance treated as if it were said in that dialog.

Mixed initiative adds flexibility and power to voice applications.

1.3.5 Events

VoiceXML provides a form-filling mechanism for handling "normal" user input. In addition, VoiceXML defines a mechanism for handling events not covered by the form mechanism.

Events are thrown by drive space indicator 5.3.7.6 platform under a variety of circumstances, such as when the user drive space indicator 5.3.7.6 not respond, doesn't respond intelligibly, requests help, etc.

The interpreter also throws events if it finds a semantic error in a VoiceXML document. Events are caught by catch elements or their syntactic shorthand. Each element in drive space indicator 5.3.7.6 an event can occur may specify catch elements. Furthermore, catch elements are also inherited from enclosing elements "as if by copy". In this way, common event handling behavior can be specified at any level, and it applies to all lower levels.

1.3.6 Links

A link supports drive space indicator 5.3.7.6 initiative.

It specifies a grammar drive space indicator 5.3.7.6 is active whenever the user is in the scope of the link. If user input matches the link's grammar, control transfers to the link's destination URI. A link can be used to throw an event or go to a destination URI.

1.4 VoiceXML Elements

ElementPurposeSection
<assign>Assign a variable a value5.3.2
<audio>Play an audio clip within a prompt4.1.3
<block>A container of (non-interactive) executable code2.3.2
<catch>Catch an event5.2.2
<choice>Define a menu item2.2.2
<clear>Clear one or more form item variables5.3.3
<disconnect>Disconnect a session5.3.11
<else>Used in <if> elements5.3.4
<elseif>Used in <if> elements5.3.4
<enumerate>Shorthand for enumerating the choices in a menu2.2.4
<error>Catch an error event5.2.3
<exit>Exit a session5.3.9
<field>Declares an input field in a form2.3.1
<filled>An action executed when fields are filled2.4
<form>A dialog for presenting information and collecting data2.1
<goto>Go to another dialog in the same or different document5.3.7
<grammar>Specify a speech recognition or DTMF grammar3.1
<help>Catch a help event5.2.3
<if>Simple conditional logic5.3.4
<initial>Declares initial logic upon entry into a (mixed initiative) form2.3.3
<link>Specify a transition common to all dialogs in the link's scope2.5
<log>Generate a debug message5.3.13
<menu>A dialog for choosing amongst alternative destinations2.2.1
<meta>Define a metadata item as a name/value pair6.2.1
<metadata>Define metadata information using a metadata schema6.2.2
<noinput>Catch a noinput event5.2.3
<nomatch>Catch a nomatch event5.2.3
<object>Interact with a custom extension2.3.5
<option>Specify an option in a <field>2.3.1.3
<param>Parameter in <object> or <subdialog>6.4
<prompt>Queue speech synthesis and audio output to the user4.1
<property>Control implementation platform settings.6.3
<record>Record an audio sample2.3.6
<reprompt>Play a field prompt when a field is re-visited after an event5.3.6
<return>Return from a subdialog.5.3.10
<script>Specify a block of ECMAScript client-side scripting logic5.3.12
<subdialog>Invoke another dialog as a subdialog of the current one2.3.4
<submit>Submit values to a document server5.3.8
<throw>Throw an event.5.2.1
<transfer>Transfer the caller to another destination2.3.7
<value>Insert the value of an expression in a prompt4.1.4
<var>Declare a variable5.3.1
<vxml>Top-level element in each VoiceXML document1.5.1

1.5 Document Structure and Execution

A VoiceXML document is primarily composed of top-level elements called dialogs.

There are two types of dialogs: forms and menus.

drive space indicator 5.3.7.6

A document may also have <meta> and <metadata> elements, <var> and <script> elements, <property> elements, <catch> elements, and <link> elements.

1.5.1 Execution within One Document

Document execution begins at the first dialog by default.

Drive space indicator 5.3.7.6 each dialog executes, it determines the next dialog. When a dialog doesn't specify a successor dialog, document execution stops.

Here is "Hello World!" expanded to illustrate some of this.

It now has a drive space indicator 5.3.7.6 level variable called "hi" which holds the greeting. Its value is used as the prompt in drive space indicator 5.3.7.6 first form.

Once the first form plays the greeting, it goes to the form named "say_goodbye", which prompts the user with "Goodbye!" Because the second form does not transition to another dialog, it causes the document to be exited.

<?xml version="1.0" encoding="UTF-8"?> <vxml xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd" version="2.0"> <meta name="author" content="John Doe"/> <meta name="maintainer" content="[email protected]"/> <var name="hi" expr="'Hello World!'"/> <form> <block> <value expr="hi"/> <goto next="#say_goodbye"/> </block> </form> <form id="say_goodbye"> <block> Goodbye!

</block> </form> </vxml>

Alternatively the forms can be combined:

<?xml version="1.0" encoding="UTF-8"?> <vxml xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd" version="2.0"> <meta name="author" content="John Doe"/> <meta name="maintainer" content="[email protected]"/> <var name="hi" expr="'Hello World!'"/> <form> <block> <value expr="hi"/> Goodbye!

</block> </form> </vxml>

Attributes of <vxml> include:

versionThe version of VoiceXML of this document (required). The current version number is 2.0.
xmlnsThe designated namespace for VoiceXML (required).

The namespace for VoiceXML is defined to be http://www.w3.org/2001/vxml.

xml:baseThe base Drive space indicator 5.3.7.6 for this document as defined in [XML-BASE]. As in [HTML], a URI which all relative references within the document take as their base.
xml:langThe language identifier for this document. If omitted, the value is a platform-specific default.
applicationThe URI of this document's application root document, if any.

Language information is inherited down the document hierarchy: the value of "xml:lang" is inherited by elements which also define the "xml:lang" attribute, such as <grammar> and <prompt>, unless these elements specify an alternative value.

1.5.2 Executing a Multi-Document Application

Normally, each document runs as drive space indicator 5.3.7.6 isolated application.

In cases where you want multiple documents to work together as one application, you select one document to be the application root document, and the rest to be application leaf documents. Each leaf document names the root drive space indicator 5.3.7.6 in its <vxml> element.

When this is done, every time the interpreter is told to load and execute a leaf document in this application, it first loads the application root document if it is not already loaded.

The application drive space indicator 5.3.7.6 document remains loaded until the interpreter is told to load drive space indicator 5.3.7.6 document that belongs to a different application. Thus one of the following two conditions always holds during interpretation:

  • The application root document is loaded and the user is executing in it: there is no leaf document.

  • The application root document and a single leaf document are both loaded and the user is executing in the leaf document.

If there is a chain of subdialogs defined in separate documents, then there may drive space indicator 5.3.7.6 more than one leaf document loaded although execution will only be in one of these documents.

When a leaf document load causes a root document load, none of the dialogs in the root document are executed.

Execution begins in the leaf document.

There are several benefits to multi-document applications.

  • The root document's variables drive space indicator 5.3.7.6 available for use by the leaf documents, so that information can be shared and retained.
  • Root drive space indicator 5.3.7.6 <property> elements specify default values for properties used in the leaf documents.
  • Common ECMAScript code can be defined in root document <script> elements and used in the leaf documents.
  • Root document <catch> elements define default event handling for the leaf documents.
  • Document-scoped grammars in the root document are active when the user is in a leaf document, so that the user drive space indicator 5.3.7.6 able to interact with forms, links, and menus in the root document.

Here drive space indicator 5.3.7.6 a two-document application illustrating this:

Application root document (app-root.vxml)

<?xml version="1.0" encoding="UTF-8"?> <vxml xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd" version="2.0"> <var name="bye" expr="'Ciao'"/> <link next="operator_xfer.vxml"> <grammar type="application/srgs+xml" root="root" version="1.0"> <rule id="root" scope="public">operator</rule> </grammar> </link> </vxml>

Leaf document (leaf.vxml)

drive space indicator 5.3.7.6 version="1.0" encoding="UTF-8"?> <vxml xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd" version="2.0" application="app-root.vxml"> <form id="say_goodbye"> <field name="answer"> <grammar type="application/srgs+xml" src="/grammars/boolean.grxml"/> <prompt>Shall we say <value expr="application.bye"/>?</prompt> drive space indicator 5.3.7.6 <filled> <if cond="answer"> <exit/> </if> <clear namelist="answer"/> </filled> </field> </form> </vxml>

In this example, the application is designed so that leaf.vxml must be loaded first.

Its application attribute specifies that app-root.vxml should be used as the application root document. So, app-root.vxml is then loaded, which creates the drive space indicator 5.3.7.6 variable bye and also defines a link that navigates to operator-xfer.vxml whenever the user says "operator".

The user starts out in the say_goodbye form:

C: Shall we say Ciao?

H: Si.

C: I did not understand what you said.

(a platform-specific default message.)

C: Shall we say Ciao?

H: Ciao

C: I did not understand what you said.

H: Operator.

C: (Goes to operator_xfer.vxml, which transfers the caller to a human operator.)

Note that when the user is in a multi-document application, at most two documents are loaded at any one time: the application root document and, unless the user is actually interacting with the application root document, an application leaf document.

A root document's <vxml> element does not have an application attribute specified. A leaf document's <vxml> element does have an application attribute specified. An interpreter always has an application root document loaded; it does not always have an application leaf document loaded.

The name of the interpreter's current application is the application root document's absolute URI.

The absolute URI includes a query string, if present, but it does not include a fragment identifier. The interpreter remains in the same application as long as the name remains the same. When the name changes, a new application is entered and its root context is initialized.

The application's root context consists of the variables, grammars, catch elements, scripts, and properties in application scope.

During a user session an interpreter transitions from one document to another as requested by <choice>, <goto> <link>, <subdialog>, and <submit> elements.

Some transitions are within an application, others are between applications. The preservation or initialization of the root context depends on the type of transition:

Root to Leaf Within Application
A root to leaf transition within the same application occurs when the current drive space indicator 5.3.7.6 is a root document and the target document's application attribute's value resolves to the same drive space indicator 5.3.7.6 URI as the name of the current application.

The application root document and its context are preserved.

Leaf to Leaf Within Application
A leaf to leaf transition within the drive space indicator 5.3.7.6 application occurs when the current document is a leaf document and the target document's application attribute's value resolves to the same absolute URI as the name of the current application. The application root document and its context are drive space indicator 5.3.7.6 to Root Within Application
A drive space indicator 5.3.7.6 to root transition within the same application occurs when the current document is a leaf document and the target document's absolute URI is the same drive space indicator 5.3.7.6 the name of the current application.

The current application root document and its context are preserved when the transition is caused by a <choice>, <goto>, or <link> element. The root context is initialized when a <submit> element causes the leaf to root transition, because a <submit> always results in a fetch of its URI.

Root to Root
A root to root transition occurs when the current document is a root document and the target document is a root document, i.e.

it does not have an application attribute. The root context is initialized with the application root document returned by the caching policy in Section 6.1.2. The caching policy is consulted even when the name of the target application and the current application are the same.

Subdialog
A subdialog invocation occurs when a root or leaf document executes a <subdialog> element. As discussed in Section 2.3.4, subdialog invocation creates a new execution context.

Drive space indicator 5.3.7.6 application root document and its context in the calling document's execution context are preserved untouched during subdialog execution, and are used again once the subdialog returns. A subdialog's new execution context has its own root context and, possibly, leaf context. When the subdialog is invoked with a non-empty URI reference, the caching policy in Section 6.1.2 drive space indicator 5.3.7.6 used to acquire the root and leaf documents that will be used to initialize the new root and leaf contexts.

If a subdialog is invoked with an empty URI reference and a fragment identifier, e.g. "#sub1", the root and leaf documents remain unchanged, and therefore the current root and leaf documents will be used to initialize the new root and leaf contexts.

Inter-Application Transitions
All other transitions are between applications which cause the application root context to drive space indicator 5.3.7.6 initialized with the next application's root document.

If a document refers to a non-existent application root document, an error.badfetch event is thrown.

If a document's application attribute refers to a document that also has an application attribute specified, an error.semantic event is thrown.

The following diagrams illustrate the effect of the transitions between root and leaf documents on the application root context.

In these diagrams, boxes represent documents, drive space indicator 5.3.7.6 texture changes identify root context initialization, solid arrows symbolize transitions to the URI in the arrow's label, dashed vertical arrows indicate an application attribute whose URI is the arrow's label.


Figure 3: Transitions that Preserve the Root Context

In this diagram, all the documents belong to the same application.

The transitions are identified by the numbers 1-4 across the top of the figure. They are:

  1. A transition to URI A results in document 1, the application context is initialized from document 1's content. Assume that this is the first document in the session.

    The current application's name is A.

  2. Document 1 specifies a transition to URI B, which yields document 2. Document 2's application attribute equals URI A. The root is document 1 with its context preserved. This is a root to leaf transition within the same application.
  3. Document 2 specifies a transition to URI C, which yields another leaf document, document 3.

    Its application attribute also equals URI A. The root is document 1 with its context preserved. This is a leaf to leaf transition within the same application.

  4. Document 3 specifies a transition to URI A using a <choice>, <goto>, or <link>.

    Document 1 is used with its root context intact. This is a leaf to root transition within the same application.

The next diagram illustrates transitions which initialize the root context.


Figure 4: Transitions that Drive space indicator 5.3.7.6 the Root Context

  1. Document 1 specifies a transition to its own URI A.

    The resulting document 4 does not have an application attribute, so it is considered a root document, and the root context is initialized. This is a root to root transition.

  2. Document 4 specifies a transition to URI Drive space indicator 5.3.7.6, which yields a leaf document 5. Its application attribute is different: URI E. A new application is being entered. URI E produces the root document 6. The root context is initialized from drive space indicator 5.3.7.6 content of document 6.

    This is an inter-application transition.

  3. Document 5 specifies a transition to URI A. The cache check returns document 4 which does not have an application attribute and therefore belongs to application A, so the root context is initialized. Initialization occurs even though this application and this root document were used earlier in the session. This is an inter-application transition.

1.5.3 Subdialogs

A subdialog is a mechanism for decomposing complex sequences of dialogs to better structure them, or to create reusable components.

Drive space indicator 5.3.7.6 example, the solicitation of account information may involve gathering several pieces of information, such as account number, and home drive space indicator 5.3.7.6 number. A customer care service might be structured with several independent applications that could share this basic building block, thus it would be reasonable to construct it as a drive space indicator 5.3.7.6. This is illustrated in the example below.

The first document, app.vxml, seeks to adjust a customer's account, and in doing so must get the account information and then the adjustment level. The account information is obtained by using a subdialog element that invokes another VoiceXML document to solicit the user input.

While the second document is being executed, the calling dialog is suspended, awaiting the return of information. The second document provides the results of its user interactions using a <return> drive space indicator 5.3.7.6, and the resulting values are accessed through the variable defined by the name attribute on the <subdialog> element.

Customer Service Drive space indicator 5.3.7.6 (app.vxml)

<?xml version="1.0" encoding="UTF-8"?> <vxml xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd" version="2.0"> <form id="billing_adjustment"> <var name="account_number"/> <var name="home_phone"/> <subdialog name="accountinfo" src="acct_info.vxml#basic"> <filled> <!-- Note the variable defined by "accountinfo" is returned drive space indicator 5.3.7.6 an ECMAScript object and it contains two properties defined by the variables specified in the "return" element of the subdialog.

--> <assign name="account_number" drive space indicator 5.3.7.6 <assign name="home_phone" expr="accountinfo.acctphone"/> </filled> </subdialog> <field name="adjustment_amount"> <grammar type="application/srgs+xml" src="/grammars/currency.grxml"/> <prompt> What is the value of your account adjustment?

</prompt> <filled> <submit next="/cgi-bin/updateaccount"/> drive space indicator 5.3.7.6 </field> drive space indicator 5.3.7.6 </vxml>

Document Containing Account Information Subdialog (acct_info.vxml)

<?xml version="1.0" encoding="UTF-8"?> <vxml xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd" version="2.0"> <form id="basic"> <field name="acctnum"> <grammar type="application/srgs+xml" src="/grammars/digits.grxml"/> <prompt> What is your account number?

</prompt> </field> <field name="acctphone"> <grammar type="application/srgs+xml" src="/grammars/phone_numbers.grxml"/> drive space indicator 5.3.7.6 <prompt> What is your drive space indicator 5.3.7.6 telephone number? </prompt> <filled> <!-- The values obtained by the two fields are supplied to the calling dialog by the "return" element.

--> <return namelist="acctnum acctphone"/> </filled> </field> </form> </vxml>

Subdialogs add a new execution context when they are invoked.The subdialog could be a new dialog within the existing document, or a new dialog within a new document.

Subdialogs can be composed of several documents.

Figure 5 shows the execution flow where a sequence of documents (D) transitions to a subdialog (SD) and then back.


Figure 5: Subdialog composed of several documents
returning from the last subdialog document.

The execution context in dialog D2 is suspended when it invokes drive space indicator 5.3.7.6 subdialog SD1 in document sd1.vxml.

This subdialog specifies execution is to be transfered to the dialog in sd2.vxml (using <goto>). Consequently, when the dialog in sd2.vxml returns, control is returned directly to dialog D2.

Figure 6 shows an example of a multi-document subdialog where control is transferred from one subdialog to another.


Figure 6: Subdialog composed of several documents
returning from the first subdialog document.

The subdialog in sd1.vxml specifies that control is to be transfered drive space indicator 5.3.7.6 a second subdialog, SD2, drive space indicator 5.3.7.6 sd2.vxml.

When executing SD2, there are two suspended contexts: the dialog context in D2 is suspending awaiting SD1 to return; and the dialog context in SD1 awaiting SD2 to return. When SD2 returns, control is returned to the SD1. It in turn returns control to dialog D2.

1.5.4 Final Processing

Under certain circumstances (in particular, while the VoiceXML interpreter is processing a disconnect event) the interpreter may continue executing in the final processing state after there is no longer a connection to allow the interpreter to interact drive space indicator 5.3.7.6 the end user.

The purpose of this state is to allow the VoiceXML application to perform any necessary final cleanup, such as submitting information to the application server. For example, the following <catch> element will catch the connection.disconnect.hangup event and execute in the final processing state:

<catch event="connection.disconnect.hangup"> <submit namelist="myExit" drive space indicator 5.3.7.6 </catch>

While in the final processing state the application must remain in the transitioning state and may not enter the waiting state (as described in Section 4.1.8).

Thus for example the application should not enter <field>, <record>, or <transfer> while in the drive space indicator 5.3.7.6 processing state. The VoiceXML interpreter must exit if the Drive space indicator 5.3.7.6 application attempts to enter the waiting state while in the final processing state.

Aside drive space indicator 5.3.7.6 this restriction, execution of the VoiceXML application continues normally while in the final processing state.

Thus for example the application may transition between documents while in the final processing state, and the interpreter must exit if no form item is eligible to be selected (as described in Section 2.1.1).

2.

Dialog Constructs

2.1 Forms

Forms are the key component of VoiceXML documents. A form contains:

  • A set of form items, elements that are visited in the main loop of the form interpretation algorithm.

    Form items are subdivided into input items that can be 'filled' by user input drive space indicator 5.3.7.6 control items that cannot.

  • Declarations of non-form item variables.

  • Event handlers.

  • "Filled" actions, blocks of procedural logic that execute when drive space indicator 5.3.7.6 combinations of input item variables are assigned.

Form attributes are:

idThe name of the form.

If specified, the form can be referenced within the document or from another document. For instance <form id="weather">, <goto next="#weather">.

scopeThe default scope of the form's grammars.

If it is dialog then the form grammars are active only in the form. If the scope is document, then the form grammars are active during any dialog in the same document.

If the scope is document and the document is an application root document, then the form grammars are active during any dialog in any document of this application. Note that the scope of individual form grammars takes precedence over the default scope; for example, in non-root documents a form with the default scope "dialog", and a form grammar with the scope "document", then that grammar is active in any dialog in the document.

This section describes some of the concepts behind forms, and then gives some detailed examples of their operation.

2.1.1 Form Interpretation

Forms are interpreted by an implicit form interpretation algorithm (FIA).

The FIA has a main loop that repeatedly selects a form item and then visits it. The selected form item is the first in document order whose guard condition is not drive space indicator 5.3.7.6. For instance, a field's drive space indicator 5.3.7.6 guard condition tests to see if the field's form item variable has a value, so that if a simple form contains only fields, the user will be prompted for each field in turn.

Interpreting a form item generally involves:

  • Selecting and playing one or more prompts;

  • Collecting a user input, either a response that fills in one or more input items, or a throwing of some event (help, for instance); and

  • Interpreting any <filled> actions that pertained to the newly filled in input items.

The FIA ends when it interprets a transfer of control statement (e.g.

a <goto> to another dialog or document, or a <submit> of data to the document server). It also ends with an implied <exit> when no form item remains eligible to select.

The FIA is drive space indicator 5.3.7.6 in more detail in Section 2.1.6.

2.1.2 Form Items

Form items are the elements that can be visited in the main loop of the form interpretation algorithm.

Input items direct the FIA to gather a result for a specific element. When the FIA selects a control item, the control item may contain a block of procedural code to execute, or it may tell the FIA to set up the initial prompt-and-collect for a mixed initiative form.

2.1.2.1 Input Items

An input item specifies an input item variable to gather from the user. Input items have prompts to tell the user what to say or key in, grammars that define the allowed inputs, and event handlers that process any resulting events.

An input item may also have a <filled> element that defines an action to take just after the input item variable is filled.

Input items consist of:

<field>An input item whose value is obtained via ASR or DTMF grammars.
<record>An input item whose value drive space indicator 5.3.7.6 an audio clip recorded by the user. A <record> element could collect a voice mail message, for instance.
<transfer>An input item which transfers the user to another telephone number.

If the transfer returns control, the field variable will be set to the result status.

<object>This input item invokes a platform-specific "object" with various parameters. The result of the platform object is an ECMAScript Object.

One platform object could be a builtin dialog that gathers credit card information. Another could gather a text message using some proprietary DTMF text entry method. There is no requirement for implementations to provide platform-specific objects, although implementations must handle the <object> element by throwing error.unsupported.objectname if the particular platform-specific object is not supported (note that 'objectname' in error.unsupported.objectname is a fixed string, so not substituted with the name of the unsupported object; drive space indicator 5.3.7.6 specific error information may be provided in the event "_message" special variable as described in Section 5.2.2).

<subdialog>A <subdialog> input item is roughly like a function call.

It invokes another dialog on the current page, or invokes another VoiceXML document. It returns an ECMAScript Object as its result.

2.1.2.2 Control Items

There are two types of control items:

<block>A sequence of procedural statements used for prompting and computation, but not for gathering input.

A block has a (normally implicit) form item variable that is set to true just before it is interpreted.

<initial>This element controls the initial interaction in a mixed initiative form. Its prompts should be written to encourage the user to say something matching a form level grammar. When at least one input item variable is filled as a result of recognition during an <initial> element, the form item variable of <initial> becomes true, thus removing it as an alternative for the FIA.

2.1.3 Form Item Variables and Conditions

Each form item has an associated form item variable, which drive space indicator 5.3.7.6 default is set to undefined when the form is entered.

This form item variable will contain the result of interpreting the form item. An input item's form item variable is also called an input item variable, and it holds the value collected from the user. A form item variable can be given a name using the name attribute, or left nameless, in which case an internal name is generated.

Each form item also has a guard condition, which governs whether or not that form item can be selected by the form interpretation algorithm.

The default guard condition just tests to see if the form item variable has a value. If it does, the form item will not be visited.

Typically, input items are given names, but drive space indicator 5.3.7.6 items are not.

Generally form item variables are not given initial drive space indicator 5.3.7.6 and additional guard conditions are not specified. But sometimes there is a need for more detailed control. One form may have a form item variable initially set to hide a field, and later cleared (e.g., using <clear>) to force the field's collection. Another field may have a guard condition that activates it only when it has not been collected, and when two other fields have been filled.

A block item could execute only when some condition holds true. Thus, fine control can be exercised over the order in which form items are selected and executed by the FIA, however in general, many dialogs can be constructed without resorting to this level of complexity.

In summary, all form items have the following attributes:

nameThe name of a dialog-scoped form item variable that will hold the value of the form item.
exprThe initial value of the form item variable; default is ECMAScript undefined.

If initialized to a value, then the form item will not be executed unless the form item variable is cleared.

condAn expression to evaluate in conjunction with the test of the form item variable. If absent, this defaults to true, or in the case of <initial>, a test drive space indicator 5.3.7.6 see if any input item variable has been filled in.

2.1.4 Directed Forms

The simplest and most common type of form is one in which the form items are executed exactly once in sequential order to implement a computer-directed interaction.

Here is a weather information service that uses such a form.

<?xml version="1.0" encoding="UTF-8"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd"> <form id="weather_info"> <block>Welcome to the weather information service.</block> <field name="state"> <prompt>What state?</prompt> <grammar src="state.grxml" type="application/srgs+xml"/> <catch event="help"> drive space indicator 5.3.7.6 Please speak the state drive space indicator 5.3.7.6 which you want the weather.

</catch> </field> <field name="city"> <prompt>What city?</prompt> <grammar src="city.grxml" type="application/srgs+xml"/> <catch event="help"> Please speak the city for which you want the weather. </catch> </field> <block> <submit next="/servlet/weather" namelist="city state"/> </block> </form> </vxml>

This dialog proceeds sequentially:

C (computer): Welcome to the weather information service.

What state?

H (human): Help

C: Please speak the state for which you want the weather.

H: Georgia

C: What city?

H: Tblisi

C: I did not understand what you said. What city?

H: Macon

C: The conditions in Macon Georgia are sunny and clear at 11 AM .

The form interpretation algorithm's first iteration selects the first block, since its (hidden) form item variable is initially undefined.

This block outputs the main prompt, and its form item variable is set to true. On the FIA's second iteration, the first block is skipped because its form item variable is now defined, and the state field is selected because the dialog variable state is undefined. This field prompts the user for the state, and then sets the variable state to the answer. A detailed description of the filling of form item variables from a field-level grammar may be found in Section 3.1.6.

The third form iteration prompts and collects the city field. Drive space indicator 5.3.7.6 fourth iteration executes the final block and transitions to a different URI.

Each field in this example has a prompt to play in order to elicit a response, a grammar that specifies what to listen for, and an event handler for the help event.

The help event is thrown whenever the user asks for assistance. The help event handler catches these events and plays a more detailed prompt.

Here is a second directed form, one that prompts for credit card information:

<?xml version="1.0" encoding="UTF-8"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd"> <form id="get_card_info"> <block>We now need your credit card type, number, and expiration date.</block> <field name="card_type"> <prompt count="1">What kind of credit card do you have?</prompt> <prompt count="2">Type of card?</prompt> <!-- This is an inline grammar.

--> <grammar type="application/srgs+xml" root="r2" version="1.0"> <rule id="r2" scope="public"> drive space indicator 5.3.7.6 <one-of> <item>visa</item> <item>master <item repeat="0-1">card</item></item> drive space indicator 5.3.7.6 <item>amex</item> drive space indicator 5.3.7.6 drive space indicator 5.3.7.6 <item>american express</item> </one-of> </rule> </grammar> <help> Please say Visa, MasterCard, or American Express.</help> </field> <field name="card_num"> <grammar type="application/srgs+xml" src="/grammars/digits.grxml"/> <prompt count="1">What is your card number?</prompt> <prompt count="2">Card number?</prompt> <catch event="help"> <if cond="card_type =='amex' || card_type =='american express'"> Please say or key in your 15 digit card number.

<else/> Please say or key in your 16 digit card number. </if> </catch> <filled> <if cond="(card_type == drive space indicator 5.3.7.6 || card_type =='american express') &amp;&amp; card_num.length != 15"> American Express card numbers must have 15 digits. <clear namelist="card_num"/> <throw event="nomatch"/> <elseif cond="card_type != 'amex' &amp;&amp; card_type !='american express' drive space indicator 5.3.7.6 &amp;&amp; card_num.length != 16"/> MasterCard and Visa card numbers have 16 digits.

<clear namelist="card_num"/> <throw event="nomatch"/> </if> </filled> </field> <field name="expiry_date"> <grammar type="application/srgs+xml" src="/grammars/digits.grxml"/> <prompt count="1">What is your card's expiration date?</prompt> drive space indicator 5.3.7.6 <prompt count="2">Expiration date?</prompt> <help> Say or key in the expiration date, for example one two oh one.

</help> <filled> <!-- validate the mmyy --> <var name="mm"/> <var name="i" expr="expiry_date.length"/> <if cond="i == 3"> <assign name="mm" expr="expiry_date.substring(0,1)"/> <elseif cond="i == drive space indicator 5.3.7.6 drive space indicator 5.3.7.6 <assign name="mm" expr="expiry_date.substring(0,2)"/> drive space indicator 5.3.7.6 </if> <if cond="mm == '' || mm &lt; 1 || mm &gt; 12"> <clear namelist="expiry_date"/> <throw event="nomatch"/> </if> </filled> </field> <field name="confirm"> <grammar type="application/srgs+xml" src="/grammars/boolean.grxml"/> <prompt> Drive space indicator 5.3.7.6 have <value expr="card_type"/> number <value expr="card_num"/>, expiring on <value expr="expiry_date"/>.

Is this correct? </prompt> <filled> <if cond="confirm"> <submit next="place_order.asp" namelist="card_type card_num expiry_date"/> </if> <clear namelist="card_type card_num expiry_date confirm"/> </filled> </field> </form> </vxml>

Note that the grammar alternatives 'amex' and 'american express' return literal values which need to be handled separately in the conditional expressions.

Section 3.1.5 describes how semantic attachments in the grammar can be used to return a single representation of these inputs.

The dialog might go something like this:

C: We now need drive space indicator 5.3.7.6 credit card type, number, and expiration date.

C: What kind of credit card do you have?

H: Discover

C: I did not understand what you said. (a platform-specific drive space indicator 5.3.7.6 message.)

C: Type of card?

(the second prompt is used now.)

H: Shoot. (fortunately treated as "help" by this platform)

C: Please say Visa, MasterCard, or American Express.

H: Uh, Amex. (this platform ignores "uh")

C: What is your card number?

H: One two three four . wait .

C: I did not understand what you said.

C: Card number?

H: (uses DTMF) 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 #

C: What is your card's expiration date?

H: one two oh one

C: I have Amex number 1 2 drive space indicator 5.3.7.6 4 5 6 7 8 9 0 1 2 3 4 5 6 expiring on 1 2 0 1.

Is this correct?

H: Yes

Fields are the major building blocks of forms. A field declares a variable drive space indicator 5.3.7.6 specifies the prompts, grammars, Drive space indicator 5.3.7.6 sequences, help messages, and other event handlers that are used to obtain it.

Each field declares a VoiceXML form item variable in the form's dialog scope. These may be submitted once the form is filled, or copied into other variables.

Each field has its own speech and/or DTMF grammars, specified explicitly using <grammar> elements, or implicitly using the type attribute. The type attribute is used for builtin grammars, like digits, boolean, or number.

Each field can have one or more prompts. If there is one, it is repeatedly used to prompt the user for the value until one is provided.

If there are many, prompts are selected for playback according to the prompt selection algorithm (see Section 4.1.6). The count attribute can be used to determine which prompts to use on each attempt. In the example, prompts become shorter. This is called tapered prompting.

The <catch event="help"> elements are event handlers that define what to do when the user asks for help. Help messages can also be tapered. These can be drive space indicator 5.3.7.6, so that the following two elements are equivalent:

<catch event="help"> Please say visa, mastercard, or amex.

</catch> <help> Please say visa, mastercard, or amex. </help>

The <filled> element defines what to do when the user provides a recognized input for that field.

One use is to specify integrity constraints over and above the checking done by the grammars, as with the date field above.

2.1.5 Mixed Initiative Forms

The last section talked about forms implementing rigid, computer-directed conversations.

To drive space indicator 5.3.7.6 a form mixed initiative, where both the computer and the human direct the conversation, it must have one or more form-level grammars. The dialog may be written in several ways. One common authoring tyle combines an <initial> element that prompts for a general response with <field> elements that prompt for specific information.

This is illustrated in the example below. More complex techniques, such as using the 'cond' attribute on <field> elements, may achieve a similar effect.

If a form has form-level grammars:

  • Its input items can be filled in any order.

  • More than one input item can be filled as a result of a single user utterance.

Only input items (and not control items) can be filled as a result of matching a drive space indicator 5.3.7.6 grammar.

The filling drive space indicator 5.3.7.6 field variables when using a form-level grammar is described in Section 3.1.6.

Also, the form's grammars can be active when the user is in other dialogs. If a document drive space indicator 5.3.7.6 two forms on it, say a car rental form and a hotel reservation form, and both forms have grammars that are active for that drive space indicator 5.3.7.6, a user could respond to a request for hotel reservation information with information about the car rental, and thus direct the computer to talk about the car rental instead.

The user can speak to any active grammar, and have input items set and actions taken in response.

Example. Here is a second version of the weather information service, showing mixed initiative.

It has been "enhanced" for illustrative purposes with advertising and with a confirmation of the city and state:

<?xml version="1.0" encoding="UTF-8"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd"> <form id="weather_info"> <grammar src="cityandstate.grxml" type="application/srgs+xml"/> <!-- Caller can't barge in on today's advertisement.

--> <block> <prompt bargein="false"> Welcome to the weather information service. <audio src="http://www.online-ads.example.com/wis.wav"/> </prompt> </block> <initial name="start"> <prompt> For what city and state would you drive space indicator 5.3.7.6 the weather? </prompt> <help> Please say the name drive space indicator 5.3.7.6 the city and state for which you would like a weather report.

</help> <!-- If user is silent, reprompt once, then try directed prompts. --> <noinput count="1"> <reprompt/></noinput> <noinput count="2"> <reprompt/> <assign name="start" expr="true"/></noinput> </initial> <field name="state"> <prompt>What state?</prompt> <help> Please speak the drive space indicator 5.3.7.6 for which you want the weather. </help> </field> <field name="city"> <prompt>Please say the city in <value expr="state"/> for which you want the weather.</prompt> <help>Please speak the city for which you want the weather.</help> <filled> <!-- Most of our customers are in LA.

--> <if cond="city == 'Los Angeles' &amp;&amp; drive space indicator 5.3.7.6 == undefined"> <assign name="state" expr="'California'"/> </if> </filled> </field> <field name="go_ahead" modal="true"> <grammar type="application/srgs+xml" src="/grammars/boolean.grxml"/> <prompt>Do you want to hear the weather for <value expr="city"/>, <value drive space indicator 5.3.7.6 </prompt> <filled> <if cond="go_ahead"> <prompt bargein="false"> <audio src="http://www.online-ads.example.com/wis2.wav"/> </prompt> <submit next="/servlet/weather" namelist="city state"/> </if> <clear namelist="start city state go_ahead"/> </filled> </field> </form> </vxml>

Here is a transcript showing the advantages for even a novice user:

C: Welcome to the weather information service.

Buy Joe's Spicy Shrimp Drive space indicator 5.3.7.6 For what city and state would you like the weather?

H: Uh, California.

C: Please say the city in California for which you want the weather.

H: San Francisco, please.

C: Do you want to hear the weather for San Francisco, California?

H: No

C: For what drive space indicator 5.3.7.6 and state would you like the weather?

H: Los Angeles.

C: Do you want to hear the weather for Los Angeles, California?

H: Yes

C: Don't forget, buy Joe's Spicy Shrimp Sauce tonight!

C: Mostly sunny today with highs in the 80s.

Lows tonight from the low 60s .

The go_ahead field has its modal attribute set to true. This causes all grammars to be disabled except the ones defined in the current form item, so that the only grammar drive space indicator 5.3.7.6 during this field is the grammar for boolean.

An experienced user can get things drive space indicator 5.3.7.6 much faster (but is still forced to listen to the ads):

C: Welcome to the weather information drive space indicator 5.3.7.6.

Buy Joe's Spicy Shrimp Sauce.

C: What .

H (barging in): LA

C: Do you .

H (barging in): Yes

C: Don't forget, buy Joe's Spicy Shrimp Sauce tonight!

C: Mostly sunny today with highs in the 80s. Lows tonight from the low 60s .

2.1.5.1 Controlling the order of field collection.

The form interpretation algorithm can be customized in several ways.

One way is to assign a value to a form item variable, so that its form item will not be selected. Another is to use <clear> to set a form item variable to undefined; this forces the FIA to revisit the form item again.

Another method is to explicitly specify the next form item to visit using <goto nextitem>. This forces an immediate transfer to that form item even drive space indicator 5.3.7.6 any cond attribute present evaluates to "false".

No variables, conditions or counters in the targeted form item will be reset. The form item's prompt will be played even if it has already been visited. If the <goto nextitem> occurs in a <filled> action, the rest drive space indicator 5.3.7.6 the <filled> action and any pending <filled> actions will be skipped.

Here is an example <goto nextitem> executed in response to the exit event:

<?xml version="1.0" encoding="UTF-8"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/vxml http://www.w3.org/TR/voicexml20/vxml.xsd"> <link event="exit"> <grammar type="application/srgs+xml" src="/grammars/exit.grxml"/> </link> <form id="survey_2000_03_30"> <catch event="exit"> <reprompt/> <goto nextitem="confirm_exit"/> </catch> <block> <prompt> Hello, you have been called at random to answer questions critical to U.S.

foreign policy. </prompt> </block> <field name="q1"> <grammar type="application/srgs+xml" src="/grammars/boolean.grxml"/> <prompt>Do you agree with the IMF position on privatizing certain functions of Burkina Faso's agriculture ministry?</prompt> </field> <field name="q2"> <grammar type="application/srgs+xml" src="/grammars/boolean.grxml"/> <prompt>If this privatization occurs, will its effects be beneficial mainly to Ouagadougou and drive space indicator 5.3.7.6 Bobo-Dioulasso?</prompt> </field> <field name="q3"> <grammar type="application/srgs+xml" src="/grammars/boolean.grxml"/> <prompt>Do you agree that sorghum and millet output might thereby increase by as much as four percent per annum?</prompt> </field> <block> <submit next="register" namelist="q1 q2 q3"/> </block> <field name="confirm_exit"> <grammar type="application/srgs+xml" src="/grammars/boolean.grxml"/> <prompt>You have elected to exit.

Are you drive space indicator 5.3.7.6 sure you want to do this, and perhaps adversely affect U.S. foreign policy vis-a-vis sub-Saharan Africa for decades to come?</prompt> <filled> <if cond="confirm_exit"> drive space indicator 5.3.7.6 Okay, but the U.S.

State Department is displeased. <exit/> <else/> Good, let's pick up where we left off. <clear namelist="confirm_exit"/> </if> </filled> <catch event="noinput nomatch"> <throw event="exit"/> drive space indicator 5.3.7.6 </catch> </field> </form> </vxml>

If the user says "exit" in response to any of the survey questions, an exit event is thrown by the platform and caught by the <catch> event handler.

This handler directs that confirm_exit be the next visited field. The confirm_exit field would not be visited during normal completion of the survey because the preceding <block> element transfers control to the registration script.

2.1.6 Form Interpretation Algorithm

We've presented the form interpretation algorithm (FIA) at a conceptual level.

In this section we describe it in more detail. A more formal description is provided in Appendix C.

2.1.6.1 Initialization Phase

Whenever a form is entered, it is initialized. Internal prompt counter variables (in the form's dialog scope) are reset to 1. Each variable (form-level <var> elements drive space indicator 5.3.7.6 form item variables) is initialized, in document order, drive space indicator 5.3.7.6 undefined or to the value of the relevant expr attribute.

2.1.6.2 Main Loop

The main loop of the FIA has three phases:

The select phase: the next unfilled form item is selected for visiting.

The collect phase: the selected form item is visited, which prompts the user for input, enables the appropriate grammars, and then waits for and collects an input (such as a spoken phrase or DTMF key presses) or an event (such as a request for help or drive space indicator 5.3.7.6 no input timeout).

The process phase: an input is processed by filling form items and executing <filled> elements to perform actions such as drive space indicator 5.3.7.6 validation.

An event is processed by executing the appropriate event handler for that event type.

Note that the FIA may be given an input (a set of grammar slot/slot value pairs) that was collected while the user was in a different form's FIA. In this case the first iteration of the main loop skips the select and collect phases, and goes right to the process phase with that input. Also note that if an error occurs in the select or collect phase drive space indicator 5.3.7.6 causes an event to be generated, the event is thrown drive space indicator 5.3.7.6 the FIA moves directly into the process phase.

2.1.6.2.1 Select phase

The purpose of the select phase is to select the drive space indicator 5.3.7.6 form item to visit.

This is done as follows:

If a <goto> from the last main loop iteration's process drive space indicator 5.3.7.6 specified a <goto nextitem>, then the specified form item is selected.

Otherwise the first form item whose guard condition is false is chosen to be visited. If an error occurs while checking guard conditions, the event is thrown which skips the collect phase, and is handled in the process phase.

If no guard condition is false, and the last iteration completed the form without encountering an explicit transfer of control, the FIA does an implicit <exit> operation (similarly, if execution proceeds outside of a form, such as when an error is generated outside of a form, and there is no explicit transfer drive space indicator 5.3.7.6 control, the interpreter will perform an implicit <exit> operation).

2.1.6.2.2 Drive space indicator 5.3.7.6 phase

The purpose of the collect phase is to collect an input or an event.

Drive space indicator 5.3.7.6 selected form item is visited, which performs actions that depend on the type of form item:

Источник: http://www.w3.org/TR/voicexml20/