Merge Chromium + Blink git repositories
[chromium-blink-merge.git] / chrome / common / extensions / docs / templates / intros / ttsEngine.html
blob2320f3e1d551809bed77f68a48ba175c3e6147ab
1 <h2 id="overview">Overview</h2>
3 <p>An extension can register itself as a speech engine. By doing so, it
4 can intercept some or all calls to functions such as
5 $(ref:tts.speak) and
6 $(ref:tts.stop)
7 and provide an alternate implementation.
8 Extensions are free to use any available web technology
9 to provide speech, including streaming audio from a server, HTML5 audio,
10 Native Client, or Flash. An extension could even do something different
11 with the utterances, like display closed captions in a pop-up window or
12 send them as log messages to a remote server.</p>
14 <h2 id="manifest">Manifest</h2>
16 <p>To implement a TTS engine, an extension must
17 declare the "ttsEngine" permission and then declare all voices
18 it provides in the extension manifest, like this:</p>
20 <pre data-filename="manifest.json">
22 "name": "My TTS Engine",
23 "version": "1.0",
24 <b>"permissions": ["ttsEngine"],
25 "tts_engine": {
26 "voices": [
28 "voice_name": "Alice",
29 "lang": "en-US",
30 "gender": "female",
31 "event_types": ["start", "marker", "end"]
34 "voice_name": "Pat",
35 "lang": "en-US",
36 "event_types": ["end"]
39 },</b>
40 "background": {
41 "page": "background.html",
42 "persistent": false
45 </pre>
47 <p>An extension can specify any number of voices.</p>
49 <p>The <code>voice_name</code> parameter is required. The name should be
50 descriptive enough that it identifies the name of the voice and the
51 engine used. In the unlikely event that two extensions register voices
52 with the same name, a client can specify the ID of the extension that
53 should do the synthesis.</p>
55 <p>The <code>gender</code> parameter is optional. If your voice corresponds
56 to a male or female voice, you can use this parameter to help clients
57 choose the most appropriate voice for their application.</p>
59 <p>The <code>lang</code> parameter is optional, but highly recommended.
60 Almost always, a voice can synthesize speech in just a single language.
61 When an engine supports more than one language, it can easily register a
62 separate voice for each language. Under rare circumstances where a single
63 voice can handle more than one language, it's easiest to just list two
64 separate voices and handle them using the same logic internally. However,
65 if you want to create a voice that will handle utterances in any language,
66 leave out the <code>lang</code> parameter from your extension's manifest.</p>
68 <p>Finally, the <code>event_types</code> parameter is required if the engine can
69 send events to update the client on the progress of speech synthesis.
70 At a minimum, supporting the <code>'end'</code> event type to indicate
71 when speech is finished is highly recommended, otherwise Chrome cannot
72 schedule queued utterances.</p>
74 <p class="note">
75 <strong>Note:</strong> If your TTS engine does not support
76 the <code>'end'</code> event type, Chrome cannot queue utterances
77 because it has no way of knowing when your utterance has finished. To
78 help mitigate this, Chrome passes an additional boolean <code>enqueue</code>
79 option to your engine's onSpeak handler, giving you the option of
80 implementing your own queueing. This is discouraged because then
81 clients are unable to queue utterances that should get spoken by different
82 speech engines.</p>
84 <p>The possible event types that you can send correspond to the event types
85 that the <code>speak()</code> method receives:</p>
87 <ul>
88 <li><code>'start'</code>: The engine has started speaking the utterance.
89 <li><code>'word'</code>: A word boundary was reached. Use
90 <code>event.charIndex</code> to determine the current speech
91 position.
92 <li><code>'sentence'</code>: A sentence boundary was reached. Use
93 <code>event.charIndex</code> to determine the current speech
94 position.
95 <li><code>'marker'</code>: An SSML marker was reached. Use
96 <code>event.charIndex</code> to determine the current speech
97 position.
98 <li><code>'end'</code>: The engine has finished speaking the utterance.
99 <li><code>'error'</code>: An engine-specific error occurred and
100 this utterance cannot be spoken.
101 Pass more information in <code>event.errorMessage</code>.
102 </ul>
104 <p>The <code>'interrupted'</code> and <code>'cancelled'</code> events are
105 not sent by the speech engine; they are generated automatically by Chrome.</p>
107 <p>Text-to-speech clients can get the voice information from your
108 extension's manifest by calling
109 $(ref:tts.getVoices),
110 assuming you've registered speech event listeners as described below.</p>
112 <h2 id="handling_speech_events">Handling speech events</h2>
114 <p>To generate speech at the request of clients, your extension must
115 register listeners for both <code>onSpeak</code> and <code>onStop</code>,
116 like this:</p>
118 <pre>var speakListener = function(utterance, options, sendTtsEvent) {
119 sendTtsEvent({'event_type': 'start', 'charIndex': 0})
121 // (start speaking)
123 sendTtsEvent({'event_type': 'end', 'charIndex': utterance.length})
126 var stopListener = function() {
127 // (stop all speech)
130 chrome.ttsEngine.onSpeak.addListener(speakListener);
131 chrome.ttsEngine.onStop.addListener(stopListener);</pre>
133 <p class="warning">
134 <b>Important:</b>
135 If your extension does not register listeners for both
136 <code>onSpeak</code> and <code>onStop</code>, it will not intercept any
137 speech calls, regardless of what is in the manifest.</p>
139 <p>The decision of whether or not to send a given speech request to an
140 extension is based solely on whether the extension supports the given voice
141 parameters in its manifest and has registered listeners
142 for <code>onSpeak</code> and <code>onStop</code>. In other words,
143 there's no way for an extension to receive a speech request and
144 dynamically decide whether to handle it.</p>