4 <meta charset=
"utf-8" />
5 <title>meSpeak
– Voices
& Languages
</title>
6 <link href=
"http://fonts.googleapis.com/css?family=Open+Sans&subset=latin" rel=
"stylesheet" type=
"text/css" />
7 <link href=
"http://fonts.googleapis.com/css?family=Lato:300&subset=latin" rel=
"stylesheet" type=
"text/css" />
8 <style type=
"text/css">
12 padding: 2em 1.5em 4.5em 1.5em;
13 background-color: #e2e3e4;
18 padding: 2px 40px 60px 40px;
19 margin: 0 auto
0 auto
;
20 background-color: #fafafb;
22 font-family: 'Open Sans',sans-serif
;
28 font-family: 'Lato',sans-serif
;
57 margin: 1.5em 0 1em 0.25em;
59 h1 span
.pict
{ font-size: 38px; color: #ccc; margin-left: 0.5em; letter-spacing: -2px; }
63 padding: 1em 0 1em 2em;
65 font-family: monospace
;
67 background-color: #f2f3f5;
70 p
.codesample strong
{ color: #222; }
72 a:hover
,a:focus
{ color: #2681a7; }
73 a:active
{ color: #cd360e; }
74 p
.action
{ margin: 1em 0 1em 1.5em; }
75 li
{ margin-bottom: 0.5em; }
79 <h1>meSpeak
<span class=
"pict">((
• ))
</span></h1>
80 <h2>Voices
& Languages
</h2>
82 <p>A short guide to the set-up of languages and voices for meSpeak.
<br />
83 Please mind that meSpeak is based on an Emscripten-port of
<a href=
"http://espeak.sourceforge.net/" target=
"_blank">eSpeak
</a>, so all of the eSpeak grammar applies also to meSpeak.
</p>
86 <h3>Standard Language Files
</h3>
88 <p>meSpeak's language-files provide eSpeak's language- and voice-files in a single package.
<br />(Since a voice usually refers to a language and its dictionary, it seems suitable to bundle them together in a single file.)
<br />The language-files are of the following structure (JSON):
</p>
91 "voice_id":
"<filename>",
92 "dict_id":
"<filename>",
93 "dict":
"<base64-encoded octet stream>",
94 "voice":
"<base64-encoded octet stream>"
99 <p>The values of
<em>voice_id
</em> and
<em>dict_id
</em> are actually UNIX-filenames,
<code>dict_id
</code> relative to the path of eSpeak's data-directory
"<code>espeak-data/
</code>",
<em>voice_id
</em> relative to
"<code>espeak-data/voices/
</code>".
</p>
101 <p>If we were to embed the files for the langage
"<code>en-en
</code>", these would be:
</p>
103 <li>"<code>en/en-en
</code>" for the voice and
</li>
104 <li>"<code>en_dict
</code>" for the dictionary used by
"en-en
"</li>
107 <p>For a standard language-file, you would add a base64-representation as the string value of
<em>dict
</em> and
<em>voice
</em> of the respective eSpeak-files.
</p>
112 <p>There is an alternate layout for meSpeak's language-files, which is espacially usefull for the purpose of customizing and testing:
</p>
115 "voice_id":
"<filename>",
116 "dict_id":
"<filename>",
117 "dict":
"<base64-encoded octet stream>",
118 "voice":
"<text-string>",
119 "voice_encoding":
"text"
123 <p>Since eSpeak's voice-files are actually plain-text files, you may use a simple string for these, if you provide an additional property
<code>"voice_encoding
":
"text
"</code> at the same time.
</p>
124 <p><em>For dictionaries, which are a binary files with eSpeak, see the note at the end of the page.
</em></p>
128 <p>For an example we will configure a basic female voice for
"en-us
", which will be named
"en-us-f
".
</p>
131 <li>Make a copy of a meSpeak-language file (json), which you want to modify (in this case
"<code>voices/en/en-us.json
</code>).
</li>
133 <li>Rename the file (e.g.:
"<code>en-us-f.json
</code>") and open it in editor.
</li>
135 <li>Download the source of
<a href=
"http://espeak.sourceforge.net/" target=
"_blank">eSpeak
</a> and go to the
"<code>espeak-data/
</code>" directory.
</li>
137 <li>The eSpeak-file
"<code>espeak-data/voices/en-us
</code>" looks like this:
139 <xmp>// moving towards US English
145 // and more, skipped here
148 <li>Rename the
"<code>name
</code>" parameter to make it unique (e.g.:
"<code>name english-us-f
</code>").
</li>
150 <li>Change any paramaters as you whish, in this case change
"<code>gender male
</code>" to
"<code>gender female
</code>" for a female voice.
</li>
152 <li>You should have arrived at something like this (first line removed, since it is just a comment):
154 <xmp>name english-us-f
161 <li>Replace any line-breaks by
"<code>\n
</code>" in order to get a valid JSON-string:
163 <xmp>"name english-us-f\nlanguage en-us 2\nlanguage en-r\nlanguage en 3\ngender female"</xmp>
165 And use this as a value for the
"<code>voice
</code>"-property of the JSON-file.
</li>
167 <li>Add the line
<code>"voice_encoding
":
"text
"</code> to the JSON to indicate that the voice is plain-text.
<br />Your voice file should now look like this:
169 <xmp>Content of file:
"en-us-f.json":
172 "voice_id":
"en-us-f",
173 "dict_id":
"en_dict",
174 "dict":
"<base64-encoded octet stream>",
175 "voice":
"name english-us-f\nlanguage en-us 2\nlanguage en-r\nlanguage en 3\ngender female",
176 "voice_encoding":
"text"
179 <li>Save it and load it into meSpeak.
</li>
182 <p><em>Please note that eSpeak is not very graceful with syntax errors in a voice-definition and will just throw an error, which will
— in the case of meSpeak
— show up in the console-log.
</em></p>
184 <p>For further details on voice-parameters and fine-tuning, please refer to the eSpeak-documentation:
<a href=
"http://espeak.sourceforge.net/voices.html" target=
"_blank">http://espeak.sourceforge.net/voices.html
</a>.
</p>
186 <h3>Custom Dictionaries
</h3>
187 <p>eSpeak's dictonaries are binary files, which must be compiled with eSpeak first.
<br />
188 You would have to install eSpeak and compile a file following the
<a href=
"http://espeak.sourceforge.net/docindex.html" target=
"_blank">eSpeak documentation
</a>.
</br />
189 Further, you would insert a base64-encoded string of the resulting object-file's content as the value of the
<em>dict
</em> property of a meSpeak-language-file.
<br />
190 Finally, you would set a suiting and unique value for the property
<em>dict_id
</em> (UNIX file path).
</p>
191 <p>There is no shortcut to this. Sorry.
</p>
193 <p>Please see also the section on the
<em>extended voice format
</em> at the
<a href=
"./">main-page
</em>.
</p>
197 <p>Norbert Landsteiner
<br />
198 Vienna, July
2013</p>