VoxCommando
VoxNastics (User Guides and Mods) => XML Exchange => Topic started by: nime5ter on April 03, 2016, 12:55:36 PM
-
Yandex offers a free (or, free for now) cloud-based TTS API. It is only for Russian or English. (The documentation says Russian only, but it works for me in English.)
Of course, the TTS will not work unless you're connected to the internet.
1. You need to register by email in order to get your own API key. https://tech.yandex.com/speechkit/cloud/
2. You will need to replace "{M:API.ydx.speechkit.key}" in the XML below with your own API key.
3. Note: For the Russian TTS to work, you will need to have VC 2.2.1.7 or later installed.
This is because Russian-language text needs to be properly URL encoded. In 2.2.1.7 (http://voxcommando.com/mediawiki/index.php?title=ChangeLog#Version_2.2.1.7), James has added a new Tools.Encode.URI action.
<?xml version="1.0" encoding="utf-16"?>
<!--VoxCommando 2.2.1.7-->
<commandGroup open="True" name="yandex" enabled="True" prefix="" priority="0" requiredProcess="" description="">
<command id="304" name="++Speak to me Yandex (use an action like this example in any command that requires TTS)" enabled="true" alwaysOn="False" confirm="False" requiredConfidence="0" loop="False" loopDelay="0" loopMax="0" description="">
<action>
<cmdType>VC.TriggerEvent</cmdType>
<params>
<param>Yandex.TTS</param>
<param>Greetings, earth being.</param>
<param>en</param>
<param>jane</param>
</params>
<cmdRepeat>1</cmdRepeat>
</action>
<action>
<cmdType>VC.Pause</cmdType>
<params>
<param>2000</param>
</params>
<cmdRepeat>1</cmdRepeat>
</action>
<action>
<cmdType>VC.TriggerEvent</cmdType>
<params>
<param>Yandex.TTS</param>
<param>Я плохо говорю по-русски</param>
<param>ru</param>
<param>zahar</param>
</params>
<cmdRepeat>1</cmdRepeat>
</action>
<phrase>Speak to me Yandex</phrase>
</command>
<command id="159" name="Yandex TTS" enabled="true" alwaysOn="False" confirm="False" requiredConfidence="0" loop="False" loopDelay="0" loopMax="0" description="{1} phrase you want spoken (Russian or English only)
{2} --> lang code options are ru-RU, en-GB. Officially, only Russian is supported, but English seems to work.
{3} --> speaker options are: 
female voices: jane or omazh
male voices: zahar or ermil
You will need your own API key to replace {M:API.ydx.speechkit.key}. See: https://tech.yandex.com/speechkit/cloud/doc/dg/concepts/speechkit-dg-tts-docpage/">
<action>
<cmdType>Tools.Encode.URI</cmdType>
<params>
<param>{1}</param>
</params>
<cmdRepeat>1</cmdRepeat>
</action>
<action>
<cmdType>Sound.PlayStream</cmdType>
<params>
<param>https://tts.voicetech.yandex.net/generate?text={LastResult}&format=mp3&lang={2}&speaker={3}&key={M:API.ydx.speechkit.key}&emotion=mixed</param>
</params>
<cmdRepeat>1</cmdRepeat>
</action>
<event>Yandex.TTS</event>
</command>
</commandGroup>
To customize the TTS voice, see the documentation here: https://tech.yandex.com/speechkit/cloud/doc/dg/concepts/speechkit-dg-tts-docpage/
-
In case it's not self-evident, to use this as your main TTS solution, rather than using TTS.Speak actions in your commands, you would call on the Yandex TTS command using VC.TriggerEvent actions (as I demonstrate in my XML example above).
-
TTS это конечно приятно, но мне гораздо интереснее, как использовать распознавание текста. И в гугл и в яндекс нужно передавать wav-файл, а я не могу придумать, как это сделать с помощью вокса. Включать запись сторонней программой, мониторить папку Watcher, и надо ещё отправить этот файл пост-запросом.. Слишком наворочено выходит, а чем сложнее система, тем легче ей выйти из строя. Форма с webkitSpeechRecognition наверно в робобраузере не будет работать.. Проще всего прописать клик мышки на микрофон в гугле, затем копировать текст через элемент? Все эти костыли с использованием GUI мне совсем не нравятся, должен же быть надежнее способ.