Author Topic: Yakitome web-based Text-to-Speech (TTS) -- some good free voices, but a bit slow  (Read 3973 times)

0 Members and 1 Guest are viewing this topic.

nime5ter

  • Administrator
  • Hero Member
  • *****
  • Posts: 2012
  • Karma: 61
    • View Profile
    • Getting Started with VoxCommando
rio14 has alerted me to a text-to-speech option that may be of interest. The free voices I've tested are very good. I don't know if they're all of equal quality.

The problem is that Yakitome's server is pretty slow, so there is often a wait of several seconds before the voice responds. I have done multiple tests with different browsers etc., and so far I have not found a way to consistently get faster responses, but this service may still be of interest to some of you, because of the quality of some of the voices relative to other free TTS voices.

What you need to do:

1. Sign up for a free API key on the site: https://www.yakitome.com/
2. You can look at their documentation (https://www.yakitome.com/documentation), including reviewing their list of voices to try different free voices. https://www.yakitome.com/documentation/tts_voices

In my example below, I am using Mike. You can change this in the URL of the Scrape.Post action.

This solution involves generating the voice synthesis on the Yakitome site's web servers, and then playing that audio file in a web browser. (We can do this in a hidden RoboBrowser window, rather than opening a browser window for each request.)

The Yakitome API works by having users send an HTTP Post request to their server. The API key is needed for this request.

Their server's response to that Post request contains the URL for the streaming TTS. This response is stored in {LastResult} as always. We use a regular expression to capture that URL, and pass it to our hidden RoboBrowser window.

Code: [Select]
<?xml version="1.0" encoding="utf-16"?>
<!--VoxCommando 2.2.1.6-->
<commandGroup open="True" name="Yakitome" enabled="True" prefix="" priority="0" requiredProcess="" description="">
  <command id="121" name="++Yakitome talks back (wait for it)" enabled="true" alwaysOn="False" confirm="False" requiredConfidence="0" loop="False" loopDelay="0" loopMax="0" description="">
    <action>
      <cmdType>Scrape.Post</cmdType>
      <params>
        <param>https://www.yakitome.com/api/call/json/tts?api_key={M:logins.yakkey}&amp;voice=Mike&amp;speed=5&amp;text={1}</param>
        <param />
        <param />
        <param />
        <param>application/x-www-form-urlencoded</param>
      </params>
      <cmdRepeat>1</cmdRepeat>
    </action>
    <action>
      <cmdType>Results.RegEx</cmdType>
      <params>
        <param>"iframe":."(.*?)"</param>
      </params>
      <cmdRepeat>1</cmdRepeat>
    </action>
    <action>
      <cmdType>RoboB.Select</cmdType>
      <params>
        <param>yak</param>
      </params>
      <cmdRepeat>1</cmdRepeat>
    </action>
    <action>
      <cmdType>RoboB.Show</cmdType>
      <params>
        <param>True</param>
      </params>
      <cmdRepeat>0</cmdRepeat>
    </action>
    <action>
      <cmdType>VC.Pause</cmdType>
      <params>
        <param>1000</param>
      </params>
      <cmdRepeat>1</cmdRepeat>
    </action>
    <action>
      <cmdType>RoboB.Navigate</cmdType>
      <params>
        <param>{Match.1}</param>
      </params>
      <cmdRepeat>1</cmdRepeat>
    </action>
    <event>yak</event>
  </command>
  <command id="153" name="Tell yakitome to say {1}" enabled="true" alwaysOn="False" confirm="False" requiredConfidence="0" loop="False" loopDelay="0" loopMax="0" description="">
    <action>
      <cmdType>VC.TriggerEvent</cmdType>
      <params>
        <param>yak</param>
        <param>{1}</param>
      </params>
      <cmdRepeat>1</cmdRepeat>
    </action>
    <phrase optional="true">Tell yakitome to</phrase>
    <phrase>Say</phrase>
    <payloadList>how are you, what is your name, my name is Mike</payloadList>
  </command>
</commandGroup>

My API key is stored in a Map Table. This is used in the Scrape.Post URL that is in the ++Yakitome talks back command.

You can either paste your API key where I have used "{M:logins.yakkey}" or use your own map variable.

Enjoy.
« Last Edit: February 20, 2016, 09:52:56 AM by nime5ter »
TIPS: POST VC VERSION #. Explain what you want VC to do. Say what you've tried & what happened, or post a video demo. Attach VC log. Link to instructions followed.  Post your command (xml)

nime5ter

  • Administrator
  • Hero Member
  • *****
  • Posts: 2012
  • Karma: 61
    • View Profile
    • Getting Started with VoxCommando
If you have an API key, you can use an http GET request to find out what free or metered voices are available to you.

Code: [Select]
<?xml version="1.0" encoding="utf-16"?>
<!--VoxCommando 2.2.1.6-->
<command id="133" name="Get list of {1} TTS voices" enabled="true" alwaysOn="False" confirm="False" requiredConfidence="0" loop="False" loopDelay="0" loopMax="0" description="">
  <action>
    <cmdType>Scrape</cmdType>
    <params>
      <param>https://www.yakitome.com/api/call/xml/voices?api_key={M:logins.yakkey}</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>Results.RegExSingle</cmdType>
    <params>
      <param>&lt;{1}&gt;(.*?)&lt;/{1}&gt;</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>Results.RegEx</cmdType>
    <params>
      <param>&lt;item&gt;&lt;item&gt;(.*?)&lt;/item&gt;&lt;item&gt;(.*?)&lt;/item&gt;&lt;item&gt;(.*?)&lt;/item&gt;&lt;/item&gt;</param>
      <param />
      <param>{Match.1}</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>OSD.ShowText</cmdType>
    <params>
      <param>{1} Voices (lang --&gt; M/F --&gt; voice name):</param>
      <param>10000</param>
      <param>-5</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>OSD.AddText</cmdType>
    <params>
      <param>{Match.{i}.1} --&gt; {Match.{i}.2} --&gt; {Match.{i}.3}</param>
    </params>
    <cmdRepeat>{#M}</cmdRepeat>
  </action>
  <phrase>Get list of</phrase>
  <payloadList>free,metered</payloadList>
  <phrase>voices</phrase>
</command>

You could also use this request to generate a payload XML file, to make it easier to change TTS voices for announcements, or to randomly select a TTS voice each time, etc.
TIPS: POST VC VERSION #. Explain what you want VC to do. Say what you've tried & what happened, or post a video demo. Attach VC log. Link to instructions followed.  Post your command (xml)