VoxCommando
General Discussion => General Discussion => Topic started by: tria on February 23, 2012, 06:40:53 AM
-
Hi,
I am trying to create a command composed of three actions:
(1) The real system action (irrelevant)... Done.
(2) Read one random phrase from xml file (using PayloadXML.GetRandomP), the phrases xml file will be composed of audio file names (for the phrases). I specify the XML file in the first parameter, and 1 in the second parameter.
(3) Use the random phrase got from (2) to play it (Sound.PlayWav) to play the selected phrase.
Here are my problems
When executing step (2), VC report an error (Unknown Action GetRandomP) even though I got it from the drop list of the GUI.
Also, I don't know how to get the random phrase from step (2) and use it as a parameter in step (3). Do I use "{1}" for example, or maybe {last_result}
Thank you.
-
Tria, I did not understand well what you want to do, but to use random phrases, use this tab "|".
Here is an example:
Fine | Excellent | Very Good | Nice
Vox will using this randomic..
I am using here for read info about song, artist, album, using TTS voice in native language.
(https://voxcommando.com/forum/proxy.php?request=http%3A%2F%2Fdl.dropbox.com%2Fu%2F25170804%2FLCBrandom.jpg&hash=4ba2bc1f3ace02449c5ae32d4bc11762dec1bd04)
-
Tria, I donĀ“t know what your command, but for randomic phases use this "|" for n phrases.
Here one case:
Fine | Excellent | Very Good | Nice
Vox will using this randomic..
I use here for read info about song, artist, album, using TTS voice in native language.
(https://voxcommando.com/forum/proxy.php?request=http%3A%2F%2Fdl.dropbox.com%2Fu%2F25170804%2FLCBrandom.jpg&hash=4ba2bc1f3ace02449c5ae32d4bc11762dec1bd04)
Hi Wanilton,
tria will read out an payload.xml (with random phrases that have an "audio file path" as "value") to play a wav file. (I think at least ::hmm )
I'm not confirm with "PayloadXML.GetRandomP" command.
Kalle
-
Thanks Wanilton. I am aware about | for TTS, but only if my language was supported like your language (PT) I can use it like you do :(
So the best way around this currently is what I am trying to do currently, and Kalle got what I am after correctly.
Currently I am using third party TTS to generate different phrases, save them as wav files, list them in payload.xml file for phrases, try to load a random phrase from that file using the mentioned GetRandomP, and finally play it as wav file using PlayWav.
For some reason, I believe GetRandomP not working (probably a bug?)
Also, I was wondering how to get the result from GetRandomP into another action like PlayWav as a parameter.
-
Hi,
I am trying to create a command composed of three actions:
(1) The real system action (irrelevant)...
I will check that everything is OK and suggest some answers shortly. I did not test this action extensively since creating it so it is possible it is buggy.
One question though: What is the ultimate goal you have in mind? i.e. What are you trying to accomplish? TTS without a TTS engine installed?
Also, it would help us to better understand if you could export a group containing your command and upload it here (along with the payloadXML file). Right-click a group in the command tree to export it.
-
sorry, I didn't read the whole thread before I replied. :-[
It's clear now what you want to do.
Are you looking, specifically, for a way to do random Arabic TTS phrases?
Yes there is a bug with the getRandomP action. In fact all the "PayloadXML.*" actions are suspect at this point ::confused.
I will look into a fix ASAP.
-
Thank you for looking into this bug.
Yes, I want to play random phrases from wav files specified in a payload xml file. This will be my "poor-man" implementation for randomized TTS :)
If I were using English, I would surely use TTS actions that support randomization.
But nobody answered the second part of the question. Assuming that GetRandomP get to work, how to take the output (or phrase) from it and pass it as a parameter to the next action in the list?
Thanks
-
I didn't answer because somehow that code all got lobotomised. It's just not there and I can't find it, so I'm recreating it. So I won't have an answer for you until I make it again.
the data is either going to be available in {LastResult} or more likely in {Match.1} since we are able to return multiple results.
Don't worry I'll post some samples when it's fixed, and I plan to do this today. I'm working on it right now.
On a side note, I think I may have an alternate poor man's solution for you, however it make take a while before it is ready, and it will only work if you have an internet connection. Go to http://translate.google.com/ and try out their Arabic TTS. Is that good enough quality for you? It sounds good to me but I don't speak Arabic so I have no idea really.
There should also be Arabic TTS available for Windows that would be compatible with VC, but I don't know where to get them or how much they cost. Maybe Rebel or Zizos could suggest something.
-
Okay, that would be great.
As for google translate, haha I like how you claim it sound good without knowing Arabic (it is impossible to evaluate any language you don't understand, right?). But yeah, you are correct, it is somehow good (especially when you know it is relatively new). But the loquendo implementation is far better. It almost sound like a real person is talking (even when compared to English engines that already existed for decades). They have a text to speech version for individuals, but it doesn't work with the system TTS engine.
What is good with Google is their Arabic Speech Recognition used in mobile phones. I managed to get hold of the API (even though not published), and was able to convert what I say in Arabic to text. The output was relatively good for a recent release. The only problem is that there is a delay due to the HTTP request/response and the need to encode in FLAC. Also, you have to send the audio file in chunks if it is too big. This will break the experience for real time application, or when there is no internet connection. It would be great if they released it for one of the operating systems out there, but I am not holding my breaths.
-
please try this and let me know
-
Now it is working properly. You are the greatest.
But one thing to note to all other people stumbling over this thread from a search engine. I/You/We should use {Match.1} to get the result as a parameter for the following command (assuming we want to access the first phrase of the random phrases)
Thanks!
-
Can I ask you to implement another feature?
Can you make another action like "PlayWav", lets call it "PlayRandomWav", where you specify the directory containing the wav files, and it will play one of them randomly. More ideas could be: allow the user to specify prefix for the files (and maybe several prefixes separated with | or ; or ,) so that files with these prefixes only get selected from the directory. Also return the played file name as a result (i.e. in {LastResult} or {Match.1}...etc)
This will be really helpful for the hopeless people without TTS, and it will save us a lot of time (instead of creating payload xml files and updating them each time). It also qualify as a great feature for other people and can be used for other purposes.
Thanks
-
Yes you can.
I will add two new commands:
- Files.GetRandFile
- Files.GetFiles
both will accept a path and a filter (i.e. *.wav or *.*)
GetRandFile will return a single filepath in {LastResult}
GetFiles will return all matching files as {Match.1}...{Match.N}
later on I will probably add an option to search in subdirectories as well.
Then you can do what you want with the files. Play a wav, show a jpg etc.
-
but as an alternative poor man's TTS you can try the secret command in the attached group.
you must enable the Bing plugin for this to work, but you don't need to enter an API key in the plugin settings (I don't think).
The reason it is not officially included in the plugin is that I may move it into another plugin later which we will also be able to use to listen to internet radio etc. since it uses a similar streaming audio approach. (also it has nothing to do with Bing :biglaugh)
You can also use many TTS voices with Bing.SpeakSync but Arabic is not available in Bing's TTS for some reason.
-
I really appreciate you looking into my request. You should note that what I meant by "prefix" is "prefix" and not "postfix" :) I was meaning a common string that they start with (but differ in the ending). However, your postfix or extension idea is also great.
As for the secret command, I am not sure whether you are serious or not, as the secret command doesn't work even if I enable Bing addon (only speaksync/speak). Unless you are hinting about upcoming function for the thing I explained to you in another place and you are going to call it that secret command? ;)
If that is the case, then I might help with providing a command line utility that automate it if you like. I was intending to create it to use it for other things, but if it help here then why not. Can you call other application and pass parameters to them from VC? also, although not required, can VC read the output stream from such called application (it would be a great way to pass result back)?
-
yeah I knew you meant prefix, but I'm more interested in creating commands that can serve multiple purposes, which is why I created a command to return filenames rather than one to play random wavs.
Anyway the (*.wav) was just an example. You should be able to do (hi??.wav) or (?all*.wav) etc. to match agains filenames.
I was not joking about the command I uploaded. I thought it would work. It works fine for me. note that after you enable a plugin you must restart VoxCommando.
You can send all sorts of stuff to VC using UDP and in the future via the http server plugin.
Check out Tellvox which is in the "extras" found here: http://voxcommando.com/downloads.asp
Tellvox uses UDP. With it you can trigger events, call any action, or send text to emulate recognition on. You can also drop .wav files in a folder and have VC recognize the speech in the .wav
There is also a commandline program called udpsender that does the same thing as tellvox if you want to call it from another program. http://voxcommando.com/forum/index.php?topic=414.msg2815#msg2815
-
As for the secret command, I am not sure whether you are serious or not, as the secret command doesn't work even if I enable Bing addon (only speaksync/speak). Unless you are hinting about upcoming function for the thing I explained to you in another place and you are going to call it that secret command? ;)
My mistake. You need to put the attached .dll in the main VC directory for that command to work.
Unless you are hinting about upcoming function for the thing I explained to you in another place and you are going to call it that secret command? ;)
If that is the case, then I might help with providing a command line utility that automate it if you like. I was intending to create it to use it for other things, but if it help here then why not.
Yes. Please share whatever you come up with.
-
I added the new file commands and created a new install version 0.944
http://voxcommando.com/forum/index.php?topic=751.0
Here are some sample commands using the new File actions:
-
Thanks man, this is really great. I'll make sure test them, and use them instead of the payload method. I'll also try the Bing plugin again.
As for Arabic not working, did you encode the url parameters before you query/send the HTTP request? You see, most Latin-based languages are easy to pass, but other languages requires you to escape the characters (eventually they will look something like %D0%F5%66%45...). This way the server will get them properly and unescape them (if the server support other languages/utf-8 etc). Most current browsers do this transparently without affecting the URL, but in reality they do that once they send the request.
Your Arabic text example in the XML file for googlespeak was really funny, let me translate it back to you: "This is the biggest program any time passed create" :D
I know, I know, blame the translator ;)
I will see if I can come up with anything, but the problem is that I must use UDP for message passing. Can't you just implement a way to listen for the output stream for another process (mine), and get the binary audio file (for TTS) or as text (for Speech Recognition). For the later I could play the audio on behalf of VC, but for the first it is impossible to pass the audio from VC or the text back to VC without writing a plugin and/or using udp. I'll see what I can do.
Thanks again for the update.
-
As for Arabic not working, did you encode the url parameters before you query/send the HTTP request?
Yes, I am familiar with this concept. It is not that it "does not work". Bing does not offer TTS for Arabic. If you go to http://www.microsofttranslator.com/ you will see that you can translate to French, and then click the speaker icon to get the TTS. You can also translate to Arabic, but then you will notice that there is no speaker icon.
Your Arabic text example in the XML file for googlespeak was really funny, let me translate it back to you: "This is the biggest program any time passed create" :D
That's funny, because that is EXACTLY what I wanted to say. :P
I will see if I can come up with anything, but the problem is that I must use UDP for message passing. Can't you just implement a way to listen for the output stream for another process (mine), and get the binary audio file (for TTS) or as text (for Speech Recognition). For the later I could play the audio on behalf of VC, but for the first it is impossible to pass the audio from VC or the text back to VC without writing a plugin and/or using udp. I'll see what I can do.
I'm not sure I fully follow your line of though here, but I cringe at your use of the word "just". It also sounds much more complicated than writing the few lines of code necessary to send UDP.
As far as audio streams go, I'm not too clear on the implementation you have in mind or why this is necessary to pass audio in either direction. I can't think of a reason why audio would need to be passed in either direction.
-
It's a bit of a complex topic(s) -- the audio streams thing.
If you want to continue the conversation maybe we should do it via email, or even Skype.
-
Just tested the GetRandomFile with PlayWav, and it is way better than my previous approach.
Yes, I guess we are going off topic. I'll finish the utility first, and then think about how to communicate with VC. I'll make sure to contact you by then.
Thanks jitterjames.