Author Topic: Multiple issues with VC SP 1.9.2.1 (Read 3502 times)

Filipok · « **on:** April 15, 2014, 12:20:57 PM »

1. Very slow microphone initialization. After startup or rebuilding it takes 1-2 minutes for the mic to become active.
This is what I see in log file:

Code: [Select]

15.04.2014 18:41:10	504	unable to reset speech engine: System.InvalidOperationException: Cannot perform this operation while the recognizer is doing recognition.
   в Microsoft.Speech.Recognition.RecognizerBase.SetInputToDefaultAudioDevice()
   в eval_e.eval_ᜎ()

2. When I was creating payload XML, I had an issue that one of the payload phrases was too long.
When that happened none of the phrases from this XML worked, and Vox crashed when I tried to edit the XML through the build-in editor.

Btw, what is maximum phrase length? My phrase was:

Code: [Select]

<phrase>инфо баннер, инфо баннеры, информационный баннер, информационные баннеры</phrase>

I had to remove the last option for the phrase to get accepted.

Filipok · « **Reply #1 on:** April 15, 2014, 12:36:47 PM »

Also something is going on with the OSD display of commands in XBMC. Very often I get gibberish for both the title or the message. See attached screenshot.

I've tested the GUI.ShowNotification and it properly displays both English and Russian. Actually I am not even sure what sends the info to OSD, as for example the take screenshot command does't have any notification actions.

jitterjames · « **Reply #2 on:** April 15, 2014, 12:37:03 PM »

If you want to post your xml of what you have so far I can try to see if I have the same issue here. If it is not too much trouble for you, please use the "backup current configuration" option from the File menu and send the zip that is generated.

There is no maximum phrase length really, it is more limited by what can be displayed in a treeview on a windows form. It could certainly be much longer than what you have shown here, so it may not be the length for something else that it does not like. The comma separated aliases are ultimately treated in a way that is similar to a payload XML which can have thousands of items in them. Perhaps the Russian speech engine is not as robust as the English ones.

You can try to do a "purge cache" and then restart VC. If it is acting oddly you might want to reboot in case something did not close properly. I just throwing ideas out there since I don't really know what is going on.

Filipok · « **Reply #3 on:** April 15, 2014, 12:42:49 PM »

Here is the backup.

The problem phrase is in the .\payloads\xbmc_aeon_mq5_viewmodes.xml

jitterjames · « **Reply #4 on:** April 15, 2014, 01:56:19 PM »

The maximum phrase length for payload XML files is defined in the options on the advanced tab. Normally this is used to catch problems with media tagging and long names. You could increase this number or split your phrases into two rows.

You have hand edited the mq view modes payloadXML and added comments that VC is not able to understand.

There seems to be a significant problem with the way your music library has been indexed by XBMC (or possibly in the way VC is reading the database - but I don't think so). You have the same album names over and over again, so each album which should appear only once and only have one ID is appearing once for each song and with multiple IDs. I also really don't know how the Russian SR engine will deal with thousands of English phrases, so you might just want to remove the xbmc albums payload XML or edit it to remove most of the items.

The OSD messages you are seeing are based on recognized speech and possible alternates. This is enabled in the settings that are in the xJson plugin settings. If you want to you can turn it off. I'm not sure about the garbled text, but it could be related to the other problems. I would try to fix them first and then see if the OSD problems persist.

You have set in options to "check mic status" every 30 seconds. It is generally recommended to leave this setting at 0. This setting combined with problems with your albums xml could be causing unexpected behaviour, exposing a weakness in the program that does not normally affect most users.

jitterjames · « **Reply #5 on:** April 15, 2014, 02:13:14 PM »

Also, I should say thanks again for your work on this, and for taking the time to post your xml files etc. I got wrapped up in trying to solve the problem and list the possible issues, but I hope I did not sound rude. I appreciate your help and effort getting VC to work in a new language.

Since we have not tested this in most languages other than English very much, and since the new SP engine was only recently added into the mix and has not been tested too extensively either, it is not very surprising that new issues will pop up, but some of the issues are old ones we have seen before like bad XBMC media tags, or people hand editing xml files. The fact that you comments are causing problems is odd... I will have to look at the code to see why this is.

Filipok · « **Reply #6 on:** April 15, 2014, 03:32:27 PM »

You were definitely not rude

The recheck mic option I enabled just to see if it will help the mic initialization issue. I did not make any difference. Btw, I never had this issue playing with VC1 using English SR.

I think most issues with my music collection are "bad" tags in mp3 files (some are uft8 while some are not), I don't really listen to music through XBMC, so I never bothered to fix it. I will remove all songs from the XMLs to see if it helps.

So comments can't be added to payload XMLs? I put them there, so it will be easier to understand later what each view is actually is (for ppl that don't read Russian).

Dealing with English texts on non-English SR is definitely something to take a look at, as I would suspect that many ppl would have the same issue, as very often the scrappers do not find localized translations for movie/song/artist info. I was under impression that it just ignores it, as I do not have any issues with recognition. If there is a problem then maybe when building grammar files you can filter the entries with regex or something similar to remove all characters that are not present in the selected SR culture. It could be that you did not have this issue earlier as German or Portuguese alphabet is almost the same as English and SR just tries to recognize it phonetically.

Regarding XJson OSD notifications, I've seen earlier on forum someone asked about similar issue, and you provided a fixed xjson.dll, but that was for VC1, where the same fix applied for version 2?
On the screenshot I posted I am using voice command "Take screenshot", there are no alternatives offered. So most likely it should have "сделай снимок" (which is take screenshot in Russian) on the left top corner in XBMC. It looks like some issue with UTF8 encoding/decoding.

jitterjames · « **Reply #7 on:** April 15, 2014, 04:04:24 PM »

I don't know if the engine will mind the English or not. It is possible it will ignore it, or that certain words might still work, or that it could cause delays in processing the data if the engine is struggling to make sense of it. It was for this last reason that I mentioned it. It might be mostly the repeated album names that were causing the most problems.

You think too highly of me as a programmer if you believe that I can reliably determine and discard languages using regex!

Hopefully it will not be an issue for you but if it were the solution would unfortunately probably be up to the user to separate their content somehow. It is a big problem that so much media has only an English title. It makes it difficult for anyone who is trying to use another language with their media libraries in XBMC, MediaMonkey etc. I am not sure what the best solution is in this case. Certainly with languages that share most of the same alphabet there is at least the potential for partial success. So long story short, I know it is an issue but I don't know that there is an easy fix. I have a similar issue with people who have huge media libraries. I am aware of the problem, but ultimately the solution requires a bit of work for the end user because there is no software solution for processing that many phrases.

I believe I implemented the character encoding correction for XBMC for both VC1 and VC2 and I tested with Cyrillic but it is possible that I only tested with the user generated OSD messages and not the event driven ones used for displaying recognized commands so I will check this.

Filipok · « **Reply #8 on:** April 15, 2014, 04:16:28 PM »

User generated OSD I've tested today and it works perfectly. Only event driven have an issue.

I'll take a look at creating a regex to strip unneeded characters, but I am not a regex expert either

jitterjames · « **Reply #9 on:** April 15, 2014, 04:35:57 PM »

Part of my concern here is that even if I removed all non Cyrillic characters, then what was left might be nonsense or be worse. Also if this were to be automated in the software it would need to detect the language selected, then filter based on that character set. There are quite a few languages to deal with. In some cases, I think that the engine can handle other languages. With Russian, we tested some English phrases and it appears as if they are completely ignored, but they may still slow it down during loading.

By the way, with the xml comments, it is important to note that even if VC can ignore the comments, if you ever edit them in VC the comments would be erased after you saved it. Most users do not edit the xml directly. I personally only do it if I carefully need to do multiple find/replaces.

Filipok · « **Reply #10 on:** April 15, 2014, 05:56:01 PM »

I have the same concern, but in theory it looks doable.
There are not that many languages, If I remember correctly 21 languages, some of them just different dialects (en-US, en-CA, en-GB), so the alphabet would be the same.

The logic I am thinking, when building grammar:
  parse the phrases
  remove all non-confirming with selected SR characters (you would just need to have a regex for each language)
  if remaining phrase is empty (or less than defined threshold) then completely ignore it

Of course all of this should be optional to user. But it is very important especially if it slows down the engine.

For Russian it should be very straight forward as the alphabets in Russian and English are completely different (eventhough some letters look the same, they have a different ASCII code)
For European languages, such as German, it should be no problem also, as I believe German has all the English letters + few accented and national. In any case for some languages the regex can be empty if it is not required.

And you already know what language is selected for SR, so no need to detect it

Btw, can you create a grammar xml file (not compiled) from the backup I've sent you, there we can probably see if the SR ignores the non-cyrillic characters or not? If it does, then nothing needs to be done.

jitterjames · « **Reply #11 on:** April 15, 2014, 06:08:00 PM »

Quote from: Filipok on April 15, 2014, 05:56:01 PM

And you already know what language is selected for SR, so no need to detect it

HA. You got me. I just meant check, not detect.

I think the first thing to do is to find a way to test if the engine cares or if it already just ignores characters that it doesn't like. Depending on what Microsoft has done in that black box, my trying to analyse every character for each entry in each payloadXML file might actually slow things down.

Filipok · « **Reply #12 on:** April 15, 2014, 06:09:09 PM »

Yes, check my addition on the last line to previous post.

jitterjames · « **Reply #13 on:** April 15, 2014, 06:48:14 PM »

Quote from: Filipok on April 15, 2014, 05:56:01 PM

Btw, can you create a grammar xml file (not compiled) from the backup I've sent you, there we can probably see if the SR ignores the non-cyrillic characters or not? If it does, then nothing needs to be done.

It took me a while to figure out what you were saying here. Yes that is a genius idea and I can do that, but I'm not sure it will give us a definitive answer, because the xml generated by microsoft might not actually look too closely at the content. That might only happen when loading or compiling an srgs.

Going with your train of throught though, what might make more sense would be to create two groups, one with only russian payload xmls, and then to create the exact same group with the same commands etc, but to add a bunch of English only phrases, compile (just save and close the tree will do that) and see if there is a big difference in the size of the srgs file. This will tell us if it is discarding the English phrases anyway, we could probably also compare to see how long they take to compile. I will try to devise a way to do this properly, but not today...

Filipok · « **Reply #14 on:** April 16, 2014, 06:11:31 AM »

I found out (the hard way) what breaks by having comments inside XML:

- impossible to generate XML through Xsql
- impossible to generate Voice menu html

Author Topic: Multiple issues with VC SP 1.9.2.1 (Read 3502 times)

Filipok

Multiple issues with VC SP 1.9.2.1

Filipok

Re: Multiple issues with VC SP 1.9.2.1

jitterjames

Re: Multiple issues with VC SP 1.9.2.1

Filipok

Re: Multiple issues with VC SP 1.9.2.1

jitterjames

Re: Multiple issues with VC SP 1.9.2.1

jitterjames

Re: Multiple issues with VC SP 1.9.2.1

Filipok

Re: Multiple issues with VC SP 1.9.2.1

jitterjames

Re: Multiple issues with VC SP 1.9.2.1

Filipok

Re: Multiple issues with VC SP 1.9.2.1

jitterjames

Re: Multiple issues with VC SP 1.9.2.1

Filipok

Re: Multiple issues with VC SP 1.9.2.1

jitterjames

Re: Multiple issues with VC SP 1.9.2.1

Filipok

Re: Multiple issues with VC SP 1.9.2.1

jitterjames

Re: Multiple issues with VC SP 1.9.2.1

Filipok

Re: Multiple issues with VC SP 1.9.2.1