I have the same concern, but in theory it looks doable.
There are not that many languages, If I remember correctly 21 languages, some of them just different dialects (en-US, en-CA, en-GB), so the alphabet would be the same.
The logic I am thinking, when building grammar:
parse the phrases
remove all non-confirming with selected SR characters (you would just need to have a regex for each language)
if remaining phrase is empty (or less than defined threshold) then completely ignore it
Of course all of this should be optional to user. But it is very important especially if it slows down the engine.
For Russian it should be very straight forward as the alphabets in Russian and English are completely different (eventhough some letters look the same, they have a different ASCII code)
For European languages, such as German, it should be no problem also, as I believe German has all the English letters + few accented and national. In any case for some languages the regex can be empty if it is not required.
And you already know what language is selected for SR, so no need to detect it
Btw, can you create a grammar xml file (not compiled) from the backup I've sent you, there we can probably see if the SR ignores the non-cyrillic characters or not? If it does, then nothing needs to be done.