Author Topic: Features to help with typesetting and other applications  (Read 1361 times)

0 Members and 1 Guest are viewing this topic.

nathanrm

  • Jr. Member
  • **
  • Posts: 4
  • Karma: 0
    • View Profile
Features to help with typesetting and other applications
« on: January 05, 2012, 03:08:04 PM »
Hi,
I recently purchased VoxCommando, and really like it.  I'm using it to typeset equations in Mathamatica while I'm writing my physics PhD thesis.  Mathematica has hundreds of keyboard shortcuts for typesetting, but they're annoying to type and would take a while to memorize.  With VoxCommando I can access them verbally using the standard names I already know.

I'm using version 0.921 of VoxCommando on Windows 7.

I have some suggestions for how it could be improved for typesetting and other applications.  Essentially, I would like to fluidly dictate equations using standard math terminology.  The neat thing about this is that it's easy to do, works even for advanced math, and seems to be a somewhat unexplored frontier.  The closest existing product is Mathtalk, which is expensive because it's targeted at people with disabilities.

Specific suggestions:
1.) I couldn't achieve reliable entry of alphabet characters.  In particular, the letters "a" and "k" aren't distinguished from each other, as well as "b" and "d".  Windows Speech Recognition works well in spelling mode, though (not going through VoxCommando).  I'm using a good headset with a noise cancelling microphone.  I tried turning on the "Learn" checkbox, which helped some. I also tried prefixing letters with the word "letter", which didn't help.

2.) Would it be possible for VoxCommando to recognize continuously dicated commands?  It doesn't seem to be able to now.

3.) Command autocompletion: This would be really neat for typesetting, and certainly other uses.  After saying a command, the user could say other words to modify the commmand.  The next time the command was invoked, it would use the modifications applied to the last invocation.  I would use this to automatically apply font and adornment options to letters.  For example, on the first entry of a symbol I could say something like "s, capital, bold, gothic".  After that I could just say "s", and the right symbol would be entered. 

The modifiers (capital, bold, gothic) could simply be other commands that the first command ("s" in the example) would be told to "listen" for, remember, and enter automatically the next time.  When the first command heard another command it's not told to remember, it would stop listening.  If you like this idea and want to implement it, it would be a good idea to first check that it's not patented.

4.) Import/Export of command groups:  I would use this to post my VoxCommando commands on the web.

Great program!
Nathan

jitterjames

  • Administrator
  • Hero Member
  • *****
  • Posts: 7713
  • Karma: 116
    • View Profile
    • VoxCommando
Re: Features to help with typesetting and other applications
« Reply #1 on: January 05, 2012, 06:06:56 PM »
4)  Easiest answer first... this exists already.  Right click a group to export it.

If you want to export multiple groups you need to arrange them in the bin (right-hand tree in editor) and then "save the bin to a new file"

To import a group use method a or b:

a) drag the xml file into the bin which will open it in the bin and then you can drag over the commands you want to your main tree

b) drag the xml file into the main tree, which will import everything to your main tree.

either way you then have the option to save or not...

1) When you say "Windows Speech Recognition works well in spelling mode" do you mean when you are able to say "s as in sam", or "a as in apple" ?  Or are you just talking about saying "a".  For this point, and possibly for others, I'd like to see your voicecommands.xml file to see how you are currently doing things, and maybe discuss it with you on skype or even share a screen with teamviewer.  I'm sure it would be a benefit to both of us.

2) I'd like to discuss this.  I'm not 100% sure what you mean by "continuous dictation".  I don't think there is a way to do it with VC the way it is now, but it might be possible to create something.

3) This one is tricky.  but it may actually be possible with VC as it is now, though it would probably also be possible to make it easier.  I'm not sure I've completely wrapped my head around what you want and it would help if I could see what you are already doing.

Glad you have you aboard!

nathanrm

  • Jr. Member
  • **
  • Posts: 4
  • Karma: 0
    • View Profile
Re: Features to help with typesetting and other applications
« Reply #2 on: January 06, 2012, 04:50:26 PM »
1)  If I spell words out like "c", "a", "t", Windows Speech Recognition works okay with its standard training.  I've gotten VoxCommando to recognize letters well by training it, but the letter "a" keeps being recognized as "k" no matter what I do.  I don't think I say it like "k".

Some example command groups are attached to this post.

2) Currently VC seems to require a slight pause between commands.  It's still great for use in typesetting, but it would be more natural if VC could take a stream of commands without pauses.

3) I'm working on a more detailed explanation of the idea.

4) Sorry, I should have seen that.

I'm willing to do either a Skype or Teamview session.  I don't have either installed yet; I'm guessing TeamView would be easier.  My schedule is pretty flexible, so you can suggest a time. 

Thanks for your help!

jitterjames

  • Administrator
  • Hero Member
  • *****
  • Posts: 7713
  • Karma: 116
    • View Profile
    • VoxCommando
Re: Features to help with typesetting and other applications
« Reply #3 on: January 06, 2012, 06:37:02 PM »
1)  this requires more discussion and explanation but basically VC uses the same engine as windows built-in (WSR) and when you train one it trains the other.  So you should not see a significant difference between the two.  It may come down to context.  The difference is that you are using WSR in dictation mode.  Try using VC in dictation mode and you should expect to get exactly the same results.

Although it is unlikely, it is possible that for some reason VC is using different input levels than WSR.  You could try lowering the input level on your mic/headset to see if it helps with the k vs a.  Obviously the k and a have identical vowel sounds, so maybe it is hearing some kind of distortion.  If you are wearing a headset, also make sure the mic is off to the side of your mouth and not too close.

2) this is not really possible.    At least I don't think so.  I've never really seen it done in speech recognition except when doing straight dictation with no commands.  If you are using WSR in MS Word and you say "one two three delete", it will type exactly that.  You need to pause before issuing the delete command.  Either you are in dictation mode or command mode.  You can't switch back seamlessly from one to the other in a single sentence and expect it do know the difference.  If you want to be able to speak continuously, you would need to just use dictation, and then look at the string that is produced and try to extract commands from it, but it is not something that is likely to work well.  There is a payload in VC called payload dictation that lets you speak continuously and returns a long string when you stop talking.  I will be adding an option to do spelling dictation so it will interpret everything you say as a letter.  Then you will be able to say "c a t e as in Edward colon" and it should give you "c a t e :".  But you won't be able to jump directly into greek letters from this, and as with any payload in VC, it needs to be preceded by a phrase such as "type".

3-4) OK.  We can also talk on the phone if you want.  Team viewer is easier to install than skype.  In fact, I don't think you even need to install it.  You used to be able to just download and run it.  Installation was optional.  If you are in North American I can call you for almost free on my voip.  You can pm or email me to set it up, though I think I need a few days to deal with other stuff.

nathanrm

  • Jr. Member
  • **
  • Posts: 4
  • Karma: 0
    • View Profile
Re: Features to help with typesetting and other applications
« Reply #4 on: January 10, 2012, 12:36:38 PM »
1) and 2)   My current setup with VC is already way better than using keyboard shortcuts or other options.  I think we these items can be left alone.

I've written a description of item 3).  Could you send me an email, so I can send you the Word document?