Yes you need to combine the groups from the two standard configurations, and you need to remove the overlap between the two groups (i.e. the commands to listen etc.)
You can create commands to launch both / either program
You can create commands to focus both / either program
You can control mediaMonkey entirely without it needing to have focus or even be visible. Some things in XBMC require focus, but some do not.
The tricky bit is that if you have commands that use the same language for both programs, then VC won't know which one you want. For example if you say "play" it may not know which program you want to play.
There are two basic approaches to this problem.
1) use different phrases for each program, or use a prefix. Each group of commands can be given it's own prefix that overrides the standard prefix. So you could use Monkey as the prefix for all your MediaMonkey Commands if you like.
2) turn groups of commands on and off depending on which program you want to use. Vox won't listen for commands in a group that has been disabled. There are now a number of different ways to accomplish this:
A) Using actions: Using a command that can be triggered by focus events, or by voice commands, you can call some of these comands to turn various groups on and off:
http://voxcommando.com/mediawiki/index.php?title=Actions#EnableGroupB) Using group properties: Edit group properties and where it says "active only for program" you can enter the process name. The commands will automatically switch on and off depending on which program has focus. You can set your XBMC groups to only run when xbmc has focus by entering xbmc in this group property field.
Note that "active only for program" can be inverted. If you want a group to be active only when XBMC does NOT have focus you can enter !xbmc (the ! means 'not')
I can put something together for you but if I need to know more about the details of how you want to do it because there is a lot of flexibility in how this can be accomplished, so that you can get it exactly how you want it...
Another consideration is that if you have a very large music library, you might want to remove all the commands for requesting music by name from XBMC to save memory and reduce load times. At the very least you will probably want to remove the "play song" commands since they are usually the biggest in terms of PayloadXML files.
Sorry if this is a bit more complicated than you expected. With power and flexibility we usually also end up with things being a bit more complicated. Of course once you set it up the way you want it, it should still be easy to use.