Does VoxCommando perform that fast as shown in the videos when using the app on Android? I notice when I am using Google Now to 'play movie x', it delays to interpret my command. VoxCommando seems instantaneous.
Usually it is almost instant, yes. The main reason that Google Now method is probably slower is that it is uploading your audio to a remote server where it is being processed (without a specific context) and then it is returning a bunch of possible sentences to your phone. Then your phone is looking at the sentences and picking which one it thinks is a command it can understand, and then acts on the command. So depending on your internet upload speeds and how busy the google servers are you may see some delays. You are also donating your voice data to Google to do with as they please.
With VoxCommando you are streaming your audio to a windows machine on your local area network, not uploading anything. Also because VoxWav allows (but does not force) you to use a press and hold to speak button, the moment you release the button it knows you have stopped speaking so after that it should only take a fraction of a second to determine what command you issued. Any delays after that would be what is required to tell XBMC what you want and wait for XBMC to do it, but on a fast system this should appear to be instantaneous, and in any case will be similar to what you see with something like yatse when pressing a button.
The main difference is that controlling XBMC is only one of many things you can do with VoxCommando, and you (presumably - I don't know much about yatse) have much more control over how you do everything. The downside (from some people's perspective - others will see it only as a perk) is that it is more work to set up, and requires Windows.