I believe so far no one has found a way to directly connect alexas or google mics to voxcommando? Also what do you mean with what you said about the wakeword and cloud? I thought voxcommando could also be triggered by wakeword?
As far as I know, noone has found a way to hijack the audio stream directly from these microphones. But even if you could, you might not get very good results feeding this audio into VoxCommando. With Alexa you can listen to the audio that was captured on their webpage and the quality is generally very poor sounding. The thing with Alexa is that they have 100% control over every stage of the process, from the engineering of the microphone, to the digital signal processing, and the compression used in uploading the stream, to the powerful cloud based recognition which can be engineered to get the most out of that very specific sounding audio. VoxCommando on the other hand uses Microsoft speech engine which came with Windows Vista and has not really changed very much since it was first released. It is not designed to work with a specific sound profile and requires decent audio quality without too much background noise. With a good audio input it does work extremely well but it was pretty much designed to be used with a headset.
VoxCommado does use a prefix to reduce false positives. It works reasonably well but it is not the same thing as a wake word. It might be possible to create something like a wake word for VoxCommando but you would need to be able to switch the recognition engine on instantly and it usually does not do well with that. When first turning the speech engine on, I find I usually have to wait a couple of seconds before speaking in order to get good recognition.
I found a lot of multimic arrays on amazon (jabra, emeet) that should function like an echo/echo dot. They are used for conference meetings. But I have no idea how far inferior these are to the mic arrays used in an echo dot.
Also I read stuff about boundary mics which should be good for voxcommando. What about the current beyerdynamic ones? https://www.beyerdynamic.de/bm-32-b.html
Multimic arrays that you find will probably not work like the echo, or if they do they will probably be extremely expensive. Many of them also work using Bluetooth which is generally not great for Microsoft's speech engine. With the SP engine you might be OK using Bluetooth. If you find a specific array mic that is affordable and you think it might work then tell us the model number and we can take a look. At the end of the day the only way to know will be to try it.
Some of the boundary mics probably could work well with VoxCommando but they will pick up EVERYTHING which as I mentioned earlier is only good if you live in a completely silent environment. In reality this is almost never the case, and certainly makes listening to music impossible.