Kinect audio
Contents
Requirements
Hardware
Kinect sensor version 1 for XBOX 360 or Windows is supported. | |
Kinect sensor version 2, and Kinect for XBOX One are not supported |
Software
Although the Kinect can be used with either version of VoxCommando, our experience has been that the advanced features such as beam forming and acoustic echo cancellation work best when using VoxCommandoSP with one of the Kinect language packs as the "Specific Speech Recognizer" (see the General Options tab in VoxCommandoSP).
However, we still prefer the regular VoxCommando SAPI5 engine. We recommend trying these settings with the non-SP version of VC first, and only switching to SP if the regular engine is not working well enough in your environment.
Kinect language packs available for free download from Microsoft: German, English (Australian, Canadian, GB, NZ, Irish), Spanish (Spanish, Mexican), French (France, Québécois), Japanese, Italian
Important notes
1. Please read this Microsoft discussion of known issues with the Kinect SDK.
2. By default, the Kinect sets the microphone input level to 100%. As with all other microphones used with VC, you should turn the input level way down to reduce the probability of false positives. Start with something like 10%, and if the Kinect fails to hear you reliably, then you can gradually raise the input level.
Settings
Attempt to use Kinect device with Audio Streaming
Enable (checkmark) this setting in order to use Kinect as an audio input with the following
Basic settings:
Adjust tilt on initialization
Will physically adjust the vertical tilt of the Kinect Device when VoxCommando first connects to the device
Enable AEC (acoustic echo cancellation)
When working correctly, this should subtract the audio that your computer is playing from the audio picked up by the Kinect, allowing you to be heard better when your computer is playing audio, such as music.
Enable AGC (automatic gain control)
This should normally be left unchecked because automatic gain control is not good for speech recognition. When no one is speaking the AGC will increase the gain on the input and start picking up quiet sounds which can lead to more false positives.
Beam forming settings:
- Automatic: Default - recommended setting.
- Adaptive: Similar to Automatic, may work better in noisy environments.
- Manual angle: Allows you to set the beam to a fixed angle. (0 degrees is directly in front of the device.)
Beam angle events
- This option cannot be selected if the beam forming mode is set to manual.
- When selected VoxCommando will generate events whenever the Kinect device detects the audio source has moved and adjusts the beam angle.
- Warning: this may result in frequent events depending on the sound in your environment.