Kinect audio

From VoxCommando
Jump to: navigation, search
KinectSensor.png

Requirements

Hardware

Checkyes.png Kinect sensor version 1 for XBOX 360 or Windows is supported.
Checkno.png Kinect sensor version 2, and Kinect for XBOX One are not supported

Software

Kinect for Windows SDK v1.8

Although the Kinect can be used with either version of VoxCommando, our experience has been that the advanced features such as beam forming and acoustic echo cancellation work best when using VoxCommandoSP with one of the Kinect language packs as the "Specific Speech Recognizer" (see the General Options tab in VoxCommandoSP).

However, we still prefer the regular VoxCommando SAPI5 engine. We recommend trying these settings with the non-SP version of VC first, and only switching to SP if the regular engine is not working well enough in your environment.

Kinect language packs available for free download from Microsoft: German, English (Australian, Canadian, GB, NZ, Irish), Spanish (Spanish, Mexican), French (France, Québécois), Japanese, Italian

Important notes

1. Please read this Microsoft discussion of known issues with the Kinect SDK.

2. By default, the Kinect sets the microphone input level to 100%. As with all other microphones used with VC, you should turn the input level way down to reduce the probability of false positives. Start with something like 10%, and if the Kinect fails to hear you reliably, then you can gradually raise the input level.

Settings

Options-Kinect.png

Attempt to use Kinect device with Audio Streaming

Enable (checkmark) this setting in order to use Kinect as an audio input with the following

Basic settings:

Adjust tilt on initialization

Will physically adjust the vertical tilt of the Kinect Device when VoxCommando first connects to the device

Enable AEC (acoustic echo cancellation)

When working correctly, this should subtract the audio that your computer is playing from the audio picked up by the Kinect, allowing you to be heard better when your computer is playing audio, such as music.

Enable AGC (automatic gain control)

This should normally be left unchecked because automatic gain control is not good for speech recognition. When no one is speaking the AGC will increase the gain on the input and start picking up quiet sounds which can lead to more false positives.

Beam forming settings:

  • Automatic: Default - recommended setting.
  • Adaptive: Similar to Automatic, may work better in noisy environments.
  • Manual angle: Allows you to set the beam to a fixed angle. (0 degrees is directly in front of the device.)

Beam angle events

  • This option cannot be selected if the beam forming mode is set to manual.
  • When selected VoxCommando will generate events whenever the Kinect device detects the audio source has moved and adjusts the beam angle.
  • Warning: this may result in frequent events depending on the sound in your environment.