Author Topic: Can sox pre-processing be applied to everything I speak?  (Read 1247 times)

0 Members and 3 Guests are viewing this topic.

marcusvdt

  • Sr. Member
  • ****
  • Posts: 152
  • Karma: 6
  • Researching
    • View Profile
Can sox pre-processing be applied to everything I speak?
« on: July 22, 2015, 02:22:04 PM »
either via PC mic or via Voxwav? I mostly use Voxwav anyway and I'm probably going to prefer using it.

In the Advanced Options, there is the options for sox pre-processing, but as per what I understand, it applies only to the directory of watched wav files.
My house is always noisy with kids and televisions, so VC works great when the house is quiet, but does not work fine when the house is in its most common state of big mess  :biglaugh

I know I can't magically teach the computer to ignore the background noise and hear only my voice, but perhaps some pre-processing of the audio could help. I'm thinking about some frequency filtering, some noise removal, and perhaps some other processing that helps the voice to be distinguished from the other noise by reducing it and hence making the voice more easily recognizable.

For example, my kid can't be recognized, no matter how he talks. I think it's because his voice is a little bit too acute, not sure.. So I would like to experiment with different effects applied to wav files.
Yes, I know it will add a delay for the processing, but I think it's more important to have the computer hearing me correctly 99% of the time than having a quick non recognized command.

The TCPmic has an option to save wav files, but then I wonder I can't use this option in conjunction with the Watched folder setting from the Advanced Options. I guess it would fire two concurrent interpretations of the commands sent via Voxwav, resulting in a big mess.


Thanks.
« Last Edit: July 22, 2015, 04:40:09 PM by marcusvdt »

jitterjames

  • Administrator
  • Hero Member
  • *****
  • Posts: 7714
  • Karma: 116
    • View Profile
    • VoxCommando
Re: Can sox pre-processing be applied to everything I speak?
« Reply #1 on: July 23, 2015, 10:07:24 PM »
Currently this is only an option that can be used with the wavwatch folder.

I can almost guarantee that there is no processing you can do with sox that will actually improve the accuracy of your recognition in a noisy environment.  The only thing that helps without degrading the quality of the signal is to lower the volume if it is too high, and the TCPMic plugin can already do this.

I don't actually think that using sox to process the audio would create a noticable delay.  On a decent machine it should happen in a fraction of a second anyway.

You can do a test yourself using two instances of VoxCommando if you want to be sure.  Use the first VoxCommando running TCP Mic to create your wav files.  This VC does not need to have any commands (I don't think) or it could maybe have a couple simple commands if necessary.  Set the second VC to watch the folder where TCP Mic from the first VC is saving the wav files.
« Last Edit: July 23, 2015, 10:23:27 PM by jitterjames »