Skip to content
Jay Miller edited this page May 18, 2020 · 3 revisions

Usage

In the simplest case, you just need to instantiate a WebRtcVad object and supply it with a byte[] of audio.

bool DoesFrameContainSpeech(byte[] audioFrame)
{
  using var vad = new WebRtcVad();
  return vad.HasSpeech(audioFrame, SampleRate.Is8kHz, FrameLength.Is10ms);
}

⚠️ Note that WebRtcVad implements IDisposable, so a using block is necessary.

A more complicated use case might pre-configure the VAD for a particular audio format, then stream frames over a full audio file.

using var vad = new WebRtcVad()
{
    OperatingMode = OperatingMode.Aggressive,
    FrameLength = FrameLength.Is30ms,
    SampleRate = SampleRate.Is16kHz,
};

var frameSize = (int)vad.SampleRate / 1000 * 2 * (int)vad.FrameLength;
var buffer = new byte[frameSize];
using var audio = OpenAudioFile(filename);

for (int i = 0; i < audio.Length - frameSize; i += frameSize)
{
    audio.Read(buffer, 0, buffer.Length);
    var hasSpeech = vad.HasSpeech(buffer);
    Console.WriteLine($"Frame {i}: {hasSpeech}");
}

In the above example, we are:

  • configuring the operating mode, frame length and sample rate,
  • computing the frame size (in bytes) and allocating a buffer,
  • opening the audio file stream,
  • reading each frame into our buffer, and
  • passing the buffer to the pre-configured VAD.
Clone this wiki locally