-
Notifications
You must be signed in to change notification settings - Fork 10
Home
Jay Miller edited this page May 18, 2020
·
3 revisions
In the simplest case, you just need to instantiate a WebRtcVad object and
supply it with a byte[]
of audio.
bool DoesFrameContainSpeech(byte[] audioFrame)
{
using var vad = new WebRtcVad();
return vad.HasSpeech(audioFrame, SampleRate.Is8kHz, FrameLength.Is10ms);
}
IDisposable
, so a using
block is necessary.
A more complicated use case might pre-configure the VAD for a particular audio format, then stream frames over a full audio file.
using var vad = new WebRtcVad()
{
OperatingMode = OperatingMode.Aggressive,
FrameLength = FrameLength.Is30ms,
SampleRate = SampleRate.Is16kHz,
};
var frameSize = (int)vad.SampleRate / 1000 * 2 * (int)vad.FrameLength;
var buffer = new byte[frameSize];
using var audio = OpenAudioFile(filename);
for (int i = 0; i < audio.Length - frameSize; i += frameSize)
{
audio.Read(buffer, 0, buffer.Length);
var hasSpeech = vad.HasSpeech(buffer);
Console.WriteLine($"Frame {i}: {hasSpeech}");
}
In the above example, we are:
- configuring the operating mode, frame length and sample rate,
- computing the frame size (in bytes) and allocating a buffer,
- opening the audio file stream,
- reading each frame into our buffer, and
- passing the buffer to the pre-configured VAD.