-
Notifications
You must be signed in to change notification settings - Fork 934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decoding the model #8
Comments
Hi! Thanks for the sharing. I am also learning about the project and when I try to get the denoise_data9.h5 using the bin2hdf5.py, something wrong happens. Would you please show me the way to use the bin2hdf5.py? Thanks. |
@zhaoforever you can use the main function in denoise.c to get the feature for tranning. |
@liuanping so you've been able to train a model with my code -- cool! As for loading it, I've had all kinds of problems getting Keras to do that. I'll try the change you suggested (once I'm done with the ICASSP paper). |
@jmvalin i have trained a model to try do ASR in noisy, and the result is not so good. I think is because of the bandwidth in the high frequency band is little big. So i also try to use a 40 band feature to train a new model and have a try and i constrain the frequency from 0Hz-8KHz, because the frequency sample is 16KHz. I hope it would be better. |
@liuanping I think 16KHz sampling rate will be enough for both ASR task and communication. |
@zhaoforever I found that the speech information have been hurt in the high frequency band, this may damage the ASR because the machine can hear the high frequency information. So i want to save the high frequency information by using much more band like 40 instead 22 |
@liuanping As @jmvalin mentioned, this project mainly targets speech communication instead of speech recognition. So the algorithm architecture is based on some speech codec algorithm. I don't think the RNNoise will work well for ASR task without changing the sampling rate and band feathure. I think you are on the right way and looking forward to your results. |
Please keep the discussion public (if possible) so we can all benefit, thanks. |
@liuanping Thanks for your Keras code and it works. But as you said, you change the band feature to 40 bands instead of 22. As I know, the features from the C code contains some algorithm from the opus, which means you have to make a lot of changes. Have you finish the test yet? |
@yeah i have done it ,but it can not make better for ASR, some people advise me to train together with DNN or BLSTM at the same time. Also, i think this may heart voice. No i think it could use to combine with beamfoming tech. |
First of all, thanks for publishing the RNNoise code... I prepared speech_only.wav, noise_only.wav, speech_noise.wav. ~/rnnoise$ git clone https://github.com/smallmuou/wavutils
++++++++++++++++++++++++++++++++++++++++++++++ and then, step 1) step 2) ~/rnnoise/src/denoise_training speech_only.pcm noise_only.pcm output.f32 step 3) ~/rnnoise/training/bin2hdf5.py output.f32 500000 87 denoise_data9.h5 step 4) ~/rnnoise/training/rnn_train.py step 5) ~/rnnoise/training/dump_rnn.py newweights9i.hdf5 rnn_data.c rnn_data.h step 6) make clean & make step 7) ~/rnnoise/examples/rnnoise_demo speech_noise.pcm denoised_speech_noise.pcm step 8) ~/rnnoise/wavutils/bin/pcm2wav 1 48000 16 denoised_speech_noise.pcm denoised_speech_noise.wav Good Luck ~~ From Jin |
@Gram2017 Thanks for your experience about using the RNNoise. As you mentioned, the size of the speech_only.wav is 3564480,which seems to be a big data file. Does the file come from some public database? Thanks. |
step 2) ~/rnnoise/src/denoise_training speech_only.pcm noise_only.pcm output.f32 |
@18307612949 ,generated by running './compile.sh' in the terminal at /src |
@liuanping Hey, guy. How is your way to optimize this method for ASR? what improvement is useful to match ASR. |
@Gram2017/everyone The recipe in this comment shows how to generate the data file for a single wave file. Say I have a bunch of speech_only and noise_only PCM files. How do I prep the data for the trainer? |
@delip normalize, resample and combine them as one PCM file. |
@Gram2017 Thanks for the description in detail. When I try to execute step 2, it's continuously running with terminal output as shown in screenshot attached. Do you have any suggestion to solve it ? |
Hi,guys,learned many things from your discussion,wandering when I use a 16k hz training samples,do I need to change the code or not need to change the code? |
The 'count' variable is set to a very high value in denoise.c and that's the reason for high execution time. Reducing the count value will help you see the output. |
The output is meant to be redirected to a file and saved. The count variable just determines how many training samples are generated |
I am getting below error while running "python dump_rnn.py newweights9i.hdf5 rnn_data.c rnn_data.h" File "training/dump_rnn.py", line 88, in I have also tried the code posted by @liuanping but again i am getting below error File "training/dump_rnn.py", line 95, in |
i have solved that error by just replacing init with init in @liuanping |
@venkat-kittu Replacing init with init ? |
@KrishnanParameswaran i replaced the init with init in following [image] |
Tried to followboth @liuanping and @venkat-kittu suggestions, but I still have the following result:
|
i would like to summarize few fixes that might need to be done for some of the issues highlighted in this forum
Step-1: Edit rnnoise/src/denoise.c line 653 for the following NOTE: u may optionally reduce the count to lesser value from 50000000. i used 50000 for quick validation purposes. i guess 500000 samples should be good enough. please correct me if not Step-2: run './compile.sh'
Follow the first comment by @liuanping in this discussion thread Correction 1: missing multipler for 10K.square and 0.01K.binar_crossentrop in mycost method Corretion 2: Invalid class initializer 'init' class WeightClip(Constraint): Correction 3: Indentation issues if any |
Should i have to change code if I train data with different sampling rate like 44100hz or 16khz. |
@Gram2017 thank you for your detailed example. It is not clear to me how train multiple file samples I mean supposed I have several |
@akshayaCap Hey, I met the same error. It generates an array of size zero. Have you fixed it? Plus it says I have a segmentation fault. But I didnt find an overflow from denoise.c Thanks a lot. |
@zhly0 @venkat-kittu As far as I understand, the application for feature extraction doesn't know whether you're processing a 48kHz file or a 16kHz file. The number of samples you feed in per frame must be based on the sampling rate of the input file. So play around with frame size to suit the sampling frequency of your audio files. |
Firstly, thank for your answer, but I still have the questions. I want to train the model using my own dataset. I have two 30s wav files which are HumanVoice_only.wav and Noise_only.wav. Their format is like above picture, the sample rate is 48kHz. Firstly I transform these two wav files into pcm files follow the steps which @Gram2017 just mentioned above. And then I use |
@nerv3890 have you changed stdout in lines 653,654,655,656 in denoise.c to fout. |
Sorry it seems like a simple question. I am not familiar with C. |
Hi, I'm still unclear whether it is possible to obtain the original training data: denoise_data9.h5? Thanks. Jon |
is it necessary to normalize? can't we just combine them? Thanks |
Dear, |
Dear, |
Dear, Thx |
Dear |
ANy one has the dump_rnn.py question ? |
hey i was trying to replicate some results as soon as run compile.sh i got this in return |
hey i am also having empty outputfile and when i saw this comment i go to denoise.c and tried to change it but could not able to is because their are just 642 lines in all can you give me a snapshot |
Hi,@jmvalin mentioned, this project mainly targets speech communication instead of speech recognition. I'd like to know which paper this sentence comes from? thank you @zhaoforever |
Hi,@pranshurastogi29, have you solved this problem? I have the same problem... |
why ? I always generate matrix size: 0 x 87 |
I get the same issue as you. Have you solved this? |
When I try to use the provided python script dump_rnn.py to decode the newweights9i.hdf5 model, I found that it can not work well. So I change a lot of it to make it work well. I am not sure if it is right in my way. Here I want share them to you. If you are not busy in some time, please help me check it. I have try it to decode the model i got. I add below in the begging.
from keras.constraints import Constraint
def mean_squared_sqrt_error(y_true, y_pred):
return K.mean(K.square(K.sqrt(y_pred) - K.sqrt(y_true)), axis=-1)
def my_crossentropy(y_true, y_pred):
return K.mean(2*K.abs(y_true-0.5) * K.binary_crossentropy(y_pred, y_true), axis=-1)
def mymask(y_true):
return K.minimum(y_true+1., 1.)
def msse(y_true, y_pred):
return K.mean(mymask(y_true) * K.square(K.sqrt(y_pred) - K.sqrt(y_true)), axis=-1)
def mycost(y_true, y_pred):
return K.mean(mymask(y_true) * (10K.square(K.square(K.sqrt(y_pred) - K.sqrt(y_true))) + K.square(K.sqrt(y_pred) - K.sqrt(y_true)) + 0.01K.binary_crossentropy(y_pred, y_true)), axis=-1)
def my_accuracy(y_true, y_pred):
return K.mean(2*K.abs(y_true-0.5) * K.equal(y_true, K.round(y_pred)), axis=-1)
class WeightClip(Constraint):
def init(self, c=2,name='WeightClip'):
self.c = c
add an argument to name = 'WeightClip' init
and change load_model from
model = load_model('./newweights9i.h5', custom_objects={'msse': mean_squared_sqrt_error, 'mean_squared_sqrt_error':mean_squared_sqrt_error, 'my_crossentropy':mean_squared_sqrt_error, 'mycost':mean_squared_sqrt_error, 'WeightClip':foo})
to
model = load_model(sys.argv[1], custom_objects={'msse':msse, 'mean_squared_sqrt_error': mean_squared_sqrt_error, 'my_crossentropy':my_crossentropy, 'mycost':mycost, 'WeightClip':WeightClip})
The text was updated successfully, but these errors were encountered: