Skip to content

dahliycia/ZURAW

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ZURAW

Python Sound Pattern Recognition System

What is ZURAW

ZURAW is a small project which recognizes (to it's best) sounds using previously created patterns. It means it is first learning, then recognizing. The ZURAW project consists of two programs which are run separately, but are dependant on each other:

  • zuraw.py - the learning program. It goes through each file from the data/ directory, and creates a pattern for each category (see more about how to add your data below)
  • zuraw_recognize.py - the recognizing program, which takes your input file, processes it and matches to a computed pattern.

##Requirements

  • python 2.7 or higher
  • scipy (needs numpy)
  • pylab
  • your sound files in .wav

For Windows, you can install scipy and numpy from http://www.lfd.uci.edu/~gohlke/pythonlibs/

For Linux, go with pip install numpy and pip install scipy

##How to use

ZURAW is easy to use. It takes general two steps to use it properly:

###1. Learn the patterns

ZURAW needs to learn from examples before it can recognize sounds for you. All your examples should be added in the data/ directory under separate directories for each category.

For ex.:

  • data/DOG/dog1.wav
  • data/DOG/dog2.wav
  • data/CAT/cat1.wav

etc. There are some files already in the repo's data/ directory, you can look there for reference and delete them if not needed.

When you have already added your patterns, simply run python zuraw.py. It will create pattern files in the patterns/ directory. It may take a few minutes to run.

###2. Recognize your example ZURAW by default recognizes all examples found in test/ directory.

Add your files to test/ directory, for ex.: test/ex1.wav, test/ex.2wav.

Then run python zuraw_recognize.py and it will inform you how the files were recognized.

What you can implement with it

Q: I would like to use some methods from ZURAW in my project. What methods can I use? How?

zuraw.py methods

  • find_types() - goes though the data/ directory and produces patterns under patterns/ directory. This is the main function which calls other functions to do all the computing work.

    • returns: dictionary with all patterns and examples data
  • save_pattern(pattern, name, pattern_path) - saves *pattern* in *pattern_path*/*name*.txt file.

  • get_pattern(type_dict) - creates a pattern for one type(category) from examples in the *type_dict*.

    • returns: pattern (array of float) - pattern is a mean of filtered fourier transforms of your examples
  • get_example(example_name, type_path) - for a single *example_name* it reads the .wav file and computes data.

    • returns: ex_dict - a dictionary with 'ex': normalized example data, 'ex_fourier': computed filtered fft
  • get_fourier(example, ex_length, sampFreq) - for a normalized *example* computes the filtered fourier transform

    • returns: new_fourier - filtered fourier transform array
  • normalize_data(data, sampFreq) - normalizes different types of input data

    • returns: new_data - normalized data array

zuraw_recognize.py methods

  • main(file_name, file_path) - recognizes *file_path*\*file_name* comparing to patterns in patterns/

  • load_patterns(patterns_dir) - loads pattern files from *patterns_dir*

    • returns: patterns_dict - a dictionary of patterns under their names
  • recognize(my_dict, patterns_dict) - recognizes a single file IMPORTANT You need to compute my_dict with zuraw.get_example(file_name, file_path)

  • measure_similarity(item, pattern) - measures similarity between the *item* and a single *pattern*

How it works

In this section you can find how (technically) the most important modules work.

normalize_data(data, sampFreq)

data should be an array of intiger/float numbers, read from the audio file. sampFreq is the sasmpling frequency of the file.

If two channels are available, ZURAW will only operate on the first one.

Then, if the data is of type unsigned, it will convert to signed by subtracting 128.

Data is filtered with a highpass filter implemented in butter_highpass_filter(data, cutoff, fs), with cutoff = 30 and fs = 11025 (this is the frequency we normalize to).

Filtered data is then normalized to fit [-1:1] by finding the maximum absolute value and dividing the array by it.

The sampling frequency is normalized to 11025.

The function returns normalized data as an array.

get_fourier(data, n, sampFreq)

data should be a normalized array of numbers [-1,1] n should be the number of samples sampFreq should be the sampling frequency

The Fourier transform is calculated with the fft(data, norm) function from pylab library, with norm = "ortho".

Only the first half of the calculated transform is used then, so the latter half, which is only a mirror of the first, is cut off.

Moreover, the program takes the absolute of the transform result, so it operates only on real values.

Then, to increase the likeliness of pattern recognition, the transform result is powered by 2. The very high frequencies, which are only a noise, are cut off by finding a place in the result after which there is no value high enough to be significant - it is set as the maximum value of the result divided by 20.

Then, the result is normalized to range [0:1] by dividing each number by the array maximum.

To smooth out the transform result it is scaled to 100 points by calculating an average for each point.

The function returns normalized Fourier transform result as an array of 100 points.

How it was tested

The program was tested on two sets of samples, each containing 4 animal types.

In first set of samples the samples used for learing were tested, to check if the program will recognize the sounds it learned on. The success rate was 100%.

The second set of samples consisted of new samples for each of 4 categories, and the success rate was 75%.

Authors

Olga Borgula (nnnnodahlia@gmail.com)

About

Python Pattern Recognition system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages