Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi! is there any example implementation of streaming for this model: https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 #178

Closed
Caet-pip opened this issue Jun 18, 2023 · 46 comments

Comments

@Caet-pip
Copy link

Hi I saw this model trained for streaming https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 is there any example implementation or is it the same as implementing the sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 model.

Are the steps used for this reproducible for the icefall libra giga model

@csukuangfj
Copy link
Collaborator

Please see k2-fsa/icefall#984
if you want to train it by yourself.

If you want to use it in sherpa-onnx, please follow
https://k2-fsa.github.io/icefall/model-export/export-onnx.html
to export the model. You can find export-onnx.py from
https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/export-onnx.py

@Caet-pip
Copy link
Author

Thanks a lot! Can this be used for streaming purpose by using a microphone in real time like sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 model.

I had previously tried the icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04 for streaming but its output is not real time streaming and requires input to start and stop the recording, the output is really good however.

Please let me know if I am approaching this in the right way, thanks in advance!

@csukuangfj
Copy link
Collaborator

Can this be used for streaming purpose by using a microphone in real time like sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 model.

Yes, you can.

The URL name contains streaming, so you can use it for streaming purpose.

@Caet-pip
Copy link
Author

Caet-pip commented Jun 18, 2023

Okay, I had a used the icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04 with the speech-recognition-from-microphone.py in sherpa-onnx for real time streaming transcription without the need for pressing enter to start/stop record. When I replace the encoder decoder and joiner with the miltidataset model I get the following error

(base) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % python3 ./python-api-examples/speech-recognition-from-microphone.py
--tokens=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/data/lang_bpe_500/tokens.txt \
--encoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/encoder-epoch-30-avg-4.onnx
--decoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/decoder-epoch-30-avg-4.onnx
--joiner=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/joiner-epoch-30-avg-4.onnx \

0 MacBook Pro Microphone, Core Audio (1 in, 0 out)
< 1 MacBook Pro Speakers, Core Audio (0 in, 2 out)
Use default device: MacBook Pro Microphone
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/online-zipformer-transducer-model.cc:InitEncoder:99 encoder_dims does not exist in the metadata

Is this because the model was not made for real-time streaming inference?

@csukuangfj
Copy link
Collaborator

Is this because the model was not made for real-time streaming inference?

The reason is that you didn't export the model correctly.

Could you describe how you exported the model in detail?

@Caet-pip
Copy link
Author

Caet-pip commented Jun 18, 2023

I am using this model https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#icefall-asr-multidataset-pruned-transducer-stateless7-2023-05-04-englis

This is already exported to be used with ONNX right?

I wanted to use it with speech-recognition-from-microphone.py python file so i replaced the code given in the example

cd /path/to/sherpa-onnx

python3 ./python-api-examples/speech-recognition-from-microphone.py
--tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt
--encoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx
--decoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx
--joiner=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx

with the icefall code

(base) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % python3 ./python-api-examples/speech-recognition-from-microphone.py
--tokens=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/data/lang_bpe_500/tokens.txt \
--encoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/encoder-epoch-30-avg-4.onnx
--decoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/decoder-epoch-30-avg-4.onnx
--joiner=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/joiner-epoch-30-avg-4.onnx \

I exported the model as described in the website

GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/yfyeung/icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04
cd icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp
git lfs pull --include "*.onnx"

But I want to use the model with the speech-recognition-from-microphone.py file for real time ASR without having to press 'enter' key

The icefall model is a Transducer model right, I wanted to use this with speech-recognition-from-microphone.py

@csukuangfj
Copy link
Collaborator

But I want to use the model with the speech-recognition-from-microphone.py file for real time ASR without having to press 'enter' key

Previously, you were asking

Hi I saw this model trained for streaming https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 is there any example implementation

so I thought you want to use it for streaming recognition. I replied with yes

Yes, you can.
The URL name contains streaming, so you can use it for streaming purpose.

But now you are switching to a different model:

https://huggingface.co/yfyeung/icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04

The above model is not a streaming model (there is no streaming in the URL, so it is a non-streaming model); thus
you cannot use it for streaming purpose.

@csukuangfj
Copy link
Collaborator

@Caet-pip
Copy link
Author

Okay got it, sorry for the miscommunication. Thanks for clearing it up.

I wanted a model for real time streaming ASR and the icefall-asr-multidataset model was very good hence I asked about that

As you said that the icefall-libri-giga-pruned-transducer-stateless7-streaming model is for streaming I will use it for my purpose. But as the model hasn't been exported as onnx I wanted to know if the other icefall model could be used.

I was looking at ways to export the icefall-libra-giga-pruned model in onnx but wanted a solution in the meantime hence I started looking at other models. My bad for not noticing the offline.

Could you let me know any onnx model which is as good as the icefall model which I can use for real time streaming immediately. Thanks!

@csukuangfj
Copy link
Collaborator

As you said that the icefall-libri-giga-pruned-transducer-stateless7-streaming model is for streaming I will use it for my purpose. But as the model hasn't been exported as onnx

Then please export it by yourself by following #178 (comment)

If you have any issues during export, we can help you.

@Caet-pip
Copy link
Author

Okay sure, thanks I will do that

I wanted to know, do I need to install Icefall for exporting model to ONNX?

@csukuangfj
Copy link
Collaborator

Okay sure, thanks I will do that

I wanted to know, do I need to install Icefall for exporting model to ONNX?

Yes, please following icefall installation doc to setup the environment.

@Caet-pip
Copy link
Author

Do I need to follow step (0) Install CUDA toolkit and cuDNN for running icefall on MacOS ARM chip.

@csukuangfj
Copy link
Collaborator

No, you don't have to.

You can install a cpu version of PyTorch and k2 for exporting models from icefall.

@Caet-pip
Copy link
Author

I followed the steps to install icefall, first I installed PyTorch CPU then k2 using requirements.txt and lhoste all in a virtual environment, but after installing icefall with requirements and trying to run test I get this error

(test-icefall) (base) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./prepare.sh
2023-06-19 13:47:01 (prepare.sh:27:main) dl_dir: /Users/fawazahamedshaik/icefall/egs/yesno/ASR/download
2023-06-19 13:47:01 (prepare.sh:30:main) Stage 0: Download data
/Users/fawazahamedshaik/icefall/egs/yesno/ASR/download/waves_yesno.tar.gz: 100%|██████████████████████████████████████████████| 4.70M/4.70M [00:01<00:00, 2.42MB/s]
2023-06-19 13:47:13 (prepare.sh:39:main) Stage 1: Prepare yesno manifest
2023-06-19 13:47:14 (prepare.sh:45:main) Stage 2: Compute fbank for yesno
Traceback (most recent call last):
File "/Users/fawazahamedshaik/icefall/egs/yesno/ASR/./local/compute_fbank_yesno.py", line 18, in
from icefall.utils import get_executor
ModuleNotFoundError: No module named 'icefall'

@danpovey
Copy link
Collaborator

danpovey commented Jun 19, 2023 via email

@Caet-pip
Copy link
Author

Hello, I did the export PYTHONPATH=$PYTHONPATH:/Users/fawazahamedshaik/icefall and this solved the not installed issue but I am getting a new issue when running ./prepare.sh

logs:
(test-icefall) (base) fawazahamedshaik@Fawazs-MacBook-Pro ASR % export PYTHONPATH=$PYTHONPATH:/Users/fawazahamedshaik/icefall
(test-icefall) (base) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./prepare.sh
2023-06-19 15:12:54 (prepare.sh:27:main) dl_dir: /Users/fawazahamedshaik/icefall/egs/yesno/ASR/download
2023-06-19 15:12:54 (prepare.sh:30:main) Stage 0: Download data
2023-06-19 15:12:54 (prepare.sh:39:main) Stage 1: Prepare yesno manifest
2023-06-19 15:12:57 (prepare.sh:45:main) Stage 2: Compute fbank for yesno
Traceback (most recent call last):
File "/Users/fawazahamedshaik/icefall/egs/yesno/ASR/./local/compute_fbank_yesno.py", line 18, in
from icefall.utils import get_executor
File "/Users/fawazahamedshaik/icefall/icefall/init.py", line 3, in
from . import (
File "/Users/fawazahamedshaik/icefall/icefall/decode.py", line 20, in
import k2
File "/Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/k2/init.py", line 23, in
from _k2 import DeterminizeWeightPushingType
ImportError: dlopen(/Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/_k2.cpython-39-darwin.so, 0x0002): Symbol not found: __ZN2at4_ops10select_int4callERKNS_6TensorExx
Referenced from: /Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/_k2.cpython-39-darwin.so
Expected in: <89972BE7-3028-34DA-B561-E66870D59767> /Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/torch/lib/libtorch_cpu.dylib

Is this related to PyTorch?

@csukuangfj
Copy link
Collaborator

What is the output of

python3 -m torch.utils.collect_env

@Caet-pip
Copy link
Author

This is the output

(test-icefall) (base) fawazahamedshaik@Fawazs-MacBook-Pro ASR % python -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 2.0.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 13.3.1 (x86_64)
GCC version: Could not collect
Clang version: 14.0.3 (clang-1403.0.22.14.1)
CMake version: version 3.26.4
Libc version: N/A

Python version: 3.9.13 (main, Aug 25 2022, 18:29:29) [Clang 12.0.0 ] (64-bit runtime)
Python platform: macOS-10.16-x86_64-i386-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M2

Versions of relevant libraries:
[pip3] k2==1.24.3.dev20230619+cpu.torch2.0.1
[pip3] numpy==1.22.4
[pip3] torch==2.0.1
[pip3] torchaudio==2.0.2
[pip3] torchvision==0.15.2
[conda] k2 1.24.3.dev20230619+cpu.torch1.13.1 pypi_0 pypi
[conda] mkl 2023.1.0 h59209a4_43558
[conda] mkl-service 2.4.0 py39h6c40b1e_1
[conda] numpy 1.22.4 pypi_0 pypi
[conda] numpydoc 1.5.0 py39hecd8cb5_0
[conda] pytorch 1.13.1 cpu_py39h9e40b02_0
[conda] pytorch-lightning 1.9.4 pypi_0 pypi
[conda] pytorch-wpe 0.0.1 pypi_0 pypi
[conda] torch-complex 0.4.3 pypi_0 pypi
[conda] torchaudio 0.13.1 pypi_0 pypi
[conda] torchmetrics 0.11.4 pypi_0 pypi
[conda] torchvision 0.14.0 pypi_0 pypi

@csukuangfj
Copy link
Collaborator

Please read the output carefully.

You have installed two versions of k2, each of which is compiled with a different version of PyTorch, i.e., torch 1.13.1 and torch 2.0.1.

Please don't do that.

Please make sure there is only one version of k2 in your current environment.

@csukuangfj
Copy link
Collaborator

I suggest that if you're not familiar with conda, please switch to pip install and don't use conda install. Most if not all users who are having issues are using conda.

@Caet-pip
Copy link
Author

Okay, I will uninstall the one installed in conda, I cannot see icefall in the installed libraries is that fine?

@Caet-pip
Copy link
Author

Caet-pip commented Jun 20, 2023

I followed installation guide in Icefall website and only used pip, I also deactivated the conda base env
still when I run ./prepare.sh

(test-icefall) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./prepare.sh
2023-06-19 22:28:06 (prepare.sh:27:main) dl_dir: /Users/fawazahamedshaik/icefall/egs/yesno/ASR/download
2023-06-19 22:28:06 (prepare.sh:30:main) Stage 0: Download data
2023-06-19 22:28:06 (prepare.sh:39:main) Stage 1: Prepare yesno manifest
2023-06-19 22:28:09 (prepare.sh:45:main) Stage 2: Compute fbank for yesno
Traceback (most recent call last):
File "/Users/fawazahamedshaik/icefall/egs/yesno/ASR/./local/compute_fbank_yesno.py", line 18, in
from icefall.utils import get_executor
File "/Users/fawazahamedshaik/icefall/icefall/init.py", line 3, in
from . import (
File "/Users/fawazahamedshaik/icefall/icefall/decode.py", line 20, in
import k2
File "/Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/k2/init.py", line 23, in
from _k2 import DeterminizeWeightPushingType
ImportError: dlopen(/Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/_k2.cpython-39-darwin.so, 0x0002): Symbol not found: __ZN2at4_ops10select_int4callERKNS_6TensorExx
Referenced from: /Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/k2-1.24.3.dev20230619+cpu.torch2.0.1-py3.9-macosx-10.9-x86_64.egg/_k2.cpython-39-darwin.so
Expected in: <89972BE7-3028-34DA-B561-E66870D59767> /Users/fawazahamedshaik/test-icefall/lib/python3.9/site-packages/torch/lib/libtorch_cpu.dylib

One thing I noticed is, when installing k2 with I notice this in the logs:

(test-icefall) fawazahamedshaik@Fawazs-MacBook-Pro k2 % export K2_MAKE_ARGS="-j6"
python3 setup.py install

CMake Warning (dev) at /Users/fawazahamedshaik/opt/anaconda3/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/mkl.cmake:1 (find_package):
Policy CMP0074 is not set: find_package uses _ROOT variables.
Run "cmake --help-policy CMP0074" for policy details. Use the cmake_policy
command to set the policy and suppress this warning.

CMake variable MKL_ROOT is set to:

/Users/fawazahamedshaik/opt/anaconda3

For compatibility, CMake is ignoring the variable.

I assume the problem is with CMake

when I run python -m torch.utils.collect_env I get:
(test-icefall) fawazahamedshaik@Fawazs-MacBook-Pro k2 % python -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 2.0.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 13.3.1 (x86_64)
GCC version: Could not collect
Clang version: 14.0.3 (clang-1403.0.22.14.1)
CMake version: version 3.26.4
Libc version: N/A

Python version: 3.9.13 (main, Aug 25 2022, 18:29:29) [Clang 12.0.0 ] (64-bit runtime)
Python platform: macOS-10.16-x86_64-i386-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M2

Versions of relevant libraries:
[pip3] k2==1.24.3.dev20230620+cpu.torch2.0.1
[pip3] numpy==1.22.4
[pip3] torch==2.0.1
[pip3] torchaudio==2.0.2
[pip3] torchvision==0.15.2
[conda] mkl 2023.1.0 h59209a4_43558
[conda] mkl-service 2.4.0 py39h6c40b1e_1
[conda] numpy 1.22.4 pypi_0 pypi
[conda] numpy-base 1.22.3 py39he782bc1_0
[conda] numpydoc 1.5.0 py39hecd8cb5_0
[conda] pytorch 1.13.1 cpu_py39h9e40b02_0
[conda] pytorch-lightning 1.9.4 pypi_0 pypi
[conda] pytorch-wpe 0.0.1 pypi_0 pypi
[conda] torch-complex 0.4.3 pypi_0 pypi
[conda] torchaudio 0.13.1 pypi_0 pypi
[conda] torchmetrics 0.11.4 pypi_0 pypi
[conda] torchvision 0.14.0 pypi_0 pypi

Here I notice that mkl and mcl-service is both installed in conda, is it because of this that I am getting error?

Sorry for the repeated error, I want to know what I am doing wrong in install process

@csukuangfj
Copy link
Collaborator

Here I notice that mkl and mcl-service is both installed in conda, is it because of this that I am getting error?

Please make sure you have deactivated conda completely.

There are multiple versions of PyTorch in your current environment, i.e.,

[pip3] torch==2.0.1
[conda] pytorch 1.13.1 cpu_py39h9e40b02_0

Please don't do that.


Please see #178 (comment)

I suggest that if you're not familiar with conda, please switch to pip install and don't use conda install. Most if not all users who are having issues are using conda.

Make sure you only have one version of PyTorch in your current environment.

@Caet-pip
Copy link
Author

Hello, I was able to export the model in the demo

https://k2-fsa.github.io/icefall/model-export/export-onnx.html#export-the-model-to-onnx

finally it says that the exported files are in:
It will generate the following 3 files in $repo/exp

encoder-epoch-99-avg-1.onnx
decoder-epoch-99-avg-1.onnx
joiner-epoch-99-avg-1.onnx

but I cannot find this directory nor the files.

@csukuangfj
Copy link
Collaborator

csukuangfj commented Jun 21, 2023

Could you post the last few lines of the export log or post the export command you are using?

The exported files are in the exp directory you specified.

@Caet-pip
Copy link
Author

Caet-pip commented Jun 21, 2023

Okay, I followed the guide and exported as per the lines in the guide

It will generate the following 3 files in $repo/exp

this is my code:
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./pruned_transducer_stateless7_streaming/export-onnx.py \
--bpe-model $repo/data/lang_bpe_500/bpe.model
--use-averaged-model 0
--epoch 99
--avg 1
--decode-chunk-len 32
--exp-dir $repo/exp/

So it must be in $repo/exp
Can I change this to another directory?

when I use my directory I get this error
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./pruned_transducer_stateless7_streaming/export-onnx.py
--bpe-model $repo/data/lang_bpe_500/bpe.model
--use-averaged-model 0
--epoch 99
--avg 1
--decode-chunk-len 32
--exp-dir $Users/fawazahamedshaik/icefall
Traceback (most recent call last):
File "/Users/fawazahamedshaik/icefall/egs/librispeech/ASR/./pruned_transducer_stateless7_streaming/export-onnx.py", line 669, in
main()
File "/Users/fawazahamedshaik/my_env/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/Users/fawazahamedshaik/icefall/egs/librispeech/ASR/./pruned_transducer_stateless7_streaming/export-onnx.py", line 480, in main
setup_logger(f"{params.exp_dir}/log-export/log-export-onnx")
File "/Users/fawazahamedshaik/icefall/icefall/utils.py", line 138, in setup_logger
os.makedirs(os.path.dirname(log_filename), exist_ok=True)
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/os.py", line 225, in makedirs
mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '/fawazahamedshaik'

Can you also guide me on how to use this in sherpa-onyx, like how I should modify the code to run it in sherpa-onnx

@csukuangfj
Copy link
Collaborator

this is my code:
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % ./pruned_transducer_stateless7_streaming/export-onnx.py \
--bpe-model $repo/data/lang_bpe_500/bpe.model

What is the value of $repo?


Okay, I followed the guide and exported as per the lines in the guide

Could you please post the link to the guide you are following? Make sure you don't miss any command in the guide.

@Caet-pip
Copy link
Author

I followed the guide in documentation, link: https://k2-fsa.github.io/icefall/model-export/export-onnx.html#export-the-model-to-onnx

$repo gives me
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % $repo
zsh: command not found: icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29

I found the files in the cloned repo in exp in this folder icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29

But there are 6 files generated here is the photo

Screenshot 2023-06-20 at 10 54 55 PM

Which files should I use, and can you explain how I can use them in sherpa-onnx

@csukuangfj
Copy link
Collaborator

$repo gives me
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro ASR % $repo

Please show the output of

echo $repo

@csukuangfj
Copy link
Collaborator

Which files should I use, and can you explain how I can use them in sherpa-onnx

So you have managed to find the generated files, congratulations!

Please refer to
https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-en-2023-02-21-english

Screenshot 2023-06-21 at 11 03 21

@Caet-pip
Copy link
Author

Caet-pip commented Jun 21, 2023

Do I need to move these files to sherpa-onnx directory?

the generated files are currently in the icefall directory

and I am planning to export the https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 next, will the output be the same?

I wanted to know the syntax of the code I need to create with the generated files, as I can see the encoder, decoder and joiner are present in the syntax, do I need to replace them with the generated onnx files?

@csukuangfj
Copy link
Collaborator

do I need to replace them with the generated onnx files?

Yes, please use absolute path names if you are not sure.

Do I need to move these files to sherpa-onnx directory?

No, you don't need to do that. You can place it anywhere as long as you pass the correct path to ./build/bin/sherpa-onnx

@csukuangfj
Copy link
Collaborator

and I am planning to export the https://huggingface.co/marcoyang/icefall-libri-giga-pruned-transducer-stateless7-streaming-2023-04-04 next, will the output be the same?

Yes, it should be the same.

@Caet-pip
Copy link
Author

Thank you so much!

I was also looking forward to use my exported model so I will try that as well...I have a problem when running the model it says file not found but the files are in the directory

code for microphone air:
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % ./build/bin/sherpa-onnx-microphone
--tokens=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt
--encoder=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder-epoch-99-avg-1.onnx
--decoder=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder-epoch-99-avg-1.onnx
--joiner=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner-epoch-99-avg-1.onnx \

logs with error:
OnlineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OnlineTransducerModelConfig(encoder_filename="--encoder=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/encoder-epoch-99-avg-1.onnx", decoder_filename="--decoder=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/decoder-epoch-99-avg-1.onnx", joiner_filename="--joiner=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/joiner-epoch-99-avg-1.onnx", tokens="--tokens=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt", num_threads=2, provider="cpu", debug=False), lm_config=OnlineLMConfig(model="", scale=0.5), endpoint_config=EndpointConfig(rule1=EndpointRule(must_contain_nonsilence=False, min_trailing_silence=2.4, min_utterance_length=0), rule2=EndpointRule(must_contain_nonsilence=True, min_trailing_silence=1.2, min_utterance_length=0), rule3=EndpointRule(must_contain_nonsilence=False, min_trailing_silence=0, min_utterance_length=300)), enable_endpoint=True, max_active_paths=4, decoding_method="greedy_search")
/Users/fawazahamedshaik/sherpa-onnx/sherpa-onnx/csrc/online-transducer-model-config.cc:Validate:29 --tokens=./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt does not exist
Errors in config!

checking if files are present in the dir:
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % cd ./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/
(my_env) fawazahamedshaik@Fawazs-MacBook-Pro exp % ls
cpu_jit.pt epoch-30.pt joiner_jit_trace.pt
decode.sh epoch-99.pt log
decoder-epoch-99-avg-1.int8.onnx export.sh pretrained.pt
decoder-epoch-99-avg-1.onnx jit_pretrained.sh pretrained.sh
decoder_jit_trace.pt jit_trace_export.sh tensorboard
encoder-epoch-99-avg-1.int8.onnx jit_trace_pretrained.sh tokens.txt
encoder-epoch-99-avg-1.onnx joiner-epoch-99-avg-1.int8.onnx train.sh
encoder_jit_trace.pt joiner-epoch-99-avg-1.onnx

as you can see tokens.txt is present in the given dir but it says missing.

@csukuangfj
Copy link
Collaborator

(my_env) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx  ls -lh ./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt

What does it output?

@Caet-pip
Copy link
Author

This is the output

(my_env) fawazahamedshaik@Fawazs-MacBook-Pro sherpa-onnx % ls -lh ./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt
-rw-r--r-- 1 fawazahamedshaik staff 4.9K Jun 19 05:47 ./icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/exp/tokens.txt

@Caet-pip
Copy link
Author

I got this tokens.txt file in icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29/data/lang-bpe-500 and I moved it to the main directory along with the decoder encoder and joiner files

@csukuangfj
Copy link
Collaborator

I see.

Please run

./build/bin/sherpa-onnx-microphone

and read the output carefully.

You don't need to use --tokens, --endcoder, etc. Please pass the paths directly.

@Caet-pip
Copy link
Author

Caet-pip commented Jun 21, 2023

This worked Thanks a lot!!!

And thank you so much for exporting the giga libri pruned transducer model! I was planning to do that next.

I am planning to use other models, namely Nvidia NeMo streaming models (I'm not sure if they have those but ill look) for streaming by converting them to ONNX next Hopefully that goes well.

Thanks again!

@csukuangfj
Copy link
Collaborator

I am planning to use other models, namely Nvidia NeMo streaming models

Is there a link to the NeMo streaming model? Is it trained by CTC or transducer loss?

@Caet-pip
Copy link
Author

I am checking but so far I see models which only decode audio files (wav files)

@Caet-pip
Copy link
Author

Caet-pip commented Jun 21, 2023

It appears to be that they are using conformer models for cache aware streaming as per this link

https://github.com/NVIDIA/NeMo/blob/main/examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py

they also have Buffered ASR/Chunked Inference for both conformer and rnnt models in this link but I do not think this is streaming

https://github.com/NVIDIA/NeMo/tree/main/examples/asr/asr_chunked_inference

@Caet-pip
Copy link
Author

After some digging I found that QuartzNet15x5Base-En model which is a EncDecCTCModel can be implemented for streaming ASR with mic as per their demo notebook: https://github.com/NVIDIA/NeMo/blob/stable/tutorials/asr/Online_ASR_Microphone_Demo.ipynb

Model details are in: https://catalog.ngc.nvidia.com/orgs/nvidia/models/nemospeechmodels

I think other NeMo EncDecCTCModels can be used for streaming ASR, do you think these can be exporting ed to ONNX and used in sherpa-onnx?

@csukuangfj
Copy link
Collaborator

If you can find a way to export it to ONNX, we can change sherpa-onnx to support that.

Support for streaming ctc models is on the plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants