Skip to content

Commit

Permalink
Add doc for FireRedAsr AED models
Browse files Browse the repository at this point in the history
  • Loading branch information
csukuangfj committed Feb 17, 2025
1 parent 7df3e46 commit 1fa1772
Show file tree
Hide file tree
Showing 7 changed files with 143 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,7 @@ def get_version():
.. _Lazarus: https://www.lazarus-ide.org/
.. _Moonshine: https://github.com/usefulsensors/moonshine
.. _moonshine: https://github.com/usefulsensors/moonshine
.. _FireRedAsr: https://github.com/FireRedTeam/FireRedASR
"""


Expand Down
14 changes: 14 additions & 0 deletions docs/source/onnx/FireRedAsr/code/2025-02-16.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
/star-fj/fangjun/open-source/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:375 ./build/bin/sherpa-onnx-offline --tokens=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt --fire-red-asr-encoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx --fire-red-asr-decoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx --num-threads=1 ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav

OfflineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80, low_freq=20, high_freq=-400, dither=0), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="", decoder="", language="", task="transcribe", tail_paddings=-1), fire_red_asr=OfflineFireRedAsrModelConfig(encoder="./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx", decoder="./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx"), tdnn=OfflineTdnnModelConfig(model=""), zipformer_ctc=OfflineZipformerCtcModelConfig(model=""), wenet_ctc=OfflineWenetCtcModelConfig(model=""), sense_voice=OfflineSenseVoiceModelConfig(model="", language="auto", use_itn=False), moonshine=OfflineMoonshineModelConfig(preprocessor="", encoder="", uncached_decoder="", cached_decoder=""), telespeech_ctc="", tokens="./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt", num_threads=1, debug=False, provider="cpu", model_type="", modeling_unit="cjkchar", bpe_vocab=""), lm_config=OfflineLMConfig(model="", scale=0.5), ctc_fst_decoder_config=OfflineCtcFstDecoderConfig(graph="", max_active=3000), decoding_method="greedy_search", max_active_paths=4, hotwords_file="", hotwords_score=1.5, blank_penalty=0, rule_fsts="", rule_fars="")
Creating recognizer ...
Started
Done!

./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav
{"lang": "", "emotion": "", "event": "", "text": "昨天是 MONDAY TODAY IS礼拜二 THE DAY AFTER TOMORROW是星期三", "timestamps": [], "tokens":["昨", "天", "是", " MO", "ND", "AY", " TO", "D", "AY", " IS", "礼", "拜", "二", " THE", " DAY", " AFTER", " TO", "M", "OR", "ROW", "是", "星", "期", "三"], "words": []}
----
num threads: 1
decoding method: greedy_search
Elapsed seconds: 19.555 s
Real time factor (RTF): 19.555 / 10.053 = 1.945
20 changes: 20 additions & 0 deletions docs/source/onnx/FireRedAsr/huggingface-space.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Huggingface space
=================

You can try `FireRedAsr`_ with `sherpa-onnx`_ with the following huggingface space

`<https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition>`_


.. hint::

You don't need to install anything. All you need is a browser.

You can even run it on your phone or tablet.

.. figure:: ./pic/fire-red-asr-hf-space.jpg
:alt: screenshot of hf space for FireRedAsr
:align: center
:width: 600

Try `FireRedAsr`_ in our Huggingface space with `sherpa-onnx`_
41 changes: 41 additions & 0 deletions docs/source/onnx/FireRedAsr/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
FireRedAsr
==========

This section describes how to use models from `<https://github.com/FireRedTeam/FireRedASR>`_.

Note that this model supports Chinese and English.

.. hint::

该模型支持普通话、及一些方言(四川话、河南话、天津话等).

We have converted `FireRedASR`_ to onnx and provided APIs for the following programming languages

- 1. C++
- 2. C
- 3. Python
- 4. C#
- 5. Go
- 6. Kotlin
- 7. Java
- 8. JavaScript (Support `WebAssembly`_ and `Node`_)
- 9. Swift
- 10. `Dart`_ (Support `Flutter`_)
- 11. Object Pascal

Note that you can use `FireRedASR`_ with `sherpa-onnx`_ on the following platforms:

- Linux (x64, aarch64, arm, riscv64)
- macOS (x64, arm64)
- Windows (x64, x86, arm64)
- Android (arm64-v8a, armv7-eabi, x86, x86_64)
- iOS (arm64)

In the following, we describe how to download pre-trained `FireRedASR`_ models
and use them in `sherpa-onnx`_.

.. toctree::
:maxdepth: 5

./huggingface-space.rst
./pretrained.rst
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
66 changes: 66 additions & 0 deletions docs/source/onnx/FireRedAsr/pretrained.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
Pre-trained Models
==================

This page describes how to download pre-trained `FireRedAsr`_ models.

sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16 (Chinese + English, 普通话、四川话、河南话等)
------------------------------------------------------------------------------------------------

This model is converted from `<https://huggingface.co/FireRedTeam/FireRedASR-AED-L>`_

It supports the following 2 languages:

- Chinese (普通话, 四川话、天津话、河南话等方言)
- English

In the following, we describe how to download it.

Download
^^^^^^^^

Please use the following commands to download it::

cd /path/to/sherpa-onnx

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2

After downloading, you should find the following files::

ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/
total 1.7G
-rw-r--r-- 1 kuangfangjun root 188 Feb 16 16:22 README.md
-rw-r--r-- 1 kuangfangjun root 425M Feb 16 16:21 decoder.int8.onnx
-rw-r--r-- 1 kuangfangjun root 1.3G Feb 16 16:21 encoder.int8.onnx
drwxr-xr-x 10 kuangfangjun root 0 Feb 16 16:26 test_wavs
-rw-r--r-- 1 kuangfangjun root 70K Feb 16 16:21 tokens.txt

ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/
total 1.9M
-rw-r--r-- 1 kuangfangjun root 315K Feb 16 16:24 0.wav
-rw-r--r-- 1 kuangfangjun root 160K Feb 16 16:24 1.wav
-rw-r--r-- 1 kuangfangjun root 147K Feb 16 16:24 2.wav
-rw-r--r-- 1 kuangfangjun root 245K Feb 16 16:25 3-sichuan.wav
-rw-r--r-- 1 kuangfangjun root 276K Feb 16 16:24 3.wav
-rw-r--r-- 1 kuangfangjun root 245K Feb 16 16:25 4-tianjin.wav
-rw-r--r-- 1 kuangfangjun root 250K Feb 16 16:26 5-henan.wav
-rw-r--r-- 1 kuangfangjun root 276K Feb 16 16:24 8k.wav

Decode a file
^^^^^^^^^^^^^

Please use the following command to decode a wave file:

.. code-block:: bash
./build/bin/sherpa-onnx-offline \
--tokens=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt \
--fire-red-asr-encoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx \
--fire-red-asr-decoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx \
--num-threads=1 \
./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav
You should see the following output:

.. literalinclude:: ./code/2025-02-16.txt
1 change: 1 addition & 0 deletions docs/source/onnx/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ Also, we show how to use it for speech recognition with pre-trained models.
./pretrained_models/index
./moonshine/index
./sense-voice/index
./FireRedAsr/index

.. toctree::
:maxdepth: 5
Expand Down

0 comments on commit 1fa1772

Please sign in to comment.