Add doc for FireRedAsr AED models

k2-fsa · Feb 17, 2025 · 1fa1772 · 1fa1772
1 parent 7df3e46
commit 1fa1772
Show file tree

Hide file tree

Showing 7 changed files with 143 additions and 0 deletions.
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -177,6 +177,7 @@ def get_version():
 .. _Lazarus: https://www.lazarus-ide.org/
 .. _Moonshine: https://github.com/usefulsensors/moonshine
 .. _moonshine: https://github.com/usefulsensors/moonshine
+.. _FireRedAsr: https://github.com/FireRedTeam/FireRedASR
 """
 
 

diff --git a/docs/source/onnx/FireRedAsr/code/2025-02-16.txt b/docs/source/onnx/FireRedAsr/code/2025-02-16.txt
@@ -0,0 +1,14 @@
+/star-fj/fangjun/open-source/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:375 ./build/bin/sherpa-onnx-offline --tokens=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt --fire-red-asr-encoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx --fire-red-asr-decoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx --num-threads=1 ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav 
+
+OfflineRecognizerConfig(feat_config=FeatureExtractorConfig(sampling_rate=16000, feature_dim=80, low_freq=20, high_freq=-400, dither=0), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="", decoder="", language="", task="transcribe", tail_paddings=-1), fire_red_asr=OfflineFireRedAsrModelConfig(encoder="./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx", decoder="./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx"), tdnn=OfflineTdnnModelConfig(model=""), zipformer_ctc=OfflineZipformerCtcModelConfig(model=""), wenet_ctc=OfflineWenetCtcModelConfig(model=""), sense_voice=OfflineSenseVoiceModelConfig(model="", language="auto", use_itn=False), moonshine=OfflineMoonshineModelConfig(preprocessor="", encoder="", uncached_decoder="", cached_decoder=""), telespeech_ctc="", tokens="./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt", num_threads=1, debug=False, provider="cpu", model_type="", modeling_unit="cjkchar", bpe_vocab=""), lm_config=OfflineLMConfig(model="", scale=0.5), ctc_fst_decoder_config=OfflineCtcFstDecoderConfig(graph="", max_active=3000), decoding_method="greedy_search", max_active_paths=4, hotwords_file="", hotwords_score=1.5, blank_penalty=0, rule_fsts="", rule_fars="")
+Creating recognizer ...
+Started
+Done!
+
+./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav
+{"lang": "", "emotion": "", "event": "", "text": "昨天是 MONDAY TODAY IS礼拜二 THE DAY AFTER TOMORROW是星期三", "timestamps": [], "tokens":["昨", "天", "是", " MO", "ND", "AY", " TO", "D", "AY", " IS", "礼", "拜", "二", " THE", " DAY", " AFTER", " TO", "M", "OR", "ROW", "是", "星", "期", "三"], "words": []}
+----
+num threads: 1
+decoding method: greedy_search
+Elapsed seconds: 19.555 s
+Real time factor (RTF): 19.555 / 10.053 = 1.945
diff --git a/docs/source/onnx/FireRedAsr/huggingface-space.rst b/docs/source/onnx/FireRedAsr/huggingface-space.rst
@@ -0,0 +1,20 @@
+Huggingface space
+=================
+
+You can try `FireRedAsr`_ with `sherpa-onnx`_ with the following huggingface space
+
+  `<https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition>`_
+
+
+.. hint::
+
+   You don't need to install anything. All you need is a browser.
+
+   You can even run it on your phone or tablet.
+
+.. figure:: ./pic/fire-red-asr-hf-space.jpg
+   :alt: screenshot of hf space for FireRedAsr
+   :align: center
+   :width: 600
+
+   Try `FireRedAsr`_ in our Huggingface space with `sherpa-onnx`_
diff --git a/docs/source/onnx/FireRedAsr/index.rst b/docs/source/onnx/FireRedAsr/index.rst
@@ -0,0 +1,41 @@
+FireRedAsr
+==========
+
+This section describes how to use models from `<https://github.com/FireRedTeam/FireRedASR>`_.
+
+Note that this model supports Chinese and English.
+
+.. hint::
+
+  该模型支持普通话、及一些方言(四川话、河南话、天津话等).
+
+We have converted `FireRedASR`_ to onnx and provided APIs for the following programming languages
+
+  - 1. C++
+  - 2. C
+  - 3. Python
+  - 4. C#
+  - 5. Go
+  - 6. Kotlin
+  - 7. Java
+  - 8. JavaScript (Support `WebAssembly`_ and `Node`_)
+  - 9. Swift
+  - 10. `Dart`_ (Support `Flutter`_)
+  - 11. Object Pascal
+
+Note that you can use `FireRedASR`_ with `sherpa-onnx`_ on the following platforms:
+
+  - Linux (x64, aarch64, arm, riscv64)
+  - macOS (x64, arm64)
+  - Windows (x64, x86, arm64)
+  - Android (arm64-v8a, armv7-eabi, x86, x86_64)
+  - iOS (arm64)
+
+In the following, we describe how to download pre-trained `FireRedASR`_ models
+and use them in `sherpa-onnx`_.
+
+.. toctree::
+   :maxdepth: 5
+
+   ./huggingface-space.rst
+   ./pretrained.rst
diff --git a/docs/source/onnx/FireRedAsr/pic/fire-red-asr-hf-space.jpg b/docs/source/onnx/FireRedAsr/pic/fire-red-asr-hf-space.jpg
diff --git a/docs/source/onnx/FireRedAsr/pretrained.rst b/docs/source/onnx/FireRedAsr/pretrained.rst
@@ -0,0 +1,66 @@
+Pre-trained Models
+==================
+
+This page describes how to download pre-trained `FireRedAsr`_ models.
+
+sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16 (Chinese + English, 普通话、四川话、河南话等)
+------------------------------------------------------------------------------------------------
+
+This model is converted from `<https://huggingface.co/FireRedTeam/FireRedASR-AED-L>`_
+
+It supports the following 2 languages:
+
+  - Chinese (普通话, 四川话、天津话、河南话等方言)
+  - English
+
+In the following, we describe how to download it.
+
+Download
+^^^^^^^^
+
+Please use the following commands to download it::
+
+  cd /path/to/sherpa-onnx
+
+  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
+  tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
+  rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
+
+After downloading, you should find the following files::
+
+  ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/
+  total 1.7G
+  -rw-r--r--  1 kuangfangjun root  188 Feb 16 16:22 README.md
+  -rw-r--r--  1 kuangfangjun root 425M Feb 16 16:21 decoder.int8.onnx
+  -rw-r--r--  1 kuangfangjun root 1.3G Feb 16 16:21 encoder.int8.onnx
+  drwxr-xr-x 10 kuangfangjun root    0 Feb 16 16:26 test_wavs
+  -rw-r--r--  1 kuangfangjun root  70K Feb 16 16:21 tokens.txt
+
+  ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/
+  total 1.9M
+  -rw-r--r-- 1 kuangfangjun root 315K Feb 16 16:24 0.wav
+  -rw-r--r-- 1 kuangfangjun root 160K Feb 16 16:24 1.wav
+  -rw-r--r-- 1 kuangfangjun root 147K Feb 16 16:24 2.wav
+  -rw-r--r-- 1 kuangfangjun root 245K Feb 16 16:25 3-sichuan.wav
+  -rw-r--r-- 1 kuangfangjun root 276K Feb 16 16:24 3.wav
+  -rw-r--r-- 1 kuangfangjun root 245K Feb 16 16:25 4-tianjin.wav
+  -rw-r--r-- 1 kuangfangjun root 250K Feb 16 16:26 5-henan.wav
+  -rw-r--r-- 1 kuangfangjun root 276K Feb 16 16:24 8k.wav
+
+Decode a file
+^^^^^^^^^^^^^
+
+Please use the following command to decode a wave file:
+
+.. code-block:: bash
+
+  ./build/bin/sherpa-onnx-offline \
+    --tokens=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt \
+    --fire-red-asr-encoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx \
+    --fire-red-asr-decoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx \
+    --num-threads=1 \
+    ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav
+
+You should see the following output:
+
+.. literalinclude:: ./code/2025-02-16.txt
diff --git a/docs/source/onnx/index.rst b/docs/source/onnx/index.rst
@@ -51,6 +51,7 @@ Also, we show how to use it for speech recognition with pre-trained models.
    ./pretrained_models/index
    ./moonshine/index
    ./sense-voice/index
+   ./FireRedAsr/index
 
 .. toctree::
    :maxdepth: 5