Add Streaming zipformer #267

yaozengwei · 2023-01-06T02:25:41Z

This PR adds streaming zipformer (see k2-fsa/icefall#787) as an online model.

csukuangfj

Thanks! Looks great! Just left some minor comments.

csukuangfj · 2023-01-06T02:27:44Z

sherpa/cpp_api/bin/online-recognizer.cc

@@ -68,7 +68,19 @@ To use fast_beam_search with an LG, use
    foo.wav \
    bar.wav

-(4) To decode wav.scp
+(4) To use an streaming Zipformer model for recognition


Suggested change

(4) To use an streaming Zipformer model for recognition

(4) To use a streaming Zipformer model for recognition

csukuangfj · 2023-01-06T02:29:44Z

sherpa/cpp_api/online-recognizer.cc

+        int32_t chunk_size = config.chunk_size;
+        // It is used after feature embedding, that does (T-7)//2
+        int32_t model_chunk_size = encoder.attr("decode_chunk_size").toInt();
+        SHERPA_CHECK_EQ(chunk_size / 2, model_chunk_size);


If we can get model_chunk_size from the model, is it still necessary to require the user to specify it?

Just to ensure the given chunk_size is equal to the exported one. I will update the code in streaming zipformer. It supports using different chunk size.

csukuangfj · 2023-01-06T03:40:12Z

sherpa/csrc/online-zipformer-transducer-model.cc

+  }
+
+  int32_t num_encoders = num_elements / 7;
+  int32_t batch_size = static_cast<const torch::Tensor &>(states[0]).size(1);


Suggested change

int32_t batch_size = static_cast<const torch::Tensor &>(states[0]).size(1);

int32_t batch_size = states[0].size(1);

csukuangfj · 2023-01-06T03:51:10Z

docs/source/cpp/pretrained_models/online_transducer.rst

@@ -20,6 +20,49 @@ This sections lists models trained using `icefall`_.
 English
 ^^^^^^^

+icefall-asr-librispeech-pruned-transducer-stateless7-streaming-2022-12-29


Could you also update
https://github.com/k2-fsa/sherpa/blob/master/.github/scripts/run-online-transducer.sh

csukuangfj · 2023-01-06T03:51:55Z

sherpa/cpp_api/bin/online-recognizer.cc

@@ -68,7 +68,19 @@ To use fast_beam_search with an LG, use
    foo.wav \
    bar.wav

-(4) To decode wav.scp
+(4) To use an streaming Zipformer model for recognition


Could you also update the comment in
https://github.com/k2-fsa/sherpa/blob/master/sherpa/cpp_api/bin/online-recognizer-microphone.cc

csukuangfj · 2023-01-06T03:59:00Z

.github/scripts/run-online-transducer.sh

+git lfs pull --include "exp/decoder_jit_trace.pt"
+git lfs pull --include "exp/joiner_jit_trace.pt"
+git lfs pull --include "data/lang_bpe_500/LG.pt"
+


Suggested change

popd

csukuangfj · 2023-01-06T04:02:54Z

sherpa/cpp_api/online-recognizer.cc

      }
    }
+    if (!is_supported) {


Suggested change

if (!is_supported) {

if (!model_) {

and remove is_supported.

yaozengwei added 6 commits January 3, 2023 10:43

Merge remote-tracking branch 'k2-fsa/master' into streaming_zipformer

562b04e

add online-zipformer-transducer-model.{cc,h}

8c0ef06

Merge remote-tracking branch 'k2-fsa/master' into streaming_zipformer

c41b142

fix bug in online-zipformer-transducer-model.{h,cc}

5536508

add online-recognizer

2740a81

add online_transducer.rst

0158aad

csukuangfj reviewed Jan 6, 2023

View reviewed changes

yaozengwei added 2 commits January 6, 2023 11:52

add CI scripts

9fba169

fix typo

ba89296

csukuangfj reviewed Jan 6, 2023

View reviewed changes

add comment, remove 'is_supported'

f1777d7

yaozengwei added the ready label Jan 6, 2023

fix CI test, add comment in online-recognizer-microphone.cc

a101ff6

yaozengwei added ready and removed ready labels Jan 6, 2023

rm

ce2bce8

yaozengwei added ready and removed ready labels Jan 6, 2023

csukuangfj approved these changes Jan 6, 2023

View reviewed changes

yaozengwei merged commit f59887b into k2-fsa:master Jan 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Streaming zipformer #267

Add Streaming zipformer #267

yaozengwei commented Jan 6, 2023

csukuangfj left a comment

csukuangfj Jan 6, 2023

csukuangfj Jan 6, 2023

yaozengwei Jan 6, 2023

csukuangfj Jan 6, 2023

csukuangfj Jan 6, 2023

csukuangfj Jan 6, 2023

yaozengwei Jan 6, 2023

csukuangfj Jan 6, 2023

csukuangfj Jan 6, 2023

	(4) To use an streaming Zipformer model for recognition
	(4) To use a streaming Zipformer model for recognition

	int32_t batch_size = static_cast<const torch::Tensor &>(states[0]).size(1);
	int32_t batch_size = states[0].size(1);

Add Streaming zipformer #267

Add Streaming zipformer #267

Conversation

yaozengwei commented Jan 6, 2023

csukuangfj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment