Support using OpenFst to compile HLG. #606

csukuangfj · 2022-10-08T13:56:04Z

It is to avoid OOM in determinizing LG when G is very large.

See also

@Jarvan-Wang @wwx007121 Could you help test this PR? It should be able to handle large LMs and won't throw OOM as long as it does not throw OOM in kaldi as it is also using OpenFst to build LG.

I have tested the generated HLG with the pretrained model from
https://huggingface.co/Zengwei/icefall-asr-librispeech-lstm-transducer-stateless3-2022-09-28

It produces identical WER as the HLG generated by the master code for the first 30 utterances of test-clean and test-other.

One thing to note is that the file size of the resulting HLG is much smaller.

-rw-r--r-- 1 kuangfangjun root 806M Oct  8 10:51 data/lang_bpe_500/HLG.pt
-rw-r--r-- 1 kuangfangjun root 583M Oct  8 21:03 data/lang_bpe_500/HLG_fst.pt

HLG.pt is generated by the master code, while HLG_fst.pt is generated by this PR.

You have to install kaldifst to use this PR.

pip install kaldifst

or

conda install -c kaldifst kaldifst

The documentation for kaldifst is available at
https://k2-fsa.github.io/kaldifst/

wangtiance · 2023-07-09T04:19:20Z

Glad that I found this. Would it further help reducing the size of HLG if I do determinize and minimize after composing H with LG?

wwx007121 · 2023-07-10T02:54:34Z

Glad that I found this. Would it further help reducing the size of HLG if I do determinize and minimize after composing H with LG?

This is caused by the size of the FST itself, and the smaller the FST will affect the recognition effect. So It is better to use Lattice instead of H to be composed with LG, if you want to decode with phone-level modeling/wfst

wangtiance · 2023-07-10T07:23:58Z

@wwx007121 I thought after minimizing the FST is equivalent as before, so it shouldn't affect recognition result?

Support using OpenFst to compile HLG.

2a14f3a

csukuangfj mentioned this pull request Dec 9, 2022

runtime error k2-fsa/k2#1129

Open

csukuangfj added 2 commits December 9, 2022 16:42

Merge remote-tracking branch 'dan/master' into hlg-openfst

9d46594

Fix style issues

96792a8

csukuangfj merged commit 4501821 into k2-fsa:master Dec 9, 2022

csukuangfj deleted the hlg-openfst branch December 9, 2022 08:46

ALIVE321 mentioned this pull request May 10, 2024

fail k2.compose(H, LG, inner_labels="tokens") k2-fsa/k2#1285

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support using OpenFst to compile HLG. #606

Support using OpenFst to compile HLG. #606

csukuangfj commented Oct 8, 2022

wangtiance commented Jul 9, 2023

wwx007121 commented Jul 10, 2023

wangtiance commented Jul 10, 2023

Support using OpenFst to compile HLG. #606

Support using OpenFst to compile HLG. #606

Conversation

csukuangfj commented Oct 8, 2022

wangtiance commented Jul 9, 2023

wwx007121 commented Jul 10, 2023

wangtiance commented Jul 10, 2023