Force decode timestamp after speaker turn #11

akashmjn · 2023-06-07T17:36:46Z

Fix for #10 by updating the logit filter ApplyTimestampRules in decoding.py.

Re-ran the run_pipelines.py and updated the analysis notebook & metrics. Numbers for tdrz didn't really change much.

Also includes the following minor improvements:

Fix a small bug in run_pipelines.py
Decoding of <|speakerturn|> tokens is also now also configurable via an added DecodingOption e.g. whisper.transcribe(with_speaker_turns=False).
Speaker turns are better formatted in verbose output, and also included in written JSON segments in field before_speaker_turn

…r tdrz models

akashmjn and others added 4 commits June 7, 2023 10:04

force sample timestamp after speaker turn

c243f8f

update metrics & analysis after decoder update

a181694

small bugfix of chdir in run_pipelines

76bf427

fix isort complaint

a6d97c2

akashmjn mentioned this pull request Jun 8, 2023

whisper : mark speakers/voices (diarization) ggerganov/whisper.cpp#64

Open

akashmjn added 2 commits June 12, 2023 23:13

controllable output via with_speaker_turns DecodingOption, autoset fo…

044cb88

…r tdrz models

update PR link in readme

61edf29

akashmjn merged commit 094b709 into main Jun 13, 2023

akashmjn mentioned this pull request Aug 16, 2023

HF Transformers Weights #15

Open

akashmjn mentioned this pull request Nov 1, 2023

Encountering Value Error When Running the Provided Colab Example #18

Open

Provide feedback