You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Oct 10, 2022. It is now read-only.
Open issues, collaborate, submit a PR, contribute, share your datasets!
12
14
Let's make STT in Russian (and more) as open and available as CV models.
@@ -50,14 +52,20 @@ Let's make STT in Russian (and more) as open and available as CV models.
50
52
This alignment was performed using Yuri's alignment tool.
51
53
[Contact him](mailto:open_stt@googlegroups.com) if you need alignment for your own dataset.
52
54
53
-
# **_update 2019-05-07_ Help needed!**
55
+
## **_Update 2019-05-10_**
56
+
57
+
Quickly converted the dataset to MP3 thanks to the community!
58
+
Waiting for our account for academic torrents to be approved.
59
+
v0.4 will boast MP3 download links.
60
+
61
+
## **_Update 2019-05-07_ Help needed!**
54
62
55
63
**If you want to support the project, you can:**
56
64
- Help us with hosting (create a mirror) / provide a reliable node for torrent;
57
65
- Help us with writing some [helper](https://github.com/snakers4/open_stt/issues/2) functions;
58
66
-[Donate](https://buymeacoff.ee/8oneCIN) (each coffee pays for several full downloads) / use our DO referral [link](https://sohabr.net/habr/post/357748/) to help;
59
67
60
-
We are converting the dataset to MP3 now.
68
+
~~We are converting the dataset to MP3 now.~~
61
69
Please contact us using the below contacts, if you would like to help.
62
70
63
71
# **Downloads**
@@ -66,22 +74,22 @@ Please contact us using the below contacts, if you would like to help.
66
74
67
75
Meta data [file](https://ru-open-stt.ams3.digitaloceanspaces.com/public_meta_data_v03.csv).
| audiobook_2 | 166 | 21.0 | down |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/private_buriy_audiobooks_2_mp3.tar.gz)| Sources from the Internet + alignment |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/private_buriy_audiobooks_2.csv)|
80
+
| asr_public_phone_calls_2 | 66 | 7.5 | down |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_2_mp3.tar.gz)| Sources from the Internet + ASR |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_2.csv)|
81
+
| asr_public_stories_2 | 9 (7.5) | NA |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_stories_2.tar.gz)| NA | Sources from the Internet + alignment |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_stories_2.csv)|
| asr_public_phone_calls_1 | 22.7 | 2.6 | down |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_1_mp3.tar.gz)| Sources from the Internet + ASR |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_1.csv)|
85
+
| asr_public_stories_1 | 4.1 | 0.5 | down |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_stories_1_mp3.tar.gz)| Public stories |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_stories_1.csv)|
86
+
| public_series_1 | 1.9 | 0.2 | down |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/public_series_1_mp3.tar.gz)| Public series |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/public_series_1.csv)|
| russian_single | 0.9 | 0.1 | down |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/russian_single_mp3.tar.gz)| Russian single speaker [dataset](https://www.kaggle.com/bryanpark/russian-single-speaker-speech-dataset)|[link](https://ru-open-stt.ams3.digitaloceanspaces.com/russian_single.csv)|
90
+
| public_lecture_1 | 0.7 | 0.1 | down |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/public_lecture_1_mp3.tar.gz)| Sources from the Internet |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/public_lecture_1.csv)|
| audiobook_2 | 166 | 131.7 |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/audiobooks_2.tar.gz_aa), [part2](https://ru-open-stt.ams3.digitaloceanspaces.com/audiobooks_2.tar.gz_ab), [part3](https://ru-open-stt.ams3.digitaloceanspaces.com/audiobooks_2.tar.gz_ac), [part4](https://ru-open-stt.ams3.digitaloceanspaces.com/audiobooks_2.tar.gz_ad), [part5](https://ru-open-stt.ams3.digitaloceanspaces.com/audiobooks_2.tar.gz_ae), [part6](https://ru-open-stt.ams3.digitaloceanspaces.com/audiobooks_2.tar.gz_af), [part7](https://ru-open-stt.ams3.digitaloceanspaces.com/audiobooks_2.tar.gz_ag)| Sources from the Internet + alignment |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/private_buriy_audiobooks_2.csv)|
73
-
| asr_public_phone_calls_2 | 66 | 51.7 |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_2.tar.gz_aa), [part2](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_2.tar.gz_ab), [part3](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_2.tar.gz_ac)| Sources from the Internet + ASR |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_2.csv)|
74
-
| asr_public_stories_2 | 9 | 7.5 |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_stories_2.tar.gz)| Sources from the Internet + alignment |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_stories_2.csv)|
| asr_public_phone_calls_1 | 22.7 | 19.0 |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_1.tar.gz)| Sources from the Internet + ASR |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_phone_calls_1.csv)|
78
-
| asr_public_stories_1 | 4.1 | 3.8 |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_stories_1.tar.gz)| Public stories |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/asr_public_stories_1.csv)|
79
-
| public_series_1 | 1.9 | 1.7 |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/public_series_1.tar.gz)| Public series |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/public_series_1.csv)|
| public_lecture_1 | 0.7 | 0.6 |[part1](https://ru-open-stt.ams3.digitaloceanspaces.com/public_lecture_1.tar.gz)| Sources from the Internet |[link](https://ru-open-stt.ams3.digitaloceanspaces.com/public_lecture_1.csv)|
84
-
| Total | 190 | 163 |||||
85
93
86
94
87
95
## **Download instructions**
@@ -108,6 +116,7 @@ Meta data [file](https://ru-open-stt.ams3.digitaloceanspaces.com/public_meta_dat
108
116
109
117
## **Check md5sum**
110
118
119
+
Including links to deprecated files.
111
120
`md5sum /path/to/downloaded/file`
112
121
113
122
<details>
@@ -118,6 +127,62 @@ Meta data [file](https://ru-open-stt.ams3.digitaloceanspaces.com/public_meta_dat
@@ -316,9 +381,11 @@ Meta data [file](https://ru-open-stt.ams3.digitaloceanspaces.com/public_meta_dat
316
381
</table>
317
382
</details>
318
383
384
+
319
385
## **End to end download scripts**
320
386
321
387
You can use this [script](https://github.com/snakers4/open_stt/blob/master/download.sh) with this config [file](https://github.com/snakers4/open_stt/blob/master/md5sum.lst).
388
+
Please check the config first.
322
389
You can also [contribute](https://github.com/snakers4/open_stt/issues/2) a similar script in python.
323
390
324
391
# **Annotation methodology**
@@ -404,11 +471,102 @@ Please contact us [here](mailto:open_stt@googlegroups.com) or just create a GitH
404
471
405
472
# **FAQ**
406
473
407
-
## **0. Why not MP3?**
474
+
## **0. ~~Why not MP3?~~ MP3 encoding / decoding**
475
+
476
+
#### **Encoding**
477
+
478
+
Mostly we used `pydub` (via ffmpeg) to convert to MP3.
479
+
We omitted blank files (YouTube mostly).
480
+
We used the following parameters:
481
+
- 16kHz;
482
+
- 32 kbps;
483
+
- Mono;
484
+
485
+
Usually 128-192 kbps is enough for music with sr of 44 kHz, 64-96 is enough for speech.
486
+
But here we have mono, 16 kHz and usually only one speaker. So 32 kbps was a good choice.
487
+
We did not use other formats like `.ogg`, because `.mp3` is much more popular.
0 commit comments