Skip to content
This repository was archived by the owner on Oct 10, 2022. It is now read-only.

Commit b8c2c48

Browse files
authored
Merge pull request #45 from EmilPi/master
Update README.md : use AzCopy if wget fails
2 parents 8e0cc6a + febda6b commit b8c2c48

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

README.md

+14
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,20 @@ or
268268
2. Download the meta data and manifests for each dataset:
269269
3. Merge files (where applicable), unpack and enjoy!
270270

271+
### Manually (using AzCopy) (2022-03-10)
272+
273+
When downloading large files from Azure `wget` downlaod may restart so often that it is impossible to download the largest file `archives/radio_v4_manifest.tar.gz` (176GB).
274+
275+
In that case you can use [AzCopy](https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10) util.
276+
277+
Instructions to download files using it are [here](https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-blobs-download). For the large file mentioned earlier you need to run
278+
279+
`azcopy[.exe] copy https://azureopendatastorage.blob.core.windows.net/openstt/ru_open_stt_opus/archives/radio_v4_manifest.tar.gz radio_v4_manifest.tar.gz`
280+
281+
command if you want to download file to the same folder where `azcopy[.exe]` is located.
282+
283+
284+
271285
# **Annotation methodology**
272286

273287
The dataset is compiled using open domain sources.

0 commit comments

Comments
 (0)