-
Notifications
You must be signed in to change notification settings - Fork 688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create DJL Time Series Dataset #1590
Comments
Hello, I want to fix this issue and can you assign it to me? Thanks! |
@WHALEEYE just did, thanks for your contribution |
@lanking520 @zachgk Hello, I'm trying to add Daily climate time series dataset into the project, but since the dataset is on kaggle, the user may need to login first to download this. So I guess maybe the dataset should be stored on somewhere else to allow users to automatically download it when they are using DJL? |
Yeah @WHALEEYE, it would be good to not have to log in to get the dataset. What we can do is store the dataset along with the metadata file in S3 and distribute it that way. Of course, the prerequisite is that the license permits us to redistribute the dataset. Some licenses like for mnist or imagenet do not. Fortunately, the climate dataset follows the CC0 license which does say that we can distribute it. Here's what you should do. When you are creating the metadata file and testing it locally, set up a directory structure something like this:
Then, in the metadata you can use a relative uri such as "1.0/datasetFile1". An example of this is the banana dataset. After it is working and you make the PR, don't add the dataset files to git. Just give us a reminder in the PR message to upload the files with the metadata and we will try to get it to match the metadata format. You can also compress any of the dataset files individually with gzip and it will be automatically extracted |
Thanks! I also get one question: this dataset seems not have labels, but the |
For unlabeled data like time series, you can just use an empty NDList for the labels |
Ok, I got it. |
Description
A time series dataset contains a sequence of events happening across time. Some examples are climate, stocks, and forecasting. This issue is to add any time series dataset to the DJL basicdatasets.
References
The text was updated successfully, but these errors were encountered: