-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature subsample #34
base: develop
Are you sure you want to change the base?
Conversation
…into feature-subsample
…k on supersampling
@clegaard , now that the caching branch has been merged into develop, there are some merge conflicts that need fixing before the merge. Also, more severe linting checks were added, so the branch needs to be checked for these. |
…into feature-subsample
Also consolidated shuffle tests and updated from_iterable
Hey, a couple of things:
Aside from this, I noticed and fixed some typing problems and snuck in a few issue fixes: Also, I've renamed |
SubsampleDataset and SuperSampleDataset - Prefixed private attributes with an underscore - Changed n_ss to sampling_ratio - Changed name of argument func to subsample_func and supersample_func - Use _parent reference other than self.dataset Dataset - Free version of supersample added Documentation - Doctest snippets modified to reflect new argument names
test_dataset - test now uses indicies to refer to attributes rather than field test_loaders - test now uses indicies to refere to attributes rather than field
Hey, slightly better commit names. However, you're still not following the standard (did you actually read the document?). The names should be: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see, you have added slicing. While it seems to work for Dataset, it doesn't for all the child classes that overload __getitem__
(StreamDataset
, SubsampleDataset
, SupersampleDataset
). Moreover, the ItemGetter
should also be updated to take slices if you want to be able to generically use self._parent[1:2] within the dataset class
Built-in Python iterables all support slicing e.g. >>> [1,2,3][0:2] [1,2] Implementation of Dataset.__getitem__ has changed: - add index out of bounds check - improve conversion from slice to indices using slice.indices(), now the following is possible: >>> ds = from_iterable(list(range(10))) >>> ds[:] [0, 1, 2, ..., 9] Add slicing support for SubsampleDataset and SupersampleDataset Testing: - add test case for Dataset.__getitem__ - add test case for SubsampleDataset.__getitem__ - add test case for SupersampleDataset.__getitem__
Update __getitem__ typehints to reflect that slicing Update typehints for bound transforms methods of dataset using forward declaration of types. See: https://www.python.org/dev/peps/pep-0484/#forward-references
Summary:
The current implementation requires the user to specify the sub-samples produced by each sample.
Which also limits it to a subsampling process that always produce a fixed number of subsamples.