-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon 6th No. 27】为 paddle.io.RandomSampler/random_split /Layer.clear_gradients 进行功能增强 -part #62966
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
Sorry to inform you that 8906223's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
is True, then it will take samples according to the number you set. Default None, disabled. | ||
num_samples(int, optional): Set sample number to draw. If :attr:`replacement` | ||
is True, it will take samples according to the number you set. If | ||
:attr:`replacement` is False, then it will yield randonly permutated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里为何写的这么复杂呢,不管replacement为多少,不都抽取 num_samples个不就可以了吗。
同时默认情况下 是否是replacemetn=False, num_samples=len(data_source),下面那些用公式描述的语句,不太容易看懂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的,默认情况是 replacemetn=False, num_samples=len(data_source)。
@@ -245,8 +243,13 @@ def __iter__(self): | |||
).tolist(): | |||
yield index | |||
else: | |||
for _ in range(self.num_samples // n): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
torch也支持这种num_samples比n还大的情况吗,这个场景看起来不是很合理?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import torch
from torch.utils.data import RandomSampler
sampler = RandomSampler(list(range(10)), replacement=False, num_samples=20)
print(list(sampler))
# [5, 6, 1, 7, 0, 3, 2, 8, 4, 9, 2, 4, 7, 9, 8, 3, 6, 1, 0, 5]
torch是支持的,是不太合理,需要改成 replacement=False 时 num_samples 必须小于等于 len(data_source) 吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NKNaN 按照这个合理的思路来实现吧,文档、具体实现都需要改一下,映射文档里提一下有这个差异
@@ -87,6 +87,16 @@ def test_with_num_samples(self): | |||
rets.append(i) | |||
assert i >= 0 and i < 100 | |||
|
|||
def test_with_num_samples_and_without_replacement(self): | |||
dataset = RandomDataset(100, 10) | |||
sampler = RandomSampler(dataset, num_samples=150, replacement=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_samples是不是应该比len(dataset)小,才合理一些
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的
@NKNaN 除了CI Static-Check和Approve之外,其他CI需要通过,另外需要修改 API中文文档、API映射文档 |
@NKNaN 按照更合理的 |
好的 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
User Experience
PR Types
Improvements
Description