-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add segsort from moderngpu #181
Conversation
k2/csrc/moderngpu_context.cu
Outdated
|
||
namespace { | ||
|
||
class ModernGpuContext : public mgpu::standard_context_t { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder if this should be called ModernGpuAllocator to clarify that it's not an instance of k2::Context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was fast! Thanks!
There exists an issue for the current commit: Lines 114 to 129 in 5a9fc1c
The cuda error code for |
Thanks!! |
std::unique_ptr<mgpu::context_t> context = | ||
GetModernGpuAllocator(src->Context()->GetDeviceId()); | ||
|
||
Array1<int32_t> &segment = src->shape.RowSplits(src->NumAxes() - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to add K2_CHECK_EQ(src->shape.NumAxes() == 2
at entry as we just sort values in src
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It supports ragged arrays with more than two axes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, at least check >=2
here otherwise there will be an error when calling shape.RowSplits(NumAxes() - 1)
, note that you should add such requirements in the documentation
Lines 577 to 584 in a178524
Sort a ragged array in-place. | |
@param [inout] The input array to be sorted. | |
CAUTION: it is sorted in-place. | |
@param [out] The indexes mapping from the sorted | |
array to the input array. The caller | |
has to pre-allocate memory for it | |
on the same device as `src`. |
BTW, in other APIs, such as MaxPerSublist
, we require NumAxes() ==2
now. @danpovey I think we should make those api consistent, should we require exactly NumAxes() == 2
or NumAxes() >=2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I'd like to make the APIs as general as they can be made without requiring extra work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has no requirement about the number of axes a ragged array has. It always sorts the last axis.
This function works as long as the ragged array is not empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm.. definitely it requires NumAxes >=2...It's meaningless to call RowSplits(1) on a shape with NumAxes < 2
, actually it will crash..
Then I suggest we add documentation and requirements for those APIs with axes >= 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give an example to show what a ragged array with NumAxes < 2
looks like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mm, the point here is, now we allow user define a empty RaggedShape technically,
Line 137 in a178524
RaggedShape() = default; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a K2_DCHECK
in the next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to add K2_CHECK_EQ(src->shape.NumAxes() == 2 at entry as we just sort values in src
Line 111 in d291daf
K2_DCHECK_GE(src->NumAxes(), 2); |
mgpu::segmented_sort_indices(src->values.Data(), // keys | ||
order->Data(), // indices | ||
src->values.Dim(), // count | ||
segment.Data() + 1, // segments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why +1
here? the api of mgpu is strange,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also tried segment.Data()
and got the same result.
According to https://github.com/moderngpu/moderngpu/blob/2b3985541c8e88a133769598c406c33ddde9d0a5/tests/test_segsort.cu#L9-L15
std::vector<int> cpu_segsort(const std::vector<int>& data,
const std::vector<int>& segments) {
std::vector<int> copy = data;
int cur = 0;
for(int seg = 0; seg < segments.size(); ++seg) {
int next = segments[seg];
std::sort(copy.data() + cur, copy.data() + next);
cur = next;
}
std::sort(copy.data() + cur, copy.data() + data.size());
return copy;
}
The start index of the first segment is 0. That's why I use segment.Data() + 1
.
Checking is fine.
I dont think we need to document it because any valid RaggedShape
automatically has NumAxes() > 2. But of course clarify
what the function does if it's not already clear.
…On Sun, Sep 27, 2020 at 1:27 PM Haowen Qiu ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In k2/csrc/ragged_inl.h
<#181 (comment)>:
> @@ -95,6 +100,35 @@ Ragged<T> RandomRagged(T min_value, T max_value, int32_t min_num_axes,
return Ragged<T>(shape, values);
}
+template <typename T, typename Op /* = LessThan<T> */>
+void SortSublists(Ragged<T> *src, Array1<int32_t> *order) {
+ K2_DCHECK(IsCompatible(src->values, *order));
+ K2_DCHECK_EQ(src->values.Dim(), order->Dim());
+ K2_DCHECK_EQ(src->Context()->GetDeviceType(), kCuda)
+ << "It supports only CUDA at present";
+
+ std::unique_ptr<mgpu::context_t> context =
+ GetModernGpuAllocator(src->Context()->GetDeviceId());
+
+ Array1<int32_t> &segment = src->shape.RowSplits(src->NumAxes() - 1);
mm, the point here is, now we allow user define a empty RaggedShape
technically,
https://github.com/k2-fsa/k2/blob/a17852434c87f76e2dfc837918197c98b7cb947d/k2/csrc/ragged.h#L137,
so it's better if we can check this and report error as early as possible
(i.e. at entry of the function).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#181 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO56M3UEIBCUNEQEJSTSH3EM7ANCNFSM4R2U4NOQ>
.
|
Are arcs sorted by |
There is only one label in the arc.
…On Sun, Oct 4, 2020 at 1:55 PM Fangjun Kuang ***@***.***> wrote:
At some point it should be possible to implement ArcSort (see fsa_algo.h).
@danpovey <https://github.com/danpovey>
Are arcs sorted by ilabel or by olabel?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#181 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO7RRV7ASD7HTDOHHYTSJAE6LANCNFSM4R2U4NOQ>
.
|
BTW I have declared ArcSort in fsa_algo.h.
…On Sun, Oct 4, 2020 at 2:06 PM Daniel Povey ***@***.***> wrote:
There is only one label in the arc.
On Sun, Oct 4, 2020 at 1:55 PM Fangjun Kuang ***@***.***>
wrote:
> At some point it should be possible to implement ArcSort (see fsa_algo.h).
>
> @danpovey <https://github.com/danpovey>
>
> Are arcs sorted by ilabel or by olabel?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#181 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAZFLO7RRV7ASD7HTDOHHYTSJAE6LANCNFSM4R2U4NOQ>
> .
>
|
Guys, IMO it would have been more convenient to allow SortSublists to
accept arrays with any num-axes but operate on the last one.
So it would access the last row-ids or row-splits.
Not urgent though.
…On Sun, Oct 4, 2020 at 3:58 PM Fangjun Kuang ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In k2/csrc/ragged_inl.h
<#181 (comment)>:
> @@ -95,6 +100,35 @@ Ragged<T> RandomRagged(T min_value, T max_value, int32_t min_num_axes,
return Ragged<T>(shape, values);
}
+template <typename T, typename Op /* = LessThan<T> */>
+void SortSublists(Ragged<T> *src, Array1<int32_t> *order) {
+ K2_DCHECK(IsCompatible(src->values, *order));
+ K2_DCHECK_EQ(src->values.Dim(), order->Dim());
+ K2_DCHECK_EQ(src->Context()->GetDeviceType(), kCuda)
+ << "It supports only CUDA at present";
+
+ std::unique_ptr<mgpu::context_t> context =
+ GetModernGpuAllocator(src->Context()->GetDeviceId());
+
+ Array1<int32_t> &segment = src->shape.RowSplits(src->NumAxes() - 1);
it's better to add K2_CHECK_EQ(src->shape.NumAxes() == 2 at entry as we
just sort values in src
@qindazhu <https://github.com/qindazhu>
fixed in #218 <#218>
https://github.com/k2-fsa/k2/blob/d291daf7a88d50907d2b1039ca6033f307995a65/k2/csrc/ragged_inl.h#L111
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#181 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO7OXHPBYV3XLYRGGP3SJATI3ANCNFSM4R2U4NOQ>
.
|
... for instance, the way it is now would require adding extra code in the
implementation of ArcSort (else it wouldn't work
for FsaVec).
…On Sun, Oct 4, 2020 at 4:41 PM Daniel Povey ***@***.***> wrote:
Guys, IMO it would have been more convenient to allow SortSublists to
accept arrays with any num-axes but operate on the last one.
So it would access the last row-ids or row-splits.
Not urgent though.
On Sun, Oct 4, 2020 at 3:58 PM Fangjun Kuang ***@***.***>
wrote:
> ***@***.**** commented on this pull request.
> ------------------------------
>
> In k2/csrc/ragged_inl.h
> <#181 (comment)>:
>
> > @@ -95,6 +100,35 @@ Ragged<T> RandomRagged(T min_value, T max_value, int32_t min_num_axes,
> return Ragged<T>(shape, values);
> }
>
> +template <typename T, typename Op /* = LessThan<T> */>
> +void SortSublists(Ragged<T> *src, Array1<int32_t> *order) {
> + K2_DCHECK(IsCompatible(src->values, *order));
> + K2_DCHECK_EQ(src->values.Dim(), order->Dim());
> + K2_DCHECK_EQ(src->Context()->GetDeviceType(), kCuda)
> + << "It supports only CUDA at present";
> +
> + std::unique_ptr<mgpu::context_t> context =
> + GetModernGpuAllocator(src->Context()->GetDeviceId());
> +
> + Array1<int32_t> &segment = src->shape.RowSplits(src->NumAxes() - 1);
>
> it's better to add K2_CHECK_EQ(src->shape.NumAxes() == 2 at entry as we
> just sort values in src
>
> @qindazhu <https://github.com/qindazhu>
> fixed in #218 <#218>
>
>
> https://github.com/k2-fsa/k2/blob/d291daf7a88d50907d2b1039ca6033f307995a65/k2/csrc/ragged_inl.h#L111
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#181 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAZFLO7OXHPBYV3XLYRGGP3SJATI3ANCNFSM4R2U4NOQ>
> .
>
|
This is what the current implementation does, except that it requires a NON-empty input. |
Closes #176