-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combine and optimize mad!
and mad
#839
Conversation
Bump, this seems pretty clearcut to me. |
Maybe a good idea to add a test that counts the allocations of this case. |
Trying to understand the code. Is |
No. |
Can add a rough test, but this calls |
And then, next question, if the center is float and the array is integer, how can |
Good catch. For collections that are not AbstractArrays but do have concrete eltypes that cannot hold their own middle this will error. I'll have to make a larger diff to cover all cases without any regressions. |
I don't think there is an issue with that. There is another branch https://github.com/JuliaStats/StatsBase.jl/pull/839/files#diff-f1f51914b0e1897b4ede4d6024d011c6aa91c7e74e483bae63a33f1e892d8eb9R545 that doesn't work in-place if the element type isn't concrete. |
The counterexample is |
There is also no point in collecting anything into x2 if c is provided. Maybe it's better to rewrite than to repair. |
I also arrived at that conclusion and have done a rewrite. Indeed, I have avoided |
I'd like to rebase this after #841 so that the diff can be more clear here. |
mad!
and mad
dcd6603
to
1569471
Compare
Less source code, more tests, more concrete types, fewer allocations, faster runtime, and hopefully not terribly buggy :) What more can you ask for? This is ready for review! |
c = center === nothing ? median!(x) : center | ||
T = promote_type(typeof(c), eltype(x)) | ||
U = eltype(x) | ||
x2 = U == T ? x : isconcretetype(U) && isconcretetype(T) && sizeof(U) == sizeof(T) ? reinterpret(T, x) : similar(x, T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me just understand this: If U
is an Int and T
is Float64
, this would allow to overwrite the Int
array by reinterpreting it filling it with intermediate floats, but x
interpreted as Ints is garbage afterwords. We should warn that x
cannot be used afterwards in mad!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring begins with "Compute the median absolute deviation (MAD) of array x around center (by default, around the median), overwriting x in the process." so this warning does exist.
x
can be reused, it has simply been overwritten with gibberish.
Previously, mad!
began with x .= abs.(x .- center)
and then partially sorted x
, which makes the contents of x
fairly unusable, though not quite as bizarre as x
can end up now.
We can make the warning more visible if you'd like, but it's also contained in the function name itself. mad!
clearly mutates its input, the warning is just that it probably mutates it in a way you aren't expecting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I am happy with "overwriting x in the process"
Is there anything else needed from me here? |
No, if there is no protest I will merge this. |
Generators don't have eltype (have eltype Any) so this explicitly makes an
Any[]
which, ironically, isn't even reused in the following code due to the isconcretetype check. A standardcollect
is better.