-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Fix regression introduced by f537293/part of #1018 fix. #1019
Conversation
This fixes the immediate problem, where diagnostics were being missed in kernels. The problem ends up being that diagnostics can happen at three different times during translation for templates: 1- During Phase 1 2- During Phase 2 instantiation 3- During ConstructOpenCLKernels. f537293 suppressed 3, and did a better job at suppressing 1. However, CallExprs are always rebuilt during instantation (for technical reasons). The result was the variadic call was being diagnosed in all 3 while developing f537293. I erronously thought the best way was to simply wait for instantiation to diagnose all SYCL diagnostics. Unfortunately, this was short-sighted. Not everything in tree-transform gets rebuilt every time. One perfect example is the inline-asm (see the added example). It doesn't get rebuilt in some cases, so it was never diagnosed during instantiation. Because I had fixed Phase 1/ConstructOpenCLKernels, this left no time for it to be diagnosed! The fix was to keep suppressing 3, and simply fix the call-expr check by moving the check to Sema::checkCall. This is how Clang works around this issue for warnings. The downside is that the diagnostics will happen slightly later for this (see the sycl-restrict.cpp test, which reverted to the previous version before f537293), but invalid programs still won't be accepted. Signed-off-by: Erich Keane <erich.keane@intel.com>
kernelFunc(); | ||
a_type ab; | ||
a_type *p; | ||
// expected-note@+1 {{called by 'kernel_single_task'}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really get why these notes are moving...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we are now diagnosing earlier. We aren't instantiating the template, we are doingil it during phase 1, so we don't know the call stack yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still is a bit unclear for me, but it's because I don't really know a lot about templates instantiating in clang, I suppose.
It seems like we previously added note above the kernelFunc()
call, but now we are adding note (for same error, I suppose?) above the use2()
call, because we only know call stack for path through use2
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I believe that is because we're just hitting a different error 'first', and thus getting the dump from it. This file is a 'revert' of the changes I made to this in the last patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I understand that it's revert, but I'm afraid it's something wrong here too. So, doesn't that mean that we are still missing some error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of? Its more that the call-stack now is just different because we got there in a different way. We have a deferred diagnostic for that exception throw, and now the order that we're visiting the functions called by the kernel aren't the same. If you comment out the call to eh_not_ok in use2, it blames it on the call to useage. I think we're just doing a 'bredth-first' descent down the functions, so it is catching the 'shorter' path first (and thus taking up the 'delayed diagnostic'.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its more that the call-stack now is just different because we got there in a different way.
It's what I tried to understand :)
Just was a bit confused by "hitting a different error 'first'" phrase in your previous comment.
Now, let's consider the following example (sorry for typos, writing on the fly):
// let's have regular "sycl_kernel" function
template <typename name, typename Func>
void kernel_single_task(Func kernelFunc) {
kernelFunc();
}
// and template function
template<typename T> void some_foo(T t) {
// Just do some bad things
throw t; // or throw 3 if it's wrong, whatever
}
// And call our template function with bag things inside only from a kernel:
kernel_single_task<class fake_kernel>([]() { some_foo(3); });
Will this test case diagnosed through delayed diagnostics after the current patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this example would print the call stack kernel_single_task/operator()/some_foo. Since some_foo isn't a 'sycl_kernel' function (I'm imagining kernel_single_task has the attribute?), its diagnostic is delayed. The initial pass of kernel_single task doesn't know that it is going to be called with something that throws, so it doesn't diagnose.
That said, i tried your example, and it diagnoses 2x, so I want to figure out why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... it happens 2x because we're checking 'throw' without checking to see whether it is dependent. So we see it in both phases of the template. I don't know if openmp or CUDA have the same problem, or if they've worked around it. I'm guessing this: #1018 becomes more important to figure out now.
Signed-off-by: Erich Keane <erich.keane@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
* Fix SPIR-V friendly IR for OpGenericCastToPtr instruction This instruction was translated directly to OCL even if SPIR-V friendly LLVM IR was requested. Original commit: KhronosGroup/SPIRV-LLVM-Translator@76c76ef
This fixes the immediate problem, where diagnostics were being missed
in kernels. The problem ends up being that diagnostics can happen
at three different times during translation for templates:
1- During Phase 1
2- During Phase 2 instantiation
3- During ConstructOpenCLKernels.
f537293 suppressed 3, and did a better job at suppressing 1. However,
CallExprs are always rebuilt during instantation (for technical
reasons). The result was the variadic call was being diagnosed in all 3
while developing f537293. I erronously thought the best way was to
simply wait for instantiation to diagnose all SYCL diagnostics.
Unfortunately, this was short-sighted. Not everything in tree-transform
gets rebuilt every time. One perfect example is the inline-asm (see the
added example). It doesn't get rebuilt in some cases, so it was never
diagnosed during instantiation. Because I had fixed Phase
1/ConstructOpenCLKernels, this left no time for it to be diagnosed!
The fix was to keep suppressing 3, and simply fix the call-expr check by
moving the check to Sema::checkCall. This is how Clang works around this
issue for warnings. The downside is that the diagnostics will happen
slightly later for this (see the sycl-restrict.cpp test, which reverted
to the previous version before f537293), but invalid programs still
won't be accepted.
Signed-off-by: Erich Keane erich.keane@intel.com