Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Disable vectorization and loop transformation passes #2458

Merged
merged 31 commits into from
May 12, 2021

Conversation

bader
Copy link
Contributor

@bader bader commented Sep 10, 2020

No description provided.

Loop unrolling in "SYCL optimization mode" uses default heuristic, which
is tuned for CPU and might not be profitable for other devices.
This change seems to hide issues with broadcast tests on CPU.
@bader
Copy link
Contributor Author

bader commented Sep 10, 2020

/summary:run

@bader bader added the performance Performance related issues label Sep 25, 2020
@bader bader marked this pull request as ready for review September 25, 2020 14:53
@bader bader requested a review from a team as a code owner September 25, 2020 14:53
@bader
Copy link
Contributor Author

bader commented Sep 25, 2020

/summary:run

@bader
Copy link
Contributor Author

bader commented Sep 26, 2020

/summary:run

@bader
Copy link
Contributor Author

bader commented Sep 28, 2020

/summary:run

@bader
Copy link
Contributor Author

bader commented Sep 29, 2020

/summary:run

@bader
Copy link
Contributor Author

bader commented Sep 29, 2020

/summary:run

MrSidims
MrSidims previously approved these changes Oct 5, 2020
@bader
Copy link
Contributor Author

bader commented Oct 5, 2020

This PR exposes a regression in test_stream from Khronos SYCL-CTS on GPU.
The issue is addressed in https://github.com/intel/compute-runtime/releases/tag/20.39.17972, so we need to update the driver first.

@bader
Copy link
Contributor Author

bader commented Apr 26, 2021

@DenisBakhvalov, once we removed special flag for ESIMD mode, this patch disables loop optimization transformations for ESIMD mode as well and there are a few failures in ESIMD specific tests. Could suggest the way to fix them?

@bader
Copy link
Contributor Author

bader commented Apr 26, 2021

/summary:run

@bader
Copy link
Contributor Author

bader commented Apr 28, 2021

/summary:run

bader added a commit to bader/llvm-test-suite that referenced this pull request Apr 29, 2021
@bader
Copy link
Contributor Author

bader commented Apr 29, 2021

@DenisBakhvalov, once we removed special flag for ESIMD mode, this patch disables loop optimization transformations for ESIMD mode as well and there are a few failures in ESIMD specific tests. Could suggest the way to fix them?

Recent fixes + new GPU driver for Windows resolved all ESIMD specific issues except 3 failing llvm-test-suite tests on Windows (no failures on Linux)

Failed Tests (3):
SYCL :: ESIMD/private_memory/pm_access_1.cpp
SYCL :: ESIMD/private_memory/pm_access_2.cpp
SYCL :: ESIMD/private_memory/pm_access_3.cpp

[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi1EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi1EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi1EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi1EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi2EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi2EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi2EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi2EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi2EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi2EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi2EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi2EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi3EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi3EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi3EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi3EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi3EEEE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi3EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi3EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi3EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi3EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] warning: GenXPromoteArray: _ZTS8KernelIDILi3EE allocation size is too big: using TPM
[2021-04-28T20:30:26.851Z] LLVM ERROR: Cannot find pointer replacement 

@NikitaRudenkoIntel, do know if this a known issue of GPU compiler? I can't find where this diagnostics can be reported in DPC++.

@NikitaRudenkoIntel
Copy link
Contributor

NikitaRudenkoIntel commented Apr 29, 2021

Hi, yes there are some issues in TPM. Actually, there is even a ticket with these exact tests and this exact failure. It is fixed. Can you check if your driver is up to date?

@bader
Copy link
Contributor Author

bader commented Apr 29, 2021

Can you check if your driver is up to date?

We are using 27.20.100.9466 from https://downloadmirror.intel.com/30381/a08/igfx_win10_100.9466.zip.
Is there more recent publicly available version?

@bader
Copy link
Contributor Author

bader commented Apr 29, 2021

/summary:run

@bader
Copy link
Contributor Author

bader commented Apr 29, 2021

Hi, yes there are some issues in TPM. Actually, there is even a ticket with these exact tests and this exact failure. It is fixed. Can you check if your driver is up to date?

I managed to get passed this issue with recent optimization pipeline adjustments (probably).
We can get back to the driver version question if we encounter it again.

@bader
Copy link
Contributor Author

bader commented May 12, 2021

I looked at regressions and all of them are issues of external dependencies: OpenCL CPU, Level Zero GPU and llvm-test-suite tests.
I'll address llvm-test-suite test issue as soon as this patch is merged. Low-level runtime issues should be addressed by updating corresponding runtimes.

@mdtoguchi, @AGindinson, @intel/llvm-reviewers-runtime, please, review this change.

@bader bader requested a review from MrSidims May 12, 2021 12:03
@bader
Copy link
Contributor Author

bader commented May 12, 2021

Sorry, didn't notice that there no runtime changes anymore, so I need review only from Mike or Artem.

Copy link
Contributor

@mdtoguchi mdtoguchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Driver OK

@bader bader merged commit ff6929e into intel:sycl May 12, 2021
@bader bader deleted the loop-opts branch May 12, 2021 18:40
againull pushed a commit to intel/llvm-test-suite that referenced this pull request May 14, 2021
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Mar 27, 2023
Chenyang-L pushed a commit that referenced this pull request Feb 18, 2025
[benchmarks] add ability to filter benchmarks by suite
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants