Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from MaxMood96:main #736

Merged
merged 25 commits into from
Oct 6, 2022
Merged

[pull] main from MaxMood96:main #736

merged 25 commits into from
Oct 6, 2022

Conversation

pull[bot]
Copy link

@pull pull bot commented Oct 6, 2022

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

wrengr and others added 25 commits October 5, 2022 16:14
Handle more cases of singleton DLT including direct sparse2sparse conversion.  (Followup to D134096)

Depends On D134926

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D134933
If the scaling factor is by 1 with no offset or border, then the
resize is a no-op.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D135329
In HIP a library is usually compiled with default target ID e.g. gfx906 so that
it can be used in all GPU configurations. The bitcode is saved in bundled
bitcode with gfx906 in entry ID.

In runtime compilation, a HIP program is compiled with a target ID matching
the GPU configuration, e.g. gfx906:xnack-. This program needs to link with
a library bundled bitcode with target ID gfx906.

For example:

  clang --offload-arch=gfx906 -o lib.o lib.hip
  clang --offload-arch=gfx906:xnack- program.hip lib.o

This common use case requires that clang-offlod-bundler to be able to extract
entry with compatible target ID, e.g. extracting an gfx906 entry when requesting
gfx906:xnack-.

Currently clang-offload-bundler only allow extracting entry with exact match
of target ID. This patch relaxes that so that it can extract entries with compatible
target ID.

Reviewed by: Artem Belevich, Saiyedul Islam

Differential Revision: https://reviews.llvm.org/D134546
Fix typo in error message. No other changes

Differential Revision: https://reviews.llvm.org/D135318
vector.extract, if the result is a scalar) only if all reduction
dimensions are of size 1.

Differential Revision: https://reviews.llvm.org/D135333
Invariant loads can always be sunk.

Reviewed By: foad, arsenm

Differential Revision: https://reviews.llvm.org/D135133
…cates

This differential adjusts the numeric values for DimLevelType values: using the low-order two bits for recording the "No" and "Nu" properties, and the high-order bits for the formats per se.  (The choice of encoding may seem a bit peculiar, since the bits are mapped to negative properties rather than positive properties.  But this was done in order to preserve the collation order of DimLevelType values.  If we don't care about collation order, then we may prefer to flip the semantics of the property bits, so that they're less surprising to readers.)

Using distinguished bits for the properties and formats enables faster implementation for the predicates detecting those properties/formats, which matters because this is in the runtime library itself (rather than on the codegen side of things).  This differential pushes through the changes to the enum values, and optimizes the basic predicates.  However it does not optimize all the places where we check compound predicates (e.g., "is compressed or singleton"), to help reduce rebasing conflict with D134933.  Those optimizations will be done after this differential and D134933 are landed.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D135004
The patch introduces reading the attributes of kernel arguments both from
function-attached and module-level metadata, during kernel arguments lowering.
Two tests are added to show the improvement.

Differential Revision: https://reviews.llvm.org/D135106

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey.tretyakov@mail.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
This is a followup to D135004, to correct one of the tests that didn't get caught by the buildbot.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D135336
The ConvertVectorToGpu pass implementation contained a small private
support library for performing various calculations during conversion
between `vector` and `nvgpu.mma.sync` and `nvgpu.ldmatrix` operations.
The support library is moved under `Dialect/NVGPU/Utils` because the
functions have wider utility. Some documentation comments are added or
improved.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D135303
NOTE: this is probably not the long term organization
      that you want to keep after the refactoring to new
      directories, but this fixes the breakage for now;
      I leave proper refactoring of build to the NVGPU
      bazel team.

Differential Revision: https://reviews.llvm.org/D135344
The llvm-driver, enabled with LLVM_TOOL_LLVM_DRIVER_BUILD combines many llvm executables
into one to save overall toolchain size. This patch adds a few more llvm tools to the
llvm-driver.

Differential Revision: https://reviews.llvm.org/D135281
Evidently * and [] are mangled differently by MSVC...
…n-place

Inserting a tensor into an equivalent tensor is a no-op after bufferization. No alloc is needed.

Differential Revision: https://reviews.llvm.org/D132662
This symbol shouldn't have default visibility.

Differential Revision: https://reviews.llvm.org/D135346
Previously we would be unable to legalize V2S16 BUILD_VECTOR_TRUNC on GFX8 & below as the custom legalization was missing.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D135149
This should be NFC and ensure the sign of the constraint is used
consistently in the future.
We can still get a NaN even if none of the operands are NaN,
e.g. from +inf/-inf. D50804 didn't catch that.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D134854
Remove the ability to disable opaque pointers by default in clang.
It is still possible to explicitly disable them via cc1
-no-opaque-pointers.

Differential Revision: https://reviews.llvm.org/D135259
This is in preparation for D94363, as we will need AAQI to
perform the recursive call to the function variant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.