Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from MaxMood96:main #153

Merged
merged 35 commits into from
Oct 14, 2021
Merged

[pull] main from MaxMood96:main #153

merged 35 commits into from
Oct 14, 2021

Conversation

pull[bot]
Copy link

@pull pull bot commented Oct 14, 2021

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

benshi001 and others added 30 commits October 14, 2021 02:24
Use LUI+SLLI.UW to compose the upper bits instead of LUI+SLLI.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111705
The 24-bit mul intrinsics yields the low-order 32 bits. We should only
do the transformation if the operands are known to be not wider than 24
bits and the result is known to be not wider than 32 bits.

Differential Revision: https://reviews.llvm.org/D111523
Opitimize immediate materialisation in the following way if profitable:
1. Use BCLRI for upper 32 bits if the lower 32 bits are negative int32.
2. Use BSETI for upper 32 bits if the lower 32 bits are positive int32.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111508
Remove unsused variable that break Werror on some buildbots
gcc does not support __has_feature(), so this was accidentally changed
in D111581 when compiling with gcc.
Check lightweight getter condition before calling all_of.
Replace check with
    if ((ExitIfTrue && CI->isZero()) || (!ExitIfTrue && CI->isOne()))
with equivalent and simpler version
    if (ExitIfTrue == CI->isZero())
When we know the bounds of the array, print any embedded nuls instead of
treating them as terminators. An exception to this rule is made for the
nul character at the very end of the string. We don't print that, as
otherwise 99% of the strings would end in \0. This way the strings
usually come out the same as how the user typed it into the compiler
(char foo[] = "with\0nuls"). It also matches how they come out in gdb.

This resolves a FIXME left from D111399, and leaves another FIXME for dealing
with nul characters in "escape-non-printables=false" mode. In this mode the
characters cause the entire summary string to be terminated prematurely.

Differential Revision: https://reviews.llvm.org/D111634
… file

This makes the compiler generated code for accessing the thread local
variable much simpler (no need for wrapper functions and weak pointers
to potential init functions), and can avoid toolchain bugs regarding how
to access TLS variables.

In particular, this fixes LLDB when built with current GCC/binutils for
MinGW, see msys2/MINGW-packages#8868.

Differential Revision: https://reviews.llvm.org/D111779
This patch fixes the bug that consisted of treating variable / immediate
length mem operations (such as memcpy, memset, ...) differently. The variable
length case needs to have the length minus 1 passed due to the use of EXRL
target instructions. However, the DAGCombiner can convert a register length
argument into a constant one, and whenever that happened one byte too little
would end up being performed.

This is also a refactorization by reducing the number of opcodes and variants
involved. For any opcode (variable or constant length), only the length minus
one is passed on to the ISD node. The rest of the logic is now instead
handled during isel pseudo expansion.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D111729
This reverts 3562076 and includes some refactoring as well.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D111733
After removing the last LinalgOps that have no region attached we can verify there is a region. The patch performs the following changes:
- Move the SingleBlockImplicitTerminator trait further up the the structured op base class.
- Adapt the LinalgOp verification since the trait only check if there is 0 or 1 block.
- Introduce a getBlock method on the LinalgOp interface.
- Access the LinalgOp body using either getBlock() or getBody() if the concrete operation type is known.

This patch is a follow up to https://reviews.llvm.org/D111233.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111393
This patch replaces all uses of std::vector with llvm::SmallVector in the flang-omp-report plugin.
This is a one of several patches focusing on switching containers from STL to LLVM's ADT library.

Reviewed By: Leporacanthicus

Differential Revision: https://reviews.llvm.org/D111709
Setting the nofold attribute enables packing an operand. At the moment, the attribute is set by default. The pack introduces a callback to control the flag.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111718
Fix assert crash when an unregistered dialect op is encountered during
parsing and `-allow-unregistered-dialect' isn't on. Instead, emit an
error.

While on this, clean up "registered" vs "loaded" on `getDialect()` and
local clang-tidy warnings.

https://llvm.discourse.group/t/assert-behavior-on-unregistered-dialect-ops/4402

Differential Revision: https://reviews.llvm.org/D111628
Some functions get opted out of instruction referencing if they're being
compiled with no optimisations, however the LiveDebugValues pass picks one
implementation and then sticks with it through the rest of compilation.
This leads to a segfault if we encounter a function that doesn't use
instr-ref (because it's optnone, for example), but we've already decided
to use InstrRefBasedLDV which expects to be passed a DomTree.

Solution: keep both implementations around in the pass, and pick whichever
one is appropriate to the current function.
MemRefType was using a wrong `isa` function in the bindings code, which
could lead to invalid IR being constructed. Also run the verifier in
memref dialect tests.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111784
Improve support for variadic regions in ODS-generated operation view classes.
In particular, make generated constructors take an extra argument that
specifies the number of variadic regions if the operation has them. Previously,
there was no mechanism to specify a non-zero number of variadic regions. Also
generate named accessors to regions.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D111783
2 returns, one after the other - reported by coverity
…> i32

Without SSE41 sext/zext instructions the extensions will be split, meaning that the MUL->PMADDWD fold will split the sext_i32(x) into zext_i32(sext_i16(x))
…ntFuncRef.get() calls. NFCI.

Fixes scan-build warning about dead initialization
…T::i8 indices. NFCI.

Avoids unused assignment scan-build warning.
The patch attempts to optimize a sequence of SIMD loads from the same
base pointer:

    %0 = gep float*, float* base, i32 4
    %1 = bitcast float* %0 to <4 x float>*
    %2 = load <4 x float>, <4 x float>* %1
    ...
    %n1 = gep float*, float* base, i32 N
    %n2 = bitcast float* %n1 to <4 x float>*
    %n3 = load <4 x float>, <4 x float>* %n2

For AArch64 the compiler generates a sequence of LDR Qt, [Xn, #16].
However, 32-bit NEON VLD1/VST1 lack the [Wn, #imm] addressing mode, so
the address is computed before every ld/st instruction:

    add r2, r0, #32
    add r0, r0, #16
    vld1.32 {d18, d19}, [r2]
    vld1.32 {d22, d23}, [r0]

This can be improved by computing address for the first load, and then
using a post-indexed form of VLD1/VST1 to load the rest:

    add r0, r0, #16
    vld1.32 {d18, d19}, [r0]!
    vld1.32 {d22, d23}, [r0]

In order to do that, the patch adds more patterns to DAGCombine:

  - (load (add ptr inc1)) and (add ptr inc2) are now folded if inc1
    and inc2 are constants.

  - (or ptr inc) is now recognized as a pointer increment if ptr is
    sufficiently aligned.

In addition to that, we now search for all possible base updates and
then pick the best one.

Differential Revision: https://reviews.llvm.org/D108988
fhahn and others added 5 commits October 14, 2021 14:03
Running -vector-combine early can introduce new vector operations,
blocking loop/SLP vectorization. The added test case could be better
optimized by the SLPVectorizer if no new vector operations are added
early.
These registers are used as operands for instructions that expect an
integer register, so they should be added to Int32Regs or Int64Regs
register classes. Otherwise the machine verifier emits an error for
the following LIT tests when LLVM_ENABLE_MACHINE_VERIFIER=1
environment variable is set:

*** Bad machine code: Illegal physical register for instruction ***
- function:    kernel_func
- basic block: %bb.0 entry (0x55c8903d5438)
- instruction: %3:int64regs = LEA_ADDRi64 $vrframelocal, 0
- operand 1:   $vrframelocal
$vrframelocal is not a Int64Regs register.

    CodeGen/NVPTX/call-with-alloca-buffer.ll
    CodeGen/NVPTX/disable-opt.ll
    CodeGen/NVPTX/lower-alloca.ll
    CodeGen/NVPTX/lower-args.ll
    CodeGen/NVPTX/param-align.ll
    CodeGen/NVPTX/reg-types.ll
    DebugInfo/NVPTX/dbg-declare-alloca.ll
    DebugInfo/NVPTX/dbg-value-const-byref.ll

Differential Revision: https://reviews.llvm.org/D110164
This commit adds the system reg/regpair definitions and the corresponding
register transfer instructions.
This patch is very similar to D110173 / a3936a6, but for variable
values rather than machine values. This is for the second instr-ref
problem, calculating the correct variable value on entry to each block.
The previous lattice based implementation was broken; we now use LLVMs
existing PHI placement utilities to work out where values need to merge,
then eliminate un-necessary ones through value propagation.

Most of the deletions here happen in vlocJoin: it was trying to pick a
location for PHIs to happen in, badly, leading to an infinite loop in the
MIR test added, where it would repeatedly switch between register
locations. The new approach is simpler: either PHIs can be eliminated, or
they can't, and the location of the value is a different problem.

Various bits and pieces move to the header so that they can be tested in
the unit tests. The DbgValue class grows a "VPHI" kind to represent
variable value PHIS that haven't been eliminated yet.

Differential Revision: https://reviews.llvm.org/D110630
@pull pull bot added the ⤵️ pull label Oct 14, 2021
@pull pull bot merged commit b5426ce into Mement-Mori:main Oct 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.