Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from llvm:main #256

Closed
wants to merge 1,885 commits into from
Closed

[pull] main from llvm:main #256

wants to merge 1,885 commits into from

Conversation

pull[bot]
Copy link

@pull pull bot commented Nov 9, 2021

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

Chen Zheng and others added 30 commits November 5, 2021 10:04
…ze object. NFCI

This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in lib/External/isl/include/isl/isl-noxceptions.h and the official isl C++ interface.
In the official interface the type `isl::size` cannot be casted to an unsigned without previously having checked if it contains a valid value with the function `isl::size::is_error()`.
For this reason two helping functions have been added:
 - `IslAssert`: assert that no errors are present in debug builds and just disables the mandatory error check in non-debug builds
 - `unisgnedFromIslSIze`: cast the `isl::size` object to `unsigned`

Changes made:
 - Add the functions `IslAssert` and `unsignedFromIslSize`
 - Add the utility function `rangeIslSize()`
 - Retype `MaxDisjunctsInDomain` from `int` to `unsigned`
 - Retype `RunTimeChecksMaxAccessDisjuncts` from `int` to `unsigned`
 - Retype `MaxDimensionsInAccessRange` from `int` to `unsigned`
 - Replaced some usages of `isl_size` to `unsigned` since we aim not to use `isl_size` anymore
 - `isl-noexceptions.h` has been generated by patacca/isl@e704f73

No functional change intended.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113101
This patch fleshes out the missing documentation for two of the VP
intrinsics introduced in D99355: `llvm.vp.load` and `llvm.vp.store`. It
does so mostly by deferring to the `llvm.masked.load` and
`llvm.masked.store` intrinsics, respectively.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D112930
Fixes one of the regressions in D113192
…esence of segment attribute

The ODS-based Python op bindings generator has been generating incorrect
specification of the operand segment in presence if both optional and variadic
operand groups: optional groups were treated as variadic whereas they require
separate treatement. Make sure it is the case. Also harden the tests around
generated op constructors as they could hitherto accept the code for both
optional and variadic arguments.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D113259
This patch fleshes out the missing documentation for the final two VP
intrinsics introduced in D99355: `llvm.vp.gather` and `llvm.vp.scatter`.
It does so mostly by deferring to the `llvm.masked.gather` and
`llvm.masked.scatter` intrinsics, respectively.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D112997
…tterns

The `fir.select` and `fir.select_rank` are lowered to llvm.switch.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D113089

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
…> bitcast(bitlogic(x,y))

To constant fold bitwise logic ops where we've legalized constant build vectors to a different type (e.g. v2i64 -> v4i32), this patch adds a basic ability to peek through the bitcasts and perform the constant fold on the inner operands.

The MVE predicate v2i64 regressions will be addressed by future support for basic v2i64 type support.

One of the yak shaving fixes for D113192....

Differential Revision: https://reviews.llvm.org/D113202
This symbol is defined in libc.so so it is definitely not DSO-Local.
Marking it as such causes problems on some platforms (such as PowerPC).

Differential revision: https://reviews.llvm.org/D109090
A pattern has selected wrong uaddlv MI. It should be as below.

uaddv(uaddlp(v8i8)) ==> uaddlv(v8i8)

Differential Revision: https://reviews.llvm.org/D113263
NumOps represents the number of elements for vector constant folding, rename this NumElts so in future we can the consistently use NumOps to represent the number of operands of the opcode.

Minor cleanup before trying to begin generalizing FoldConstantArithmetic to support opcodes other than binops.
[NFC] This patch fixes URLs containing "master". Old URLs were either broken or
redirecting to the new URL.

Reviewed By: #libc, ldionne, mehdi_amini

Differential Revision: https://reviews.llvm.org/D113186
When created a UUNPKLO/HI node with an undef input then the
output should also be undef. I've added a target DAG combine
function to ensure we avoid creating an unnecessary uunpklo/hi
instruction.

Differential Revision: https://reviews.llvm.org/D113266
Avid readers of this saga may recall from previous installments,
that replication mask replicates (lol) each of the `VF` elements
in a vector `ReplicationFactor` times. For example, the mask for
`ReplicationFactor=3` and `VF=4` is: `<0,0,0,1,1,1,2,2,2,3,3,3>`.
More importantly, replication mask is used by LoopVectorizer
when using masked interleaved memory operations.

As discussed in previous installments, while it is used by LV,
and we **seem** to support masked interleaved memory operations on X86,
it's support in cost model leaves a lot to be desired:
until basically yesterday even for AVX512 we had no cost model for it.

As it has been witnessed in the recent
AVX2 `X86TTIImpl::getInterleavedMemoryOpCost()`
costmodel patches, while it is hard-enough to query the cost
of a particular assembly sequence [from llvm-mca],
afterwards the check lines LV costmodel tests must be updated manually.
This is, at the very least, boring.

Okay, now we have decent costmodel coverage for interleaving shuffles,
but now basically the same mind-killing sequence has to be performed
for replication mask. I think we can improve at least the second half
of the problem, by teaching
the `TargetTransformInfoImplCRTPBase::getUserCost()` to recognize
`Instruction::ShuffleVector` that are repetition masks,
adding exhaustive test coverage
using `-cost-model -analyze` + `utils/update_analyze_test_checks.py`

This way we can have good exhaustive coverage for cost model,
and only basic coverage for the LV costmodel.

This patch adds precise undef-aware `isReplicationMask()`,
with exhaustive test coverage.
* `InstructionsTest.ShuffleMaskIsReplicationMask` shows that
   it correctly detects all the known masks.
* `InstructionsTest.ShuffleMaskIsReplicationMask_undef`
  shows that replacing some mask elements in a known replication mask
  still allows us to recognize it as a replication mask.
  Note, with enough undef elts, we may detect a different tuple.
* `InstructionsTest.ShuffleMaskIsReplicationMask_Exhaustive_Correctness`
  shows that if we detected the replication mask with given params,
  then if we actually generate a true replication mask with said params,
  it matches element-wise ignoring undef mask elements.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D113214
(X s< 0) ? Y : 0 --> (X s>> BW-1) & Y

We canonicalize to the icmp+select form in IR, and we already have this fold
for scalar select in SDAG, so I think it's an oversight that we don't have
the fold for vectors. It seems neutral for AArch64 and saves some instructions
on x86.

Whether we should also have the sibling folds for the inverse condition or
all-ones true value may depend on target-specific factors such as whether
there's an "and-not" instruction.

Differential Revision: https://reviews.llvm.org/D113212
…t i1

Even though AVX512's masked mem ops (unlike AVX1/2) have a mask
that is a `VF x i1`, replication of said masks happens after
promotion of it to `VF x i8`, so we should use `i8`, not `i1`,
when calculating the cost of mask replication.
Another minor step towards merging FoldConstantVectorArithmetic into FoldConstantArithmetic.

We don't use SDNodeFlags in any constant folding inside DAG, so passing the Flags argument is a waste of time - an alternative would be to wire up FoldConstantArithmetic to take SDNodeFlags just-in-case we someday start using it, but we don't have any way to test it and I'd prefer to avoid dead code.

Differential Revision: https://reviews.llvm.org/D113276
This introduces a new ComputeMinSignedBits method for ValueTracking that
returns the BitWidth - SignBits + 1 from ComputeSignBits, and represents
the minimum bit size for the value as a signed integer.  Similar to the
existing APInt::getMinSignedBits method, this can make some of the
reasoning around ComputeSignBits more natural.

See https://reviews.llvm.org/D112298
This test became much slower after 01d8759
This is possible after D106314 / 8773822.

Makes the required prepare-code-coverage-artifact.py invocation a bit longer,
but that seems like a good tradeoff.

Differential Revision: https://reviews.llvm.org/D113282
JonChesterfield and others added 26 commits November 8, 2021 20:28
Like rLLD354040.

Before: `error: unrecognized relocation Unknown (254)`
Now:    `error: unknown relocation (254) against symbol foo`
When an array's shape involves references to symbols that are not
invariant in a scope -- the classic example being a dummy array
with an explicit shape involving other dummy arguments -- the
compiler was creating shape expressions that referenced those
symbols.  This might be valid if those symbols are somehow
captured and copied at each entry point to a subprogram, and
the copies referenced in the shapes instead, but that's not
the case.

This patch introduces a new expression predicate IsScopeInvariantExpr(),
which defines a class of expressions that contains constant expressions
(in the sense that the standard uses that term) as well as references
to items that may be safely accessed in a context-free way throughout
their scopes.   This includes dummy arguments that are INTENT(IN)
and not VALUE, descriptor inquiries into descriptors that cannot
change, and bare LEN type parameters within the definitions of
derived types.  The new predicate is then used in shape analysis
to winnow out results that would have otherwise been contextual.

Differential Revision: https://reviews.llvm.org/D113309
Generate static function for matching the type/attribute to reduce the
memory footprint.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D110199
…ates.

- CUDA cannot associate memory space with pointer types. Even though Clang could add extra attributes to specify the address space explicitly on a pointer type, it breaks the portability between Clang and NVCC.
- This change proposes to assume the address space from a pointer from the assumption built upon target-specific address space predicates, such as `__isGlobal` from CUDA. E.g.,

```
  foo(float *p) {
    __builtin_assume(__isGlobal(p));
    // From there, we could assume p is a global pointer instead of a
    // generic one.
  }
```

This makes the code portable without introducing the implementation-specific features.

Note that NVCC starts to support __builtin_assume from version 11.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D112041
Merge definition visibility the same way we do for other decls. Without
the fix the added test emits `-Wobjc-method-access` as it cannot find a
visible protocol. Make this warning `-Werror` so the test would fail
when protocol visibility regresses.

rdar://83600696

Differential Revision: https://reviews.llvm.org/D111860
This ensures that the c++ test gets the right CXXFLAGS if required.
At this point, every supported compiler that claims a -std=c++17 mode
should also support `if constexpr`. This was an issue for GCC 5
and GCC 6, but hasn't been an issue since GCC 7. (Our current
minimum supported GCC version, IIUC, is GCC 10 or 11.)

Differential Revision: https://reviews.llvm.org/D113348
Currently, LOAD_STACK_GUARD on ARM is only implemented for Mach-O targets, and
other targets rely on the generic support which may result in spilling of the
stack canary value or address, or may cause it to be kept in a callee save
register across function calls, which means they essentially get spilled as
well, only by the callee when it wants to free up this register.

So let's implement LOAD_STACK GUARD for other targets as well. This ensures
that the load of the stack canary is rematerialized fully in the epilogue.

This code was split off from

  D112768: [ARM] implement support for TLS register based stack protector

for which it is a prerequisite.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D112811
getting the tls base address. unlike linux arm64, the tpidr_el0 returns always 0 (aka unused)
thus using tpidrro_el0 instead clearing up the cpu id encoded in the lower bits.

Reviewed-By: yln

Differential Revision: https://reviews.llvm.org/D112866
instruction the key points to is deleted

Use weak value handles for both the key and the value. The entry is
invalid if either value handle is null.

This fixes an assertion failure in BasicAAResult::alias that is caused
by UnderlyingObjCPtrCache returning a wrong value.

I don't have a test case for this patch that fails reliably.

rdar://83984790
A new tool that compares TargetLibraryInfo's opinion of the availability
of library function calls against the functions actually exported by a
specified set of libraries. Can be helpful in verifying the correctness
of TLI for a given target, and avoid mishaps such as had to be addressed
in D107509 and 94b4598.

The tool currently supports ELF object files only, although it's unlikely
to be hard to add support for other formats.

Differential Revision: https://reviews.llvm.org/D111358
This is in preparation for only invalidating analyses on changed
functions.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D113303
This is consistent with what we do for other operands that are required
to be constants.

I don't think this results in any real changes. The pattern match
code for isel treats ConstantSDNode and TargetConstantSDNode the same.
Add a new directive `either` to specify the operands can be matched in either order

Reviewed By: jpienaar, Mogball

Differential Revision: https://reviews.llvm.org/D110666
The outdated documentation diverges a lot from the current state of
COFF/Mach-O/ELF/wasm ports and may just confuse users. It is better rewriting
some if useful.

Tested with `ninja docs-lld-html`

Reviewed By: #lld-macho, lhames, Jez Ng

Differential Revision: https://reviews.llvm.org/D113432
This resulted in the final argument being dropped from the output, which
can be rather important.
Not all bots have ld.lld available.
This reverts commit 62dd488.
There are several aspects of the API that either aren't easy to use, or are
deceptively easy to do the wrong thing. The main change of this commit
is to remove all of the `getValue<T>`/`getFlatValue<T>` from ElementsAttr
and instead provide operator[] methods on the ranges returned by
`getValues<T>`. This provides a much more convenient API for the value
ranges. It also removes the easy-to-be-inefficient nature of
getValue/getFlatValue, which under the hood would construct a new range for
the type `T`. Constructing a range is not necessarily cheap in all cases, and
could lead to very poor performance if used within a loop; i.e. if you were to
naively write something like:

```
DenseElementsAttr attr = ...;
for (int i = 0; i < size; ++i) {
  // We are internally rebuilding the APFloat value range on each iteration!!
  APFloat it = attr.getFlatValue<APFloat>(i);
}
```

Differential Revision: https://reviews.llvm.org/D113229
A new tool that compares TargetLibraryInfo's opinion of the availability
of library function calls against the functions actually exported by a
specified set of libraries. Can be helpful in verifying the correctness
of TLI for a given target, and avoid mishaps such as had to be addressed
in D107509 and 94b4598.

The tool currently supports ELF object files only, although it's unlikely
to be hard to add support for other formats.

Re-commits 62dd488 with changes to use pre-generated objects, as not all
bots have ld.lld available.

Differential Revision: https://reviews.llvm.org/D111358
When emitting a reloc for the Wasm global __stack_pointer, it was inadvertedly added to the symbols used for generating aranges, which caused some aranges to use it as the end symbol in a symbol diff, which caused a reloc for it to be emitted, which then caused an assert in `wasm64` since we have no 64-bit relocs for Wasm globals.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=52376
Differential Revision: https://reviews.llvm.org/D113438
@pull pull bot added the ⤵️ pull label Nov 9, 2021
@github-actions
Copy link

github-actions bot commented Nov 9, 2021

This repository does not accept pull requests. Please follow http://llvm.org/docs/Contributing.html#how-to-submit-a-patch for contribution to LLVM.

@github-actions github-actions bot closed this Nov 9, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Nov 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.