[AArch64] New subtarget features to control ldp and stp formation, fo… #66098

manosanaggh · 2023-09-12T15:17:29Z

…cused on ampere1 and ampere1a.

On some AArch64 cores, including Ampere's ampere1 and ampere1a architectures, load and store pair instructions are faster compared to simple loads/stores only when the alignment of the pair is at least twice that of the individual element being loaded.

Based on that, this patch introduces four new subtarget features, two for controlling ldp and two for controlling stp, to cover the ampere1 and ampere1a alignment needs and to enable optional fine-grained control over ldp and stp generation in general. The latter can be utilized by another cpu, if there are possible benefits
with a different policy than the default provided by the compiler.

More specifically, for each of the ldp and stp respectively we have:

disable-ldp/disable-stp: Do not emit ldp/stp.
ldp-aligned-only/stp-aligned-only: Emit ldp/stp only if the source pointer is aligned to at least double the alignment of the type.

Therefore, for -mcpu=ampere1 and -mcpu=ampere1a
ldp-aligned-only/stp-aligned-only become the defaults, because of the benefit from the alignment, whereas for the rest of the cpus the default behaviour of the compiler is maintained.

manosanaggh · 2023-09-12T15:19:04Z

This is a new version, which was requested here: https://reviews.llvm.org/D159480

manosanaggh · 2023-09-12T15:20:14Z

@ptomsich @davemgreen

davemgreen

Thanks for moving this over. It looks sensible to me.

llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp

llvm/test/CodeGen/AArch64/disable-ldp-ampere1.ll

manosanaggh · 2023-09-12T21:43:23Z

Thanks for moving this over. It looks sensible to me.

Thank you too for the tips! I am wondering if these x64 regressions in the checks are a thing or false positives, but I am going to have a real quick test tomorrow in a x64 machine.

manosanaggh · 2023-09-13T07:57:16Z

Thanks for moving this over. It looks sensible to me.

Thank you too for the tips! I am wondering if these x64 regressions in the checks are a thing or false positives, but I am going to have a real quick test tomorrow in a x64 machine.

After looking at that log generated by the ci, I realize I have to switch to debug build on x64 for that. Should be a test-issue.

llvm/test/CodeGen/AArch64/ldp-stp-control-features.ll

llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp

manosanaggh · 2023-09-13T20:05:23Z

The regression on my test case, caught by the ci, also happens with the upstream llvm (main branch) and has nothing to do with my changes. I see it is trigerred only for -mcpu=ampere1/ampere1a with the right test-case. Machine instruction Scheduler pass causes it on an assertion enabled with debug build. I should probably file a report on it.

manosanaggh · 2023-09-14T05:51:23Z

Filed a report for the crash on: #66328

llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp

davemgreen

Sorry I missed that the check lines are still a bit off. If you fix that then this LGTM. Thanks

llvm/test/CodeGen/AArch64/ldp-stp-control-features.ll

…cused on ampere1 and ampere1a. On some AArch64 cores, including Ampere's ampere1 and ampere1a architectures, load and store pair instructions are faster compared to simple loads/stores only when the alignment of the pair is at least twice that of the individual element being loaded. Based on that, this patch introduces four new subtarget features, two for controlling ldp and two for controlling stp, to cover the ampere1 and ampere1a alignment needs and to enable optional fine-grained control over ldp and stp generation in general. The latter can be utilized by another cpu, if there are possible benefits with a different policy than the default provided by the compiler. More specifically, for each of the ldp and stp respectively we have: - disable-ldp/disable-stp: Do not emit ldp/stp. - ldp-aligned-only/stp-aligned-only: Emit ldp/stp only if the source pointer is aligned to at least double the alignment of the type. Therefore, for -mcpu=ampere1 and -mcpu=ampere1a ldp-aligned-only/stp-aligned-only become the defaults because, of the benefit from the alignment, whereas for the rest of the cpus the default behaviour of the compiler is maintained.

davemgreen

Thanks LGTM

manosanaggh · 2023-09-14T12:55:21Z

Thanks LGTM

Thank you too!

davemgreen · 2023-09-14T14:51:40Z

There is hopefully a fix for the issues with the ampere1 model in #66384

@manosanaggh I didn't seem to be able to add you as a reviewer, but it would be good if you could take a look. Thanks

manosanaggh · 2023-09-14T15:13:38Z

There is hopefully a fix for the issues with the ampere1 model in #66384

@manosanaggh I didn't seem to be able to add you as a reviewer, but it would be good if you could take a look. Thanks

Thanks, for addressing this quickly. I am going to have a look soon.

Yes, you can't add me as a reviewer because of me not being a reviewer (not having rights in the project).

…vm#66098) On some AArch64 cores, including Ampere's ampere1 and ampere1a architectures, load and store pair instructions are faster compared to simple loads/stores only when the alignment of the pair is at least twice that of the individual element being loaded. Based on that, this patch introduces four new subtarget features, two for controlling ldp and two for controlling stp, to cover the ampere1 and ampere1a alignment needs and to enable optional fine-grained control over ldp and stp generation in general. The latter can be utilized by another cpu, if there are possible benefits with a different policy than the default provided by the compiler. More specifically, for each of the ldp and stp respectively we have: - disable-ldp/disable-stp: Do not emit ldp/stp. - ldp-aligned-only/stp-aligned-only: Emit ldp/stp only if the source pointer is aligned to at least double the alignment of the type. Therefore, for -mcpu=ampere1 and -mcpu=ampere1a ldp-aligned-only/stp-aligned-only become the defaults, because of the benefit from the alignment, whereas for the rest of the cpus the default behaviour of the compiler is maintained.

manosanaggh requested a review from a team as a code owner September 12, 2023 15:17

llvmbot added the backend:AArch64 label Sep 12, 2023

davemgreen reviewed Sep 12, 2023

View reviewed changes

llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp Outdated Show resolved Hide resolved

llvm/test/CodeGen/AArch64/disable-ldp-ampere1.ll Outdated Show resolved Hide resolved

manosanaggh force-pushed the ldp-stp-aligned-only-ampere1-series branch from b08b40f to df192f7 Compare September 13, 2023 07:07

davemgreen reviewed Sep 13, 2023

View reviewed changes

llvm/test/CodeGen/AArch64/ldp-stp-control-features.ll Outdated Show resolved Hide resolved

llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp Outdated Show resolved Hide resolved

manosanaggh force-pushed the ldp-stp-aligned-only-ampere1-series branch from df192f7 to 487a01d Compare September 13, 2023 19:10

manosanaggh force-pushed the ldp-stp-aligned-only-ampere1-series branch from 487a01d to f48c624 Compare September 13, 2023 20:12

manosanaggh requested a review from davemgreen September 14, 2023 05:53

manosanaggh force-pushed the ldp-stp-aligned-only-ampere1-series branch from f48c624 to c878ea7 Compare September 14, 2023 07:06

davemgreen reviewed Sep 14, 2023

View reviewed changes

llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp Outdated Show resolved Hide resolved

manosanaggh force-pushed the ldp-stp-aligned-only-ampere1-series branch from c878ea7 to 4791aad Compare September 14, 2023 10:18

manosanaggh requested a review from davemgreen September 14, 2023 10:19

davemgreen reviewed Sep 14, 2023

View reviewed changes

llvm/test/CodeGen/AArch64/ldp-stp-control-features.ll Outdated Show resolved Hide resolved

manosanaggh force-pushed the ldp-stp-aligned-only-ampere1-series branch from 4791aad to 651ce7c Compare September 14, 2023 11:33

manosanaggh requested a review from davemgreen September 14, 2023 11:35

manosanaggh force-pushed the ldp-stp-aligned-only-ampere1-series branch from 651ce7c to 2cf4f1d Compare September 14, 2023 12:24

davemgreen approved these changes Sep 14, 2023

View reviewed changes

ptomsich merged commit 008f26b into llvm:main Sep 14, 2023

kstoimenov mentioned this pull request Sep 14, 2023

[LSAN][NFC] Add a new line to a log kstoimenov/llvm-project#5

Closed

kstoimenov mentioned this pull request Sep 14, 2023

[LSAN][NFC] Add a new line to a log kstoimenov/llvm-project#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64] New subtarget features to control ldp and stp formation, fo… #66098

[AArch64] New subtarget features to control ldp and stp formation, fo… #66098

manosanaggh commented Sep 12, 2023 •

edited

Loading

manosanaggh commented Sep 12, 2023

manosanaggh commented Sep 12, 2023 •

edited

Loading

davemgreen left a comment

manosanaggh commented Sep 12, 2023

manosanaggh commented Sep 13, 2023 •

edited

Loading

manosanaggh commented Sep 13, 2023 •

edited

Loading

manosanaggh commented Sep 14, 2023

davemgreen left a comment

davemgreen left a comment

manosanaggh commented Sep 14, 2023

davemgreen commented Sep 14, 2023

manosanaggh commented Sep 14, 2023

[AArch64] New subtarget features to control ldp and stp formation, fo… #66098

[AArch64] New subtarget features to control ldp and stp formation, fo… #66098

Conversation

manosanaggh commented Sep 12, 2023 • edited Loading

manosanaggh commented Sep 12, 2023

manosanaggh commented Sep 12, 2023 • edited Loading

davemgreen left a comment

Choose a reason for hiding this comment

manosanaggh commented Sep 12, 2023

manosanaggh commented Sep 13, 2023 • edited Loading

manosanaggh commented Sep 13, 2023 • edited Loading

manosanaggh commented Sep 14, 2023

davemgreen left a comment

Choose a reason for hiding this comment

davemgreen left a comment

Choose a reason for hiding this comment

manosanaggh commented Sep 14, 2023

davemgreen commented Sep 14, 2023

manosanaggh commented Sep 14, 2023

manosanaggh commented Sep 12, 2023 •

edited

Loading

manosanaggh commented Sep 12, 2023 •

edited

Loading

manosanaggh commented Sep 13, 2023 •

edited

Loading

manosanaggh commented Sep 13, 2023 •

edited

Loading