Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redundant load-acquire elimination discrepancy between CoreCLR and Clang #47261

Closed
alexrp opened this issue Jan 21, 2021 · 9 comments
Closed
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI memory model issues associated with memory model question Answer questions and provide assistance, not an issue with source code or documentation.
Milestone

Comments

@alexrp
Copy link
Contributor

alexrp commented Jan 21, 2021

C code:

void foo(int *p)
{
    (void)__atomic_load_n(p, __ATOMIC_ACQUIRE);
    (void)__atomic_load_n(p, __ATOMIC_ACQUIRE);
    (void)__atomic_load_n(p, __ATOMIC_ACQUIRE);
}

C# code:

public static void Foo(ref int p)
{
    _ = Volatile.Read(ref p);
    _ = Volatile.Read(ref p);
    _ = Volatile.Read(ref p);
}

clang-11 -O3 atomic.c results in this code for foo:

mov eax, dword ptr [rdi]
mov eax, dword ptr [rdi]
mov eax, dword ptr [rdi]
ret

(Note: I'm specifically using Clang because it optimizes atomics much more aggressively than GCC. In any case, GCC emits the same code.)

The C# code for Foo results in this:

cmp [rcx], ecx
ret

Given that both snippets are using acquire semantics, who's right? Is CoreCLR wrong to eliminate the redundant loads, or is Clang being overly conservative?

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jan 21, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@GrabYourPitchforks GrabYourPitchforks added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 21, 2021
@GrabYourPitchforks
Copy link
Member

GrabYourPitchforks commented Jan 21, 2021

Also /cc @stephentoub, who understands the Volatile type better than anybody else. I suspect this has to do with that volatile in CLR land is for acquire / release semantics (and thus can be elided if there are no observable side effects), while volatile in C++ is more for controlling low-level hardware memory access.

@alexrp
Copy link
Contributor Author

alexrp commented Jan 21, 2021

Note that the C code makes no use of volatile.

@jkotas jkotas added the question Answer questions and provide assistance, not an issue with source code or documentation. label Jan 21, 2021
@stephentoub
Copy link
Member

This looks like #6280.

@JulieLeeMSFT
Copy link
Member

CC @AndyAyersMS
Is this related to #10268

@JulieLeeMSFT JulieLeeMSFT added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner labels Jan 22, 2021
@AndyAyersMS
Copy link
Member

Yes, they are related.

Per ECMA-355, I.12.6.7 Volatile reads and writes:

The volatile. prefix on certain instructions shall guarantee cross-thread memory ordering rules.
They do not provide atomicity, other than that guaranteed by the specification of §I.12.6.6.

and subsequently

An optimizing compiler that converts CIL to native code shall not remove any volatile operation,
nor shall it coalesce multiple volatile operations into a single operation.

and the jit is not honoring this last "shall not," either here or in #6280 or in #10268.

In this case the latter two of the three reads are removed because we can clear their side effect flags during assertion prop, so only the first read remains:

Non-null prop for index #01 in BB01:
N002 (  3,  2) [000018] V--XGO-N----              *  IND       int    $1c1
Propagating 0000000000000001 assertions for BB01, stmt STMT00003, tree [000008], tree -> 0
Propagating 0000000000000001 assertions for BB01, stmt STMT00003, tree [000009], tree -> 0
Re-morphing this stmt:
STMT00003 (IL   ???...  ???)
N004 (  3,  2) [000009] ---XGO------              *  COMMA     void   $241
N002 (  3,  2) [000018] V---GO-N----              +--*  IND       int    $1c1
N001 (  1,  1) [000005] ------------              |  \--*  LCL_VAR   byref  V00 arg0         u:1 $80
N003 (  0,  0) [000008] ------------              \--*  NOP       void   $201


optAssertionPropMain removed tree:
N003 (  0,  0) [000008] ------------              *  NOP       void   $201

The fix should be fairly simple, morph must check for volatile, but currently this involves a tree walk. Perhaps we should model volatile as a general side effect so it tends to get checked along with all the other side effects, and propagates up to the roots of trees.

@AndyAyersMS
Copy link
Member

I don't see us fixing this for 6.0. Willing to reconsider if we have evidence this is causing a problem somewhere.

@AndyAyersMS AndyAyersMS removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Jun 7, 2021
@AndyAyersMS AndyAyersMS added this to the Future milestone Jun 7, 2021
@VSadov VSadov added the memory model issues associated with memory model label Sep 7, 2022
@VSadov
Copy link
Member

VSadov commented Sep 22, 2022

I think this is ByDesign. Clang may be used with device memory with unknown sideeffects of volatile reads, so it can't drop them.

In .NET reads just read ordinary memory. Since noone looks at the results, the reads can be dropped.

@VSadov
Copy link
Member

VSadov commented Sep 22, 2022

There are multiple existing versions of the runtime, in and out of support, that exhibit this assumption/behavior, trying to fix it will only introduce inconsistencies.

Since targeting device memory is not a supported scenario, there is a very little value in changing this (as opposed to documenting).

@VSadov VSadov closed this as completed Sep 22, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Oct 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI memory model issues associated with memory model question Answer questions and provide assistance, not an issue with source code or documentation.
Projects
Archived in project
Development

No branches or pull requests

7 participants