Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tier0 allocates on a simple cast (boxing) #88987

Closed
EgorBo opened this issue Jul 17, 2023 · 9 comments · Fixed by #88989
Closed

Tier0 allocates on a simple cast (boxing) #88987

EgorBo opened this issue Jul 17, 2023 · 9 comments · Fixed by #88989
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Jul 17, 2023

Minimal repro for https://github.com/dotnet/runtime/pull/88560/files#r1264791578

class Program
{
    static void Main()
    {
        Test(42); // <int> specialization
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static void Test<T>(T value)
    {
        if (value is IEnumerable enumerable) // always false for <int>
            Consume(enumerable);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static void Consume(object o) { }
}

Tier1 codegen for Test<int>:

; Assembly listing for method Program:Test[int](int) (FullOpts)
       ret      
; Total bytes of code 1

Tier0 codegen for Test<int>:

; Assembly listing for method Program:Test[int](int) (Tier0)
       push     rbp
       sub      rsp, 48
       lea      rbp, [rsp+30H]
       xor      eax, eax
       mov      qword ptr [rbp-08H], rax
       mov      qword ptr [rbp-10H], rax
       mov      dword ptr [rbp+10H], ecx
       mov      rcx, 0xD1FFAB1E      ; System.Int32
       call     CORINFO_HELP_NEWSFAST ;;;;; <-------- allocation (boxing)
       mov      gword ptr [rbp-10H], rax
       mov      rdx, gword ptr [rbp-10H]
       mov      ecx, dword ptr [rbp+10H]
       mov      dword ptr [rdx+08H], ecx
       mov      rdx, gword ptr [rbp-10H]
       mov      rcx, 0xD1FFAB1E      ; System.Collections.IEnumerable
       call     [CORINFO_HELP_ISINSTANCEOFINTERFACE]
       mov      gword ptr [rbp-08H], rax
       cmp      gword ptr [rbp-08H], 0
       je       SHORT G_M5413_IG03
       mov      rcx, gword ptr [rbp-08H]
       call     [Program:Consume(System.Object)]
G_M5413_IG03:
       nop      
       add      rsp, 48
       pop      rbp
       ret      
; Total bytes of code 100
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 17, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 17, 2023
@ghost
Copy link

ghost commented Jul 17, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Minimal repro for https://github.com/dotnet/runtime/pull/88560/files#r1264791578

class Program
{
    static void Main()
    {
        Test(42);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static void Test<T>(T? value)
    {
        if (value is IEnumerable enumerable)
            Consume(enumerable);
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    static void Consume(object o) { }
}

Tier1 codegen for Test<int>:

; Assembly listing for method Program:Test[int](int) (FullOpts)
       ret      
; Total bytes of code 1

Tier0 codegen for Test<int>:

; Assembly listing for method Program:Test[int](int) (Tier0)
       push     rbp
       sub      rsp, 48
       lea      rbp, [rsp+30H]
       xor      eax, eax
       mov      qword ptr [rbp-08H], rax
       mov      qword ptr [rbp-10H], rax
       mov      dword ptr [rbp+10H], ecx
       mov      rcx, 0xD1FFAB1E      ; System.Int32
       call     CORINFO_HELP_NEWSFAST ;;;;; <-------- allocation (boxing)
       mov      gword ptr [rbp-10H], rax
       mov      rdx, gword ptr [rbp-10H]
       mov      ecx, dword ptr [rbp+10H]
       mov      dword ptr [rdx+08H], ecx
       mov      rdx, gword ptr [rbp-10H]
       mov      rcx, 0xD1FFAB1E      ; System.Collections.IEnumerable
       call     [CORINFO_HELP_ISINSTANCEOFINTERFACE]
       mov      gword ptr [rbp-08H], rax
       cmp      gword ptr [rbp-08H], 0
       je       SHORT G_M5413_IG03
       mov      rcx, gword ptr [rbp-08H]
       call     [Program:Consume(System.Object)]
G_M5413_IG03:
       nop      
       add      rsp, 48
       pop      rbp
       ret      
; Total bytes of code 100
Author: EgorBo
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 17, 2023
@EgorBo EgorBo self-assigned this Jul 17, 2023
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Jul 17, 2023
@EgorBo EgorBo added this to the 8.0.0 milestone Jul 17, 2023
@stephentoub
Copy link
Member

stephentoub commented Jul 17, 2023

@EgorBo, if I change your example slightly to enumerate the enumerable in the if block:

using System.Collections;
using System.Runtime.CompilerServices;

while (true)
{
    Test<int>(42);
}

[MethodImpl(MethodImplOptions.NoInlining)]
static void Test<T>(T value)
{
    if (value is IEnumerable enumerable)
    {
        foreach (object? e in enumerable)
        {
        }
    }
}

I get this for the "optimized" implementation:

; Assembly listing for method Program:<<Main>$>g__Test|0_0[int](int) (Tier1)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; Tier1 code
; optimized code
; optimized using Dynamic PGO
; rbp based frame
; fully interruptible
; with Dynamic PGO: edge weights are valid, and fgCalledCount is 1.446707e+07
; BEGIN METHOD Program:<<Main>$>g__Test|0_0[int](int)

G_M000_IG01:                ;; offset=0000H
       push     rbp
       sub      rsp, 48
       lea      rbp, [rsp+30H]
       mov      qword ptr [rbp-10H], rsp

G_M000_IG02:                ;; offset=000EH
       add      rsp, 48
       pop      rbp
       ret

G_M000_IG03:                ;; offset=0014H
       int3

G_M000_IG04:                ;; offset=0015H
       push     rbp
       sub      rsp, 48
       mov      rbp, qword ptr [rcx+20H]
       mov      qword ptr [rsp+20H], rbp
       lea      rbp, [rbp+30H]

G_M000_IG05:                ;; offset=0027H
       mov      rdx, gword ptr [rbp-08H]
       mov      rcx, 0x7FF93AB130D0
       call     [CORINFO_HELP_ISINSTANCEOFINTERFACE]
       test     rax, rax
       je       SHORT G_M000_IG06
       mov      rcx, rax
       mov      r11, 0x7FF93A940098
       call     [r11]System.IDisposable:Dispose():this

G_M000_IG06:                ;; offset=0050H
       nop

G_M000_IG07:                ;; offset=0051H
       add      rsp, 48
       pop      rbp
       ret
 ; END METHOD Program:<<Main>$>g__Test|0_0[int](int)

; Total bytes of code 87

when in reality it should just be more like:

; Assembly listing for method Program:<<Main>$>g__Test|0_0[int](int) (Tier1)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; Tier1 code
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; BEGIN METHOD Program:<<Main>$>g__Test|0_0[int](int)

G_M000_IG01:                ;; offset=0000H
       ret
 ; END METHOD Program:<<Main>$>g__Test|0_0[int](int)

; Total bytes of code 1

What's going on there?

@EgorBo
Copy link
Member Author

EgorBo commented Jul 17, 2023

What's going on there?

Haha, interesting, so your codegen looks fine - see, it just pushes something to stack (generic context?) and then just exits, but JIT wasn't able to delete a big chunk of unreachable code - presumably, because of exception handling from forerach - need to take a look

@stephentoub
Copy link
Member

Haha, interesting, so your codegen looks fine - see, it just pushes something to stack (generic context?) and then just exits

But why does it even need to do that? It should be no different from an empty method body, no?

@stephentoub
Copy link
Member

Also, removing the NoInlining on Test<T>, it still doesn't get inlined even though it should be a nop equivalent to an empty method.

@EgorBo
Copy link
Member Author

EgorBo commented Jul 17, 2023

Haha, interesting, so your codegen looks fine - see, it just pushes something to stack (generic context?) and then just exits

But why does it even need to do that? It should be no different from an empty method body, no?

I guess it thinks it's not an leaf method due to calls in the unreachable blocks , hence, it needs to setup a frame.

Also, removing the NoInlining on Test, it still doesn't get inlined even though it should be a nop equivalent to an empty method.

My bet that it's pretty hard for inliner to guess here - the main issue that our boxing optimizations are pretty simple pattern matchers, like BOX+ISINST+BRTRUE -> fold. It's not the case for e.g. if (obj is Foo foo) syntax because it produces BOX+ISINST+STLOC+LDLOC+BTRUE that our matcher no longer can optimize early, I'll check why the blocks were not removed in your case

@stephentoub
Copy link
Member

stephentoub commented Jul 17, 2023

the main issue that our boxing optimizations are pretty simple pattern matchers, like BOX+ISINST+BRTRUE -> fold. It's not the case for e.g. if (obj is Foo foo) syntax because it produces BOX+ISINST+STLOC+LDLOC+BTRUE that our matcher no longer can optimize early

Ugh. So, yeah, if I change the repro to instead be:

using System.Collections;
using System.Runtime.CompilerServices;

while (true)
{
    Test<int>(42);
}

static void Test<T>(T value)
{
    if (value is IEnumerable)
    {
        foreach (object? e in (IEnumerable)value)
        {
        }
    }
}

then it generates the desired (albeit still not inlined):

; Assembly listing for method Program:<<Main>$>g__Test|0_0[int](int) (Tier1)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; Tier1 code
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; BEGIN METHOD Program:<<Main>$>g__Test|0_0[int](int)

G_M000_IG01:                ;; offset=0000H

G_M000_IG02:                ;; offset=0000H
       ret
 ; END METHOD Program:<<Main>$>g__Test|0_0[int](int)

; Total bytes of code 1

It's really unfortunate that the C# language, our style rules, analyzers, IDE suggestions, etc. all push us and others to use the is IEnumerable enumerable variant, but the JIT is unable to handle that as well. I'd thought this was only a deficiency around boxing in a case like:

T value = ...;
if (value is IFoo foo) foo.Bar();

and instead needing to write that as:

T value = ...;
if (value is IFoo) ((IFoo)foo).Bar();

to avoid the boxing, but it now sounds like the problem is much more extensive than that.

@EgorBo
Copy link
Member Author

EgorBo commented Jul 17, 2023

So, looks like it's a general issue with JIT today (#82335) that it afraids to remove unreachable blocks with EH, even in this simple example:

image

(in some cases, jit can do that, though, e.g. remove empty finally-block).

to avoid the boxing, but it now sounds like the problem is much more extensive than that.

Yeah, we're aware of it and have an idea to fix it (it's the same old "multi-use boxing" problem) but it's just not trivial from my understanding (e.g. escape analysis) - @AndyAyersMS @markples might know more about it

@AndyAyersMS
Copy link
Member

Yeah, we're aware of it and have an idea to fix it (it's the same old "multi-use boxing" problem) but it's just not trivial from my understanding (e.g. escape analysis) - @AndyAyersMS @markples might know more about it

We had ambitions of fixing (at least some) cases of these multi-use boxes in .NET 8 but didn't get to it ... so unfortunately it is going to have to wait.

The removal of unreachable EH might be easier to address, but not sure if it would fit into the remaining .NET 8 time either. Seems like moderate risk for (what currently feels like) small reward. Might not be too hard to assess the impact more quantitatively.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 18, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Aug 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants