Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect code generated for dynamic indexing into a row_major matrix #1634

Closed
tangent-vector opened this issue Oct 25, 2018 · 0 comments
Closed
Assignees
Labels
bug Bug, regression, crash

Comments

@tangent-vector
Copy link

The following compute shader code exposes a code generation issue on the version of dxc that ships in the RS5 SDK:

cbuffer C
{
    row_major int3x4 m;
};

RWStructuredBuffer<int> output;

[shader("compute")]
[numthreads(12,1,1)]
void main(uint3 tid : SV_DispatchThreadID)
{
    uint i = tid.x;
    
    output[i] = m[i / 4][i % 4];
}

The header comment on the generated DXIL shows that the matrix has been laid out properly, consuming 3 16-byte "registers" in the cbuffer:

; cbuffer C
; {
;
;   struct dx.alignment.legacy.C
;   {
;
;       row_major int3x4 m;                           ; Offset:    0
;
;   } C                                               ; Offset:    0 Size:    48
;
; }

That is, the data is stored as 3 consecutive float4 rows, which is what row_major implies.

The code for implementing the fetch, loads one row using a cbuffer fetch, but then only extracts three scalar values for the row (a row which should have four values...)

  %2 = call i32 @dx.op.threadId.i32(i32 93, i32 0)  ; ThreadId(component)
  %3 = lshr i32 %2, 2
  %4 = call %dx.types.CBufRet.i32 @dx.op.cbufferLoadLegacy.i32(i32 59, %dx.types.Handle %C_cbuffer, i32 %3)  ; CBufferLoadLegacy(handle,regIndex)
  %5 = extractvalue %dx.types.CBufRet.i32 %4, 0
  %6 = extractvalue %dx.types.CBufRet.i32 %4, 1
  %7 = extractvalue %dx.types.CBufRet.i32 %4, 2

The generated code then constructs a local 3-element array for indexing (even though the row has four values):

  %8 = getelementptr inbounds [3 x i32], [3 x i32]* %1, i32 0, i32 0
  store i32 %5, i32* %8, align 4
  %9 = getelementptr inbounds [3 x i32], [3 x i32]* %1, i32 0, i32 1
  store i32 %6, i32* %9, align 4
  %10 = getelementptr inbounds [3 x i32], [3 x i32]* %1, i32 0, i32 2
  store i32 %7, i32* %10, align 4

Then it indexes into the row using a value in 0-3 (generated by an and with 3), which can obviously index out of the bounds of the local array:

  %11 = getelementptr inbounds [3 x i32], [3 x i32]* %1, i32 0, i32 %13
  %12 = load i32, i32* %11, align 4
  %13 = and i32 %2, 3

(Aside: I'm also not clear on how the %11 getelementptr instruction is referencing %13 which is being computed later in the same block).

I have observed this miscompilation leading to garbage values being read in shader code I am testing.

@tex3d tex3d self-assigned this Oct 30, 2018
@tex3d tex3d added the bug Bug, regression, crash label Oct 30, 2018
@tex3d tex3d reopened this Oct 30, 2018
@tex3d tex3d closed this as completed Oct 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug, regression, crash
Projects
None yet
Development

No branches or pull requests

2 participants