Incorrect code generated for dynamic indexing into a row_major matrix #1634

tangent-vector · 2018-10-25T20:43:40Z

The following compute shader code exposes a code generation issue on the version of dxc that ships in the RS5 SDK:

cbuffer C
{
    row_major int3x4 m;
};

RWStructuredBuffer<int> output;

[shader("compute")]
[numthreads(12,1,1)]
void main(uint3 tid : SV_DispatchThreadID)
{
    uint i = tid.x;
    
    output[i] = m[i / 4][i % 4];
}

The header comment on the generated DXIL shows that the matrix has been laid out properly, consuming 3 16-byte "registers" in the cbuffer:

; cbuffer C
; {
;
;   struct dx.alignment.legacy.C
;   {
;
;       row_major int3x4 m;                           ; Offset:    0
;
;   } C                                               ; Offset:    0 Size:    48
;
; }

That is, the data is stored as 3 consecutive float4 rows, which is what row_major implies.

The code for implementing the fetch, loads one row using a cbuffer fetch, but then only extracts three scalar values for the row (a row which should have four values...)

  %2 = call i32 @dx.op.threadId.i32(i32 93, i32 0)  ; ThreadId(component)
  %3 = lshr i32 %2, 2
  %4 = call %dx.types.CBufRet.i32 @dx.op.cbufferLoadLegacy.i32(i32 59, %dx.types.Handle %C_cbuffer, i32 %3)  ; CBufferLoadLegacy(handle,regIndex)
  %5 = extractvalue %dx.types.CBufRet.i32 %4, 0
  %6 = extractvalue %dx.types.CBufRet.i32 %4, 1
  %7 = extractvalue %dx.types.CBufRet.i32 %4, 2

The generated code then constructs a local 3-element array for indexing (even though the row has four values):

  %8 = getelementptr inbounds [3 x i32], [3 x i32]* %1, i32 0, i32 0
  store i32 %5, i32* %8, align 4
  %9 = getelementptr inbounds [3 x i32], [3 x i32]* %1, i32 0, i32 1
  store i32 %6, i32* %9, align 4
  %10 = getelementptr inbounds [3 x i32], [3 x i32]* %1, i32 0, i32 2
  store i32 %7, i32* %10, align 4

Then it indexes into the row using a value in 0-3 (generated by an and with 3), which can obviously index out of the bounds of the local array:

  %11 = getelementptr inbounds [3 x i32], [3 x i32]* %1, i32 0, i32 %13
  %12 = load i32, i32* %11, align 4
  %13 = and i32 %2, 3

(Aside: I'm also not clear on how the %11 getelementptr instruction is referencing %13 which is being computed later in the same block).

I have observed this miscompilation leading to garbage values being read in shader code I am testing.

The text was updated successfully, but these errors were encountered:

tex3d mentioned this issue Oct 30, 2018

Fix dynamic indexed row_major matrix from cbuffer #1643

Merged

tex3d self-assigned this Oct 30, 2018

tex3d added the bug Bug, regression, crash label Oct 30, 2018

tex3d closed this as completed in #1643 Oct 30, 2018

tex3d reopened this Oct 30, 2018

tex3d closed this as completed Oct 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect code generated for dynamic indexing into a row_major matrix #1634

Incorrect code generated for dynamic indexing into a row_major matrix #1634

tangent-vector commented Oct 25, 2018

Incorrect code generated for dynamic indexing into a row_major matrix #1634

Incorrect code generated for dynamic indexing into a row_major matrix #1634

Comments

tangent-vector commented Oct 25, 2018