ggml-metal: round up to 16 to fix setThreadgroupMemoryLength assertion #3938

psugihara · 2023-11-03T19:08:52Z

Without this fix, when running inside Xcode, MTLDebugComputeCommandEncoder throws the following assertion:

-[MTLDebugComputeCommandEncoder setThreadgroupMemoryLength:atIndex:]:817: failed assertion `length(4) must be a multiple of 16 bytes.'

Logging shows me it’s getting passed 4 on line 1020. This fix just makes sure that we round up to 16 if we’re under.

Here’s the relevant doc that discusses setThreadgroupMemoryLength being a multiple of 16: https://developer.apple.com/documentation/metal/mtlcomputecommandencoder/1443142-setthreadgroupmemorylength

…tion

ggerganov · 2023-11-03T19:32:22Z

ggml-metal.m

@@ -1348,7 +1348,7 @@ void ggml_metal_graph_compute(
                            [encoder setBytes:&ne00    length:sizeof( int64_t) atIndex:2];
                            [encoder setBytes:&nb01    length:sizeof(uint64_t) atIndex:3];
                            [encoder setBytes:&eps     length:sizeof(   float) atIndex:4];
-                            [encoder setThreadgroupMemoryLength:nth*sizeof(float) atIndex:0];
+                            [encoder setThreadgroupMemoryLength:MAX(16, nth*sizeof(float)) atIndex:0];


This could still fail the check when ne00 < 256 && ne00 % 4 != 0

brittlewis12 · 2023-11-17T23:11:46Z

are the same changes applicable to https://github.com/ggerganov/llama.cpp/pull/3938/files#diff-d189c5117255f368aa89f521bd523afa399ada0bbfb7043b74c65bf4ac2472fdR1332?

in GGML_OP_RMS_NORM

* ggml-org/llama.cpp#3938

…ggml-org#3938)

* ggml-org/llama.cpp#3938

ggml-metal: round up to 16 to fix MTLDebugComputeCommandEncoder asser…

8f312d9

…tion

psugihara changed the title ~~ggml-metal: round up to 16 to fix MTLDebugComputeCommandEncoder asser…~~ ggml-metal: round up to 16 to fix setThreadgroupMemoryLength assertion Nov 3, 2023

ggerganov approved these changes Nov 3, 2023

View reviewed changes

ggerganov merged commit d9b33fe into ggml-org:master Nov 3, 2023

ggerganov reviewed Nov 3, 2023

View reviewed changes

ggerganov mentioned this pull request Nov 6, 2023

Failed assertion in ggml-metal.m ggerganov/whisper.cpp#1435

Closed

brittlewis12 added a commit to brittlewis12/llmfarm_core.swift that referenced this pull request Nov 17, 2023

Fix memory length assertion for metal

eb6587c

* ggml-org/llama.cpp#3938

brittlewis12 added a commit to brittlewis12/llmfarm_core.swift that referenced this pull request Nov 18, 2023

Fix memory length assertion for metal

c542770

* ggml-org/llama.cpp#3938

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023

metal : round up to 16 to fix MTLDebugComputeCommandEncoder assertion (…

efd31a7

…ggml-org#3938)

brittlewis12 added a commit to brittlewis12/llmfarm_core.swift that referenced this pull request Nov 30, 2023

Fix memory length assertion for metal

32dbf8d

* ggml-org/llama.cpp#3938

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-metal: round up to 16 to fix setThreadgroupMemoryLength assertion #3938

ggml-metal: round up to 16 to fix setThreadgroupMemoryLength assertion #3938

psugihara commented Nov 3, 2023 •

edited

Loading

ggerganov Nov 3, 2023

brittlewis12 commented Nov 17, 2023 •

edited

Loading

ggml-metal: round up to 16 to fix setThreadgroupMemoryLength assertion #3938

ggml-metal: round up to 16 to fix setThreadgroupMemoryLength assertion #3938

Conversation

psugihara commented Nov 3, 2023 • edited Loading

ggerganov Nov 3, 2023

Choose a reason for hiding this comment

brittlewis12 commented Nov 17, 2023 • edited Loading

psugihara commented Nov 3, 2023 •

edited

Loading

brittlewis12 commented Nov 17, 2023 •

edited

Loading