Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml-metal: round up to 16 to fix setThreadgroupMemoryLength assertion #3938

Merged
merged 1 commit into from
Nov 3, 2023

Conversation

psugihara
Copy link
Contributor

@psugihara psugihara commented Nov 3, 2023

Without this fix, when running inside Xcode, MTLDebugComputeCommandEncoder throws the following assertion:

-[MTLDebugComputeCommandEncoder setThreadgroupMemoryLength:atIndex:]:817: failed assertion `length(4) must be a multiple of 16 bytes.'

Logging shows me it’s getting passed 4 on line 1020. This fix just makes sure that we round up to 16 if we’re under.

Here’s the relevant doc that discusses setThreadgroupMemoryLength being a multiple of 16: https://developer.apple.com/documentation/metal/mtlcomputecommandencoder/1443142-setthreadgroupmemorylength

@psugihara psugihara changed the title ggml-metal: round up to 16 to fix MTLDebugComputeCommandEncoder asser… ggml-metal: round up to 16 to fix setThreadgroupMemoryLength assertion Nov 3, 2023
@ggerganov ggerganov merged commit d9b33fe into ggml-org:master Nov 3, 2023
@@ -1348,7 +1348,7 @@ void ggml_metal_graph_compute(
[encoder setBytes:&ne00 length:sizeof( int64_t) atIndex:2];
[encoder setBytes:&nb01 length:sizeof(uint64_t) atIndex:3];
[encoder setBytes:&eps length:sizeof( float) atIndex:4];
[encoder setThreadgroupMemoryLength:nth*sizeof(float) atIndex:0];
[encoder setThreadgroupMemoryLength:MAX(16, nth*sizeof(float)) atIndex:0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could still fail the check when ne00 < 256 && ne00 % 4 != 0

@brittlewis12
Copy link

brittlewis12 commented Nov 17, 2023

brittlewis12 added a commit to brittlewis12/llmfarm_core.swift that referenced this pull request Nov 17, 2023
brittlewis12 added a commit to brittlewis12/llmfarm_core.swift that referenced this pull request Nov 18, 2023
olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023
brittlewis12 added a commit to brittlewis12/llmfarm_core.swift that referenced this pull request Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants