[Feature Request] Frontend to emit i1 packed tensor for attention masks #19382

lialan · 2024-12-05T08:26:38Z

Request description

As the title says. In #19354 we are adding one extra encoding attribute #iree_encoding.packed_storage to indicate that a tensor has a back to back packed memory layout. such as tensor<1024x1024xi1, #iree_encoding.packed_storage>.

The backend will handle such tensor accordingly.

What component(s) does this issue relate to?

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

lialan · 2024-12-16T04:51:16Z

@rsuderman #19354 will enable packed tensor types, which the frontend can emit and the backend can process it properly.

lialan added the enhancement ➕ New feature or request label Dec 5, 2024

lialan assigned rsuderman Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Frontend to emit i1 packed tensor for attention masks #19382

[Feature Request] Frontend to emit i1 packed tensor for attention masks #19382

lialan commented Dec 5, 2024 •

edited

Loading

lialan commented Dec 16, 2024

[Feature Request] Frontend to emit i1 packed tensor for attention masks #19382

[Feature Request] Frontend to emit i1 packed tensor for attention masks #19382

Comments

lialan commented Dec 5, 2024 • edited Loading

Request description

What component(s) does this issue relate to?

Additional context

lialan commented Dec 16, 2024

lialan commented Dec 5, 2024 •

edited

Loading