Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Frontend to emit i1 packed tensor for attention masks #19382

Open
lialan opened this issue Dec 5, 2024 · 1 comment
Open
Assignees
Labels
enhancement ➕ New feature or request

Comments

@lialan
Copy link
Contributor

lialan commented Dec 5, 2024

Request description

As the title says. In #19354 we are adding one extra encoding attribute #iree_encoding.packed_storage to indicate that a tensor has a back to back packed memory layout. such as tensor<1024x1024xi1, #iree_encoding.packed_storage>.

The backend will handle such tensor accordingly.

What component(s) does this issue relate to?

No response

Additional context

No response

@lialan lialan added the enhancement ➕ New feature or request label Dec 5, 2024
@lialan
Copy link
Contributor Author

lialan commented Dec 16, 2024

@rsuderman #19354 will enable packed tensor types, which the frontend can emit and the backend can process it properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ➕ New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants