Refactor layout implementation #491

jerryzh168 · 2024-07-09T20:51:40Z

Summary:
Cleaning up some layout type related arguments, previously we are putting layout specific args like inner_k_tiles in from_float, but now we just use layout_type which can be different LayoutType instances and can hold different args for each type of LayoutType

Test Plan:
regression tests:

python test/quantization/test_quant_api.py
python test/integration/test_integration.py

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2024-07-09T20:51:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/491

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a8d9218 with merge base 6e7cf71 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torchao/dtypes/utils.py

vayuda · 2024-07-11T19:28:26Z

torchao/dtypes/affine_quantized_tensor.py

    ):
        original_shape = input_float.shape
-        if extended_layout == "tensor_core_tiled":
+        if isinstance(layout_type, TensorCoreTiledLayoutType):


is it possible to move this inside the TensorCoreTiledLayout constructor

oh, this is padding the input though, not sure how we can do that, but we can move the implementation to TensorCoreTiledLayout I think

Yea I think that would be a bit cleaner if all layout specific code was kept within the layout class. Otherwise, this pr looks good.

msaroufim · 2024-07-15T19:54:55Z

torchao/dtypes/affine_quantized_tensor.py


 aten = torch.ops.aten

+@dataclass(frozen=True)
+class PlainLayoutType(LayoutType):


comment or error that this shouldnt be instantiated directly

this can be instantiated I think, are you talking about LayoutType?

I see I guess I'm a bit thrown off because a data classes primary goal is to store data wheras this class stores nothing and its really just a name

I would have instead done an enum like this

from enum import Enum class Operations(Enum): ADD = (1,) SUBTRACT = (2,) MULTIPLY = (3,) DIVIDE = (4, 'precision')

enums are also a class so you can override __init__ and define a func that only applies on DIVIDE for example

sorry I don't follow the DIVIDE part, can you elaborate a bit? is this talking about how to support TensorCoreTiledLayoutType that has a inner_k_tiles argument?

msaroufim · 2024-07-15T19:55:45Z

torchao/dtypes/affine_quantized_tensor.py

+
+    def pad_input(self, input: torch.Tensor) -> torch.Tensor:
+        orig_out_features, orig_in_features = input.shape
+        in_features = find_multiple(orig_in_features, 1024)


where are the in and out numbers coming from? I constants like this were a function of the dtype as well

this is comes from tinygemm kernel I think, this layout only applies to uint4 dtype

torchao/dtypes/affine_quantized_tensor.py

torchao/dtypes/utils.py

msaroufim

Would recommend either using an enum or abstract classes to get the desired behavior

jerryzh168 · 2024-07-15T23:52:13Z

torchao/dtypes/affine_quantized_tensor.py

+
+@dataclass(frozen=True)
+class TensorCoreTiledLayoutType(LayoutType):
+    inner_k_tiles: int = 8


@msaroufim see here, we have extra configurable arguments, it's not just a name so I'm not sure how enum would work here

vayuda · 2024-07-16T00:43:26Z

torchao/dtypes/affine_quantized_tensor.py

-                input_float,
-                (0, in_features - orig_in_features, 0, out_features - orig_out_features),
-            )
+        input_float = layout_type.pad_input(input_float)


This is on the right track but I think we can make it more generic to better serve future usage.
LayoutTypes can implement two functions: pre_process() and post_process() The new workflow would look like this:

input_float = layout.pre_process(input_float) ... int_data = quantize_affine(...) int_data = layout.post_process(int_data) layout_tensor_ctr = get_layout_tensor_constructor(type(layout_type)) layout_tensor = layout_tensor_ctr(int_data, scale, zero_point, layout_type)

To motivate this I can say that I would for sure use the post_process function while integrating my intx work. This allows me to write just a layout_type instead of an entire layout_class (I can reuse the plain layout). Additionally, if you look at #498 they call torch._cslt_compress(int_data) which would also be implemented as a post_process function. I think this also means they can re-use the plain layout since i dont see them use any other arguments in the constructor (@jcaip maybe you can confirm this)

makes sense, by reusing PlainLayoutType, you mean inherit it and override post_process right?

Summary: TODO Test Plan: TODO Reviewers: Subscribers: Tasks: Tags:

* Minimal android app build * Improve script * Detect physical device as well

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 9, 2024

jerryzh168 force-pushed the layout branch 2 times, most recently from 3ffe60f to d181a77 Compare July 10, 2024 17:34

jerryzh168 requested review from msaroufim, drisspg, vayuda, gau-nernst, andrewor14 and HDCharles July 10, 2024 18:02

msaroufim reviewed Jul 10, 2024

View reviewed changes

torchao/dtypes/utils.py Outdated Show resolved Hide resolved

msaroufim reviewed Jul 10, 2024

View reviewed changes

torchao/dtypes/utils.py Outdated Show resolved Hide resolved

jerryzh168 force-pushed the layout branch from d181a77 to 90aa6bf Compare July 10, 2024 22:38

jerryzh168 requested a review from msaroufim July 10, 2024 23:33

jerryzh168 force-pushed the layout branch from 90aa6bf to 21d13e7 Compare July 11, 2024 00:53

vayuda reviewed Jul 11, 2024

View reviewed changes

jerryzh168 force-pushed the layout branch from 21d13e7 to ce76e4b Compare July 15, 2024 17:46

vayuda approved these changes Jul 15, 2024

View reviewed changes

msaroufim reviewed Jul 15, 2024

View reviewed changes

torchao/dtypes/affine_quantized_tensor.py Show resolved Hide resolved

msaroufim reviewed Jul 15, 2024

View reviewed changes

torchao/dtypes/utils.py Outdated Show resolved Hide resolved

msaroufim requested changes Jul 15, 2024

View reviewed changes

jerryzh168 commented Jul 15, 2024

View reviewed changes

msaroufim self-requested a review July 16, 2024 00:29

msaroufim approved these changes Jul 16, 2024

View reviewed changes

vayuda reviewed Jul 16, 2024

View reviewed changes

jerryzh168 force-pushed the layout branch from ce76e4b to c7cd4f7 Compare July 16, 2024 02:59

vayuda approved these changes Jul 16, 2024

View reviewed changes

jerryzh168 force-pushed the layout branch from c7cd4f7 to e2e8649 Compare July 16, 2024 19:20

Refactor layout implementation

a8d9218

Summary: TODO Test Plan: TODO Reviewers: Subscribers: Tasks: Tags:

jerryzh168 force-pushed the layout branch from e2e8649 to a8d9218 Compare July 16, 2024 19:52

jerryzh168 merged commit aef7e09 into pytorch:main Jul 16, 2024
13 checks passed

jerryzh168 deleted the layout branch July 16, 2024 21:55

dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024

Refactor layout implementation (pytorch#491)

b9fbfc4

Summary: TODO Test Plan: TODO Reviewers: Subscribers: Tasks: Tags:

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Minimal android app build (pytorch#491)

3bf0abb

* Minimal android app build * Improve script * Detect physical device as well

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor layout implementation #491

Refactor layout implementation #491

jerryzh168 commented Jul 9, 2024 •

edited

Loading

pytorch-bot bot commented Jul 9, 2024 •

edited

Loading

vayuda Jul 11, 2024

jerryzh168 Jul 11, 2024

vayuda Jul 12, 2024 •

edited

Loading

msaroufim Jul 15, 2024 •

edited

Loading

jerryzh168 Jul 15, 2024

msaroufim Jul 15, 2024 •

edited

Loading

msaroufim Jul 15, 2024

jerryzh168 Jul 15, 2024

msaroufim Jul 15, 2024

jerryzh168 Jul 15, 2024 •

edited

Loading

msaroufim left a comment

jerryzh168 Jul 15, 2024

vayuda Jul 16, 2024 •

edited

Loading

jerryzh168 Jul 16, 2024

Refactor layout implementation #491

Refactor layout implementation #491

Conversation

jerryzh168 commented Jul 9, 2024 • edited Loading

pytorch-bot bot commented Jul 9, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/491

✅ No Failures

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vayuda Jul 12, 2024 • edited Loading

Choose a reason for hiding this comment

msaroufim Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msaroufim Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryzh168 Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

msaroufim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vayuda Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryzh168 commented Jul 9, 2024 •

edited

Loading

pytorch-bot bot commented Jul 9, 2024 •

edited

Loading

vayuda Jul 12, 2024 •

edited

Loading

msaroufim Jul 15, 2024 •

edited

Loading

msaroufim Jul 15, 2024 •

edited

Loading

jerryzh168 Jul 15, 2024 •

edited

Loading

vayuda Jul 16, 2024 •

edited

Loading