Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor layout implementation #491

Merged
merged 1 commit into from
Jul 16, 2024
Merged

Conversation

jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Jul 9, 2024

Summary:
Cleaning up some layout type related arguments, previously we are putting layout specific args like inner_k_tiles in from_float, but now we just use layout_type which can be different LayoutType instances and can hold different args for each type of LayoutType

Test Plan:
regression tests:

python test/quantization/test_quant_api.py
python test/integration/test_integration.py

Reviewers:

Subscribers:

Tasks:

Tags:

Copy link

pytorch-bot bot commented Jul 9, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/491

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a8d9218 with merge base 6e7cf71 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 9, 2024
@jerryzh168 jerryzh168 force-pushed the layout branch 2 times, most recently from 3ffe60f to d181a77 Compare July 10, 2024 17:34
):
original_shape = input_float.shape
if extended_layout == "tensor_core_tiled":
if isinstance(layout_type, TensorCoreTiledLayoutType):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to move this inside the TensorCoreTiledLayout constructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, this is padding the input though, not sure how we can do that, but we can move the implementation to TensorCoreTiledLayout I think

Copy link
Collaborator

@vayuda vayuda Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I think that would be a bit cleaner if all layout specific code was kept within the layout class. Otherwise, this pr looks good.


aten = torch.ops.aten

@dataclass(frozen=True)
class PlainLayoutType(LayoutType):
Copy link
Member

@msaroufim msaroufim Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment or error that this shouldnt be instantiated directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be instantiated I think, are you talking about LayoutType?

Copy link
Member

@msaroufim msaroufim Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see I guess I'm a bit thrown off because a data classes primary goal is to store data wheras this class stores nothing and its really just a name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have instead done an enum like this

from enum import Enum
class Operations(Enum):
    ADD = (1,)
    SUBTRACT = (2,)
    MULTIPLY = (3,)
    DIVIDE = (4, 'precision') 

enums are also a class so you can override __init__ and define a func that only applies on DIVIDE for example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I don't follow the DIVIDE part, can you elaborate a bit? is this talking about how to support TensorCoreTiledLayoutType that has a inner_k_tiles argument?


def pad_input(self, input: torch.Tensor) -> torch.Tensor:
orig_out_features, orig_in_features = input.shape
in_features = find_multiple(orig_in_features, 1024)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are the in and out numbers coming from? I constants like this were a function of the dtype as well

Copy link
Contributor Author

@jerryzh168 jerryzh168 Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is comes from tinygemm kernel I think, this layout only applies to uint4 dtype

Copy link
Member

@msaroufim msaroufim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would recommend either using an enum or abstract classes to get the desired behavior


@dataclass(frozen=True)
class TensorCoreTiledLayoutType(LayoutType):
inner_k_tiles: int = 8
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msaroufim see here, we have extra configurable arguments, it's not just a name so I'm not sure how enum would work here

@msaroufim msaroufim self-requested a review July 16, 2024 00:29
input_float,
(0, in_features - orig_in_features, 0, out_features - orig_out_features),
)
input_float = layout_type.pad_input(input_float)
Copy link
Collaborator

@vayuda vayuda Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on the right track but I think we can make it more generic to better serve future usage.
LayoutTypes can implement two functions: pre_process() and post_process() The new workflow would look like this:

input_float = layout.pre_process(input_float)
...
int_data = quantize_affine(...)
int_data = layout.post_process(int_data)
layout_tensor_ctr = get_layout_tensor_constructor(type(layout_type))
layout_tensor = layout_tensor_ctr(int_data, scale, zero_point, layout_type)

To motivate this I can say that I would for sure use the post_process function while integrating my intx work. This allows me to write just a layout_type instead of an entire layout_class (I can reuse the plain layout). Additionally, if you look at #498 they call torch._cslt_compress(int_data) which would also be implemented as a post_process function. I think this also means they can re-use the plain layout since i dont see them use any other arguments in the constructor (@jcaip maybe you can confirm this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, by reusing PlainLayoutType, you mean inherit it and override post_process right?

Summary:
TODO

Test Plan:
TODO

Reviewers:

Subscribers:

Tasks:

Tags:
@jerryzh168 jerryzh168 merged commit aef7e09 into pytorch:main Jul 16, 2024
13 checks passed
@jerryzh168 jerryzh168 deleted the layout branch July 16, 2024 21:55
dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024
Summary:
TODO

Test Plan:
TODO

Reviewers:

Subscribers:

Tasks:

Tags:
yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024
* Minimal android app build

* Improve script

* Detect physical device as well
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants