[RFC] HIP backend for AMD GPU support #3231

iotamudelta · 2024-01-31T21:26:03Z

We would like to contribute a HIP backend to Faiss to support AMD GPUs. We have a working prototype that passes all unit tests on Navi hardware (6800XT, 7900XTX). The prototype features a statically hipified version of the existing CUDA backend with manual AMD specific changes (build system, PTX to amdgcn builtins, ...).

Assuming you are interested in this work, how would we go about upstreaming it best?

Would a static HIP backend (in the faiss::hip namespace) be preferred? If not, what architecture would preferable (e.g., overriding faiss::gpu)?

Unlike the CUDA backend, we ultimately need to support multiple "warp sizes" (wave fronts) at runtime - 32 for Navi and 64 for MI series. There are some questions pertaining to uses of kWarpSize that will not work out of the box (sizing shared memory, some of the replacements in cmake, static uses at dispatch sites, ...). We will need guidance on how such support would be architected and integrated best.

Lastly, we have done only minor performance analysis and/or tuning with the current prototype - are there any public benchmarks and/or particular protocols you have used for the other backends that we should use as a reference? We have used the GPU benchmark scripts with the SIFT data sets to assess performance so far.

As part of this, we're currently using a GPU_MAX_SELECTION_K of 1024 - what would a recommend protocol look like to decide for 1024 vs 2048 - we'd ideally like to use only one independent of HW/SW generation.

To browse the current prototype as a reference, please see https://github.com/iotamudelta/faiss/tree/wf32 .

mdouze · 2024-02-01T15:26:02Z

FYI there is an open (but inactive) pull request for AMD:
#3126
Let's discuss how we can coordinate this effort.

iotamudelta · 2024-02-06T21:58:16Z

@mdouze Thanks for the pointer! I see your concerns in that PR, we'll talk internally how to address.

iotamudelta · 2024-04-10T20:13:09Z

@mdouze We had some further discussions internally. @ItsPitt and I can maintain an AMD/HIP backend for now. AMD can contribute two servers for CI - same setup as the PyTorch CI.

I'd recommend closing #3126 as overcome by events. With that in mind, what's the preferred strategy to make this all happen - I assume we'll want multiple steps.

Current state of AMD/HIP is in https://github.com/iotamudelta/faiss/tree/rocm_support

iotamudelta · 2024-09-30T21:52:40Z

Solved - ROCm support is integrated into tip of tree.

mdouze added the GPU label Feb 1, 2024

iotamudelta mentioned this issue May 20, 2024

ROCm support #3462

Closed

asadoughi added the feature request label Jul 1, 2024

iotamudelta closed this as completed Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] HIP backend for AMD GPU support #3231

[RFC] HIP backend for AMD GPU support #3231

iotamudelta commented Jan 31, 2024

mdouze commented Feb 1, 2024

iotamudelta commented Feb 6, 2024

iotamudelta commented Apr 10, 2024

iotamudelta commented Sep 30, 2024

[RFC] HIP backend for AMD GPU support #3231

[RFC] HIP backend for AMD GPU support #3231

Comments

iotamudelta commented Jan 31, 2024

mdouze commented Feb 1, 2024

iotamudelta commented Feb 6, 2024

iotamudelta commented Apr 10, 2024

iotamudelta commented Sep 30, 2024