Multi-GPU support #4

Lissanro · 2024-02-04T12:28:52Z

Since multiple experts can take a lot of VRAM, especially for SDXL, it would useful to have a way to choose which experts to load to which GPU (since GPU can have different VRAM each).

Andrey36652 · 2024-02-04T19:38:41Z

@Lissanro wouldn't it be killed by pcie latency?

Lissanro · 2024-02-05T06:52:03Z

I think PCI-E latency is only relevant during training (not to mention it could be quite good if PCI-E 4.0 or PCI-E 5.0 with sufficient number of lanes is used, or NVLink in case of a pair 3090 cards).

For inference, PCI-E latency should not matter much, it is just independent experts doing their job once their fully loaded to the VRAM. This is how for example running Mixtral (8x7B MoE) is possible at 4-bit or higher quantization with 24GB cards - since it cannot fit in 24GB of a single card, it gets split across more than 1 GPU, and speed is comparable to running on a single GPU.

Potentially, it could be even better if parallelism across multiple GPUs is implemented (for a case when one expert is fully allocated at one GPU, and another expert at different GPU, and the gate network decided it needs to use both). In any case, even naive sequential implementation (to process experts one-by-one even if they are on different GPUs) is still better than crashing with OOM, and in terms of speed should be at least comparable to running on a single GPU with the higher VRAM.

Warlord-K · 2024-02-05T12:47:03Z

Thanks for the suggestion, we are working on optimizing the memory usage, but feel free to create a PR for Multi-GPU usage.

g29times · 2024-02-06T02:45:05Z

@Warlord-K Hi Admin, is there any possible that the homepage README file that tells the GPU needs or specifications?

Warlord-K · 2024-02-10T03:47:11Z

@g29times I have added the GPU requirements, thanks for the suggestion!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-GPU support #4

Multi-GPU support #4

Lissanro commented Feb 4, 2024

Andrey36652 commented Feb 4, 2024

Lissanro commented Feb 5, 2024 •

edited

Loading

Warlord-K commented Feb 5, 2024

g29times commented Feb 6, 2024

Warlord-K commented Feb 10, 2024

Multi-GPU support #4

Multi-GPU support #4

Comments

Lissanro commented Feb 4, 2024

Andrey36652 commented Feb 4, 2024

Lissanro commented Feb 5, 2024 • edited Loading

Warlord-K commented Feb 5, 2024

g29times commented Feb 6, 2024

Warlord-K commented Feb 10, 2024

Lissanro commented Feb 5, 2024 •

edited

Loading