Re: Building llama.cpp for AMD GPU using only Debian packages?
Hi Jamie,
On 2025-01-30 17:39, Jamie Bainbridge wrote:
I am not sure this is actually needed anymore.
In the past I found Vulkan on RDNA1 (5600XT) was significantly slower
than ROCm, like half the text generation speed.
I was recently advised to try Vulkan again. I found RDNA1 (5600XT)
Vulkan runs the same speed as ROCm, and RDNA2 (6600XT) Vulkan runs
faster than ROCm by about 10%!
The rocBLAS library depends heavily on tuned Tensile assembly kernels to
achieve good performance. What you're seeing on RDNA1 is the performance
of the rocBLAS library using what is basically the reference implementation.
On RDNA2, there are assembly kernels, but AFAIK there was never any
rocBLAS tuning done for llama-cpp workloads. It's likely that the
parameter space is not well-covered and Tensile is forced to select a
suboptimal assembly kernel. It's likely that the performance could be
significantly improved through tuning.
The Tensile library has a tremendous amount of technical debt and it's
not exactly easy to use. I've never done Tensile tuning before, but
three of my close friends from grad school were Tensile developers for a
few years [1], so I've asked for a favour. Benjamin Ulmer is going to
help tune rocBLAS for llama.cpp on RDNA 1, Though whether AMD upstream
accepts that tuning is an open question.
What models should we be tuning for?
On 2025-01-30 17:39, Jamie Bainbridge wrote:
AMD's marketing for RDNA3 (7900XTX) also uses Vulkan to spruik
performance gains over competing cards with CUDA. The XTX is even
officially supported in ROCm, so surely if ROCm was faster they'd use
that result instead.
I wouldn't read too much into that. That does suggest that the Vulkan
implementation was faster, but we don't know if that's a well-optimized
result. I suspect it's not.
Sincerely,
Cory Bloor
[1]: I'd hoped that with so many friends on the team, I'd be able to
have some influence on the technical direction of the library.
Unfortunately, that proved not to be the case. It was a bit of a life
lesson for me.
Reply to: