[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: First experiments with gfx1100/gfx1101/gfx1102



Hi Christian,

On 2024-03-11 01:36, Christian Kastner wrote:
I ran the first experiments with gfx1100 (via W7800), gfx1101 (via
[W7700]), gfx1102 (via W7500) over the last few days.

The good
========

gfx1100 seems to work pretty well. If you look at the failing tests in
unstable [1], we have
  * rocprim: fixed in experimental
  * rocfft: fixed in experimental
  * rocsolver, hipsolver: substantial failures though other arches pass.
    perhaps these would benefit from an update to 5.7? (These are two of
    the few libraries that are < 5.7)
  * hipcub: fails but so does everything else non-gfx1030
  * rocsparse, hipsparse: fails everywhere
  * rocthrust: fails everywhere
  * rocblas, hipblas: interesting one. Pass everywhere else except for
    gfx1035 on unstable (testing still fine)

Note that a few of the problems described as "fails everywhere" are relatively small issues, while others are more serious.

With respect to rocBLAS, the gfx1100 architecture was fully enabled upstream in all these libraries in ROCm 5.4. The performance should be better in later releases, but I'm pretty sure rocBLAS should be functionally correct when built for gfx1100 in ROCm 5.5. I think we should try to compare these results against the upstream ROCm stack on Ubuntu 22.04. Unfortunately, the upstream project does not ship the tests as binaries, but you can build them without too much trouble.

To test with the same userspace as upstream, you can build and run in docker:

# docker run -it --device=/dev/dri --device=/dev/kfd --security-opt seccomp=unconfined ubuntu:22.04
cd ~
apt-get -y update
apt-get -y upgrade
apt-get -y install wget cmake git
wget https://repo.radeon.com/amdgpu-install/5.5.1/ubuntu/jammy/amdgpu-install_5.5.50501-1_all.deb
apt-get install -y ./amdgpu-install_5.5.50501-1_all.deb 
apt-get -y update
apt-get -y install rocm-dev
git clone https://github.com/ROCm/rocBLAS.git
cd rocBLAS
git checkout rocm-5.5.1
./install.sh -cda "gfx1100;gfx1101;gfx1102"
build/release/clients/staging/rocblas-test --gtest_filter='-*known_bug*'
cd ~
git clone https://github.com/ROCm/rocSOLVER.git
cd rocSOLVER
git checkout rocm-5.5.1
rocblas_DIR=$HOME/rocBLAS/build/release/rocblas-install ./install.sh -cda "gfx1100;gfx1101;gfx1102"
build/release/clients/staging/rocsolver-test

If you're doing the above on bare metal (rather than in Docker), you can also add "apt-get -y install amdgpu-dkms" to use the kernel driver that upstream was testing against.

Sincerely,
Cory Bloor


Reply to: