[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: rocBLAS on arm64 and ppc64el



Hi Mo,

On 2023-07-17 21:22, M. Zhou wrote:
Can we patch the program to let it output something to screen in
some time interval (like 5 minutes). For buildd this is a
straightforward workaround for timeout.

I didn't know that was a possibility! Thanks for the heads up. I don't have a log file with the real command on hand, but the part that is timing out is basically:

clang -O3 --offload-arch=gfx803 --offload-arch=gfx900 --offload-arch=gfx906:xnack- --offload-arch=gfx908:xnack- --offload-arch=gfx90a:xnack+ --offload-arch=gfx90a:xnack- --offload-arch=gfx1010 --offload-arch=gfx1030 -c -o Kernels.o Kernels.cpp

The problem is that Kernels.cpp is an 80 MB source file that clang must compile in serial for each of the eight offload architectures. The majority of the rocBLAS build time is spent compiling this single file. On a Zen 2 CPU, it takes roughly 15 minutes per offload architecture to build.

In my next rocBLAS upload, I will add the --verbose flag to the clang build options for this file. That way, clang will print a message upon beginning the compilation of each offload architecture. That should result in eight messages spaced roughly equally apart during the build of Kernels.cpp, which might be enough to prevent a timeout.

Sincerely,
Cory Bloor


Reply to: