[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1064257: ITP: rocm-tensile -- ROCm tool for generating and benchmarking assembly kernels



Package: wnpp
Severity: wishlist
Owner: Cordell Bloor <cgmb@slerp.xyz>
X-Debbugs-Cc: debian-devel@lists.debian.org, cgmb@slerp.xyz, debian-ai@lists.debian.org

* Package name    : rocm-tensile
  Version         : 6.0.2
* URL             : https://github.com/ROCm/Tensile
* License         : Expat
  Programming Lang: Python, HIP
  Description     : ROCm tool for generating and benchmarking assembly kernels

 Tensile is a set of tools and libraries primarily for selecting
 parameters of GPU kernels implementing the general matrix multiply
 (GEMM) operation. There are three components that comprise Tensile:
 .
  1. A command-line tool for generating kernels, benchmarking them, and
     saving the parameters used for generating the best kernels (a.k.a.
     "solutions") in YAML files.
  2. A build system component that reads YAML solution files, generates
     kernel source files, and invokes the compiler to turn them into code
     object files. The kernels are indexed by their parameters in either
     YAML or MessagePack format within a TensileLibrary file.
  3. A runtime library for loading and executing the best available
     solution for a given set of GEMM input parameters (a.k.a. "a problem").

The rocm-tensile library sources are currently packaged as part of
rocblas in a multi-upstream tarball package, but they should be split
out so that the command-line tool can be packaged. Tensile kernels are a
vital part of the performance of the rocblas library. It is often
necessary to add tuned kernels for particular problem sizes to achieve
optimal performance in a new application or on a new hardware
architecture. This is therefore an important development tool for BLAS
performance on AMD GPUs.

A fork of the Tensile library is also used by hipblaslt. Splitting
Tensile out from the rocblas package may be helpful in preventing the
duplication of embedded copies. The Tensile library can also be used by
MIOpen.

This package is part of AMD's ROCm stack and will be maintained under
the Debian AI team umbrella.


Reply to: