[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1068199: librocfft0: callback test failures on gfx900 and gfx1030



On 2024-04-02 00:35, Cordell Bloor wrote:
> I tried to reproduce the rocfft callback bug with a W6800 (gfx1030). I
> used a Debian Unstable docker container on an Ubuntu Noble host, but the
> tests all passed. This made me realize that the test failure pattern on
> the CI is that all the qemu-based workers are failing and all the
> podman-based workers are passing.
> 
> This issue seems to be somehow related to the qemu+rocm autopkgtest
> environment.

The issue is already visible with AMD_LOG_LEVEL=1, it's the lack of PCIe
atomics:

> half epsilon: 0.000977	single epsilon: 3.75e-05	double epsilon: 1e-15
> Random seed: 1392424582
> rocFFT version: 1.0.23.
> Note: Google Test filter = rocfft_UnitTest.default_load_callback_complex_single
> [==========] Running 1 test from 1 test suite.
> [----------] Global test environment set-up.
> [----------] 1 test from rocfft_UnitTest
> [ RUN      ] rocfft_UnitTest.default_load_callback_complex_single
> :1:rocvirtual.cpp           :2949: 1796815625 us: [pid:1917  tid:0x7f4a2102c980] Pcie atomics not enabled, hostcall not supported
> :1:rocvirtual.cpp           :3289: 1796816120 us: [pid:1917  tid:0x7f4a2102c980] AQL dispatch failed> clients/tests/default_callbacks_test.cpp:280: Failure
> Expected equality of these values:
>   rocfft_execute(plan, &in_ptr, &out_ptr, info)
>     Which is: 1
>   rocfft_status_success
>     Which is: 0
> 
> clients/tests/default_callbacks_test.cpp:310: Failure
> Expected: (diff.l_inf) < (type_epsilon<Tbound>()), actual: 32.230823516845703 vs 3.75e-05
> 
> [  FAILED  ] rocfft_UnitTest.default_load_callback_complex_single (907 ms)
> [----------] 1 test from rocfft_UnitTest (908 ms total)

(I did not check all 130 failures, so strictly speaking there could be
additional causes, too.)

In an older ROCm ticket, a workaround to enable PCIe atomics in the
guest was discussed [1], but I never got this to work. The relevant bit
is not set after invoking setpci.

I don't know how to best address this. A workaround would be to skip
these tests if the host is a guest VM, but that would reduce coverage.
However, switching everything to podman would reduce coverage even more
if we only use the latest kernel.

Best,
Christian

PS: Full AMD_LOG_LEVEL=4 attached, for reference.

[1] https://github.com/ROCm/ROCK-Kernel-Driver/issues/26#issuecomment-313857180
half epsilon: 0.000977	single epsilon: 3.75e-05	double epsilon: 1e-15
Random seed: 3631874771
rocFFT version: 1.0.23.
Note: Google Test filter = rocfft_UnitTest.default_load_callback_complex_single
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from rocfft_UnitTest
[ RUN      ] rocfft_UnitTest.default_load_callback_complex_single
:3:rocdevice.cpp            :442 : 1761239558 us: [pid:1890  tid:0x7f13ab583980] Initializing HSA stack.
:3:rocdevice.cpp            :208 : 1761720693 us: [pid:1890  tid:0x7f13ab583980] Numa selects cpu agent[0]=0x563d5daf67a0(fine=0x563d4cdd0020,coarse=0x563d5db0c3b0) for gpu agent=0x563d5db36800 CPU<->GPU XGMI=0
:3:rocdevice.cpp            :1680: 1761721192 us: [pid:1890  tid:0x7f13ab583980] Gfx Major/Minor/Stepping: 10/3/0
:3:rocdevice.cpp            :1682: 1761722377 us: [pid:1890  tid:0x7f13ab583980] HMM support: 0, XNACK: 0, Direct host access: 0
:3:rocdevice.cpp            :1684: 1761722570 us: [pid:1890  tid:0x7f13ab583980] Max SDMA Read Mask: 0x0, Max SDMA Write Mask: 0x0
:4:rocdevice.cpp            :2063: 1761722742 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa host memory 0x7f13a9dfc000, size 0x38
:4:rocdevice.cpp            :2063: 1761723175 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa host memory 0x7f1295700000, size 0x101000
:4:rocdevice.cpp            :2063: 1761723636 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa host memory 0x7f1295500000, size 0x101000
:4:runtime.cpp              :83  : 1761723779 us: [pid:1890  tid:0x7f13ab583980] init
:3:hip_context.cpp          :48  : 1761723839 us: [pid:1890  tid:0x7f13ab583980] Direct Dispatch: 1
:3:hip_memory.cpp           :1302: 1761724035 us: [pid:1890  tid:0x7f13ab583980]  hipMemcpyFromSymbol ( 0x563d4bfa06c8, 0x7fff49c710f8, 8, 0, hipMemcpyDeviceToHost ) 
:3:devprogram.cpp           :2681: 1761725089 us: [pid:1890  tid:0x7f13ab583980] Using Code Object V4.
:3:rocdevice.cpp            :2230: 1761730456 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3fefffff8
:3:rocdevice.cpp            :2732: 1761730568 us: [pid:1890  tid:0x7f13ab583980] number of allocated hardware queues with low priority: 0, with normal priority: 0, with high priority: 0, maximum per priority is: 4
:3:rocdevice.cpp            :2810: 1761734314 us: [pid:1890  tid:0x7f13ab583980] created hardware queue 0x7f13a9d7c000 with size 16384 with priority 1, cooperative: 0
:3:rocdevice.cpp            :2902: 1761734476 us: [pid:1890  tid:0x7f13ab583980] acquireQueue refCount: 0x7f13a9d7c000 (1)
:4:rocdevice.cpp            :2063: 1761734816 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa host memory 0x7f1295100000, size 0x100000
:3:devprogram.cpp           :2684: 1761913902 us: [pid:1890  tid:0x7f13ab583980] Using Code Object V5.
:4:command.cpp              :349 : 1761916286 us: [pid:1890  tid:0x7f13ab583980] Command (CopyDeviceToHost) enqueued: 0x563d5a3fc130
:4:rocblit.cpp              :823 : 1761916988 us: [pid:1890  tid:0x7f13ab583980] HSA Async Copy staged D2H dst=0x7f1295500000, src=0x7f13a9deb4d8, size=8, completion_signal=0x7f13ab57e800
:4:command.cpp              :289 : 1761917330 us: [pid:1890  tid:0x7f13ab583980] Queue marker to command queue: 0x563d4cdd1330
:4:command.cpp              :349 : 1761917378 us: [pid:1890  tid:0x7f13ab583980] Command (InternalMarker) enqueued: 0x563d5dc5da20
:4:command.cpp              :179 : 1761917476 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5a3fc130 complete
:4:command.cpp              :173 : 1761917524 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5dc5da20 complete (Wall: 1761917523, CPU: 0, GPU: 0 us)
:4:command.cpp              :253 : 1761917619 us: [pid:1890  tid:0x7f13ab583980] Waiting for event 0x563d5a3fc130 to complete, current status 0
:4:command.cpp              :268 : 1761917714 us: [pid:1890  tid:0x7f13ab583980] Event 0x563d5a3fc130 wait completed
:3:hip_memory.cpp           :1303: 1761917764 us: [pid:1890  tid:0x7f13ab583980] hipMemcpyFromSymbol: Returned hipSuccess : : duration: 193729 us
:3:hip_device_runtime.cpp   :531 : 1761919964 us: [pid:1890  tid:0x7f13ab583980]  hipGetDevice ( 0x7fff49c706bc ) 
:3:hip_device_runtime.cpp   :539 : 1761920064 us: [pid:1890  tid:0x7f13ab583980] hipGetDevice: Returned hipSuccess : 
:3:hip_device.cpp           :381 : 1761920128 us: [pid:1890  tid:0x7f13ab583980]  hipGetDeviceProperties ( 0x7fff49c706c0, 0 ) 
:3:hip_device.cpp           :383 : 1761920230 us: [pid:1890  tid:0x7f13ab583980] hipGetDeviceProperties: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :531 : 1761920342 us: [pid:1890  tid:0x7f13ab583980]  hipGetDevice ( 0x7fff49c702ac ) 
:3:hip_device_runtime.cpp   :539 : 1761920436 us: [pid:1890  tid:0x7f13ab583980] hipGetDevice: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :531 : 1761920569 us: [pid:1890  tid:0x7f13ab583980]  hipGetDevice ( 0x7fff49c702ac ) 
:3:hip_device_runtime.cpp   :539 : 1761920671 us: [pid:1890  tid:0x7f13ab583980] hipGetDevice: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :561 : 1761920661 us: [pid:1890  tid:0x7f1294e006c0]  hipSetDevice ( 0 ) 
:3:hip_device_runtime.cpp   :565 : 1761920899 us: [pid:1890  tid:0x7f1294e006c0] hipSetDevice: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :561 : 1761920836 us: [pid:1890  tid:0x7f128ca006c0]  hipSetDevice ( 0 ) 
:3:hip_device_runtime.cpp   :565 : 1761921056 us: [pid:1890  tid:0x7f128ca006c0] hipSetDevice: Returned hipSuccess : 
:3:hiprtc.cpp               :69  : 1761924866 us: [pid:1890  tid:0x7f128ca006c0] hiprtcCreateProgram ( 0x7f128c9fe918, #define ROCFFT_CALLBACKS_ENABLED

// Copyright (C) 2021 - 2023 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.

#ifndef ROCFFT_COMPLEX_H
#define ROCFFT_COMPLEX_H

#if !defined(__HIPCC_RTC__)
#endif

#ifdef __HIP_PLATFORM_NVIDIA__
typedef __half _Float16;
#endif

template <typename Treal>
struct rocfft_complex
{

    Treal x; // Real part
    Treal y; // Imaginary part

    // Constructors
    // Do not initialize the members x or y by default, to ensure that it can
    // be used in __shared__ and that it is a trivial class compatible with C.
    __device__ __host__ rocfft_complex()                      = default;
    __device__ __host__ rocfft_complex(const rocfft_complex&) = default;
    __device__ __host__ rocfft_complex(rocfft_complex&&)      = default;
    __device__ __host__ rocfft_complex& operator=(const rocfft_complex& rhs) & = default;
    __device__ __host__ rocfft_complex& operator=(rocfft_complex&& rhs) & = default;
    __device__                          __host__ ~rocfft_complex()        = default;

    // Constructor from real and imaginary parts
    __device__ __host__ constexpr rocfft_complex(Treal real, Treal imag)
        : x{real}
        , y{imag}
    {
    }

    // Conversion from different precision
    template <typename U>
    __device__ __host__ explicit constexpr rocfft_complex(const rocfft_complex<U>& z)
        : x(z.x)
        , y(z.y)
    {
    }

    // Accessors
    __device__ __host__ constexpr Treal real() const
    {
        return x;
    }

    __device__ __host__ constexpr Treal imag() const
    {
        return y;
    }

    // Unary operations
    __forceinline__ __device__ __host__ rocfft_complex operator-() const
    {
        return {-x, -y};
    }

    __forceinline__ __device__ __host__ rocfft_complex operator+() const
    {
        return *this;
    }

    __device__ __host__ Treal asum(const rocfft_complex& z)
    {
        return abs(z.x) + abs(z.y);
    }

    // Internal real functions
    static __forceinline__ __device__ __host__ Treal abs(Treal x)
    {
        return x < 0 ? -x : x;
    }

    static __forceinline__ __device__ __host__ float sqrt(float x)
    {
        return ::sqrtf(x);
    }

    static __forceinline__ __device__ __host__ double sqrt(double x)
    {
        return ::sqrt(x);
    }

    // Addition operators
    __device__ __host__ auto& operator+=(const rocfft_complex& rhs)
    {
        return *this = {x + rhs.x, y + rhs.y};
    }

    __device__ __host__ auto operator+(const rocfft_complex& rhs) const
    {
        auto lhs = *this;
        return lhs += rhs;
    }

    // Subtraction operators
    __device__ __host__ auto& operator-=(const rocfft_complex& rhs)
    {
        return *this = {x - rhs.x, y - rhs.y};
    }

    __device__ __host__ auto operator-(const rocfft_complex& rhs) const
    {
        auto lhs = *this;
        return lhs -= rhs;
    }

    // Multiplication operators
    __device__ __host__ auto& operator*=(const rocfft_complex& rh
:3:hiprtc.cpp               :110 : 1761926814 us: [pid:1890  tid:0x7f128ca006c0] hiprtcCreateProgram: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :114 : 1761926889 us: [pid:1890  tid:0x7f128ca006c0] hiprtcCompileProgram ( 0x7f12840250f0, 3, 0x7f12840f8d80 )
:3:hiprtcInternal.cpp       :233 : 1761927003 us: [pid:1890  tid:0x7f128ca006c0] --gpu-architecture is nvcc option, transforming it to --offload-arch option
:3:hiprtc.cpp               :69  : 1761961673 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcCreateProgram ( 0x7ffc0a8d1798, #define ROCFFT_CALLBACKS_ENABLED

// Copyright (C) 2021 - 2023 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.

#ifndef ROCFFT_COMPLEX_H
#define ROCFFT_COMPLEX_H

#if !defined(__HIPCC_RTC__)
#endif

#ifdef __HIP_PLATFORM_NVIDIA__
typedef __half _Float16;
#endif

template <typename Treal>
struct rocfft_complex
{

    Treal x; // Real part
    Treal y; // Imaginary part

    // Constructors
    // Do not initialize the members x or y by default, to ensure that it can
    // be used in __shared__ and that it is a trivial class compatible with C.
    __device__ __host__ rocfft_complex()                      = default;
    __device__ __host__ rocfft_complex(const rocfft_complex&) = default;
    __device__ __host__ rocfft_complex(rocfft_complex&&)      = default;
    __device__ __host__ rocfft_complex& operator=(const rocfft_complex& rhs) & = default;
    __device__ __host__ rocfft_complex& operator=(rocfft_complex&& rhs) & = default;
    __device__                          __host__ ~rocfft_complex()        = default;

    // Constructor from real and imaginary parts
    __device__ __host__ constexpr rocfft_complex(Treal real, Treal imag)
        : x{real}
        , y{imag}
    {
    }

    // Conversion from different precision
    template <typename U>
    __device__ __host__ explicit constexpr rocfft_complex(const rocfft_complex<U>& z)
        : x(z.x)
        , y(z.y)
    {
    }

    // Accessors
    __device__ __host__ constexpr Treal real() const
    {
        return x;
    }

    __device__ __host__ constexpr Treal imag() const
    {
        return y;
    }

    // Unary operations
    __forceinline__ __device__ __host__ rocfft_complex operator-() const
    {
        return {-x, -y};
    }

    __forceinline__ __device__ __host__ rocfft_complex operator+() const
    {
        return *this;
    }

    __device__ __host__ Treal asum(const rocfft_complex& z)
    {
        return abs(z.x) + abs(z.y);
    }

    // Internal real functions
    static __forceinline__ __device__ __host__ Treal abs(Treal x)
    {
        return x < 0 ? -x : x;
    }

    static __forceinline__ __device__ __host__ float sqrt(float x)
    {
        return ::sqrtf(x);
    }

    static __forceinline__ __device__ __host__ double sqrt(double x)
    {
        return ::sqrt(x);
    }

    // Addition operators
    __device__ __host__ auto& operator+=(const rocfft_complex& rhs)
    {
        return *this = {x + rhs.x, y + rhs.y};
    }

    __device__ __host__ auto operator+(const rocfft_complex& rhs) const
    {
        auto lhs = *this;
        return lhs += rhs;
    }

    // Subtraction operators
    __device__ __host__ auto& operator-=(const rocfft_complex& rhs)
    {
        return *this = {x - rhs.x, y - rhs.y};
    }

    __device__ __host__ auto operator-(const rocfft_complex& rhs) const
    {
        auto lhs = *this;
        return lhs -= rhs;
    }

    // Multiplication operators
    __device__ __host__ auto& operator*=(const rocfft_complex& rh
:3:hiprtc.cpp               :110 : 1761963985 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcCreateProgram: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :114 : 1761964097 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcCompileProgram ( 0x559e15cfe520, 3, 0x559e15c55730 )
:3:hiprtcInternal.cpp       :233 : 1761964256 us: [pid:1896  tid:0x7f0a33acbf40] --gpu-architecture is nvcc option, transforming it to --offload-arch option
:3:hiprtc.cpp               :132 : 1762287160 us: [pid:1890  tid:0x7f128ca006c0] hiprtcCompileProgram: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :180 : 1762287472 us: [pid:1890  tid:0x7f128ca006c0] hiprtcGetCodeSize ( 0x7f12840250f0, 0x7f128c9fe8f8 )
:3:hiprtc.cpp               :189 : 1762287651 us: [pid:1890  tid:0x7f128ca006c0] hiprtcGetCodeSize: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :193 : 1762287738 us: [pid:1890  tid:0x7f128ca006c0] hiprtcGetCode ( 0x7f12840250f0,  )
:3:hiprtc.cpp               :202 : 1762287820 us: [pid:1890  tid:0x7f128ca006c0] hiprtcGetCode: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :168 : 1762287897 us: [pid:1890  tid:0x7f128ca006c0] hiprtcDestroyProgram ( 0x7f128c9fe918 )
:3:hiprtc.cpp               :176 : 1762287964 us: [pid:1890  tid:0x7f128ca006c0] hiprtcDestroyProgram: Returned HIPRTC_SUCCESS
:3:hip_module.cpp           :57  : 1762288076 us: [pid:1890  tid:0x7f128ca006c0]  hipModuleLoadData ( 0x7f12840cec00, 0x7f1284396ef0 ) 
:3:devprogram.cpp           :2681: 1762288297 us: [pid:1890  tid:0x7f128ca006c0] Using Code Object V4.
:3:hip_module.cpp           :58  : 1762290006 us: [pid:1890  tid:0x7f128ca006c0] hipModuleLoadData: Returned hipSuccess : 
:3:hip_module.cpp           :73  : 1762290081 us: [pid:1890  tid:0x7f128ca006c0]  hipModuleGetFunction ( 0x7f12840cec08, 0x7f12840ca8b0, fft_rtc_fwd_len256_factors_4_4_4_4_wgs_64_tpt_64_halfLds_sp_op_CI_CI_unitstride_sbrr_dirReg_CB ) 
:3:hip_module.cpp           :87  : 1762290210 us: [pid:1890  tid:0x7f128ca006c0] hipModuleGetFunction: Returned hipSuccess : 
:3:hiprtc.cpp               :132 : 1762390504 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcCompileProgram: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :180 : 1762390838 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcGetCodeSize ( 0x559e15cfe520, 0x7ffc0a8d1778 )
:3:hiprtc.cpp               :189 : 1762390993 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcGetCodeSize: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :193 : 1762391071 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcGetCode ( 0x559e15cfe520,  )
:3:hiprtc.cpp               :202 : 1762391134 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcGetCode: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :168 : 1762391202 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcDestroyProgram ( 0x7ffc0a8d1798 )
:3:hiprtc.cpp               :176 : 1762391302 us: [pid:1896  tid:0x7f0a33acbf40] hiprtcDestroyProgram: Returned HIPRTC_SUCCESS
:3:hip_module.cpp           :57  : 1762395389 us: [pid:1890  tid:0x7f1294e006c0]  hipModuleLoadData ( 0x7f12880f2120, 0x7f128801cb00 ) 
:3:devprogram.cpp           :2681: 1762395902 us: [pid:1890  tid:0x7f1294e006c0] Using Code Object V4.
:3:hip_module.cpp           :58  : 1762397708 us: [pid:1890  tid:0x7f1294e006c0] hipModuleLoadData: Returned hipSuccess : 
:3:hip_module.cpp           :73  : 1762397831 us: [pid:1890  tid:0x7f1294e006c0]  hipModuleGetFunction ( 0x7f12880f2128, 0x7f1288085990, fft_rtc_fwd_len256_factors_4_4_4_4_wgs_64_tpt_64_halfLds_sp_op_CI_CI_unitstride_sbrr_dirReg ) 
:3:hip_module.cpp           :87  : 1762397992 us: [pid:1890  tid:0x7f1294e006c0] hipModuleGetFunction: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :531 : 1762398143 us: [pid:1890  tid:0x7f13ab583980]  hipGetDevice ( 0x7fff49c70588 ) 
:3:hip_device_runtime.cpp   :539 : 1762398210 us: [pid:1890  tid:0x7f13ab583980] hipGetDevice: Returned hipSuccess : 
:3:hip_stream.cpp           :364 : 1762398269 us: [pid:1890  tid:0x7f13ab583980]  hipStreamCreate ( 0x563d59eec090 ) 
:3:rocdevice.cpp            :2732: 1762398329 us: [pid:1890  tid:0x7f13ab583980] number of allocated hardware queues with low priority: 0, with normal priority: 1, with high priority: 0, maximum per priority is: 4
:3:rocdevice.cpp            :2810: 1762401959 us: [pid:1890  tid:0x7f13ab583980] created hardware queue 0x7f13a9d56000 with size 16384 with priority 1, cooperative: 0
:3:rocdevice.cpp            :2902: 1762402091 us: [pid:1890  tid:0x7f13ab583980] acquireQueue refCount: 0x7f13a9d56000 (1)
:4:rocdevice.cpp            :2063: 1762402444 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa host memory 0x7f1282300000, size 0x100000
:3:hip_stream.cpp           :370 : 1762402659 us: [pid:1890  tid:0x7f13ab583980] hipStreamCreate: Returned hipSuccess : stream:0x563d5e02e9f0
:3:hip_memory.cpp           :566 : 1762402774 us: [pid:1890  tid:0x7f13ab583980]  hipMalloc ( 0x7fff49c70408, 2016 ) 
:4:rocdevice.cpp            :2191: 1762402872 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa device memory 0x7f1282000000, size 0x7e0
:3:rocdevice.cpp            :2230: 1762402979 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3fefff818
:3:hip_memory.cpp           :568 : 1762403029 us: [pid:1890  tid:0x7f13ab583980] hipMalloc: Returned hipSuccess : 0x7f1282000000: duration: 255 us
:3:hiprtc.cpp               :69  : 1762403217 us: [pid:1890  tid:0x7f13ab583980] hiprtcCreateProgram ( 0x7fff49c6fdf8, #define ROCFFT_CALLBACKS_ENABLED

// Copyright (C) 2021 - 2023 Advanced Micro Devices, Inc. All rights reserved.
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.

#ifndef ROCFFT_COMPLEX_H
#define ROCFFT_COMPLEX_H

#if !defined(__HIPCC_RTC__)
#endif

#ifdef __HIP_PLATFORM_NVIDIA__
typedef __half _Float16;
#endif

template <typename Treal>
struct rocfft_complex
{

    Treal x; // Real part
    Treal y; // Imaginary part

    // Constructors
    // Do not initialize the members x or y by default, to ensure that it can
    // be used in __shared__ and that it is a trivial class compatible with C.
    __device__ __host__ rocfft_complex()                      = default;
    __device__ __host__ rocfft_complex(const rocfft_complex&) = default;
    __device__ __host__ rocfft_complex(rocfft_complex&&)      = default;
    __device__ __host__ rocfft_complex& operator=(const rocfft_complex& rhs) & = default;
    __device__ __host__ rocfft_complex& operator=(rocfft_complex&& rhs) & = default;
    __device__                          __host__ ~rocfft_complex()        = default;

    // Constructor from real and imaginary parts
    __device__ __host__ constexpr rocfft_complex(Treal real, Treal imag)
        : x{real}
        , y{imag}
    {
    }

    // Conversion from different precision
    template <typename U>
    __device__ __host__ explicit constexpr rocfft_complex(const rocfft_complex<U>& z)
        : x(z.x)
        , y(z.y)
    {
    }

    // Accessors
    __device__ __host__ constexpr Treal real() const
    {
        return x;
    }

    __device__ __host__ constexpr Treal imag() const
    {
        return y;
    }

    // Unary operations
    __forceinline__ __device__ __host__ rocfft_complex operator-() const
    {
        return {-x, -y};
    }

    __forceinline__ __device__ __host__ rocfft_complex operator+() const
    {
        return *this;
    }

    __device__ __host__ Treal asum(const rocfft_complex& z)
    {
        return abs(z.x) + abs(z.y);
    }

    // Internal real functions
    static __forceinline__ __device__ __host__ Treal abs(Treal x)
    {
        return x < 0 ? -x : x;
    }

    static __forceinline__ __device__ __host__ float sqrt(float x)
    {
        return ::sqrtf(x);
    }

    static __forceinline__ __device__ __host__ double sqrt(double x)
    {
        return ::sqrt(x);
    }

    // Addition operators
    __device__ __host__ auto& operator+=(const rocfft_complex& rhs)
    {
        return *this = {x + rhs.x, y + rhs.y};
    }

    __device__ __host__ auto operator+(const rocfft_complex& rhs) const
    {
        auto lhs = *this;
        return lhs += rhs;
    }

    // Subtraction operators
    __device__ __host__ auto& operator-=(const rocfft_complex& rhs)
    {
        return *this = {x - rhs.x, y - rhs.y};
    }

    __device__ __host__ auto operator-(const rocfft_complex& rhs) const
    {
        auto lhs = *this;
        return lhs -= rhs;
    }

    // Multiplication operators
    __device__ __host__ auto& operator*=(const rocfft_complex& rh
:3:hiprtc.cpp               :110 : 1762404935 us: [pid:1890  tid:0x7f13ab583980] hiprtcCreateProgram: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :114 : 1762405025 us: [pid:1890  tid:0x7f13ab583980] hiprtcCompileProgram ( 0x563d5e0269c0, 3, 0x563d59c99920 )
:3:hiprtcInternal.cpp       :233 : 1762405138 us: [pid:1890  tid:0x7f13ab583980] --gpu-architecture is nvcc option, transforming it to --offload-arch option
:3:hiprtc.cpp               :132 : 1762667262 us: [pid:1890  tid:0x7f13ab583980] hiprtcCompileProgram: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :180 : 1762667472 us: [pid:1890  tid:0x7f13ab583980] hiprtcGetCodeSize ( 0x563d5e0269c0, 0x7fff49c6fdd8 )
:3:hiprtc.cpp               :189 : 1762667613 us: [pid:1890  tid:0x7f13ab583980] hiprtcGetCodeSize: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :193 : 1762667673 us: [pid:1890  tid:0x7f13ab583980] hiprtcGetCode ( 0x563d5e0269c0,  )
:3:hiprtc.cpp               :202 : 1762667731 us: [pid:1890  tid:0x7f13ab583980] hiprtcGetCode: Returned HIPRTC_SUCCESS
:3:hiprtc.cpp               :168 : 1762667798 us: [pid:1890  tid:0x7f13ab583980] hiprtcDestroyProgram ( 0x7fff49c6fdf8 )
:3:hiprtc.cpp               :176 : 1762667860 us: [pid:1890  tid:0x7f13ab583980] hiprtcDestroyProgram: Returned HIPRTC_SUCCESS
:3:hip_module.cpp           :57  : 1762667966 us: [pid:1890  tid:0x7f13ab583980]  hipModuleLoadData ( 0x7fff49c700b0, 0x563d5db900a0 ) 
:3:devprogram.cpp           :2681: 1762668150 us: [pid:1890  tid:0x7f13ab583980] Using Code Object V4.
:3:hip_module.cpp           :58  : 1762669818 us: [pid:1890  tid:0x7f13ab583980] hipModuleLoadData: Returned hipSuccess : 
:3:hip_module.cpp           :73  : 1762669945 us: [pid:1890  tid:0x7f13ab583980]  hipModuleGetFunction ( 0x7fff49c700b8, 0x563d5e09d130, twiddle_gen_radices_sp ) 
:3:hip_module.cpp           :87  : 1762670051 us: [pid:1890  tid:0x7f13ab583980] hipModuleGetFunction: Returned hipSuccess : 
:3:hip_module.cpp           :433 : 1762670117 us: [pid:1890  tid:0x7f13ab583980]  hipModuleLaunchKernel ( 0x0x563d5b57fb80, 1, 2, 1, 32, 32, 1, 0, stream:0x563d5e02e9f0, char array:<null>, 0x7fff49c6ffd0 ) 
:3:rocdevice.cpp            :2651: 1762670224 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:command.cpp              :349 : 1762670281 us: [pid:1890  tid:0x7f13ab583980] Command (KernelExecution) enqueued: 0x563d5e1bb240
:3:rocvirtual.cpp           :783 : 1762670386 us: [pid:1890  tid:0x7f13ab583980] Arg0:   = val:256
:3:rocvirtual.cpp           :783 : 1762670432 us: [pid:1890  tid:0x7f13ab583980] Arg1:   = val:4
:3:rocvirtual.cpp           :783 : 1762670479 us: [pid:1890  tid:0x7f13ab583980] Arg2:   = val:4
:3:rocvirtual.cpp           :783 : 1762670526 us: [pid:1890  tid:0x7f13ab583980] Arg3:   = val:16
:3:rocvirtual.cpp           :783 : 1762670574 us: [pid:1890  tid:0x7f13ab583980] Arg4:   = val:0
:3:rocvirtual.cpp           :708 : 1762670622 us: [pid:1890  tid:0x7f13ab583980] Arg5:   = ptr:0x7f1282000000 obj:[0x7f1282000000-0x7f12820007e0]
:3:rocvirtual.cpp           :2901: 1762670717 us: [pid:1890  tid:0x7f13ab583980] ShaderName : twiddle_gen_radices_sp
:4:rocvirtual.cpp           :867 : 1762670767 us: [pid:1890  tid:0x7f13ab583980] HWq=0x7f1294300000, Dispatch Header = 0xb02 (type=2, barrier=1, acquire=1, release=1), setup=3, grid=[32, 64, 1], workgroup=[32, 32, 1], private_seg_size=0, group_seg_size=0, kernel_obj=0x7f13a9d30780, kernarg_address=0x7f1282300000, completion_signal=0x0
:3:hip_module.cpp           :453 : 1762670918 us: [pid:1890  tid:0x7f13ab583980] hipModuleLaunchKernel: Returned hipSuccess : 
:3:hip_module.cpp           :43  : 1762670967 us: [pid:1890  tid:0x7f13ab583980]  hipModuleUnload ( 0x563d5e09d130 ) 
:3:hip_module.cpp           :47  : 1762671016 us: [pid:1890  tid:0x7f13ab583980] hipModuleUnload: Returned hipSuccess : 
:3:hip_stream.cpp           :451 : 1762671069 us: [pid:1890  tid:0x7f13ab583980]  hipStreamSynchronize ( stream:0x563d5e02e9f0 ) 
:3:rocdevice.cpp            :2651: 1762671163 us: [pid:1890  tid:0x7f13ab583980] No HW event
:3:rocdevice.cpp            :2651: 1762671211 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:commandqueue.cpp         :140 : 1762671258 us: [pid:1890  tid:0x7f13ab583980] HW Event not ready, awaiting completion instead
:4:command.cpp              :289 : 1762671353 us: [pid:1890  tid:0x7f13ab583980] Queue marker to command queue: 0x563d5e02e9f0
:4:command.cpp              :349 : 1762671401 us: [pid:1890  tid:0x7f13ab583980] Command (InternalMarker) enqueued: 0x563d5de30fb0
:4:rocvirtual.cpp           :1013: 1762671499 us: [pid:1890  tid:0x7f13ab583980] HWq=0x7f1294300000, BarrierAND Header = 0x1503 (type=3, barrier=1, acquire=2, release=2), dep_signal=[0x0, 0x0, 0x0, 0x0, 0x0], completion_signal=0x7f13a9d65380
:4:command.cpp              :179 : 1762671604 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5e1bb240 complete
:4:command.cpp              :173 : 1762671652 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5de30fb0 complete (Wall: 1762671652, CPU: 0, GPU: 0 us)
:4:command.cpp              :253 : 1762671757 us: [pid:1890  tid:0x7f13ab583980] Waiting for event 0x563d5e1bb240 to complete, current status 0
:4:command.cpp              :268 : 1762671860 us: [pid:1890  tid:0x7f13ab583980] Event 0x563d5e1bb240 wait completed
:4:commandqueue.cpp         :153 : 1762672075 us: [pid:1890  tid:0x7f13ab583980] All commands finished
:3:hip_stream.cpp           :452 : 1762672136 us: [pid:1890  tid:0x7f13ab583980] hipStreamSynchronize: Returned hipSuccess : 
:3:hip_memory.cpp           :566 : 1762672199 us: [pid:1890  tid:0x7f13ab583980]  hipMalloc ( 0x7fff49c705d8, 384 ) 
:4:rocdevice.cpp            :2191: 1762672261 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa device memory 0x7f1282001000, size 0x180
:3:rocdevice.cpp            :2230: 1762672365 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3fefff698
:3:hip_memory.cpp           :568 : 1762672415 us: [pid:1890  tid:0x7f13ab583980] hipMalloc: Returned hipSuccess : 0x7f1282001000: duration: 216 us
:3:hip_memory.cpp           :641 : 1762672521 us: [pid:1890  tid:0x7f13ab583980]  hipMemcpy ( 0x7f1282001000, 0x7fff49c703c0, 384, hipMemcpyHostToDevice ) 
:3:rocdevice.cpp            :2651: 1762672621 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:command.cpp              :349 : 1762672671 us: [pid:1890  tid:0x7f13ab583980] Command (CopyHostToDevice) enqueued: 0x563d5e1b97a0
:4:rocblit.cpp              :791 : 1762673354 us: [pid:1890  tid:0x7f13ab583980] HSA Async Copy staged H2D dst=0x7f1282001000, src=0x7f1295700000, size=384, completion_signal=0x7f13ab57e780
:4:command.cpp              :289 : 1762673465 us: [pid:1890  tid:0x7f13ab583980] Queue marker to command queue: 0x563d4cdd1330
:4:command.cpp              :349 : 1762673524 us: [pid:1890  tid:0x7f13ab583980] Command (InternalMarker) enqueued: 0x563d5dc5da20
:4:command.cpp              :179 : 1762673620 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5e1b97a0 complete
:4:command.cpp              :173 : 1762673668 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5dc5da20 complete (Wall: 1762673667, CPU: 0, GPU: 0 us)
:4:command.cpp              :253 : 1762673763 us: [pid:1890  tid:0x7f13ab583980] Waiting for event 0x563d5e1b97a0 to complete, current status 0
:4:command.cpp              :268 : 1762673858 us: [pid:1890  tid:0x7f13ab583980] Event 0x563d5e1b97a0 wait completed
:3:hip_memory.cpp           :642 : 1762673907 us: [pid:1890  tid:0x7f13ab583980] hipMemcpy: Returned hipSuccess : : duration: 1386 us
:3:hip_stream.cpp           :364 : 1762674011 us: [pid:1890  tid:0x7f13ab583980]  hipStreamCreate ( 0x7fff49c70f38 ) 
:3:rocdevice.cpp            :2732: 1762674070 us: [pid:1890  tid:0x7f13ab583980] number of allocated hardware queues with low priority: 0, with normal priority: 2, with high priority: 0, maximum per priority is: 4
:3:rocdevice.cpp            :2810: 1762677563 us: [pid:1890  tid:0x7f13ab583980] created hardware queue 0x7f13a9d34000 with size 16384 with priority 1, cooperative: 0
:3:rocdevice.cpp            :2902: 1762677695 us: [pid:1890  tid:0x7f13ab583980] acquireQueue refCount: 0x7f13a9d34000 (1)
:4:rocdevice.cpp            :2063: 1762678051 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa host memory 0x7f127ff00000, size 0x100000
:3:hip_stream.cpp           :370 : 1762678322 us: [pid:1890  tid:0x7f13ab583980] hipStreamCreate: Returned hipSuccess : stream:0x563d5e02f9f0
:3:hip_memory.cpp           :566 : 1762678434 us: [pid:1890  tid:0x7f13ab583980]  hipMalloc ( 0x7fff49c70f50, 2048 ) 
:4:rocdevice.cpp            :2191: 1762678496 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa device memory 0x7f1282002000, size 0x800
:3:rocdevice.cpp            :2230: 1762678600 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3feffee98
:3:hip_memory.cpp           :568 : 1762678649 us: [pid:1890  tid:0x7f13ab583980] hipMalloc: Returned hipSuccess : 0x7f1282002000: duration: 215 us
:3:hip_memory.cpp           :641 : 1762678745 us: [pid:1890  tid:0x7f13ab583980]  hipMemcpy ( 0x7f1282002000, 0x563d5e036950, 2048, hipMemcpyHostToDevice ) 
:3:rocdevice.cpp            :2651: 1762678839 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:command.cpp              :349 : 1762678889 us: [pid:1890  tid:0x7f13ab583980] Command (CopyHostToDevice) enqueued: 0x563d5e1b97a0
:4:rocblit.cpp              :791 : 1762678998 us: [pid:1890  tid:0x7f13ab583980] HSA Async Copy staged H2D dst=0x7f1282002000, src=0x7f1295700000, size=2048, completion_signal=0x7f13ab57e700
:4:command.cpp              :289 : 1762679104 us: [pid:1890  tid:0x7f13ab583980] Queue marker to command queue: 0x563d4cdd1330
:4:command.cpp              :349 : 1762679151 us: [pid:1890  tid:0x7f13ab583980] Command (InternalMarker) enqueued: 0x563d5dc5da20
:4:command.cpp              :179 : 1762679247 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5e1b97a0 complete
:4:command.cpp              :173 : 1762679295 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5dc5da20 complete (Wall: 1762679295, CPU: 0, GPU: 0 us)
:4:command.cpp              :253 : 1762679390 us: [pid:1890  tid:0x7f13ab583980] Waiting for event 0x563d5e1b97a0 to complete, current status 0
:4:command.cpp              :268 : 1762679486 us: [pid:1890  tid:0x7f13ab583980] Event 0x563d5e1b97a0 wait completed
:3:hip_memory.cpp           :642 : 1762679534 us: [pid:1890  tid:0x7f13ab583980] hipMemcpy: Returned hipSuccess : : duration: 789 us
:3:hip_memory.cpp           :566 : 1762679629 us: [pid:1890  tid:0x7f13ab583980]  hipMalloc ( 0x7fff49c70f20, 2048 ) 
:4:rocdevice.cpp            :2191: 1762679678 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa device memory 0x7f1282003000, size 0x800
:3:rocdevice.cpp            :2230: 1762679777 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3feffe698
:3:hip_memory.cpp           :568 : 1762679825 us: [pid:1890  tid:0x7f13ab583980] hipMalloc: Returned hipSuccess : 0x7f1282003000: duration: 196 us
:3:hip_memory.cpp           :1302: 1762679922 us: [pid:1890  tid:0x7f13ab583980]  hipMemcpyFromSymbol ( 0x7f13ac404b70, 0x563d5e010040, 8, 0, hipMemcpyDeviceToHost ) 
:3:devprogram.cpp           :2681: 1762680136 us: [pid:1890  tid:0x7f13ab583980] Using Code Object V4.
:3:rocdevice.cpp            :2230: 1762680466 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3feffe690
:3:rocdevice.cpp            :2651: 1762680529 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:command.cpp              :349 : 1762680586 us: [pid:1890  tid:0x7f13ab583980] Command (CopyDeviceToHost) enqueued: 0x563d5e1b97a0
:4:rocblit.cpp              :823 : 1762680693 us: [pid:1890  tid:0x7f13ab583980] HSA Async Copy staged D2H dst=0x7f1295500000, src=0x7f13a9d23af0, size=8, completion_signal=0x7f13ab57e680
:4:command.cpp              :289 : 1762680796 us: [pid:1890  tid:0x7f13ab583980] Queue marker to command queue: 0x563d4cdd1330
:4:command.cpp              :349 : 1762680843 us: [pid:1890  tid:0x7f13ab583980] Command (InternalMarker) enqueued: 0x563d5dc5da20
:4:command.cpp              :179 : 1762680938 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5e1b97a0 complete
:4:command.cpp              :173 : 1762680985 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5dc5da20 complete (Wall: 1762680985, CPU: 0, GPU: 0 us)
:4:command.cpp              :253 : 1762681080 us: [pid:1890  tid:0x7f13ab583980] Waiting for event 0x563d5e1b97a0 to complete, current status 0
:4:command.cpp              :268 : 1762681175 us: [pid:1890  tid:0x7f13ab583980] Event 0x563d5e1b97a0 wait completed
:3:hip_memory.cpp           :1303: 1762681223 us: [pid:1890  tid:0x7f13ab583980] hipMemcpyFromSymbol: Returned hipSuccess : : duration: 1301 us
:3:hip_module.cpp           :433 : 1762681326 us: [pid:1890  tid:0x7f13ab583980]  hipModuleLaunchKernel ( 0x0x7f12840b2fe0, 1, 1, 1, 64, 1, 1, 1024, stream:0x563d5e02f9f0, char array:<null>, 0x7fff49c70880 ) 
:3:rocdevice.cpp            :2651: 1762681433 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:command.cpp              :349 : 1762681489 us: [pid:1890  tid:0x7f13ab583980] Command (KernelExecution) enqueued: 0x563d5e1bb240
:3:rocvirtual.cpp           :708 : 1762681584 us: [pid:1890  tid:0x7f13ab583980] Arg0:   = ptr:0x7f1282000000 obj:[0x7f1282000000-0x7f12820007e0]
:3:rocvirtual.cpp           :783 : 1762681678 us: [pid:1890  tid:0x7f13ab583980] Arg1:   = val:1
:3:rocvirtual.cpp           :708 : 1762681725 us: [pid:1890  tid:0x7f13ab583980] Arg2:   = ptr:0x7f1282001000 obj:[0x7f1282001000-0x7f1282001180]
:3:rocvirtual.cpp           :708 : 1762681820 us: [pid:1890  tid:0x7f13ab583980] Arg3:   = ptr:0x7f1282001080 obj:[0x7f1282001000-0x7f1282001180]
:3:rocvirtual.cpp           :708 : 1762681915 us: [pid:1890  tid:0x7f13ab583980] Arg4:   = ptr:0x7f1282001100 obj:[0x7f1282001000-0x7f1282001180]
:3:rocvirtual.cpp           :783 : 1762682010 us: [pid:1890  tid:0x7f13ab583980] Arg5:   = val:1
:3:rocvirtual.cpp           :783 : 1762682058 us: [pid:1890  tid:0x7f13ab583980] Arg6:   = val:0
:3:rocvirtual.cpp           :783 : 1762682105 us: [pid:1890  tid:0x7f13ab583980] Arg9:   = val:0
:3:rocvirtual.cpp           :708 : 1762682153 us: [pid:1890  tid:0x7f13ab583980] Arg12:   = ptr:0x7f1282002000 obj:[0x7f1282002000-0x7f1282002800]
:3:rocvirtual.cpp           :708 : 1762682248 us: [pid:1890  tid:0x7f13ab583980] Arg13:   = ptr:0x7f1282003000 obj:[0x7f1282003000-0x7f1282003800]
:3:rocvirtual.cpp           :2901: 1762682344 us: [pid:1890  tid:0x7f13ab583980] ShaderName : fft_rtc_fwd_len256_factors_4_4_4_4_wgs_64_tpt_64_halfLds_sp_op_CI_CI_unitstride_sbrr_dirReg_CB
:1:rocvirtual.cpp           :2949: 1762682438 us: [pid:1890  tid:0x7f13ab583980] Pcie atomics not enabled, hostcall not supported
:1:rocvirtual.cpp           :3289: 1762682533 us: [pid:1890  tid:0x7f13ab583980] AQL dispatch failed!
:4:command.cpp              :179 : 1762682581 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5e1bb240 complete
:3:hip_module.cpp           :453 : 1762682628 us: [pid:1890  tid:0x7f13ab583980] hipModuleLaunchKernel: Returned hipErrorIllegalState : 
clients/tests/default_callbacks_test.cpp:280: Failure
Expected equality of these values:
  rocfft_execute(plan, &in_ptr, &out_ptr, info)
    Which is: 1
  rocfft_status_success
    Which is: 0

:3:hip_memory.cpp           :618 : 1762682983 us: [pid:1890  tid:0x7f13ab583980]  hipFree ( 0x7f1282003000 ) 
:3:rocdevice.cpp            :2651: 1762683040 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:commandqueue.cpp         :140 : 1762683086 us: [pid:1890  tid:0x7f13ab583980] HW Event not ready, awaiting completion instead
:4:commandqueue.cpp         :153 : 1762683181 us: [pid:1890  tid:0x7f13ab583980] All commands finished
:3:rocdevice.cpp            :2651: 1762683229 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:commandqueue.cpp         :140 : 1762683276 us: [pid:1890  tid:0x7f13ab583980] HW Event not ready, awaiting completion instead
:4:commandqueue.cpp         :153 : 1762683372 us: [pid:1890  tid:0x7f13ab583980] All commands finished
:3:rocdevice.cpp            :2651: 1762683419 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:commandqueue.cpp         :140 : 1762683467 us: [pid:1890  tid:0x7f13ab583980] HW Event not ready, awaiting completion instead
:4:commandqueue.cpp         :153 : 1762683562 us: [pid:1890  tid:0x7f13ab583980] All commands finished
:3:hip_memory.cpp           :620 : 1762683612 us: [pid:1890  tid:0x7f13ab583980] hipFree: Returned hipSuccess : 
:3:hip_memory.cpp           :618 : 1762683658 us: [pid:1890  tid:0x7f13ab583980]  hipFree ( 0x7f1282002000 ) 
:3:hip_memory.cpp           :620 : 1762683706 us: [pid:1890  tid:0x7f13ab583980] hipFree: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :531 : 1762683768 us: [pid:1890  tid:0x7f13ab583980]  hipGetDevice ( 0x7fff49c706bc ) 
:3:hip_device_runtime.cpp   :539 : 1762683826 us: [pid:1890  tid:0x7f13ab583980] hipGetDevice: Returned hipSuccess : 
:3:hip_device.cpp           :381 : 1762683874 us: [pid:1890  tid:0x7f13ab583980]  hipGetDeviceProperties ( 0x7fff49c706c0, 0 ) 
:3:hip_device.cpp           :383 : 1762683970 us: [pid:1890  tid:0x7f13ab583980] hipGetDeviceProperties: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :531 : 1762684091 us: [pid:1890  tid:0x7f13ab583980]  hipGetDevice ( 0x7fff49c702ac ) 
:3:hip_device_runtime.cpp   :539 : 1762684151 us: [pid:1890  tid:0x7f13ab583980] hipGetDevice: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :531 : 1762684239 us: [pid:1890  tid:0x7f13ab583980]  hipGetDevice ( 0x7fff49c702ac ) 
:3:hip_device_runtime.cpp   :539 : 1762684303 us: [pid:1890  tid:0x7f13ab583980] hipGetDevice: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :561 : 1762684286 us: [pid:1890  tid:0x7f128ca006c0]  hipSetDevice ( 0 ) 
:3:hip_device_runtime.cpp   :561 : 1762684427 us: [pid:1890  tid:0x7f1294e006c0]  hipSetDevice ( 0 ) 
:3:hip_device_runtime.cpp   :565 : 1762684545 us: [pid:1890  tid:0x7f1294e006c0] hipSetDevice: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :565 : 1762684483 us: [pid:1890  tid:0x7f128ca006c0] hipSetDevice: Returned hipSuccess : 
:3:hip_module.cpp           :57  : 1762684654 us: [pid:1890  tid:0x7f1294e006c0]  hipModuleLoadData ( 0x7f12843232b0, 0x7f1284323590 ) 
:3:hip_module.cpp           :57  : 1762684731 us: [pid:1890  tid:0x7f128ca006c0]  hipModuleLoadData ( 0x7f12880d1cd0, 0x7f1288106c30 ) 
:3:devprogram.cpp           :2681: 1762684870 us: [pid:1890  tid:0x7f1294e006c0] Using Code Object V4.
:3:devprogram.cpp           :2681: 1762685040 us: [pid:1890  tid:0x7f128ca006c0] Using Code Object V4.
:3:hip_module.cpp           :58  : 1762685292 us: [pid:1890  tid:0x7f1294e006c0] hipModuleLoadData: Returned hipSuccess : 
:3:hip_module.cpp           :73  : 1762685358 us: [pid:1890  tid:0x7f1294e006c0]  hipModuleGetFunction ( 0x7f12843232b8, 0x7f1284004840, fft_rtc_fwd_len256_factors_4_4_4_4_wgs_64_tpt_64_halfLds_sp_op_CI_CI_unitstride_sbrr_dirReg_CB ) 
:3:hip_module.cpp           :87  : 1762685463 us: [pid:1890  tid:0x7f1294e006c0] hipModuleGetFunction: Returned hipSuccess : 
:3:hip_module.cpp           :58  : 1762685508 us: [pid:1890  tid:0x7f128ca006c0] hipModuleLoadData: Returned hipSuccess : 
:3:hip_module.cpp           :73  : 1762685591 us: [pid:1890  tid:0x7f128ca006c0]  hipModuleGetFunction ( 0x7f12880d1cd8, 0x7f12880f0d00, fft_rtc_fwd_len256_factors_4_4_4_4_wgs_64_tpt_64_halfLds_sp_op_CI_CI_unitstride_sbrr_dirReg ) 
:3:hip_module.cpp           :87  : 1762685726 us: [pid:1890  tid:0x7f128ca006c0] hipModuleGetFunction: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :531 : 1762685826 us: [pid:1890  tid:0x7f13ab583980]  hipGetDevice ( 0x7fff49c70588 ) 
:3:hip_device_runtime.cpp   :539 : 1762685884 us: [pid:1890  tid:0x7f13ab583980] hipGetDevice: Returned hipSuccess : 
:3:hip_memory.cpp           :566 : 1762685933 us: [pid:1890  tid:0x7f13ab583980]  hipMalloc ( 0x7fff49c705d8, 384 ) 
:4:rocdevice.cpp            :2191: 1762685982 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa device memory 0x7f1282004000, size 0x180
:3:rocdevice.cpp            :2230: 1762686085 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3feffe510
:3:hip_memory.cpp           :568 : 1762686133 us: [pid:1890  tid:0x7f13ab583980] hipMalloc: Returned hipSuccess : 0x7f1282004000: duration: 200 us
:3:hip_memory.cpp           :641 : 1762686228 us: [pid:1890  tid:0x7f13ab583980]  hipMemcpy ( 0x7f1282004000, 0x7fff49c703c0, 384, hipMemcpyHostToDevice ) 
:4:command.cpp              :349 : 1762686324 us: [pid:1890  tid:0x7f13ab583980] Command (CopyHostToDevice) enqueued: 0x563d5e1b97a0
:4:rocblit.cpp              :791 : 1762686431 us: [pid:1890  tid:0x7f13ab583980] HSA Async Copy staged H2D dst=0x7f1282004000, src=0x7f1295700000, size=384, completion_signal=0x7f13ab57e600
:4:command.cpp              :289 : 1762686533 us: [pid:1890  tid:0x7f13ab583980] Queue marker to command queue: 0x563d4cdd1330
:4:command.cpp              :349 : 1762686580 us: [pid:1890  tid:0x7f13ab583980] Command (InternalMarker) enqueued: 0x563d5de30fb0
:4:command.cpp              :179 : 1762686676 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5e1b97a0 complete
:4:command.cpp              :173 : 1762686723 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5de30fb0 complete (Wall: 1762686722, CPU: 0, GPU: 0 us)
:4:command.cpp              :253 : 1762686818 us: [pid:1890  tid:0x7f13ab583980] Waiting for event 0x563d5e1b97a0 to complete, current status 0
:4:command.cpp              :268 : 1762686912 us: [pid:1890  tid:0x7f13ab583980] Event 0x563d5e1b97a0 wait completed
:3:hip_memory.cpp           :642 : 1762686960 us: [pid:1890  tid:0x7f13ab583980] hipMemcpy: Returned hipSuccess : : duration: 732 us
:3:hip_stream.cpp           :364 : 1762687060 us: [pid:1890  tid:0x7f13ab583980]  hipStreamCreate ( 0x7fff49c70f38 ) 
:3:rocdevice.cpp            :2732: 1762687117 us: [pid:1890  tid:0x7f13ab583980] number of allocated hardware queues with low priority: 0, with normal priority: 3, with high priority: 0, maximum per priority is: 4
:3:rocdevice.cpp            :2810: 1762690840 us: [pid:1890  tid:0x7f13ab583980] created hardware queue 0x7f13a9d26000 with size 16384 with priority 1, cooperative: 0
:3:rocdevice.cpp            :2902: 1762690962 us: [pid:1890  tid:0x7f13ab583980] acquireQueue refCount: 0x7f13a9d26000 (1)
:4:rocdevice.cpp            :2063: 1762691316 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa host memory 0x7f127e100000, size 0x100000
:3:hip_stream.cpp           :370 : 1762691548 us: [pid:1890  tid:0x7f13ab583980] hipStreamCreate: Returned hipSuccess : stream:0x563d5e019c40
:3:hip_memory.cpp           :566 : 1762691657 us: [pid:1890  tid:0x7f13ab583980]  hipMalloc ( 0x7fff49c70f50, 2048 ) 
:4:rocdevice.cpp            :2191: 1762691715 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa device memory 0x7f1282005000, size 0x800
:3:rocdevice.cpp            :2230: 1762691821 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3feffdd10
:3:hip_memory.cpp           :568 : 1762691871 us: [pid:1890  tid:0x7f13ab583980] hipMalloc: Returned hipSuccess : 0x7f1282005000: duration: 214 us
:3:hip_memory.cpp           :641 : 1762691965 us: [pid:1890  tid:0x7f13ab583980]  hipMemcpy ( 0x7f1282005000, 0x563d5e036950, 2048, hipMemcpyHostToDevice ) 
:4:command.cpp              :349 : 1762692101 us: [pid:1890  tid:0x7f13ab583980] Command (CopyHostToDevice) enqueued: 0x563d5e1b97a0
:4:rocblit.cpp              :791 : 1762692210 us: [pid:1890  tid:0x7f13ab583980] HSA Async Copy staged H2D dst=0x7f1282005000, src=0x7f1295700000, size=2048, completion_signal=0x7f13ab57e580
:4:command.cpp              :289 : 1762692315 us: [pid:1890  tid:0x7f13ab583980] Queue marker to command queue: 0x563d4cdd1330
:4:command.cpp              :349 : 1762692362 us: [pid:1890  tid:0x7f13ab583980] Command (InternalMarker) enqueued: 0x563d5de30fb0
:4:command.cpp              :179 : 1762692457 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5e1b97a0 complete
:4:command.cpp              :173 : 1762692505 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5de30fb0 complete (Wall: 1762692504, CPU: 0, GPU: 0 us)
:4:command.cpp              :253 : 1762692600 us: [pid:1890  tid:0x7f13ab583980] Waiting for event 0x563d5e1b97a0 to complete, current status 0
:4:command.cpp              :268 : 1762692695 us: [pid:1890  tid:0x7f13ab583980] Event 0x563d5e1b97a0 wait completed
:3:hip_memory.cpp           :642 : 1762692743 us: [pid:1890  tid:0x7f13ab583980] hipMemcpy: Returned hipSuccess : : duration: 778 us
:3:hip_memory.cpp           :566 : 1762692838 us: [pid:1890  tid:0x7f13ab583980]  hipMalloc ( 0x7fff49c70f20, 2048 ) 
:4:rocdevice.cpp            :2191: 1762692887 us: [pid:1890  tid:0x7f13ab583980] Allocate hsa device memory 0x7f1282006000, size 0x800
:3:rocdevice.cpp            :2230: 1762692986 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3feffd510
:3:hip_memory.cpp           :568 : 1762693035 us: [pid:1890  tid:0x7f13ab583980] hipMalloc: Returned hipSuccess : 0x7f1282006000: duration: 197 us
:3:hip_module.cpp           :433 : 1762693135 us: [pid:1890  tid:0x7f13ab583980]  hipModuleLaunchKernel ( 0x0x7f1288021650, 1, 1, 1, 64, 1, 1, 1024, stream:0x563d5e019c40, char array:<null>, 0x7fff49c70880 ) 
:3:rocdevice.cpp            :2651: 1762693242 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:command.cpp              :349 : 1762693299 us: [pid:1890  tid:0x7f13ab583980] Command (KernelExecution) enqueued: 0x563d5e0ec020
:3:rocvirtual.cpp           :708 : 1762693394 us: [pid:1890  tid:0x7f13ab583980] Arg0:   = ptr:0x7f1282000000 obj:[0x7f1282000000-0x7f12820007e0]
:3:rocvirtual.cpp           :783 : 1762693488 us: [pid:1890  tid:0x7f13ab583980] Arg1:   = val:1
:3:rocvirtual.cpp           :708 : 1762693535 us: [pid:1890  tid:0x7f13ab583980] Arg2:   = ptr:0x7f1282004000 obj:[0x7f1282004000-0x7f1282004180]
:3:rocvirtual.cpp           :708 : 1762693630 us: [pid:1890  tid:0x7f13ab583980] Arg3:   = ptr:0x7f1282004080 obj:[0x7f1282004000-0x7f1282004180]
:3:rocvirtual.cpp           :708 : 1762693725 us: [pid:1890  tid:0x7f13ab583980] Arg4:   = ptr:0x7f1282004100 obj:[0x7f1282004000-0x7f1282004180]
:3:rocvirtual.cpp           :783 : 1762693820 us: [pid:1890  tid:0x7f13ab583980] Arg5:   = val:1
:3:rocvirtual.cpp           :783 : 1762693868 us: [pid:1890  tid:0x7f13ab583980] Arg6:   = val:0
:3:rocvirtual.cpp           :783 : 1762693915 us: [pid:1890  tid:0x7f13ab583980] Arg9:   = val:0
:3:rocvirtual.cpp           :708 : 1762693963 us: [pid:1890  tid:0x7f13ab583980] Arg12:   = ptr:0x7f1282005000 obj:[0x7f1282005000-0x7f1282005800]
:3:rocvirtual.cpp           :708 : 1762694058 us: [pid:1890  tid:0x7f13ab583980] Arg13:   = ptr:0x7f1282006000 obj:[0x7f1282006000-0x7f1282006800]
:3:rocvirtual.cpp           :2901: 1762694153 us: [pid:1890  tid:0x7f13ab583980] ShaderName : fft_rtc_fwd_len256_factors_4_4_4_4_wgs_64_tpt_64_halfLds_sp_op_CI_CI_unitstride_sbrr_dirReg
:4:rocvirtual.cpp           :867 : 1762694250 us: [pid:1890  tid:0x7f13ab583980] HWq=0x7f127fd00000, Dispatch Header = 0xb02 (type=2, barrier=1, acquire=1, release=1), setup=3, grid=[64, 1, 1], workgroup=[64, 1, 1], private_seg_size=0, group_seg_size=1024, kernel_obj=0x7f13a9d00a00, kernarg_address=0x7f127e100000, completion_signal=0x0
:3:hip_module.cpp           :453 : 1762694391 us: [pid:1890  tid:0x7f13ab583980] hipModuleLaunchKernel: Returned hipSuccess : 
:3:hip_memory.cpp           :641 : 1762694440 us: [pid:1890  tid:0x7f13ab583980]  hipMemcpy ( 0x563d5dfbada0, 0x7f1282006000, 2048, hipMemcpyDeviceToHost ) 
:3:rocdevice.cpp            :2651: 1762694533 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:command.cpp              :289 : 1762694581 us: [pid:1890  tid:0x7f13ab583980] Queue marker to command queue: 0x563d5e019c40
:4:command.cpp              :349 : 1762694628 us: [pid:1890  tid:0x7f13ab583980] Command (InternalMarker) enqueued: 0x563d5de30df0
:3:rocvirtual.cpp           :457 : 1762694729 us: [pid:1890  tid:0x7f13ab583980] Set Handler: handle(0x7f13a9d2ab80), timestamp(0x563d5b61b370)
:4:rocvirtual.cpp           :1013: 1762694834 us: [pid:1890  tid:0x7f13ab583980] HWq=0x7f127fd00000, BarrierAND Header = 0x1503 (type=3, barrier=1, acquire=2, release=2), dep_signal=[0x0, 0x0, 0x0, 0x0, 0x0], completion_signal=0x7f13a9d2ab80
:4:command.cpp              :349 : 1762694933 us: [pid:1890  tid:0x7f13ab583980] Command (Marker) enqueued: 0x563d5de30820
:3:rocvirtual.cpp           :457 : 1762694993 us: [pid:1890  tid:0x7f13ab583980] Set Handler: handle(0x7f13ab57e500), timestamp(0x563d5adb8de0)
:4:rocvirtual.cpp           :1013: 1762695101 us: [pid:1890  tid:0x7f13ab583980] HWq=0x7f1295300000, BarrierAND Header = 0x1503 (type=3, barrier=1, acquire=2, release=2), dep_signal=[0x0, 0x0, 0x0, 0x0, 0x0], completion_signal=0x7f13ab57e500
:4:command.cpp              :349 : 1762695208 us: [pid:1890  tid:0x7f13ab583980] Command (CopyDeviceToHost) enqueued: 0x563d5e1b97a0
:4:rocblit.cpp              :823 : 1762695317 us: [pid:1890  tid:0x7f13ab583980] HSA Async Copy staged D2H dst=0x7f1295500000, src=0x7f1282006000, size=2048, completion_signal=0x7f13ab57e480
:4:command.cpp              :289 : 1762695423 us: [pid:1890  tid:0x7f13ab583980] Queue marker to command queue: 0x563d4cdd1330
:4:command.cpp              :349 : 1762695481 us: [pid:1890  tid:0x7f13ab583980] Command (InternalMarker) enqueued: 0x563d5de30fb0
:4:command.cpp              :179 : 1762695586 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5e1b97a0 complete
:4:command.cpp              :173 : 1762695644 us: [pid:1890  tid:0x7f13ab583980] Command 0x563d5de30fb0 complete (Wall: 1762695643, CPU: 0, GPU: 0 us)
:4:command.cpp              :253 : 1762695755 us: [pid:1890  tid:0x7f13ab583980] Waiting for event 0x563d5e1b97a0 to complete, current status 0
:4:command.cpp              :268 : 1762695861 us: [pid:1890  tid:0x7f13ab583980] Event 0x563d5e1b97a0 wait completed
:3:hip_memory.cpp           :642 : 1762695920 us: [pid:1890  tid:0x7f13ab583980] hipMemcpy: Returned hipSuccess : : duration: 1480 us
:3:rocvirtual.cpp           :210 : 1762694922 us: [pid:1890  tid:0x7f1396e006c0] Handler: value(0), timestamp(0x563d5df6cb80), handle(0x7f13a9d2ab80)
:3:hip_module.cpp           :43  : 1762696035 us: [pid:1890  tid:0x7f13ab583980]  hipModuleUnload ( 0x7f1284004840 ) 
:4:command.cpp              :179 : 1762696186 us: [pid:1890  tid:0x7f1396e006c0] Command 0x563d5e0ec020 complete
:4:command.cpp              :173 : 1762696319 us: [pid:1890  tid:0x7f1396e006c0] Command 0x563d5de30df0 complete (Wall: 1762696318, CPU: 0, GPU: 1595 us)
:3:hip_module.cpp           :47  : 1762696420 us: [pid:1890  tid:0x7f13ab583980] hipModuleUnload: Returned hipSuccess : 
:3:hip_module.cpp           :43  : 1762696525 us: [pid:1890  tid:0x7f13ab583980]  hipModuleUnload ( 0x7f12880f0d00 ) 
:3:hip_module.cpp           :47  : 1762696590 us: [pid:1890  tid:0x7f13ab583980] hipModuleUnload: Returned hipSuccess : 
:3:hip_memory.cpp           :618 : 1762696649 us: [pid:1890  tid:0x7f13ab583980]  hipFree ( 0x7f1282004000 ) 
:4:commandqueue.cpp         :153 : 1762696707 us: [pid:1890  tid:0x7f13ab583980] All commands finished
:3:rocdevice.cpp            :2651: 1762696765 us: [pid:1890  tid:0x7f13ab583980] No HW event
:4:commandqueue.cpp         :140 : 1762696822 us: [pid:1890  tid:0x7f13ab583980] HW Event not ready, awaiting completion instead
:4:commandqueue.cpp         :153 : 1762696928 us: [pid:1890  tid:0x7f13ab583980] All commands finished
:3:hip_memory.cpp           :620 : 1762696987 us: [pid:1890  tid:0x7f13ab583980] hipFree: Returned hipSuccess : 
:3:hip_memory.cpp           :618 : 1762697046 us: [pid:1890  tid:0x7f13ab583980]  hipFree ( char array:<null> ) 
:3:hip_memory.cpp           :620 : 1762697103 us: [pid:1890  tid:0x7f13ab583980] hipFree: Returned hipSuccess : 
:3:hip_memory.cpp           :618 : 1762697162 us: [pid:1890  tid:0x7f13ab583980]  hipFree ( 0x7f1282006000 ) 
:3:hip_memory.cpp           :620 : 1762697220 us: [pid:1890  tid:0x7f13ab583980] hipFree: Returned hipSuccess : 
:3:hip_memory.cpp           :618 : 1762697277 us: [pid:1890  tid:0x7f13ab583980]  hipFree ( 0x7f1282005000 ) 
:3:hip_memory.cpp           :620 : 1762697335 us: [pid:1890  tid:0x7f13ab583980] hipFree: Returned hipSuccess : 
:3:rocvirtual.cpp           :210 : 1762696465 us: [pid:1890  tid:0x7f1396e006c0] Handler: value(0), timestamp(0x563d5e212490), handle(0x7f13ab57e500)
clients/tests/default_callbacks_test.cpp:310: Failure
Expected: (diff.l_inf) < (type_epsilon<Tbound>()), actual: 32.230823516845703 vs 3.75e-05

:4:command.cpp              :173 : 1762697540 us: [pid:1890  tid:0x7f1396e006c0] Command 0x563d5de30820 complete (Wall: 1762697539, CPU: 0, GPU: 2547 us)
[  FAILED  ] rocfft_UnitTest.default_load_callback_complex_single (1458 ms)
[----------] 1 test from rocfft_UnitTest (1458 ms total)

:4:rocdevice.cpp            :2207: 1762697742 us: [pid:1890  tid:0x7f1396e006c0] Free hsa memory 0x7f1282004000
[----------] Global test environment tear-down
:3:rocdevice.cpp            :2230: 1762697924 us: [pid:1890  tid:0x7f1396e006c0] device=0x563d5db5ff10, freeMem_ = 0x3feffd690
:4:rocdevice.cpp            :2207: 1762698057 us: [pid:1890  tid:0x7f1396e006c0] Free hsa memory 0x7f1282005000
:3:rocdevice.cpp            :2230: 1762698126 us: [pid:1890  tid:0x7f1396e006c0] device=0x563d5db5ff10, freeMem_ = 0x3feffde90
:4:rocdevice.cpp            :2207: 1762698195 us: [pid:1890  tid:0x7f1396e006c0] Free hsa memory 0x7f1282006000
:3:rocdevice.cpp            :2230: 1762698263 us: [pid:1890  tid:0x7f1396e006c0] device=0x563d5db5ff10, freeMem_ = 0x3feffe690
[==========] 1 test from 1 test suite ran. (1459 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] rocfft_UnitTest.default_load_callback_complex_single

 1 FAILED TEST
:3:hip_memory.cpp           :618 : 1762700909 us: [pid:1890  tid:0x7f13ab583980]  hipFree ( 0x7f1282000000 ) 
:3:hip_memory.cpp           :620 : 1762700969 us: [pid:1890  tid:0x7f13ab583980] hipFree: Returned hipSuccess : 
:3:hip_stream.cpp           :464 : 1762701026 us: [pid:1890  tid:0x7f13ab583980]  hipStreamDestroy ( stream:0x563d5e02e9f0 ) 
:4:command.cpp              :349 : 1762701120 us: [pid:1890  tid:0x7f13ab583980] Command (Marker) enqueued: 0x563d5de30fb0
:3:rocvirtual.cpp           :457 : 1762701176 us: [pid:1890  tid:0x7f13ab583980] Set Handler: handle(0x7f13a9d65300), timestamp(0x563d5bcf8c60)
:4:rocvirtual.cpp           :1013: 1762701281 us: [pid:1890  tid:0x7f13ab583980] HWq=0x7f1294300000, BarrierAND Header = 0x1503 (type=3, barrier=1, acquire=2, release=2), dep_signal=[0x0, 0x0, 0x0, 0x0, 0x0], completion_signal=0x7f13a9d65300
:4:command.cpp              :253 : 1762701380 us: [pid:1890  tid:0x7f13ab583980] Waiting for event 0x563d5de30fb0 to complete, current status 2
:3:rocvirtual.cpp           :210 : 1762701359 us: [pid:1890  tid:0x7f1396e006c0] Handler: value(0), timestamp(0x7f12843255a0), handle(0x7f13a9d65300)
:4:command.cpp              :268 : 1762701658 us: [pid:1890  tid:0x7f13ab583980] Event 0x563d5de30fb0 wait completed
:4:command.cpp              :173 : 1762701660 us: [pid:1890  tid:0x7f1396e006c0] Command 0x563d5de30fb0 complete (Wall: 1762701657, CPU: 0, GPU: 491 us)
:4:rocdevice.cpp            :2207: 1762701903 us: [pid:1890  tid:0x7f13ab583980] Free hsa memory 0x7f1282300000
:4:rocdevice.cpp            :2207: 1762701962 us: [pid:1890  tid:0x7f13ab583980] Free hsa memory (nil)
:3:rocdevice.cpp            :2914: 1762702018 us: [pid:1890  tid:0x7f13ab583980] releaseQueue refCount:0x7f13a9d56000 (0)
:3:hip_stream.cpp           :498 : 1762702077 us: [pid:1890  tid:0x7f13ab583980] hipStreamDestroy: Returned hipSuccess : 
:3:rocdevice.cpp            :2230: 1762703329 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3feffe698
:3:rocdevice.cpp            :2230: 1762790134 us: [pid:1890  tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3feffe6a0
Random seed: 3631874771
half precision max l-inf epsilon: 0
half precision max l2 epsilon:     0
single precision max l-inf epsilon: 0
single precision max l2 epsilon:     0
double precision max l-inf epsilon: 0
double precision max l2 epsilon:     0
Number of runtime issues: 0

Reply to: