Test failures in rocm-device-libs-17
Hello,
I've prepared a rocm-device-libs-17 package [1] based on the
release/17.x branch that Jeremy Newton mentioned [2]. It is a separate
source package at the moment, but perhaps it should just be a new binary
package for rocm-device-libs? I don't know. The important thing is to
ensure that users can stay on rocm-device-libs for clang-15 until
rocm-compiler-support and rocm-hipamd have been updated to use clang-17,
otherwise all libraries written in the HIP language will FTBFS during
the transition. The creation of a new rocm-device-libs-17 package solves
that problem by enabling device libs for both clang-15 and clang-17 to
be installed during the transition, but perhaps there's a better way.
Unfortunately, I'm seeing some test failures. The logs are attached.
Fedora Rawhide apparently already packages this version of the
rocm-device-libs, so I'm curious to know if these failures are specific
to Debian.
Sincerely,
Cory Bloor
[1]: https://salsa.debian.org/cgmb/rocm-device-libs-17
[2]: https://lists.debian.org/debian-ai/2023/08/msg00157.html
Test project /root/rocm-device-libs-17/rocm-device-libs-17/obj-x86_64-linux-gnu
Start 1: constant_fold_lgamma_r__gfx900
1/21 Test #1: constant_fold_lgamma_r__gfx900 ....***Failed 0.17 sec
CMake Error at /root/rocm-device-libs-17/rocm-device-libs-17/test/compile/RunConstantFoldTest.cmake:33 (message):
Error in test output:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/lgamma_r.cl:69:24:
error: CONSTANTFOLD-NEXT: expected string not found in input
// CONSTANTFOLD-NEXT: store volatile float 0x419DE28040000000,
^
output.lgamma_r.gfx900.s:34:23: note: scanning from here
store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
^
output.lgamma_r.gfx900.s:35:2: note: possible intended match here
store volatile float 0x419DE28060000000, ptr addrspace(1) %0, align 4, !tbaa !13
^
Input file: output.lgamma_r.gfx900.s
Check file:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/lgamma_r.cl
-dump-input=help explains the following input dump.
Input was:
<<<<<<
.
.
.
29: store volatile float 0x3FE62E4300000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:57 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:60 ^~~~~~~~~~~~~~~~~~~~~
31: store volatile float 0x3FE250D040000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:61 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
32: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:64 ^~~~~~~~~~~~~~~~~~~~~
33: store volatile float 0x405601E680000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:65 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:68 ^~~~~~~~~~~~~~~~~~~~~
next:69'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
35: store volatile float 0x419DE28060000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:69'0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:69'1 ? possible intended match
36: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:69'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
37: store volatile float 0x419DE28000000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:69'0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:69'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39: store volatile float 0xC19DE28040000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:69'0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
40: store volatile i32 0, ptr addrspace(1) %1, align 4, !tbaa !9
next:69'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.
.
.
>>>>>>
Start 2: constant_fold_lgamma_r__gfx1030
2/21 Test #2: constant_fold_lgamma_r__gfx1030 ...***Failed 0.16 sec
CMake Error at /root/rocm-device-libs-17/rocm-device-libs-17/test/compile/RunConstantFoldTest.cmake:33 (message):
Error in test output:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/lgamma_r.cl:69:24:
error: CONSTANTFOLD-NEXT: expected string not found in input
// CONSTANTFOLD-NEXT: store volatile float 0x419DE28040000000,
^
output.lgamma_r.gfx1030.s:34:23: note: scanning from here
store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
^
output.lgamma_r.gfx1030.s:35:2: note: possible intended match here
store volatile float 0x419DE28060000000, ptr addrspace(1) %0, align 4, !tbaa !13
^
Input file: output.lgamma_r.gfx1030.s
Check file:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/lgamma_r.cl
-dump-input=help explains the following input dump.
Input was:
<<<<<<
.
.
.
29: store volatile float 0x3FE62E4300000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:57 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:60 ^~~~~~~~~~~~~~~~~~~~~
31: store volatile float 0x3FE250D040000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:61 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
32: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:64 ^~~~~~~~~~~~~~~~~~~~~
33: store volatile float 0x405601E680000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:65 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:68 ^~~~~~~~~~~~~~~~~~~~~
next:69'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
35: store volatile float 0x419DE28060000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:69'0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:69'1 ? possible intended match
36: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:69'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
37: store volatile float 0x419DE28000000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:69'0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38: store volatile i32 1, ptr addrspace(1) %1, align 4, !tbaa !9
next:69'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39: store volatile float 0xC19DE28040000000, ptr addrspace(1) %0, align 4, !tbaa !13
next:69'0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
40: store volatile i32 0, ptr addrspace(1) %1, align 4, !tbaa !9
next:69'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.
.
.
>>>>>>
Start 3: compile_asin__gfx803
3/21 Test #3: compile_asin__gfx803 .............. Passed 0.16 sec
Start 4: compile_atan2__gfx803
4/21 Test #4: compile_atan2__gfx803 ............. Passed 0.16 sec
Start 5: compile_atan2pi__gfx803
5/21 Test #5: compile_atan2pi__gfx803 ........... Passed 0.16 sec
Start 6: compile_frexp__gfx600
6/21 Test #6: compile_frexp__gfx600 .............***Failed 0.18 sec
CMake Error at /root/rocm-device-libs-17/rocm-device-libs-17/test/compile/RunCompileTest.cmake:36 (message):
Error in test output:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/frexp.cl:8:16:
error: GCN-DAG: expected string not found in input
// GFX600-DAG: s_movk_i32 [[CLASS_MASK:s[0-9]+]], 0x1f8
^
output.frexp.gfx600.s:7:16: note: scanning from here
test_frexp_f32: ; @test_frexp_f32
^
output.frexp.gfx600.s:11:2: note: possible intended match here
s_mov_b32 s2, 0
^
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/frexp.cl:34:16:
error: CHECK-DAG: expected string not found in input
// GFX600-DAG: s_movk_i32 [[CLASS_MASK:s[0-9]+]], 0x1f8
^
output.frexp.gfx600.s:98:16: note: scanning from here
test_frexp_f64: ; @test_frexp_f64
^
output.frexp.gfx600.s:102:2: note: possible intended match here
s_mov_b32 s2, 0
^
Input file: output.frexp.gfx600.s
Check file:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/frexp.cl
-dump-input=help explains the following input dump.
Input was:
<<<<<<
1: .text
2: .amdgcn_target "amdgcn-amd-amdhsa--gfx600"
3: .protected test_frexp_f32 ; -- Begin function test_frexp_f32
4: .globl test_frexp_f32
5: .p2align 8
6: .type test_frexp_f32,@function
7: test_frexp_f32: ; @test_frexp_f32
label:7'0 ^~~~~~~~~~~~~~~
label:7'1 ^~~~~~~~~~~~~~~
dag:8'0 X~~~~~~~~~~~~~~~~~~ error: no match found
8: ; %bb.0:
dag:8'0 ~~~~~~~~~
9: s_load_dwordx2 s[0:1], s[4:5], 0x4
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10: s_mov_b32 s3, 0x100f000
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
11: s_mov_b32 s2, 0
dag:8'0 ~~~~~~~~~~~~~~~~~
dag:8'1 ? possible intended match
12: v_lshlrev_b32_e32 v0, 2, v0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
13: v_mov_b32_e32 v1, 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~
14: s_waitcnt lgkmcnt(0)
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
15: buffer_load_dword v2, v[0:1], s[0:3], 0 addr64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
16: s_load_dwordx4 s[4:7], s[4:5], 0x0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17: s_mov_b32 s8, 0x7f800000
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
18: s_waitcnt lgkmcnt(0)
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
19: s_mov_b64 s[0:1], s[4:5]
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
20: s_waitcnt vmcnt(0)
dag:8'0 ~~~~~~~~~~~~~~~~~~~~
21: v_frexp_mant_f32_e32 v3, v2
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
22: v_cmp_lt_f32_e64 vcc, |v2|, s8
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
23: v_frexp_exp_i32_f32_e32 v4, v2
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24: v_cndmask_b32_e32 v2, v2, v3, vcc
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25: v_cndmask_b32_e32 v3, 0, v4, vcc
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
26: buffer_store_dword v2, v[0:1], s[0:3], 0 addr64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
27: s_mov_b64 s[0:1], s[6:7]
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
28: buffer_store_dword v3, v[0:1], s[0:3], 0 addr64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29: s_endpgm
dag:8'0 ~~~~~~~~~~
30: .section .rodata,#alloc
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
31: .p2align 6, 0x0
dag:8'0 ~~~~~~~~~~~~~~~~~
32: .amdhsa_kernel test_frexp_f32
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
33: .amdhsa_group_segment_fixed_size 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34: .amdhsa_private_segment_fixed_size 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
35: .amdhsa_kernarg_size 24
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
36: .amdhsa_user_sgpr_count 6
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
37: .amdhsa_user_sgpr_private_segment_buffer 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38: .amdhsa_user_sgpr_dispatch_ptr 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39: .amdhsa_user_sgpr_queue_ptr 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
40: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41: .amdhsa_user_sgpr_dispatch_id 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
42: .amdhsa_user_sgpr_flat_scratch_init 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
43: .amdhsa_user_sgpr_private_segment_size 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
44: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45: .amdhsa_system_sgpr_workgroup_id_x 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
46: .amdhsa_system_sgpr_workgroup_id_y 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47: .amdhsa_system_sgpr_workgroup_id_z 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
48: .amdhsa_system_sgpr_workgroup_info 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
49: .amdhsa_system_vgpr_workitem_id 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50: .amdhsa_next_free_vgpr 5
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
51: .amdhsa_next_free_sgpr 9
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
52: .amdhsa_float_round_mode_32 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53: .amdhsa_float_round_mode_16_64 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54: .amdhsa_float_denorm_mode_32 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55: .amdhsa_float_denorm_mode_16_64 3
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
56: .amdhsa_dx10_clamp 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
57: .amdhsa_ieee_mode 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~
58: .amdhsa_exception_fp_ieee_invalid_op 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59: .amdhsa_exception_fp_denorm_src 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60: .amdhsa_exception_fp_ieee_div_zero 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
61: .amdhsa_exception_fp_ieee_overflow 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
62: .amdhsa_exception_fp_ieee_underflow 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
63: .amdhsa_exception_fp_ieee_inexact 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
64: .amdhsa_exception_int_div_zero 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
65: .end_amdhsa_kernel
dag:8'0 ~~~~~~~~~~~~~~~~~~~~
66: .text
dag:8'0 ~~~~~~~
67: .Lfunc_end0:
dag:8'0 ~~~~~~~~~~~~~
68: .size test_frexp_f32, .Lfunc_end0-test_frexp_f32
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
69: ; -- End function
dag:8'0 ~~~~~~~~~~~~~~~~~~~
70: .section .AMDGPU.csdata
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
71: ; Kernel info:
dag:8'0 ~~~~~~~~~~~~~~~
72: ; codeLenInByte = 112
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
73: ; NumSgprs: 11
dag:8'0 ~~~~~~~~~~~~~~~
74: ; NumVgprs: 5
dag:8'0 ~~~~~~~~~~~~~~
75: ; ScratchSize: 0
dag:8'0 ~~~~~~~~~~~~~~~~~
76: ; MemoryBound: 0
dag:8'0 ~~~~~~~~~~~~~~~~~
77: ; FloatMode: 192
dag:8'0 ~~~~~~~~~~~~~~~~~
78: ; IeeeMode: 1
dag:8'0 ~~~~~~~~~~~~~~
79: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
80: ; SGPRBlocks: 1
dag:8'0 ~~~~~~~~~~~~~~~~
81: ; VGPRBlocks: 1
dag:8'0 ~~~~~~~~~~~~~~~~
82: ; NumSGPRsForWavesPerEU: 11
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
83: ; NumVGPRsForWavesPerEU: 5
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
84: ; Occupancy: 10
dag:8'0 ~~~~~~~~~~~~~~~~
85: ; WaveLimiterHint : 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~
86: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
87: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
88: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
89: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
90: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
91: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
92: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
93: .text
dag:8'0 ~~~~~~~
94: .protected test_frexp_f64 ; -- Begin function test_frexp_f64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
95: .globl test_frexp_f64
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~
96: .p2align 8
dag:8'0 ~~~~~~~~~~~~
97: .type test_frexp_f64,@function
dag:8'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
98: test_frexp_f64: ; @test_frexp_f64
label:32 ^~~~~~~~~~~~~~~
dag:8'0 ~~~~~~~~~~~~~~~
dag:34'0 X~~~~~~~~~~~~~~~~~~ error: no match found
99: ; %bb.0:
dag:34'0 ~~~~~~~~~
100: s_load_dwordx2 s[0:1], s[4:5], 0x4
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101: s_mov_b32 s3, 0x100f000
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
102: s_mov_b32 s2, 0
dag:34'0 ~~~~~~~~~~~~~~~~~
dag:34'1 ? possible intended match
103: v_lshlrev_b32_e32 v1, 3, v0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104: v_mov_b32_e32 v2, 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
105: s_waitcnt lgkmcnt(0)
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
106: buffer_load_dwordx2 v[3:4], v[1:2], s[0:3], 0 addr64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
107: s_load_dwordx4 s[4:7], s[4:5], 0x0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
108: s_mov_b32 s9, 0x7ff00000
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
109: s_mov_b32 s8, s2
dag:34'0 ~~~~~~~~~~~~~~~~~~
110: s_waitcnt lgkmcnt(0)
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
111: s_mov_b64 s[0:1], s[4:5]
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
112: s_waitcnt vmcnt(0)
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
113: v_frexp_mant_f64_e32 v[5:6], v[3:4]
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
114: v_cmp_lt_f64_e64 vcc, |v[3:4]|, s[8:9]
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
115: v_frexp_exp_i32_f64_e32 v7, v[3:4]
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
116: v_cndmask_b32_e32 v4, v4, v6, vcc
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
117: v_cndmask_b32_e32 v3, v3, v5, vcc
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
118: v_cndmask_b32_e32 v5, 0, v7, vcc
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
119: buffer_store_dwordx2 v[3:4], v[1:2], s[0:3], 0 addr64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
120: s_mov_b64 s[0:1], s[6:7]
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
121: v_lshlrev_b32_e32 v1, 2, v0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122: buffer_store_dword v5, v[1:2], s[0:3], 0 addr64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
123: s_endpgm
dag:34'0 ~~~~~~~~~~
124: .section .rodata,#alloc
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
125: .p2align 6, 0x0
dag:34'0 ~~~~~~~~~~~~~~~~~
126: .amdhsa_kernel test_frexp_f64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
127: .amdhsa_group_segment_fixed_size 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
128: .amdhsa_private_segment_fixed_size 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
129: .amdhsa_kernarg_size 24
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
130: .amdhsa_user_sgpr_count 6
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
131: .amdhsa_user_sgpr_private_segment_buffer 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132: .amdhsa_user_sgpr_dispatch_ptr 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
133: .amdhsa_user_sgpr_queue_ptr 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
134: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
135: .amdhsa_user_sgpr_dispatch_id 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
136: .amdhsa_user_sgpr_flat_scratch_init 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
137: .amdhsa_user_sgpr_private_segment_size 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
138: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
139: .amdhsa_system_sgpr_workgroup_id_x 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140: .amdhsa_system_sgpr_workgroup_id_y 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
141: .amdhsa_system_sgpr_workgroup_id_z 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
142: .amdhsa_system_sgpr_workgroup_info 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
143: .amdhsa_system_vgpr_workitem_id 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
144: .amdhsa_next_free_vgpr 8
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
145: .amdhsa_next_free_sgpr 10
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
146: .amdhsa_float_round_mode_32 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147: .amdhsa_float_round_mode_16_64 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
148: .amdhsa_float_denorm_mode_32 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
149: .amdhsa_float_denorm_mode_16_64 3
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
150: .amdhsa_dx10_clamp 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
151: .amdhsa_ieee_mode 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
152: .amdhsa_exception_fp_ieee_invalid_op 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
153: .amdhsa_exception_fp_denorm_src 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
154: .amdhsa_exception_fp_ieee_div_zero 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
155: .amdhsa_exception_fp_ieee_overflow 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
156: .amdhsa_exception_fp_ieee_underflow 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
157: .amdhsa_exception_fp_ieee_inexact 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
158: .amdhsa_exception_int_div_zero 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
159: .end_amdhsa_kernel
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
160: .text
dag:34'0 ~~~~~~~
161: .Lfunc_end1:
dag:34'0 ~~~~~~~~~~~~~
162: .size test_frexp_f64, .Lfunc_end1-test_frexp_f64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
163: ; -- End function
dag:34'0 ~~~~~~~~~~~~~~~~~~~
164: .section .AMDGPU.csdata
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
165: ; Kernel info:
dag:34'0 ~~~~~~~~~~~~~~~
166: ; codeLenInByte = 124
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
167: ; NumSgprs: 12
dag:34'0 ~~~~~~~~~~~~~~~
168: ; NumVgprs: 8
dag:34'0 ~~~~~~~~~~~~~~
169: ; ScratchSize: 0
dag:34'0 ~~~~~~~~~~~~~~~~~
170: ; MemoryBound: 0
dag:34'0 ~~~~~~~~~~~~~~~~~
171: ; FloatMode: 192
dag:34'0 ~~~~~~~~~~~~~~~~~
172: ; IeeeMode: 1
dag:34'0 ~~~~~~~~~~~~~~
173: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
174: ; SGPRBlocks: 1
dag:34'0 ~~~~~~~~~~~~~~~~
175: ; VGPRBlocks: 1
dag:34'0 ~~~~~~~~~~~~~~~~
176: ; NumSGPRsForWavesPerEU: 12
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
177: ; NumVGPRsForWavesPerEU: 8
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
178: ; Occupancy: 10
dag:34'0 ~~~~~~~~~~~~~~~~
179: ; WaveLimiterHint : 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
180: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
181: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
182: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
183: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
184: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
185: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
186: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
187: .ident "Debian clang version 17.0.6 (2)"
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
188: .section ".note.GNU-stack"
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
189: .addrsig
dag:34'0 ~~~~~~~~~~
190: .amdgpu_metadata
dag:34'0 ~~~~~~~~~~~~~~~~~~
191: ---
dag:34'0 ~~~~
192: amdhsa.kernels:
dag:34'0 ~~~~~~~~~~~~~~~~
193: - .args:
dag:34'0 ~~~~~~~~~~
194: - .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
195: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
196: .offset: 0
dag:34'0 ~~~~~~~~~~~~
197: .size: 8
dag:34'0 ~~~~~~~~~~
198: .type_name: 'float*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
199: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
200: - .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
201: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
202: .offset: 8
dag:34'0 ~~~~~~~~~~~~
203: .size: 8
dag:34'0 ~~~~~~~~~~
204: .type_name: 'int*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
205: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
206: - .access: read_only
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
207: .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~
208: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
209: .offset: 16
dag:34'0 ~~~~~~~~~~~~~
210: .size: 8
dag:34'0 ~~~~~~~~~~
211: .type_name: 'float*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
212: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
213: .group_segment_fixed_size: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
214: .kernarg_segment_align: 8
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
215: .kernarg_segment_size: 24
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
216: .language: OpenCL C
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
217: .language_version:
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
218: - 2
dag:34'0 ~~~~~
219: - 0
dag:34'0 ~~~~~
220: .max_flat_workgroup_size: 256
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
221: .name: test_frexp_f32
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~
222: .private_segment_fixed_size: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
223: .sgpr_count: 11
dag:34'0 ~~~~~~~~~~~~~~~~~
224: .sgpr_spill_count: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
225: .symbol: test_frexp_f32.kd
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
226: .vgpr_count: 5
dag:34'0 ~~~~~~~~~~~~~~~~
227: .vgpr_spill_count: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
228: .wavefront_size: 64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
229: - .args:
dag:34'0 ~~~~~~~~~~
230: - .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
231: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
232: .offset: 0
dag:34'0 ~~~~~~~~~~~~
233: .size: 8
dag:34'0 ~~~~~~~~~~
234: .type_name: 'double*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~
235: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
236: - .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
237: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
238: .offset: 8
dag:34'0 ~~~~~~~~~~~~
239: .size: 8
dag:34'0 ~~~~~~~~~~
240: .type_name: 'int*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
241: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
242: - .access: read_only
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
243: .address_space: global
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~
244: .is_restrict: true
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
245: .offset: 16
dag:34'0 ~~~~~~~~~~~~~
246: .size: 8
dag:34'0 ~~~~~~~~~~
247: .type_name: 'double*'
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~
248: .value_kind: global_buffer
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
249: .group_segment_fixed_size: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
250: .kernarg_segment_align: 8
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
251: .kernarg_segment_size: 24
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
252: .language: OpenCL C
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
253: .language_version:
dag:34'0 ~~~~~~~~~~~~~~~~~~~~
254: - 2
dag:34'0 ~~~~~
255: - 0
dag:34'0 ~~~~~
256: .max_flat_workgroup_size: 256
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
257: .name: test_frexp_f64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~
258: .private_segment_fixed_size: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
259: .sgpr_count: 12
dag:34'0 ~~~~~~~~~~~~~~~~~
260: .sgpr_spill_count: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
261: .symbol: test_frexp_f64.kd
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
262: .vgpr_count: 8
dag:34'0 ~~~~~~~~~~~~~~~~
263: .vgpr_spill_count: 0
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
264: .wavefront_size: 64
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~
265: amdhsa.target: amdgcn-amd-amdhsa--gfx600
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
266: amdhsa.version:
dag:34'0 ~~~~~~~~~~~~~~~~
267: - 1
dag:34'0 ~~~~~
268: - 1
dag:34'0 ~~~~~
269: ...
dag:34'0 ~~~~
270:
dag:34'0 ~
271: .end_amdgpu_metadata
dag:34'0 ~~~~~~~~~~~~~~~~~~~~~~
>>>>>>
Start 7: compile_fract__gfx600
7/21 Test #7: compile_fract__gfx600 .............***Failed 0.18 sec
CMake Error at /root/rocm-device-libs-17/rocm-device-libs-17/test/compile/RunCompileTest.cmake:36 (message):
Error in test output:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl:9:16:
error: CHECK-DAG: expected string not found in input
// GFX600-DAG: v_cmp_neq_f32
^
output.fract.gfx600.s:19:15: note: scanning from here
v_cvt_f32_f16_e32 v3, v2
^
output.fract.gfx600.s:26:2: note: possible intended match here
v_cmp_ne_u32_e32 vcc, s0, v2
^
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl:49:16:
error: CHECK-DAG: expected string not found in input
// GFX600-DAG: v_cmp_neq_f32
^
output.fract.gfx600.s:100:27: note: scanning from here
.protected test_fract_f32 ; -- Begin function test_fract_f32
^
output.fract.gfx600.s:124:2: note: possible intended match here
v_cmp_u_f32_e32 vcc, v2, v2
^
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl:80:12:
error: GFX600: expected string not found in input
// GFX600: v_cmp_neq_f64
^
output.fract.gfx600.s:227:11: note: scanning from here
v_min_f64 v[4:5], v[6:7], s[8:9]
^
output.fract.gfx600.s:230:2: note: possible intended match here
v_cmp_class_f64_e32 vcc, v[2:3], v8
^
Input file: output.fract.gfx600.s
Check file:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl
-dump-input=help explains the following input dump.
Input was:
<<<<<<
1: .text
2: .amdgcn_target "amdgcn-amd-amdhsa--gfx600"
3: .protected test_fract_f16 ; -- Begin function test_fract_f16
label:3'0 ^~~~~~~~~~~~~~
label:3'1 ^~~~~~~~~~~~~~
4: .globl test_fract_f16
5: .p2align 8
6: .type test_fract_f16,@function
7: test_fract_f16: ; @test_fract_f16
8: ; %bb.0:
9: s_load_dwordx2 s[0:1], s[4:5], 0x4
10: s_mov_b32 s3, 0x100f000
11: s_mov_b32 s2, 0
12: v_lshlrev_b32_e32 v0, 1, v0
13: v_mov_b32_e32 v1, 0
14: s_waitcnt lgkmcnt(0)
15: buffer_load_ushort v2, v[0:1], s[0:3], 0 addr64
16: s_movk_i32 s0, 0x7c00
17: s_load_dwordx4 s[4:7], s[4:5], 0x0
18: s_waitcnt vmcnt(0)
19: v_cvt_f32_f16_e32 v3, v2
check:4 ^~~~~~~~~~~~~
dag:9'0 X~~~~~~~~~~~ error: no match found
20: v_and_b32_e32 v2, 0x7fff, v2
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21: v_floor_f32_e32 v4, v3
dag:5 ^~~~~~~~~~~
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~
22: v_sub_f32_e32 v5, v3, v4
dag:6 ^~~~~~~~~
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
23: v_min_f32_e32 v5, 0x3f7fe000, v5
dag:7 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24: v_cmp_u_f32_e32 vcc, v3, v3
dag:8 ^~~~~~~~~~~
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25: v_cndmask_b32_e32 v3, v5, v3, vcc
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
26: v_cmp_ne_u32_e32 vcc, s0, v2
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:9'1 ? possible intended match
27: v_cndmask_b32_e32 v2, 0, v3, vcc
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
28: v_cvt_f16_f32_e32 v4, v4
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
29: v_cvt_f16_f32_e32 v2, v2
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
30: s_waitcnt lgkmcnt(0)
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~
31: s_mov_b64 s[0:1], s[6:7]
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
32: s_mov_b64 s[6:7], s[2:3]
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
33: buffer_store_short v4, v[0:1], s[0:3], 0 addr64
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34: buffer_store_short v2, v[0:1], s[4:7], 0 addr64
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
35: s_endpgm
dag:9'0 ~~~~~~~~~~
36: .section .rodata,#alloc
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
37: .p2align 6, 0x0
dag:9'0 ~~~~~~~~~~~~~~~~~
38: .amdhsa_kernel test_fract_f16
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39: .amdhsa_group_segment_fixed_size 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
40: .amdhsa_private_segment_fixed_size 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41: .amdhsa_kernarg_size 24
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
42: .amdhsa_user_sgpr_count 6
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
43: .amdhsa_user_sgpr_private_segment_buffer 1
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
44: .amdhsa_user_sgpr_dispatch_ptr 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45: .amdhsa_user_sgpr_queue_ptr 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
46: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47: .amdhsa_user_sgpr_dispatch_id 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
48: .amdhsa_user_sgpr_flat_scratch_init 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
49: .amdhsa_user_sgpr_private_segment_size 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
51: .amdhsa_system_sgpr_workgroup_id_x 1
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
52: .amdhsa_system_sgpr_workgroup_id_y 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53: .amdhsa_system_sgpr_workgroup_id_z 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54: .amdhsa_system_sgpr_workgroup_info 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55: .amdhsa_system_vgpr_workitem_id 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
56: .amdhsa_next_free_vgpr 6
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
57: .amdhsa_next_free_sgpr 8
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
58: .amdhsa_float_round_mode_32 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59: .amdhsa_float_round_mode_16_64 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60: .amdhsa_float_denorm_mode_32 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
61: .amdhsa_float_denorm_mode_16_64 3
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
62: .amdhsa_dx10_clamp 1
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~
63: .amdhsa_ieee_mode 1
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~
64: .amdhsa_exception_fp_ieee_invalid_op 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
65: .amdhsa_exception_fp_denorm_src 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
66: .amdhsa_exception_fp_ieee_div_zero 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
67: .amdhsa_exception_fp_ieee_overflow 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
68: .amdhsa_exception_fp_ieee_underflow 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
69: .amdhsa_exception_fp_ieee_inexact 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
70: .amdhsa_exception_int_div_zero 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71: .end_amdhsa_kernel
dag:9'0 ~~~~~~~~~~~~~~~~~~~~
72: .text
dag:9'0 ~~~~~~~
73: .Lfunc_end0:
dag:9'0 ~~~~~~~~~~~~~
74: .size test_fract_f16, .Lfunc_end0-test_fract_f16
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
75: ; -- End function
dag:9'0 ~~~~~~~~~~~~~~~~~~~
76: .section .AMDGPU.csdata
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
77: ; Kernel info:
dag:9'0 ~~~~~~~~~~~~~~~
78: ; codeLenInByte = 132
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~
79: ; NumSgprs: 10
dag:9'0 ~~~~~~~~~~~~~~~
80: ; NumVgprs: 6
dag:9'0 ~~~~~~~~~~~~~~
81: ; ScratchSize: 0
dag:9'0 ~~~~~~~~~~~~~~~~~
82: ; MemoryBound: 0
dag:9'0 ~~~~~~~~~~~~~~~~~
83: ; FloatMode: 192
dag:9'0 ~~~~~~~~~~~~~~~~~
84: ; IeeeMode: 1
dag:9'0 ~~~~~~~~~~~~~~
85: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
86: ; SGPRBlocks: 1
dag:9'0 ~~~~~~~~~~~~~~~~
87: ; VGPRBlocks: 1
dag:9'0 ~~~~~~~~~~~~~~~~
88: ; NumSGPRsForWavesPerEU: 10
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
89: ; NumVGPRsForWavesPerEU: 6
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
90: ; Occupancy: 10
dag:9'0 ~~~~~~~~~~~~~~~~
91: ; WaveLimiterHint : 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~
92: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
93: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
94: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
95: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
96: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
97: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
98: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
99: .text
dag:9'0 ~~~~~~~
100: .protected test_fract_f32 ; -- Begin function test_fract_f32
label:43 ^~~~~~~~~~~~~~
dag:9'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:49'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
101: .globl test_fract_f32
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~
102: .p2align 8
dag:49'0 ~~~~~~~~~~~~
103: .type test_fract_f32,@function
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104: test_fract_f32: ; @test_fract_f32
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
105: ; %bb.0:
dag:49'0 ~~~~~~~~~
106: s_load_dwordx2 s[0:1], s[4:5], 0x4
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
107: s_mov_b32 s3, 0x100f000
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
108: s_mov_b32 s2, 0
dag:49'0 ~~~~~~~~~~~~~~~~~
109: v_lshlrev_b32_e32 v0, 2, v0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
110: v_mov_b32_e32 v1, 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~
111: s_waitcnt lgkmcnt(0)
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~
112: buffer_load_dword v2, v[0:1], s[0:3], 0 addr64
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
113: s_load_dwordx4 s[4:7], s[4:5], 0x0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
114: v_mov_b32_e32 v3, 0x204
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
115: s_waitcnt lgkmcnt(0)
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~
116: s_mov_b64 s[0:1], s[6:7]
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
117: s_mov_b64 s[6:7], s[2:3]
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
118: s_waitcnt vmcnt(0)
dag:49'0 ~~~~~~~~~~~~~~~~~~~~
119: v_floor_f32_e32 v4, v2
dag:44 ^~~~~~~~~~~
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~
120: v_sub_f32_e32 v5, v2, v4
dag:45 ^~~~~~~~~
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
121: buffer_store_dword v4, v[0:1], s[0:3], 0 addr64
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122: s_waitcnt expcnt(0)
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~
123: v_min_f32_e32 v4, 0x3f7fffff, v5
dag:46 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
124: v_cmp_u_f32_e32 vcc, v2, v2
dag:47 ^~~~~~~~~~~
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:49'1 ? possible intended match
125: v_cndmask_b32_e32 v4, v4, v2, vcc
dag:48 ^~~~~~~~~~~~~
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
126: v_cmp_class_f32_e32 vcc, v2, v3
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
127: v_cndmask_b32_e64 v2, v4, 0, vcc
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
128: buffer_store_dword v2, v[0:1], s[4:7], 0 addr64
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
129: s_endpgm
dag:49'0 ~~~~~~~~~~
130: .section .rodata,#alloc
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
131: .p2align 6, 0x0
dag:49'0 ~~~~~~~~~~~~~~~~~
132: .amdhsa_kernel test_fract_f32
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
133: .amdhsa_group_segment_fixed_size 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
134: .amdhsa_private_segment_fixed_size 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
135: .amdhsa_kernarg_size 24
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
136: .amdhsa_user_sgpr_count 6
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
137: .amdhsa_user_sgpr_private_segment_buffer 1
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
138: .amdhsa_user_sgpr_dispatch_ptr 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
139: .amdhsa_user_sgpr_queue_ptr 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
141: .amdhsa_user_sgpr_dispatch_id 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
142: .amdhsa_user_sgpr_flat_scratch_init 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
143: .amdhsa_user_sgpr_private_segment_size 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
144: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
145: .amdhsa_system_sgpr_workgroup_id_x 1
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
146: .amdhsa_system_sgpr_workgroup_id_y 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147: .amdhsa_system_sgpr_workgroup_id_z 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
148: .amdhsa_system_sgpr_workgroup_info 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
149: .amdhsa_system_vgpr_workitem_id 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
150: .amdhsa_next_free_vgpr 6
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
151: .amdhsa_next_free_sgpr 8
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
152: .amdhsa_float_round_mode_32 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
153: .amdhsa_float_round_mode_16_64 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
154: .amdhsa_float_denorm_mode_32 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
155: .amdhsa_float_denorm_mode_16_64 3
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
156: .amdhsa_dx10_clamp 1
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~
157: .amdhsa_ieee_mode 1
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~
158: .amdhsa_exception_fp_ieee_invalid_op 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
159: .amdhsa_exception_fp_denorm_src 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
160: .amdhsa_exception_fp_ieee_div_zero 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
161: .amdhsa_exception_fp_ieee_overflow 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
162: .amdhsa_exception_fp_ieee_underflow 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
163: .amdhsa_exception_fp_ieee_inexact 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
164: .amdhsa_exception_int_div_zero 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
165: .end_amdhsa_kernel
dag:49'0 ~~~~~~~~~~~~~~~~~~~~
166: .text
dag:49'0 ~~~~~~~
167: .Lfunc_end1:
dag:49'0 ~~~~~~~~~~~~~
168: .size test_fract_f32, .Lfunc_end1-test_fract_f32
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
169: ; -- End function
dag:49'0 ~~~~~~~~~~~~~~~~~~~
170: .section .AMDGPU.csdata
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
171: ; Kernel info:
dag:49'0 ~~~~~~~~~~~~~~~
172: ; codeLenInByte = 128
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~
173: ; NumSgprs: 10
dag:49'0 ~~~~~~~~~~~~~~~
174: ; NumVgprs: 6
dag:49'0 ~~~~~~~~~~~~~~
175: ; ScratchSize: 0
dag:49'0 ~~~~~~~~~~~~~~~~~
176: ; MemoryBound: 0
dag:49'0 ~~~~~~~~~~~~~~~~~
177: ; FloatMode: 192
dag:49'0 ~~~~~~~~~~~~~~~~~
178: ; IeeeMode: 1
dag:49'0 ~~~~~~~~~~~~~~
179: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
180: ; SGPRBlocks: 1
dag:49'0 ~~~~~~~~~~~~~~~~
181: ; VGPRBlocks: 1
dag:49'0 ~~~~~~~~~~~~~~~~
182: ; NumSGPRsForWavesPerEU: 10
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
183: ; NumVGPRsForWavesPerEU: 6
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
184: ; Occupancy: 10
dag:49'0 ~~~~~~~~~~~~~~~~
185: ; WaveLimiterHint : 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~
186: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
187: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
188: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
189: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
190: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
191: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
193: .text
dag:49'0 ~~~~~~~
194: .protected test_fract_f64 ; -- Begin function test_fract_f64
label:68 ^~~~~~~~~~~~~~
dag:49'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
195: .globl test_fract_f64
196: .p2align 8
197: .type test_fract_f64,@function
198: test_fract_f64: ; @test_fract_f64
199: ; %bb.0:
200: s_load_dwordx2 s[0:1], s[4:5], 0x4
201: s_mov_b32 s3, 0x100f000
202: s_mov_b32 s2, 0
203: v_lshlrev_b32_e32 v0, 3, v0
204: v_mov_b32_e32 v1, 0
205: s_waitcnt lgkmcnt(0)
206: buffer_load_dwordx2 v[2:3], v[0:1], s[0:3], 0 addr64
207: v_mov_b32_e32 v4, -1
208: v_mov_b32_e32 v5, 0x3fefffff
209: s_load_dwordx4 s[4:7], s[4:5], 0x0
210: s_mov_b32 s8, -1
211: s_mov_b32 s9, 0x3fefffff
212: v_mov_b32_e32 v8, 0x204
213: s_waitcnt lgkmcnt(0)
214: s_mov_b64 s[0:1], s[6:7]
215: s_mov_b64 s[6:7], s[2:3]
216: s_waitcnt vmcnt(0)
217: v_fract_f64_e32 v[6:7], v[2:3]
check:71 ^~~~~~~~~~~~~~~
218: v_cmp_class_f64_e64 vcc, v[2:3], 3
check:72 ^~~~~~~~~~~~~~~~~~~
219: v_min_f64 v[4:5], v[6:7], v[4:5]
check:73 ^~~~~~~~~
220: v_cndmask_b32_e32 v4, v4, v2, vcc
check:74 ^~~~~~~~~~~~~
221: v_cndmask_b32_e32 v5, v5, v3, vcc
check:75 ^~~~~~~~~~~~~
222: v_add_f64 v[4:5], v[2:3], -v[4:5]
check:76 ^~~~~~~~~
223: v_cmp_u_f64_e32 vcc, v[2:3], v[2:3]
check:77 ^~~~~~~~~~~
224: v_add_f64 v[6:7], v[2:3], -v[4:5]
check:78 ^~~~~~~~~
225: buffer_store_dwordx2 v[4:5], v[0:1], s[0:3], 0 addr64
226: s_waitcnt expcnt(0)
227: v_min_f64 v[4:5], v[6:7], s[8:9]
check:79 ^~~~~~~~~
check:80'0 X~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
228: v_cndmask_b32_e32 v4, v4, v2, vcc
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
229: v_cndmask_b32_e32 v5, v5, v3, vcc
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
230: v_cmp_class_f64_e32 vcc, v[2:3], v8
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:80'1 ? possible intended match
231: v_cndmask_b32_e64 v3, v5, 0, vcc
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
232: v_cndmask_b32_e64 v2, v4, 0, vcc
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
233: buffer_store_dwordx2 v[2:3], v[0:1], s[4:7], 0 addr64
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
234: s_endpgm
check:80'0 ~~~~~~~~~~
235: .section .rodata,#alloc
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
236: .p2align 6, 0x0
check:80'0 ~~~~~~~~~~~~~~~~~
237: .amdhsa_kernel test_fract_f64
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
238: .amdhsa_group_segment_fixed_size 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
239: .amdhsa_private_segment_fixed_size 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
240: .amdhsa_kernarg_size 24
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
241: .amdhsa_user_sgpr_count 6
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
242: .amdhsa_user_sgpr_private_segment_buffer 1
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
243: .amdhsa_user_sgpr_dispatch_ptr 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
244: .amdhsa_user_sgpr_queue_ptr 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
245: .amdhsa_user_sgpr_kernarg_segment_ptr 1
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
246: .amdhsa_user_sgpr_dispatch_id 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
247: .amdhsa_user_sgpr_flat_scratch_init 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
248: .amdhsa_user_sgpr_private_segment_size 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
249: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
250: .amdhsa_system_sgpr_workgroup_id_x 1
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
251: .amdhsa_system_sgpr_workgroup_id_y 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
252: .amdhsa_system_sgpr_workgroup_id_z 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
253: .amdhsa_system_sgpr_workgroup_info 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
254: .amdhsa_system_vgpr_workitem_id 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
255: .amdhsa_next_free_vgpr 9
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
256: .amdhsa_next_free_sgpr 10
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
257: .amdhsa_float_round_mode_32 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
258: .amdhsa_float_round_mode_16_64 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
259: .amdhsa_float_denorm_mode_32 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
260: .amdhsa_float_denorm_mode_16_64 3
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
261: .amdhsa_dx10_clamp 1
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
262: .amdhsa_ieee_mode 1
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
263: .amdhsa_exception_fp_ieee_invalid_op 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
264: .amdhsa_exception_fp_denorm_src 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
265: .amdhsa_exception_fp_ieee_div_zero 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
266: .amdhsa_exception_fp_ieee_overflow 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
267: .amdhsa_exception_fp_ieee_underflow 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
268: .amdhsa_exception_fp_ieee_inexact 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
269: .amdhsa_exception_int_div_zero 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
270: .end_amdhsa_kernel
check:80'0 ~~~~~~~~~~~~~~~~~~~~
271: .text
check:80'0 ~~~~~~~
272: .Lfunc_end2:
check:80'0 ~~~~~~~~~~~~~
273: .size test_fract_f64, .Lfunc_end2-test_fract_f64
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
274: ; -- End function
check:80'0 ~~~~~~~~~~~~~~~~~~~
275: .section .AMDGPU.csdata
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
276: ; Kernel info:
check:80'0 ~~~~~~~~~~~~~~~
277: ; codeLenInByte = 224
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
278: ; NumSgprs: 12
check:80'0 ~~~~~~~~~~~~~~~
279: ; NumVgprs: 9
check:80'0 ~~~~~~~~~~~~~~
280: ; ScratchSize: 0
check:80'0 ~~~~~~~~~~~~~~~~~
281: ; MemoryBound: 0
check:80'0 ~~~~~~~~~~~~~~~~~
282: ; FloatMode: 192
check:80'0 ~~~~~~~~~~~~~~~~~
283: ; IeeeMode: 1
check:80'0 ~~~~~~~~~~~~~~
284: ; LDSByteSize: 0 bytes/workgroup (compile time only)
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
285: ; SGPRBlocks: 1
check:80'0 ~~~~~~~~~~~~~~~~
286: ; VGPRBlocks: 2
check:80'0 ~~~~~~~~~~~~~~~~
287: ; NumSGPRsForWavesPerEU: 12
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
288: ; NumVGPRsForWavesPerEU: 9
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
289: ; Occupancy: 10
check:80'0 ~~~~~~~~~~~~~~~~
290: ; WaveLimiterHint : 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
291: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
292: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
293: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
294: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
295: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
296: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
297: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
298: .ident "Debian clang version 17.0.6 (2)"
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
299: .section ".note.GNU-stack"
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
300: .addrsig
check:80'0 ~~~~~~~~~~
301: .amdgpu_metadata
check:80'0 ~~~~~~~~~~~~~~~~~~
302: ---
check:80'0 ~~~~
303: amdhsa.kernels:
check:80'0 ~~~~~~~~~~~~~~~~
304: - .args:
check:80'0 ~~~~~~~~~~
305: - .address_space: global
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
306: .is_restrict: true
check:80'0 ~~~~~~~~~~~~~~~~~~~~
307: .offset: 0
check:80'0 ~~~~~~~~~~~~
308: .size: 8
check:80'0 ~~~~~~~~~~
309: .type_name: 'half*'
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
310: .value_kind: global_buffer
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
311: - .address_space: global
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
312: .is_restrict: true
check:80'0 ~~~~~~~~~~~~~~~~~~~~
313: .offset: 8
check:80'0 ~~~~~~~~~~~~
314: .size: 8
check:80'0 ~~~~~~~~~~
315: .type_name: 'half*'
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
316: .value_kind: global_buffer
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
317: - .access: read_only
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
318: .address_space: global
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~
319: .is_restrict: true
check:80'0 ~~~~~~~~~~~~~~~~~~~~
320: .offset: 16
check:80'0 ~~~~~~~~~~~~~
321: .size: 8
check:80'0 ~~~~~~~~~~
322: .type_name: 'half*'
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
323: .value_kind: global_buffer
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
324: .group_segment_fixed_size: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
325: .kernarg_segment_align: 8
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
326: .kernarg_segment_size: 24
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
327: .language: OpenCL C
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
328: .language_version:
check:80'0 ~~~~~~~~~~~~~~~~~~~~
329: - 2
check:80'0 ~~~~~
330: - 0
check:80'0 ~~~~~
331: .max_flat_workgroup_size: 256
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
332: .name: test_fract_f16
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~
333: .private_segment_fixed_size: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
334: .sgpr_count: 10
check:80'0 ~~~~~~~~~~~~~~~~~
335: .sgpr_spill_count: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
336: .symbol: test_fract_f16.kd
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
337: .vgpr_count: 6
check:80'0 ~~~~~~~~~~~~~~~~
338: .vgpr_spill_count: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
339: .wavefront_size: 64
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
340: - .args:
check:80'0 ~~~~~~~~~~
341: - .address_space: global
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
342: .is_restrict: true
check:80'0 ~~~~~~~~~~~~~~~~~~~~
343: .offset: 0
check:80'0 ~~~~~~~~~~~~
344: .size: 8
check:80'0 ~~~~~~~~~~
345: .type_name: 'float*'
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
346: .value_kind: global_buffer
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
347: - .address_space: global
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
348: .is_restrict: true
check:80'0 ~~~~~~~~~~~~~~~~~~~~
349: .offset: 8
check:80'0 ~~~~~~~~~~~~
350: .size: 8
check:80'0 ~~~~~~~~~~
351: .type_name: 'float*'
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
352: .value_kind: global_buffer
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
353: - .access: read_only
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
354: .address_space: global
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~
355: .is_restrict: true
check:80'0 ~~~~~~~~~~~~~~~~~~~~
356: .offset: 16
check:80'0 ~~~~~~~~~~~~~
357: .size: 8
check:80'0 ~~~~~~~~~~
358: .type_name: 'float*'
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
359: .value_kind: global_buffer
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
360: .group_segment_fixed_size: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
361: .kernarg_segment_align: 8
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
362: .kernarg_segment_size: 24
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
363: .language: OpenCL C
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
364: .language_version:
check:80'0 ~~~~~~~~~~~~~~~~~~~~
365: - 2
check:80'0 ~~~~~
366: - 0
check:80'0 ~~~~~
367: .max_flat_workgroup_size: 256
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
368: .name: test_fract_f32
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~
369: .private_segment_fixed_size: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
370: .sgpr_count: 10
check:80'0 ~~~~~~~~~~~~~~~~~
371: .sgpr_spill_count: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
372: .symbol: test_fract_f32.kd
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
373: .vgpr_count: 6
check:80'0 ~~~~~~~~~~~~~~~~
374: .vgpr_spill_count: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
375: .wavefront_size: 64
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
376: - .args:
check:80'0 ~~~~~~~~~~
377: - .address_space: global
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
378: .is_restrict: true
check:80'0 ~~~~~~~~~~~~~~~~~~~~
379: .offset: 0
check:80'0 ~~~~~~~~~~~~
380: .size: 8
check:80'0 ~~~~~~~~~~
381: .type_name: 'double*'
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~
382: .value_kind: global_buffer
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
383: - .address_space: global
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
384: .is_restrict: true
check:80'0 ~~~~~~~~~~~~~~~~~~~~
385: .offset: 8
check:80'0 ~~~~~~~~~~~~
386: .size: 8
check:80'0 ~~~~~~~~~~
387: .type_name: 'double*'
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~
388: .value_kind: global_buffer
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
389: - .access: read_only
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
390: .address_space: global
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~
391: .is_restrict: true
check:80'0 ~~~~~~~~~~~~~~~~~~~~
392: .offset: 16
check:80'0 ~~~~~~~~~~~~~
393: .size: 8
check:80'0 ~~~~~~~~~~
394: .type_name: 'double*'
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~
395: .value_kind: global_buffer
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
396: .group_segment_fixed_size: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
397: .kernarg_segment_align: 8
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
398: .kernarg_segment_size: 24
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
399: .language: OpenCL C
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
400: .language_version:
check:80'0 ~~~~~~~~~~~~~~~~~~~~
401: - 2
check:80'0 ~~~~~
402: - 0
check:80'0 ~~~~~
403: .max_flat_workgroup_size: 256
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
404: .name: test_fract_f64
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~
405: .private_segment_fixed_size: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
406: .sgpr_count: 12
check:80'0 ~~~~~~~~~~~~~~~~~
407: .sgpr_spill_count: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
408: .symbol: test_fract_f64.kd
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
409: .vgpr_count: 9
check:80'0 ~~~~~~~~~~~~~~~~
410: .vgpr_spill_count: 0
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
411: .wavefront_size: 64
check:80'0 ~~~~~~~~~~~~~~~~~~~~~
412: amdhsa.target: amdgcn-amd-amdhsa--gfx600
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
413: amdhsa.version:
check:80'0 ~~~~~~~~~~~~~~~~
414: - 1
check:80'0 ~~~~~
415: - 1
check:80'0 ~~~~~
416: ...
check:80'0 ~~~~
417:
check:80'0 ~
418: .end_amdgpu_metadata
check:80'0 ~~~~~~~~~~~~~~~~~~~~~~
>>>>>>
Start 8: compile_native_rcp__gfx600
8/21 Test #8: compile_native_rcp__gfx600 ........ Passed 0.17 sec
Start 9: compile_native_rsqrt__gfx600
9/21 Test #9: compile_native_rsqrt__gfx600 ...... Passed 0.17 sec
Start 10: compile_native_log__gfx600
10/21 Test #10: compile_native_log__gfx600 ........ Passed 0.16 sec
Start 11: compile_native_exp__gfx600
11/21 Test #11: compile_native_exp__gfx600 ........ Passed 0.16 sec
Start 12: compile_fract__gfx700
12/21 Test #12: compile_fract__gfx700 .............***Failed 0.19 sec
CMake Error at /root/rocm-device-libs-17/rocm-device-libs-17/test/compile/RunCompileTest.cmake:36 (message):
Error in test output:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl:23:16:
error: CHECK-DAG: expected string not found in input
// GFX700-DAG: v_cmp_neq_f32
^
output.fract.gfx700.s:27:26: note: scanning from here
v_sub_f32_e32 v7, v5, v6
^
output.fract.gfx700.s:32:2: note: possible intended match here
v_cmp_ne_u32_e32 vcc, s4, v3
^
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl:88:16:
error: GFX700-DAG: expected string not found in input
// GFX700-DAG: s_mov_b32 s[[INF_HI:[0-9]+]], 0x7ff00000
^
output.fract.gfx700.s:214:26: note: scanning from here
flat_load_dwordx2 v[0:1], v[0:1]
^
output.fract.gfx700.s:215:2: note: possible intended match here
v_mov_b32_e32 v5, s7
^
Input file: output.fract.gfx700.s
Check file:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl
-dump-input=help explains the following input dump.
Input was:
<<<<<<
1: .text
2: .amdgcn_target "amdgcn-amd-amdhsa--gfx700"
3: .protected test_fract_f16 ; -- Begin function test_fract_f16
label:3'0 ^~~~~~~~~~~~~~
label:3'1 ^~~~~~~~~~~~~~
4: .globl test_fract_f16
5: .p2align 8
6: .type test_fract_f16,@function
7: test_fract_f16: ; @test_fract_f16
8: ; %bb.0:
9: s_load_dwordx2 s[0:1], s[4:5], 0x4
10: v_lshlrev_b32_e32 v2, 1, v0
11: s_waitcnt lgkmcnt(0)
12: v_mov_b32_e32 v1, s1
13: v_add_i32_e32 v0, vcc, s0, v2
14: v_addc_u32_e32 v1, vcc, 0, v1, vcc
15: flat_load_ushort v3, v[0:1]
check:16'0 ^~~~~~~~~~~~~~~~~~~
check:16'1 ^~ captured var "VAL"
16: s_load_dwordx4 s[0:3], s[4:5], 0x0
17: s_movk_i32 s4, 0x7c00
18: s_waitcnt lgkmcnt(0)
19: v_mov_b32_e32 v1, s3
20: v_add_i32_e32 v0, vcc, s2, v2
21: v_addc_u32_e32 v1, vcc, 0, v1, vcc
22: v_mov_b32_e32 v4, s1
23: s_waitcnt vmcnt(0)
24: v_cvt_f32_f16_e32 v5, v3
check:17'0 ^~~~~~~~~~~~~~~~~~~~
check:17'1 ^~ captured var "VAL_F32"
25: v_and_b32_e32 v3, 0x7fff, v3
26: v_floor_f32_e32 v6, v5
dag:18'0 ^~~~~~~~~~~~~~~~~~~~~~
dag:18'1 with "VAL_F32" equal to "v5"
dag:18'2 ^~ captured var "FLOOR"
27: v_sub_f32_e32 v7, v5, v6
check:19'0 ^~~~~~~~~~~~~~~~~~~~~~~~
check:19'1 with "VAL_F32" equal to "v5"
check:19'2 with "FLOOR" equal to "v6"
check:19'3 ^~ captured var "SUB"
dag:23'0 X error: no match found
28: v_min_f32_e32 v7, 0x3f7fe000, v7
dag:21'0 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:21'1 with "SUB" equal to "v7"
dag:21'2 ^~ captured var "CLAMP"
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29: v_cmp_u_f32_e32 vcc, v5, v5
dag:22 ^~~~~~~~~~~
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30: v_cvt_f16_f32_e32 v6, v6
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
31: v_cndmask_b32_e32 v5, v7, v5, vcc
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
32: v_cmp_ne_u32_e32 vcc, s4, v3
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:23'1 ? possible intended match
33: v_cndmask_b32_e32 v3, 0, v5, vcc
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34: v_cvt_f16_f32_e32 v3, v3
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
35: flat_store_short v[0:1], v6
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
36: v_add_i32_e32 v0, vcc, s0, v2
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
37: v_addc_u32_e32 v1, vcc, 0, v4, vcc
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38: flat_store_short v[0:1], v3
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39: s_endpgm
dag:23'0 ~~~~~~~~~~
40: .section .rodata,#alloc
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
41: .p2align 6, 0x0
dag:23'0 ~~~~~~~~~~~~~~~~~
42: .amdhsa_kernel test_fract_f16
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
43: .amdhsa_group_segment_fixed_size 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
44: .amdhsa_private_segment_fixed_size 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45: .amdhsa_kernarg_size 24
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
46: .amdhsa_user_sgpr_count 6
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
47: .amdhsa_user_sgpr_private_segment_buffer 1
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
48: .amdhsa_user_sgpr_dispatch_ptr 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
49: .amdhsa_user_sgpr_queue_ptr 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
51: .amdhsa_user_sgpr_dispatch_id 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
52: .amdhsa_user_sgpr_flat_scratch_init 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53: .amdhsa_user_sgpr_private_segment_size 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55: .amdhsa_system_sgpr_workgroup_id_x 1
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
56: .amdhsa_system_sgpr_workgroup_id_y 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
57: .amdhsa_system_sgpr_workgroup_id_z 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
58: .amdhsa_system_sgpr_workgroup_info 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59: .amdhsa_system_vgpr_workitem_id 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60: .amdhsa_next_free_vgpr 8
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
61: .amdhsa_next_free_sgpr 6
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
62: .amdhsa_reserve_flat_scratch 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
63: .amdhsa_float_round_mode_32 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
64: .amdhsa_float_round_mode_16_64 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
65: .amdhsa_float_denorm_mode_32 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
66: .amdhsa_float_denorm_mode_16_64 3
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
67: .amdhsa_dx10_clamp 1
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~
68: .amdhsa_ieee_mode 1
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~
69: .amdhsa_exception_fp_ieee_invalid_op 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
70: .amdhsa_exception_fp_denorm_src 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71: .amdhsa_exception_fp_ieee_div_zero 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
72: .amdhsa_exception_fp_ieee_overflow 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
73: .amdhsa_exception_fp_ieee_underflow 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
74: .amdhsa_exception_fp_ieee_inexact 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
75: .amdhsa_exception_int_div_zero 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
76: .end_amdhsa_kernel
dag:23'0 ~~~~~~~~~~~~~~~~~~~~
77: .text
dag:23'0 ~~~~~~~
78: .Lfunc_end0:
dag:23'0 ~~~~~~~~~~~~~
79: .size test_fract_f16, .Lfunc_end0-test_fract_f16
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
80: ; -- End function
dag:23'0 ~~~~~~~~~~~~~~~~~~~
81: .section .AMDGPU.csdata
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
82: ; Kernel info:
dag:23'0 ~~~~~~~~~~~~~~~
83: ; codeLenInByte = 144
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~
84: ; NumSgprs: 8
dag:23'0 ~~~~~~~~~~~~~~
85: ; NumVgprs: 8
dag:23'0 ~~~~~~~~~~~~~~
86: ; ScratchSize: 0
dag:23'0 ~~~~~~~~~~~~~~~~~
87: ; MemoryBound: 0
dag:23'0 ~~~~~~~~~~~~~~~~~
88: ; FloatMode: 192
dag:23'0 ~~~~~~~~~~~~~~~~~
89: ; IeeeMode: 1
dag:23'0 ~~~~~~~~~~~~~~
90: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
91: ; SGPRBlocks: 0
dag:23'0 ~~~~~~~~~~~~~~~~
92: ; VGPRBlocks: 1
dag:23'0 ~~~~~~~~~~~~~~~~
93: ; NumSGPRsForWavesPerEU: 8
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
94: ; NumVGPRsForWavesPerEU: 8
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
95: ; Occupancy: 10
dag:23'0 ~~~~~~~~~~~~~~~~
96: ; WaveLimiterHint : 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~
97: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
98: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
99: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
100: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
102: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
103: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104: .text
dag:23'0 ~~~~~~~
105: .protected test_fract_f32 ; -- Begin function test_fract_f32
label:43 ^~~~~~~~~~~~~~
dag:23'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
106: .globl test_fract_f32
107: .p2align 8
108: .type test_fract_f32,@function
109: test_fract_f32: ; @test_fract_f32
110: ; %bb.0:
111: s_load_dwordx2 s[0:1], s[4:5], 0x4
112: v_lshlrev_b32_e32 v2, 2, v0
113: v_mov_b32_e32 v4, 0x204
114: s_waitcnt lgkmcnt(0)
115: v_mov_b32_e32 v1, s1
116: v_add_i32_e32 v0, vcc, s0, v2
117: v_addc_u32_e32 v1, vcc, 0, v1, vcc
118: flat_load_dword v3, v[0:1]
119: s_load_dwordx4 s[0:3], s[4:5], 0x0
120: s_waitcnt lgkmcnt(0)
121: v_mov_b32_e32 v1, s3
122: v_add_i32_e32 v0, vcc, s2, v2
123: v_addc_u32_e32 v1, vcc, 0, v1, vcc
124: v_mov_b32_e32 v5, s1
125: s_waitcnt vmcnt(0)
126: v_floor_f32_e32 v6, v3
127: v_fract_f32_e32 v7, v3
128: v_cmp_class_f32_e32 vcc, v3, v4
129: v_cndmask_b32_e64 v3, v7, 0, vcc
130: flat_store_dword v[0:1], v6
131: v_add_i32_e32 v0, vcc, s0, v2
132: v_addc_u32_e32 v1, vcc, 0, v5, vcc
133: flat_store_dword v[0:1], v3
134: s_endpgm
135: .section .rodata,#alloc
136: .p2align 6, 0x0
137: .amdhsa_kernel test_fract_f32
138: .amdhsa_group_segment_fixed_size 0
139: .amdhsa_private_segment_fixed_size 0
140: .amdhsa_kernarg_size 24
141: .amdhsa_user_sgpr_count 6
142: .amdhsa_user_sgpr_private_segment_buffer 1
143: .amdhsa_user_sgpr_dispatch_ptr 0
144: .amdhsa_user_sgpr_queue_ptr 0
145: .amdhsa_user_sgpr_kernarg_segment_ptr 1
146: .amdhsa_user_sgpr_dispatch_id 0
147: .amdhsa_user_sgpr_flat_scratch_init 0
148: .amdhsa_user_sgpr_private_segment_size 0
149: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
150: .amdhsa_system_sgpr_workgroup_id_x 1
151: .amdhsa_system_sgpr_workgroup_id_y 0
152: .amdhsa_system_sgpr_workgroup_id_z 0
153: .amdhsa_system_sgpr_workgroup_info 0
154: .amdhsa_system_vgpr_workitem_id 0
155: .amdhsa_next_free_vgpr 8
156: .amdhsa_next_free_sgpr 6
157: .amdhsa_reserve_flat_scratch 0
158: .amdhsa_float_round_mode_32 0
159: .amdhsa_float_round_mode_16_64 0
160: .amdhsa_float_denorm_mode_32 0
161: .amdhsa_float_denorm_mode_16_64 3
162: .amdhsa_dx10_clamp 1
163: .amdhsa_ieee_mode 1
164: .amdhsa_exception_fp_ieee_invalid_op 0
165: .amdhsa_exception_fp_denorm_src 0
166: .amdhsa_exception_fp_ieee_div_zero 0
167: .amdhsa_exception_fp_ieee_overflow 0
168: .amdhsa_exception_fp_ieee_underflow 0
169: .amdhsa_exception_fp_ieee_inexact 0
170: .amdhsa_exception_int_div_zero 0
171: .end_amdhsa_kernel
172: .text
173: .Lfunc_end1:
174: .size test_fract_f32, .Lfunc_end1-test_fract_f32
175: ; -- End function
176: .section .AMDGPU.csdata
177: ; Kernel info:
178: ; codeLenInByte = 120
179: ; NumSgprs: 8
180: ; NumVgprs: 8
181: ; ScratchSize: 0
182: ; MemoryBound: 0
183: ; FloatMode: 192
184: ; IeeeMode: 1
185: ; LDSByteSize: 0 bytes/workgroup (compile time only)
186: ; SGPRBlocks: 0
187: ; VGPRBlocks: 1
188: ; NumSGPRsForWavesPerEU: 8
189: ; NumVGPRsForWavesPerEU: 8
190: ; Occupancy: 10
191: ; WaveLimiterHint : 0
192: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
193: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
194: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
195: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
196: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
197: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
198: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
199: .text
200: .protected test_fract_f64 ; -- Begin function test_fract_f64
label:68'0 ^~~~~~~~~~~~~~
label:68'1 ^~~~~~~~~~~~~~
201: .globl test_fract_f64
202: .p2align 8
203: .type test_fract_f64,@function
204: test_fract_f64: ; @test_fract_f64
205: ; %bb.0:
206: s_load_dwordx2 s[0:1], s[4:5], 0x4
207: v_lshlrev_b32_e32 v6, 3, v0
208: v_mov_b32_e32 v4, 0x204
209: s_load_dwordx4 s[4:7], s[4:5], 0x0
210: s_waitcnt lgkmcnt(0)
211: v_mov_b32_e32 v1, s1
212: v_add_i32_e32 v0, vcc, s0, v6
213: v_addc_u32_e32 v1, vcc, 0, v1, vcc
214: flat_load_dwordx2 v[0:1], v[0:1]
check:83'0 ^~~~~~~~~~~~~~~~~~~~~~~~
check:83'1 ^~~~~~ captured var "VAL"
dag:88'0 X~~~~~~~~ error: no match found
215: v_mov_b32_e32 v5, s7
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
dag:88'1 ? possible intended match
216: v_mov_b32_e32 v7, s5
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
217: s_waitcnt vmcnt(0)
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
218: v_fract_f64_e32 v[2:3], v[0:1]
dag:86'0 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:86'1 with "VAL" equal to "v\\[0:1\\]"
dag:86'2 ^ captured var "FRACT_LO"
dag:86'3 ^ captured var "FRACT_HI"
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219: v_cmp_class_f64_e32 vcc, v[0:1], v4
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
220: v_floor_f64_e32 v[0:1], v[0:1]
dag:84'0 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:84'1 with "VAL" equal to "v\\[0:1\\]"
dag:84'2 ^~~~~~ captured var "FLOOR"
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
221: v_add_i32_e64 v4, s[0:1], s6, v6
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
222: v_addc_u32_e64 v5, s[0:1], 0, v5, s[0:1]
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
223: v_cndmask_b32_e64 v3, v3, 0, vcc
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
224: v_cndmask_b32_e64 v2, v2, 0, vcc
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
225: flat_store_dwordx2 v[4:5], v[0:1]
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
226: v_add_i32_e32 v0, vcc, s4, v6
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
227: v_addc_u32_e32 v1, vcc, 0, v7, vcc
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
228: flat_store_dwordx2 v[0:1], v[2:3]
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
229: s_endpgm
dag:88'0 ~~~~~~~~~~
230: .section .rodata,#alloc
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
231: .p2align 6, 0x0
dag:88'0 ~~~~~~~~~~~~~~~~~
232: .amdhsa_kernel test_fract_f64
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
233: .amdhsa_group_segment_fixed_size 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
234: .amdhsa_private_segment_fixed_size 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
235: .amdhsa_kernarg_size 24
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
236: .amdhsa_user_sgpr_count 6
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
237: .amdhsa_user_sgpr_private_segment_buffer 1
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
238: .amdhsa_user_sgpr_dispatch_ptr 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
239: .amdhsa_user_sgpr_queue_ptr 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
240: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
241: .amdhsa_user_sgpr_dispatch_id 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
242: .amdhsa_user_sgpr_flat_scratch_init 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
243: .amdhsa_user_sgpr_private_segment_size 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
244: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
245: .amdhsa_system_sgpr_workgroup_id_x 1
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
246: .amdhsa_system_sgpr_workgroup_id_y 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
247: .amdhsa_system_sgpr_workgroup_id_z 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
248: .amdhsa_system_sgpr_workgroup_info 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
249: .amdhsa_system_vgpr_workitem_id 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
250: .amdhsa_next_free_vgpr 8
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
251: .amdhsa_next_free_sgpr 8
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
252: .amdhsa_reserve_flat_scratch 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
253: .amdhsa_float_round_mode_32 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
254: .amdhsa_float_round_mode_16_64 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
255: .amdhsa_float_denorm_mode_32 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
256: .amdhsa_float_denorm_mode_16_64 3
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
257: .amdhsa_dx10_clamp 1
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
258: .amdhsa_ieee_mode 1
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
259: .amdhsa_exception_fp_ieee_invalid_op 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
260: .amdhsa_exception_fp_denorm_src 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
261: .amdhsa_exception_fp_ieee_div_zero 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
262: .amdhsa_exception_fp_ieee_overflow 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
263: .amdhsa_exception_fp_ieee_underflow 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
264: .amdhsa_exception_fp_ieee_inexact 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
265: .amdhsa_exception_int_div_zero 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
266: .end_amdhsa_kernel
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
267: .text
dag:88'0 ~~~~~~~
268: .Lfunc_end2:
dag:88'0 ~~~~~~~~~~~~~
269: .size test_fract_f64, .Lfunc_end2-test_fract_f64
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
270: ; -- End function
dag:88'0 ~~~~~~~~~~~~~~~~~~~
271: .section .AMDGPU.csdata
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
272: ; Kernel info:
dag:88'0 ~~~~~~~~~~~~~~~
273: ; codeLenInByte = 144
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
274: ; NumSgprs: 10
dag:88'0 ~~~~~~~~~~~~~~~
275: ; NumVgprs: 8
dag:88'0 ~~~~~~~~~~~~~~
276: ; ScratchSize: 0
dag:88'0 ~~~~~~~~~~~~~~~~~
277: ; MemoryBound: 0
dag:88'0 ~~~~~~~~~~~~~~~~~
278: ; FloatMode: 192
dag:88'0 ~~~~~~~~~~~~~~~~~
279: ; IeeeMode: 1
dag:88'0 ~~~~~~~~~~~~~~
280: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
281: ; SGPRBlocks: 1
dag:88'0 ~~~~~~~~~~~~~~~~
282: ; VGPRBlocks: 1
dag:88'0 ~~~~~~~~~~~~~~~~
283: ; NumSGPRsForWavesPerEU: 10
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
284: ; NumVGPRsForWavesPerEU: 8
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
285: ; Occupancy: 10
dag:88'0 ~~~~~~~~~~~~~~~~
286: ; WaveLimiterHint : 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
287: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
288: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
289: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
290: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
291: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
292: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
293: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
294: .ident "Debian clang version 17.0.6 (2)"
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
295: .section ".note.GNU-stack"
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
296: .addrsig
dag:88'0 ~~~~~~~~~~
297: .amdgpu_metadata
dag:88'0 ~~~~~~~~~~~~~~~~~~
298: ---
dag:88'0 ~~~~
299: amdhsa.kernels:
dag:88'0 ~~~~~~~~~~~~~~~~
300: - .args:
dag:88'0 ~~~~~~~~~~
301: - .address_space: global
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
302: .is_restrict: true
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
303: .offset: 0
dag:88'0 ~~~~~~~~~~~~
304: .size: 8
dag:88'0 ~~~~~~~~~~
305: .type_name: 'half*'
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
306: .value_kind: global_buffer
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
307: - .address_space: global
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
308: .is_restrict: true
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
309: .offset: 8
dag:88'0 ~~~~~~~~~~~~
310: .size: 8
dag:88'0 ~~~~~~~~~~
311: .type_name: 'half*'
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
312: .value_kind: global_buffer
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
313: - .access: read_only
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
314: .address_space: global
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~
315: .is_restrict: true
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
316: .offset: 16
dag:88'0 ~~~~~~~~~~~~~
317: .size: 8
dag:88'0 ~~~~~~~~~~
318: .type_name: 'half*'
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
319: .value_kind: global_buffer
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
320: .group_segment_fixed_size: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
321: .kernarg_segment_align: 8
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
322: .kernarg_segment_size: 24
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
323: .language: OpenCL C
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
324: .language_version:
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
325: - 2
dag:88'0 ~~~~~
326: - 0
dag:88'0 ~~~~~
327: .max_flat_workgroup_size: 256
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
328: .name: test_fract_f16
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~
329: .private_segment_fixed_size: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
330: .sgpr_count: 8
dag:88'0 ~~~~~~~~~~~~~~~~
331: .sgpr_spill_count: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
332: .symbol: test_fract_f16.kd
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
333: .vgpr_count: 8
dag:88'0 ~~~~~~~~~~~~~~~~
334: .vgpr_spill_count: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
335: .wavefront_size: 64
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
336: - .args:
dag:88'0 ~~~~~~~~~~
337: - .address_space: global
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
338: .is_restrict: true
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
339: .offset: 0
dag:88'0 ~~~~~~~~~~~~
340: .size: 8
dag:88'0 ~~~~~~~~~~
341: .type_name: 'float*'
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
342: .value_kind: global_buffer
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
343: - .address_space: global
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
344: .is_restrict: true
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
345: .offset: 8
dag:88'0 ~~~~~~~~~~~~
346: .size: 8
dag:88'0 ~~~~~~~~~~
347: .type_name: 'float*'
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
348: .value_kind: global_buffer
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349: - .access: read_only
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
350: .address_space: global
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~
351: .is_restrict: true
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
352: .offset: 16
dag:88'0 ~~~~~~~~~~~~~
353: .size: 8
dag:88'0 ~~~~~~~~~~
354: .type_name: 'float*'
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
355: .value_kind: global_buffer
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
356: .group_segment_fixed_size: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
357: .kernarg_segment_align: 8
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
358: .kernarg_segment_size: 24
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
359: .language: OpenCL C
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
360: .language_version:
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
361: - 2
dag:88'0 ~~~~~
362: - 0
dag:88'0 ~~~~~
363: .max_flat_workgroup_size: 256
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
364: .name: test_fract_f32
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~
365: .private_segment_fixed_size: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
366: .sgpr_count: 8
dag:88'0 ~~~~~~~~~~~~~~~~
367: .sgpr_spill_count: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
368: .symbol: test_fract_f32.kd
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
369: .vgpr_count: 8
dag:88'0 ~~~~~~~~~~~~~~~~
370: .vgpr_spill_count: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
371: .wavefront_size: 64
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
372: - .args:
dag:88'0 ~~~~~~~~~~
373: - .address_space: global
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
374: .is_restrict: true
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
375: .offset: 0
dag:88'0 ~~~~~~~~~~~~
376: .size: 8
dag:88'0 ~~~~~~~~~~
377: .type_name: 'double*'
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~
378: .value_kind: global_buffer
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
379: - .address_space: global
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
380: .is_restrict: true
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
381: .offset: 8
dag:88'0 ~~~~~~~~~~~~
382: .size: 8
dag:88'0 ~~~~~~~~~~
383: .type_name: 'double*'
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~
384: .value_kind: global_buffer
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
385: - .access: read_only
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
386: .address_space: global
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~
387: .is_restrict: true
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
388: .offset: 16
dag:88'0 ~~~~~~~~~~~~~
389: .size: 8
dag:88'0 ~~~~~~~~~~
390: .type_name: 'double*'
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~
391: .value_kind: global_buffer
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
392: .group_segment_fixed_size: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
393: .kernarg_segment_align: 8
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
394: .kernarg_segment_size: 24
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
395: .language: OpenCL C
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
396: .language_version:
dag:88'0 ~~~~~~~~~~~~~~~~~~~~
397: - 2
dag:88'0 ~~~~~
398: - 0
dag:88'0 ~~~~~
399: .max_flat_workgroup_size: 256
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
400: .name: test_fract_f64
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~
401: .private_segment_fixed_size: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
402: .sgpr_count: 10
dag:88'0 ~~~~~~~~~~~~~~~~~
403: .sgpr_spill_count: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
404: .symbol: test_fract_f64.kd
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
405: .vgpr_count: 8
dag:88'0 ~~~~~~~~~~~~~~~~
406: .vgpr_spill_count: 0
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
407: .wavefront_size: 64
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~
408: amdhsa.target: amdgcn-amd-amdhsa--gfx700
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
409: amdhsa.version:
dag:88'0 ~~~~~~~~~~~~~~~~
410: - 1
dag:88'0 ~~~~~
411: - 1
dag:88'0 ~~~~~
412: ...
dag:88'0 ~~~~
413:
dag:88'0 ~
414: .end_amdgpu_metadata
dag:88'0 ~~~~~~~~~~~~~~~~~~~~~~
>>>>>>
Start 13: compile_native_rcp__gfx700
13/21 Test #13: compile_native_rcp__gfx700 ........ Passed 0.16 sec
Start 14: compile_native_rsqrt__gfx700
14/21 Test #14: compile_native_rsqrt__gfx700 ...... Passed 0.16 sec
Start 15: compile_native_log__gfx700
15/21 Test #15: compile_native_log__gfx700 ........ Passed 0.16 sec
Start 16: compile_native_exp__gfx700
16/21 Test #16: compile_native_exp__gfx700 ........ Passed 0.16 sec
Start 17: compile_fract__gfx803
17/21 Test #17: compile_fract__gfx803 .............***Failed 0.19 sec
CMake Error at /root/rocm-device-libs-17/rocm-device-libs-17/test/compile/RunCompileTest.cmake:36 (message):
Error in test output:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl:31:16:
error: GFX803-DAG: expected string not found in input
// GFX803-DAG: s_movk_i32 [[INF:s[0-9]+]], 0x7c00
^
output.fract.gfx803.s:16:21: note: scanning from here
flat_load_ushort v3, v[0:1]
^
output.fract.gfx803.s:17:17: note: possible intended match here
s_load_dwordx4 s[0:3], s[4:5], 0x0
^
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl:56:16:
error: GFX803-DAG: expected string not found in input
// GFX803-DAG: s_mov_b32 [[INF:s[0-9]+]], 0x7f800000
^
output.fract.gfx803.s:111:20: note: scanning from here
flat_load_dword v3, v[0:1]
^
output.fract.gfx803.s:112:17: note: possible intended match here
s_load_dwordx4 s[0:3], s[4:5], 0x0
^
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl:102:16:
error: GFX803-DAG: expected string not found in input
// GFX803-DAG: s_mov_b32 s[[INF_HI:[0-9]+]], 0x7ff00000
^
output.fract.gfx803.s:207:26: note: scanning from here
flat_load_dwordx2 v[0:1], v[0:1]
^
output.fract.gfx803.s:208:2: note: possible intended match here
v_mov_b32_e32 v5, s7
^
Input file: output.fract.gfx803.s
Check file:
/root/rocm-device-libs-17/rocm-device-libs-17/test/compile/fract.cl
-dump-input=help explains the following input dump.
Input was:
<<<<<<
1: .text
2: .amdgcn_target "amdgcn-amd-amdhsa--gfx803"
3: .protected test_fract_f16 ; -- Begin function test_fract_f16
label:3'0 ^~~~~~~~~~~~~~
label:3'1 ^~~~~~~~~~~~~~
4: .globl test_fract_f16
5: .p2align 8
6: .type test_fract_f16,@function
7: test_fract_f16: ; @test_fract_f16
8: ; %bb.0:
9: s_load_dwordx2 s[0:1], s[4:5], 0x10
10: v_lshlrev_b32_e32 v2, 1, v0
11: v_mov_b32_e32 v4, 0x204
12: s_waitcnt lgkmcnt(0)
13: v_mov_b32_e32 v1, s1
14: v_add_u32_e32 v0, vcc, s0, v2
15: v_addc_u32_e32 v1, vcc, 0, v1, vcc
16: flat_load_ushort v3, v[0:1]
check:28'0 ^~~~~~~~~~~~~~~~~~~
check:28'1 ^~ captured var "VAL"
dag:31'0 X~~~~~~~~ error: no match found
17: s_load_dwordx4 s[0:3], s[4:5], 0x0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:31'1 ? possible intended match
18: s_waitcnt lgkmcnt(0)
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~
19: v_mov_b32_e32 v1, s3
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~
20: v_add_u32_e32 v0, vcc, s2, v2
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21: v_addc_u32_e32 v1, vcc, 0, v1, vcc
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
22: v_mov_b32_e32 v5, s1
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~
23: s_waitcnt vmcnt(0)
dag:31'0 ~~~~~~~~~~~~~~~~~~~~
24: v_floor_f16_e32 v6, v3
dag:29'0 ^~~~~~~~~~~~~~~~~~~~~~
dag:29'1 with "VAL" equal to "v3"
dag:29'2 ^~ captured var "FLOOR"
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~
25: v_fract_f16_e32 v7, v3
dag:30'0 ^~~~~~~~~~~~~~~~~~~~~~
dag:30'1 with "VAL" equal to "v3"
dag:30'2 ^~ captured var "FRACT"
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~
26: v_cmp_class_f16_e32 vcc, v3, v4
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
27: v_cndmask_b32_e64 v3, v7, 0, vcc
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
28: flat_store_short v[0:1], v6
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29: v_add_u32_e32 v0, vcc, s0, v2
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30: v_addc_u32_e32 v1, vcc, 0, v5, vcc
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31: flat_store_short v[0:1], v3
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
32: s_endpgm
dag:31'0 ~~~~~~~~~~
33: .section .rodata,#alloc
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
34: .p2align 6, 0x0
dag:31'0 ~~~~~~~~~~~~~~~~~
35: .amdhsa_kernel test_fract_f16
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
36: .amdhsa_group_segment_fixed_size 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
37: .amdhsa_private_segment_fixed_size 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38: .amdhsa_kernarg_size 24
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
39: .amdhsa_user_sgpr_count 6
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
40: .amdhsa_user_sgpr_private_segment_buffer 1
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
41: .amdhsa_user_sgpr_dispatch_ptr 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
42: .amdhsa_user_sgpr_queue_ptr 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
43: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
44: .amdhsa_user_sgpr_dispatch_id 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
45: .amdhsa_user_sgpr_flat_scratch_init 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
46: .amdhsa_user_sgpr_private_segment_size 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
48: .amdhsa_system_sgpr_workgroup_id_x 1
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
49: .amdhsa_system_sgpr_workgroup_id_y 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50: .amdhsa_system_sgpr_workgroup_id_z 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
51: .amdhsa_system_sgpr_workgroup_info 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
52: .amdhsa_system_vgpr_workitem_id 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53: .amdhsa_next_free_vgpr 8
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
54: .amdhsa_next_free_sgpr 6
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
55: .amdhsa_reserve_flat_scratch 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
56: .amdhsa_float_round_mode_32 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
57: .amdhsa_float_round_mode_16_64 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
58: .amdhsa_float_denorm_mode_32 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
59: .amdhsa_float_denorm_mode_16_64 3
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60: .amdhsa_dx10_clamp 1
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~
61: .amdhsa_ieee_mode 1
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~
62: .amdhsa_exception_fp_ieee_invalid_op 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
63: .amdhsa_exception_fp_denorm_src 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
64: .amdhsa_exception_fp_ieee_div_zero 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
65: .amdhsa_exception_fp_ieee_overflow 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
66: .amdhsa_exception_fp_ieee_underflow 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
67: .amdhsa_exception_fp_ieee_inexact 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
68: .amdhsa_exception_int_div_zero 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
69: .end_amdhsa_kernel
dag:31'0 ~~~~~~~~~~~~~~~~~~~~
70: .text
dag:31'0 ~~~~~~~
71: .Lfunc_end0:
dag:31'0 ~~~~~~~~~~~~~
72: .size test_fract_f16, .Lfunc_end0-test_fract_f16
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
73: ; -- End function
dag:31'0 ~~~~~~~~~~~~~~~~~~~
74: .section .AMDGPU.csdata
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
75: ; Kernel info:
dag:31'0 ~~~~~~~~~~~~~~~
76: ; codeLenInByte = 128
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~
77: ; NumSgprs: 8
dag:31'0 ~~~~~~~~~~~~~~
78: ; NumVgprs: 8
dag:31'0 ~~~~~~~~~~~~~~
79: ; ScratchSize: 0
dag:31'0 ~~~~~~~~~~~~~~~~~
80: ; MemoryBound: 0
dag:31'0 ~~~~~~~~~~~~~~~~~
81: ; FloatMode: 192
dag:31'0 ~~~~~~~~~~~~~~~~~
82: ; IeeeMode: 1
dag:31'0 ~~~~~~~~~~~~~~
83: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
84: ; SGPRBlocks: 0
dag:31'0 ~~~~~~~~~~~~~~~~
85: ; VGPRBlocks: 1
dag:31'0 ~~~~~~~~~~~~~~~~
86: ; NumSGPRsForWavesPerEU: 8
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
87: ; NumVGPRsForWavesPerEU: 8
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
88: ; Occupancy: 10
dag:31'0 ~~~~~~~~~~~~~~~~
89: ; WaveLimiterHint : 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~
90: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
91: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
92: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
93: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
94: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
95: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
96: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
97: .text
dag:31'0 ~~~~~~~
98: .protected test_fract_f32 ; -- Begin function test_fract_f32
label:43 ^~~~~~~~~~~~~~
dag:31'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
99: .globl test_fract_f32
100: .p2align 8
101: .type test_fract_f32,@function
102: test_fract_f32: ; @test_fract_f32
103: ; %bb.0:
104: s_load_dwordx2 s[0:1], s[4:5], 0x10
105: v_lshlrev_b32_e32 v2, 2, v0
106: v_mov_b32_e32 v4, 0x204
107: s_waitcnt lgkmcnt(0)
108: v_mov_b32_e32 v1, s1
109: v_add_u32_e32 v0, vcc, s0, v2
110: v_addc_u32_e32 v1, vcc, 0, v1, vcc
111: flat_load_dword v3, v[0:1]
check:53'0 ^~~~~~~~~~~~~~~~~~
check:53'1 ^~ captured var "VAL"
dag:56'0 X~~~~~~~~ error: no match found
112: s_load_dwordx4 s[0:3], s[4:5], 0x0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:56'1 ? possible intended match
113: s_waitcnt lgkmcnt(0)
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~
114: v_mov_b32_e32 v1, s3
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~
115: v_add_u32_e32 v0, vcc, s2, v2
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
116: v_addc_u32_e32 v1, vcc, 0, v1, vcc
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
117: v_mov_b32_e32 v5, s1
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~
118: s_waitcnt vmcnt(0)
dag:56'0 ~~~~~~~~~~~~~~~~~~~~
119: v_floor_f32_e32 v6, v3
dag:54'0 ^~~~~~~~~~~~~~~~~~~~~~
dag:54'1 with "VAL" equal to "v3"
dag:54'2 ^~ captured var "FLOOR"
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~
120: v_fract_f32_e32 v7, v3
dag:55'0 ^~~~~~~~~~~~~~~~~~~~~~
dag:55'1 with "VAL" equal to "v3"
dag:55'2 ^~ captured var "FRACT"
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~
121: v_cmp_class_f32_e32 vcc, v3, v4
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122: v_cndmask_b32_e64 v3, v7, 0, vcc
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
123: flat_store_dword v[0:1], v6
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
124: v_add_u32_e32 v0, vcc, s0, v2
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
125: v_addc_u32_e32 v1, vcc, 0, v5, vcc
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
126: flat_store_dword v[0:1], v3
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
127: s_endpgm
dag:56'0 ~~~~~~~~~~
128: .section .rodata,#alloc
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
129: .p2align 6, 0x0
dag:56'0 ~~~~~~~~~~~~~~~~~
130: .amdhsa_kernel test_fract_f32
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
131: .amdhsa_group_segment_fixed_size 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132: .amdhsa_private_segment_fixed_size 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
133: .amdhsa_kernarg_size 24
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
134: .amdhsa_user_sgpr_count 6
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
135: .amdhsa_user_sgpr_private_segment_buffer 1
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
136: .amdhsa_user_sgpr_dispatch_ptr 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
137: .amdhsa_user_sgpr_queue_ptr 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
138: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
139: .amdhsa_user_sgpr_dispatch_id 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140: .amdhsa_user_sgpr_flat_scratch_init 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
141: .amdhsa_user_sgpr_private_segment_size 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
142: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
143: .amdhsa_system_sgpr_workgroup_id_x 1
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
144: .amdhsa_system_sgpr_workgroup_id_y 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
145: .amdhsa_system_sgpr_workgroup_id_z 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
146: .amdhsa_system_sgpr_workgroup_info 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147: .amdhsa_system_vgpr_workitem_id 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
148: .amdhsa_next_free_vgpr 8
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
149: .amdhsa_next_free_sgpr 6
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
150: .amdhsa_reserve_flat_scratch 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
151: .amdhsa_float_round_mode_32 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
152: .amdhsa_float_round_mode_16_64 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
153: .amdhsa_float_denorm_mode_32 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
154: .amdhsa_float_denorm_mode_16_64 3
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
155: .amdhsa_dx10_clamp 1
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~
156: .amdhsa_ieee_mode 1
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~
157: .amdhsa_exception_fp_ieee_invalid_op 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
158: .amdhsa_exception_fp_denorm_src 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
159: .amdhsa_exception_fp_ieee_div_zero 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
160: .amdhsa_exception_fp_ieee_overflow 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
161: .amdhsa_exception_fp_ieee_underflow 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
162: .amdhsa_exception_fp_ieee_inexact 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
163: .amdhsa_exception_int_div_zero 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
164: .end_amdhsa_kernel
dag:56'0 ~~~~~~~~~~~~~~~~~~~~
165: .text
dag:56'0 ~~~~~~~
166: .Lfunc_end1:
dag:56'0 ~~~~~~~~~~~~~
167: .size test_fract_f32, .Lfunc_end1-test_fract_f32
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
168: ; -- End function
dag:56'0 ~~~~~~~~~~~~~~~~~~~
169: .section .AMDGPU.csdata
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
170: ; Kernel info:
dag:56'0 ~~~~~~~~~~~~~~~
171: ; codeLenInByte = 128
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~
172: ; NumSgprs: 8
dag:56'0 ~~~~~~~~~~~~~~
173: ; NumVgprs: 8
dag:56'0 ~~~~~~~~~~~~~~
174: ; ScratchSize: 0
dag:56'0 ~~~~~~~~~~~~~~~~~
175: ; MemoryBound: 0
dag:56'0 ~~~~~~~~~~~~~~~~~
176: ; FloatMode: 192
dag:56'0 ~~~~~~~~~~~~~~~~~
177: ; IeeeMode: 1
dag:56'0 ~~~~~~~~~~~~~~
178: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
179: ; SGPRBlocks: 0
dag:56'0 ~~~~~~~~~~~~~~~~
180: ; VGPRBlocks: 1
dag:56'0 ~~~~~~~~~~~~~~~~
181: ; NumSGPRsForWavesPerEU: 8
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
182: ; NumVGPRsForWavesPerEU: 8
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
183: ; Occupancy: 10
dag:56'0 ~~~~~~~~~~~~~~~~
184: ; WaveLimiterHint : 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~
185: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
186: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
187: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
188: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
189: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
190: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
191: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
192: .text
dag:56'0 ~~~~~~~
193: .protected test_fract_f64 ; -- Begin function test_fract_f64
label:68 ^~~~~~~~~~~~~~
dag:56'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
194: .globl test_fract_f64
195: .p2align 8
196: .type test_fract_f64,@function
197: test_fract_f64: ; @test_fract_f64
198: ; %bb.0:
199: s_load_dwordx2 s[0:1], s[4:5], 0x10
200: v_lshlrev_b32_e32 v6, 3, v0
201: v_mov_b32_e32 v4, 0x204
202: s_load_dwordx4 s[4:7], s[4:5], 0x0
203: s_waitcnt lgkmcnt(0)
204: v_mov_b32_e32 v1, s1
205: v_add_u32_e32 v0, vcc, s0, v6
206: v_addc_u32_e32 v1, vcc, 0, v1, vcc
207: flat_load_dwordx2 v[0:1], v[0:1]
check:98'0 ^~~~~~~~~~~~~~~~~~~~~~~~
check:98'1 ^~~~~~ captured var "VAL"
dag:102'0 X~~~~~~~~ error: no match found
208: v_mov_b32_e32 v5, s7
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
dag:102'1 ? possible intended match
209: v_mov_b32_e32 v7, s5
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
210: s_waitcnt vmcnt(0)
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
211: v_fract_f64_e32 v[2:3], v[0:1]
dag:100'0 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:100'1 with "VAL" equal to "v\\[0:1\\]"
dag:100'2 ^ captured var "FRACT_LO"
dag:100'3 ^ captured var "FRACT_HI"
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
212: v_cmp_class_f64_e32 vcc, v[0:1], v4
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
213: v_floor_f64_e32 v[0:1], v[0:1]
dag:99'0 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dag:99'1 with "VAL" equal to "v\\[0:1\\]"
dag:99'2 ^~~~~~ captured var "FLOOR"
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
214: v_add_u32_e64 v4, s[0:1], s6, v6
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
215: v_addc_u32_e64 v5, s[0:1], 0, v5, s[0:1]
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
216: v_cndmask_b32_e64 v3, v3, 0, vcc
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
217: v_cndmask_b32_e64 v2, v2, 0, vcc
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218: flat_store_dwordx2 v[4:5], v[0:1]
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219: v_add_u32_e32 v0, vcc, s4, v6
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
220: v_addc_u32_e32 v1, vcc, 0, v7, vcc
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
221: flat_store_dwordx2 v[0:1], v[2:3]
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
222: s_endpgm
dag:102'0 ~~~~~~~~~~
223: .section .rodata,#alloc
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
224: .p2align 6, 0x0
dag:102'0 ~~~~~~~~~~~~~~~~~
225: .amdhsa_kernel test_fract_f64
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
226: .amdhsa_group_segment_fixed_size 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
227: .amdhsa_private_segment_fixed_size 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
228: .amdhsa_kernarg_size 24
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
229: .amdhsa_user_sgpr_count 6
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
230: .amdhsa_user_sgpr_private_segment_buffer 1
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
231: .amdhsa_user_sgpr_dispatch_ptr 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
232: .amdhsa_user_sgpr_queue_ptr 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
233: .amdhsa_user_sgpr_kernarg_segment_ptr 1
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
234: .amdhsa_user_sgpr_dispatch_id 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
235: .amdhsa_user_sgpr_flat_scratch_init 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
236: .amdhsa_user_sgpr_private_segment_size 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
237: .amdhsa_system_sgpr_private_segment_wavefront_offset 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
238: .amdhsa_system_sgpr_workgroup_id_x 1
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
239: .amdhsa_system_sgpr_workgroup_id_y 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
240: .amdhsa_system_sgpr_workgroup_id_z 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
241: .amdhsa_system_sgpr_workgroup_info 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
242: .amdhsa_system_vgpr_workitem_id 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
243: .amdhsa_next_free_vgpr 8
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
244: .amdhsa_next_free_sgpr 8
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
245: .amdhsa_reserve_flat_scratch 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
246: .amdhsa_float_round_mode_32 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
247: .amdhsa_float_round_mode_16_64 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
248: .amdhsa_float_denorm_mode_32 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
249: .amdhsa_float_denorm_mode_16_64 3
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
250: .amdhsa_dx10_clamp 1
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
251: .amdhsa_ieee_mode 1
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
252: .amdhsa_exception_fp_ieee_invalid_op 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
253: .amdhsa_exception_fp_denorm_src 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
254: .amdhsa_exception_fp_ieee_div_zero 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
255: .amdhsa_exception_fp_ieee_overflow 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
256: .amdhsa_exception_fp_ieee_underflow 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
257: .amdhsa_exception_fp_ieee_inexact 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
258: .amdhsa_exception_int_div_zero 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
259: .end_amdhsa_kernel
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
260: .text
dag:102'0 ~~~~~~~
261: .Lfunc_end2:
dag:102'0 ~~~~~~~~~~~~~
262: .size test_fract_f64, .Lfunc_end2-test_fract_f64
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
263: ; -- End function
dag:102'0 ~~~~~~~~~~~~~~~~~~~
264: .section .AMDGPU.csdata
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~
265: ; Kernel info:
dag:102'0 ~~~~~~~~~~~~~~~
266: ; codeLenInByte = 152
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
267: ; NumSgprs: 10
dag:102'0 ~~~~~~~~~~~~~~~
268: ; NumVgprs: 8
dag:102'0 ~~~~~~~~~~~~~~
269: ; ScratchSize: 0
dag:102'0 ~~~~~~~~~~~~~~~~~
270: ; MemoryBound: 0
dag:102'0 ~~~~~~~~~~~~~~~~~
271: ; FloatMode: 192
dag:102'0 ~~~~~~~~~~~~~~~~~
272: ; IeeeMode: 1
dag:102'0 ~~~~~~~~~~~~~~
273: ; LDSByteSize: 0 bytes/workgroup (compile time only)
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
274: ; SGPRBlocks: 1
dag:102'0 ~~~~~~~~~~~~~~~~
275: ; VGPRBlocks: 1
dag:102'0 ~~~~~~~~~~~~~~~~
276: ; NumSGPRsForWavesPerEU: 10
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
277: ; NumVGPRsForWavesPerEU: 8
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
278: ; Occupancy: 10
dag:102'0 ~~~~~~~~~~~~~~~~
279: ; WaveLimiterHint : 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
280: ; COMPUTE_PGM_RSRC2:SCRATCH_EN: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
281: ; COMPUTE_PGM_RSRC2:USER_SGPR: 6
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
282: ; COMPUTE_PGM_RSRC2:TRAP_HANDLER: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
283: ; COMPUTE_PGM_RSRC2:TGID_X_EN: 1
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
284: ; COMPUTE_PGM_RSRC2:TGID_Y_EN: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
285: ; COMPUTE_PGM_RSRC2:TGID_Z_EN: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
286: ; COMPUTE_PGM_RSRC2:TIDIG_COMP_CNT: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
287: .ident "Debian clang version 17.0.6 (2)"
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
288: .section ".note.GNU-stack"
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
289: .addrsig
dag:102'0 ~~~~~~~~~~
290: .amdgpu_metadata
dag:102'0 ~~~~~~~~~~~~~~~~~~
291: ---
dag:102'0 ~~~~
292: amdhsa.kernels:
dag:102'0 ~~~~~~~~~~~~~~~~
293: - .args:
dag:102'0 ~~~~~~~~~~
294: - .address_space: global
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
295: .is_restrict: true
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
296: .offset: 0
dag:102'0 ~~~~~~~~~~~~
297: .size: 8
dag:102'0 ~~~~~~~~~~
298: .type_name: 'half*'
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
299: .value_kind: global_buffer
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
300: - .address_space: global
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
301: .is_restrict: true
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
302: .offset: 8
dag:102'0 ~~~~~~~~~~~~
303: .size: 8
dag:102'0 ~~~~~~~~~~
304: .type_name: 'half*'
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
305: .value_kind: global_buffer
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
306: - .access: read_only
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
307: .address_space: global
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~
308: .is_restrict: true
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
309: .offset: 16
dag:102'0 ~~~~~~~~~~~~~
310: .size: 8
dag:102'0 ~~~~~~~~~~
311: .type_name: 'half*'
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
312: .value_kind: global_buffer
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
313: .group_segment_fixed_size: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
314: .kernarg_segment_align: 8
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
315: .kernarg_segment_size: 24
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
316: .language: OpenCL C
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
317: .language_version:
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
318: - 2
dag:102'0 ~~~~~
319: - 0
dag:102'0 ~~~~~
320: .max_flat_workgroup_size: 256
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
321: .name: test_fract_f16
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~
322: .private_segment_fixed_size: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
323: .sgpr_count: 8
dag:102'0 ~~~~~~~~~~~~~~~~
324: .sgpr_spill_count: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
325: .symbol: test_fract_f16.kd
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
326: .vgpr_count: 8
dag:102'0 ~~~~~~~~~~~~~~~~
327: .vgpr_spill_count: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
328: .wavefront_size: 64
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
329: - .args:
dag:102'0 ~~~~~~~~~~
330: - .address_space: global
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
331: .is_restrict: true
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
332: .offset: 0
dag:102'0 ~~~~~~~~~~~~
333: .size: 8
dag:102'0 ~~~~~~~~~~
334: .type_name: 'float*'
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
335: .value_kind: global_buffer
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
336: - .address_space: global
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
337: .is_restrict: true
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
338: .offset: 8
dag:102'0 ~~~~~~~~~~~~
339: .size: 8
dag:102'0 ~~~~~~~~~~
340: .type_name: 'float*'
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
341: .value_kind: global_buffer
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
342: - .access: read_only
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
343: .address_space: global
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~
344: .is_restrict: true
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
345: .offset: 16
dag:102'0 ~~~~~~~~~~~~~
346: .size: 8
dag:102'0 ~~~~~~~~~~
347: .type_name: 'float*'
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
348: .value_kind: global_buffer
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349: .group_segment_fixed_size: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
350: .kernarg_segment_align: 8
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
351: .kernarg_segment_size: 24
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
352: .language: OpenCL C
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
353: .language_version:
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
354: - 2
dag:102'0 ~~~~~
355: - 0
dag:102'0 ~~~~~
356: .max_flat_workgroup_size: 256
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
357: .name: test_fract_f32
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~
358: .private_segment_fixed_size: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
359: .sgpr_count: 8
dag:102'0 ~~~~~~~~~~~~~~~~
360: .sgpr_spill_count: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
361: .symbol: test_fract_f32.kd
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
362: .vgpr_count: 8
dag:102'0 ~~~~~~~~~~~~~~~~
363: .vgpr_spill_count: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
364: .wavefront_size: 64
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
365: - .args:
dag:102'0 ~~~~~~~~~~
366: - .address_space: global
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
367: .is_restrict: true
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
368: .offset: 0
dag:102'0 ~~~~~~~~~~~~
369: .size: 8
dag:102'0 ~~~~~~~~~~
370: .type_name: 'double*'
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~
371: .value_kind: global_buffer
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
372: - .address_space: global
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~
373: .is_restrict: true
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
374: .offset: 8
dag:102'0 ~~~~~~~~~~~~
375: .size: 8
dag:102'0 ~~~~~~~~~~
376: .type_name: 'double*'
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~
377: .value_kind: global_buffer
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
378: - .access: read_only
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
379: .address_space: global
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~
380: .is_restrict: true
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
381: .offset: 16
dag:102'0 ~~~~~~~~~~~~~
382: .size: 8
dag:102'0 ~~~~~~~~~~
383: .type_name: 'double*'
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~
384: .value_kind: global_buffer
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
385: .group_segment_fixed_size: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
386: .kernarg_segment_align: 8
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
387: .kernarg_segment_size: 24
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
388: .language: OpenCL C
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
389: .language_version:
dag:102'0 ~~~~~~~~~~~~~~~~~~~~
390: - 2
dag:102'0 ~~~~~
391: - 0
dag:102'0 ~~~~~
392: .max_flat_workgroup_size: 256
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
393: .name: test_fract_f64
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~
394: .private_segment_fixed_size: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
395: .sgpr_count: 10
dag:102'0 ~~~~~~~~~~~~~~~~~
396: .sgpr_spill_count: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
397: .symbol: test_fract_f64.kd
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
398: .vgpr_count: 8
dag:102'0 ~~~~~~~~~~~~~~~~
399: .vgpr_spill_count: 0
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
400: .wavefront_size: 64
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~
401: amdhsa.target: amdgcn-amd-amdhsa--gfx803
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
402: amdhsa.version:
dag:102'0 ~~~~~~~~~~~~~~~~
403: - 1
dag:102'0 ~~~~~
404: - 1
dag:102'0 ~~~~~
405: ...
dag:102'0 ~~~~
406:
dag:102'0 ~
407: .end_amdgpu_metadata
dag:102'0 ~~~~~~~~~~~~~~~~~~~~~~
>>>>>>
Start 18: compile_native_rcp__gfx803
18/21 Test #18: compile_native_rcp__gfx803 ........ Passed 0.17 sec
Start 19: compile_native_rsqrt__gfx803
19/21 Test #19: compile_native_rsqrt__gfx803 ...... Passed 0.17 sec
Start 20: compile_native_log__gfx803
20/21 Test #20: compile_native_log__gfx803 ........ Passed 0.17 sec
Start 21: compile_native_exp__gfx803
21/21 Test #21: compile_native_exp__gfx803 ........ Passed 0.17 sec
71% tests passed, 6 tests failed out of 21
Total Test time (real) = 3.53 sec
The following tests FAILED:
1 - constant_fold_lgamma_r__gfx900 (Failed)
2 - constant_fold_lgamma_r__gfx1030 (Failed)
6 - compile_frexp__gfx600 (Failed)
7 - compile_fract__gfx600 (Failed)
12 - compile_fract__gfx700 (Failed)
17 - compile_fract__gfx803 (Failed)
Errors while running CTest
Reply to: