[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Blender crash with packaged ROCm 5.2.3 drivers



Hi Jakub,

Jakub Jaszewski, on 2022-11-04:
> Please forgive me if this is not the right place to discuss such issues. My
> name is Jakub and for my work I use FOSS 3D software - Blender [1] for which
> recently AMD contributed HIP compute backend [2] as part of the official
> support.

I think you're at a right place to discuss ROCm related topics
in Debian context.  :)

> After ROCm 5.2.3 landed in Debian unstable I gave it a try with Blender, and
> after some initial hurdles with binaries path I encounterred an LLVM error
> which resulted in a crash. Blender developer said that this is not something
> they can fix. The entire issue is documented on Blender bugtracker [3] where
> you can find all the details.
> 
> The most relevant part of debug log:
> 
> I1102 11:33:38.655006 91168 device.cpp:32] HIPEW initialization succeeded
> I1102 11:33:38.655035 91168 device.cpp:34] Found precompiled kernels
> mesa: CommandLine Error: Option 'h' registered more than once!
> LLVM ERROR: inconsistency in registered CommandLine options
> Aborted
> 
> I don't know if this can qualify as a bug that should be reported here on
> debian bugtracker or somewhere else. Any help would be greatly appreciated.
> 
> [1] https://builder.blender.org/download/daily/
> [2] https://developer.blender.org/D12578
> [3] https://developer.blender.org/T102018

I have been scratching my head on what would be the necessary
changes to properly hide symbols so to prevent them from
colliding with Mesa; as far as I could witness, except for the
rocm-smi-lib, library symbols are already filtered.  But it is
quite possible I missed the point and didn't look at the right
things (I've been after .map, .def and packages symbols lists).

Other than that, I tried to build a custom blender 3.3.1 version
with HIP Cycles support following the packaging changes
suggested by Cordell Bloor in #1021646[4], to see if I could
reproduce the issue.

[4]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1021646

Apparently I managed to reproduce a crash about at the same step
as you observe.  I got a slightly different output when running
blender through the debugger: the Mesa CommandLine Error does
not appear on my end.  Here below is the tracing information
from the debugger:

I1105 00:57:15.862457 879154 device.cpp:32] HIPEW initialization succeeded
I1105 00:57:15.862509 879154 device.cpp:34] Found precompiled kernels
[New Thread 0x7fff325ff6c0 (LWP 879325)]

Thread 1 "blender" received signal SIGSEGV, Segmentation fault.
0x00007ffff7c15d95 in ?? () from /lib/x86_64-linux-gnu/libjemalloc.so.2
(gdb) bt
#0  0x00007ffff7c15d95 in  () at /lib/x86_64-linux-gnu/libjemalloc.so.2
#1  0x00007fff32959154 in  () at /lib/x86_64-linux-gnu/libamdhip64.so
#2  0x00007fff32960fa8 in  () at /lib/x86_64-linux-gnu/libamdhip64.so
#3  0x00007fff3290f19e in  () at /lib/x86_64-linux-gnu/libamdhip64.so
#4  0x00007fff32952dfe in  () at /lib/x86_64-linux-gnu/libamdhip64.so
#5  0x00007fff326c676c in  () at /lib/x86_64-linux-gnu/libamdhip64.so
#6  0x00007fff326c75ad in hipInit () at /lib/x86_64-linux-gnu/libamdhip64.so
#7  0x0000555557e38824 in ccl::device_hip_safe_init () at ./intern/cycles/device/hip/device.cpp:96
#8  ccl::device_hip_info(ccl::vector<ccl::DeviceInfo, ccl::GuardedAllocator<ccl::DeviceInfo> >&) (devices=...) at ./intern/cycles/device/hip/device.cpp:104
#9  0x0000555557e20b7a in ccl::Device::available_devices(unsigned int) (mask=34) at ./intern/cycles/device/device.cpp:228
#10 0x0000555557bbbc3d in ccl::available_devices_func(PyObject*, PyObject*) (args=<optimized out>) at ./intern/cycles/blender/python.cpp:416
#11 0x00007fffeff28413 in  () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#12 0x00007fffefedebce in _PyObject_MakeTpCall () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#13 0x00007fffefe79cb4 in _PyEval_EvalFrameDefault () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#14 0x00007fffeffc70c6 in  () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#15 0x00007fffefee31b8 in  () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#16 0x00007fffefe79c63 in _PyEval_EvalFrameDefault () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#17 0x00007fffeffc70c6 in  () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#18 0x00007fffefee31b8 in  () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#19 0x00007fffefe79c63 in _PyEval_EvalFrameDefault () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#20 0x00007fffeffc70c6 in  () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#21 0x00007fffefee31b8 in  () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#22 0x00007fffefe79c63 in _PyEval_EvalFrameDefault () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#23 0x00007fffeffc70c6 in  () at /lib/x86_64-linux-gnu/libpython3.10.so.1.0
#24 0x0000555556ac015f in bpy_class_call (C=0x7fffd967e2b8, ptr=<optimized out>, func=0x55555ac15da0 <rna_Panel_draw_func>, parms=0x7fffffffdca0) at ./source/blender/python/intern/bpy_rna.c:8690
#25 0x0000555556a5da5c in panel_draw (C=<optimized out>, panel=0x7fff439304b8) at ./source/blender/makesrna/intern/rna_ui.c:129
#26 0x0000555556adafab in ed_panel_draw (C=C@entry=0x7fffd967e2b8, region=region@entry=0x7fff4bc55038, lb=lb@entry=0x7fff4bc55130, pt=pt@entry=0x7fff4b8ca938, panel=0x7fff439304b8, panel@entry=0x0, w=484, em=20, unique_panel_str=0x0, search_filter=0x0) at ./source/blender/editors/screen/area.c:2791
#27 0x0000555556adca43 in ED_region_panels_layout_ex (C=C@entry=0x7fffd967e2b8, region=region@entry=0x7fff4bc55038, paneltypes=<optimized out>, contexts=contexts@entry=0x7fffffffdf60, category_override=category_override@entry=0x0) at ./source/blender/editors/screen/area.c:2989
#28 0x00005555584a3be5 in userpref_main_region_layout (C=0x7fffd967e2b8, region=0x7fff4bc55038) at ./source/blender/editors/space_userpref/space_userpref.c:128
#29 0x0000555556adbb9e in ED_region_do_layout (C=C@entry=0x7fffd967e2b8, region=region@entry=0x7fff4bc55038) at ./source/blender/editors/screen/area.c:511
#30 0x00005555565543f5 in wm_draw_window_offscreen (stereo=false, win=0x7fff43dd7a78, C=0x7fffd967e2b8) at ./source/blender/windowmanager/intern/wm_draw.c:889
#31 wm_draw_window (win=0x7fff43dd7a78, C=0x7fffd967e2b8) at ./source/blender/windowmanager/intern/wm_draw.c:1111
#32 wm_draw_update (C=0x7fffd967e2b8) at ./source/blender/windowmanager/intern/wm_draw.c:1338
#33 0x0000555556550f40 in WM_main (C=C@entry=0x7fffd967e2b8) at ./source/blender/windowmanager/intern/wm.c:640
#34 0x0000555555efa1ca in main (argc=2, argv=0x7fffffffe248) at ./source/creator/creator.c:547

In hope this helps pinpointing what's wrong,
-- 
Étienne Mollier <emollier@emlwks999.eu>
Fingerprint:  8f91 b227 c7d6 f2b1 948c  8236 793c f67e 8f0d 11da
Sent from /dev/pts/1, please excuse my verbosity.

Attachment: signature.asc
Description: PGP signature


Reply to: