[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Slow Xorg performance on dual Opteron + Radeon, Jessie 64-bit



Hi all,

I've recently installed Debian Jessie 64-bit on my (admittedly rather old) dual Opteron workstation, and I'm experiencing pretty bad performance in X11. Certain redrawing operations are extremely slow, with delays of half a second or more, and Xorg consumes a lot of CPU time. I've wondered if it's particularly a problem for older X clients using bitmapped fonts, as it's very noticeable when running xosview and when dragging tabs and invoking menus in Notion (window manager), but menu drawing in GIMP is also very slow. The machine has 8 GiB of RAM and only a couple are being used. It was formerly running Ubuntu 12 LTS and didn't have this problem.

`Xorg -version` reports the following:

X.Org X Server 1.16.4
Release Date: 2014-12-20
X Protocol Version 11, Revision 0
Build Operating System: Linux 3.16.0-4-amd64 x86_64 Debian
Current Operating System: Linux zaphod 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) x86_64 Kernel command line: BOOT_IMAGE=/boot/vmlinuz-3.16.0-4-amd64 root=UUID=2cd952af-5971-442e-9254-bb15730dec1a ro quiet
Build Date: 11 February 2015  12:32:02AM
xorg-server 2:1.16.4-1 (http://www.debian.org/support)
Current version of pixman: 0.32.6


My video card is a Radeon X800 XT Platinum Edition AGP, and Xorg is using the RADEON driver. It's running in dual-head mode.


`perf top` seems to indicate that the C library's memcpy implementation is the main culprit:

  44.29%  libc-2.19.so                   [.] __memcpy_sse2_unaligned


Here's the disassembly of the same, also courtesy of `perf`:
<<
__memcpy_sse2_unaligned /lib/x86_64-linux-gnu/libc-2.19.so
...
│ lea 0x30(%r10),%rax │ movdqu (%rcx,%r10,1),%xmm8 10.42 │ movdqa %xmm8,(%rcx) 5.03 │ movdqu (%rcx,%r9,1),%xmm8 10.35 │ movdqa %xmm8,0x10(%rcx) 10.60 │ movdqu (%rcx,%r8,1),%xmm8 14.00 │ movdqa %xmm8,0x20(%rcx) 7.87 │ movdqu (%rcx,%rax,1),%xmm8 12.53 │ movdqa %xmm8,0x30(%rcx) 7.57 │ add $0x40,%rcx │ cmp %rcx,%rdx 10.08 │ ↓ jne 780 │ ↓ jmpq 6de
       │       cmp    %rsi,%rdi
>>

Could it be a performance regression due to recent enhancements to GLIBC for newer CPUs? Any thoughts on how I can test this further?

Thanks,
Chris


Reply to: