[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#206907: xserver-xfree86: hangs on screensaver



Sven Luther <sven.luther@wanadoo.fr> writes:

> > Well, that certainly looks like a good way to fuck yourself.
> > Whoever wrote this code has a lot of confidence in the underlying
> > hardware.

> > Since I happen to know that Sven Luther does a lot of glint driver
> > work, I'm CCing him on this message.

> Yep, i did write part of this code, altough i mostly worked for the
> permedia3, but it is the same code. Could you try building the glint
> module with the DEBUG #define at the start of pm2_accel.c set to 1,
> in order to see the call trace in the /var/log/XFree86.0.log.

I don't have the X sources or bandwidth to sit around compiling this
stuff, but I will test out glint.o's that you guys send me.

> What is happening is that the Permedia2Sync function is sending the
> sync command to the chip, to synchronize the pipeline, probably
> between accel drawing and software drawing, since the permedia2
> cannot accel some of the function (notably lines). The sync tag
> should be read back once the sync command has reached the bottom of
> the graphic pipeline, and every accel drawing has been commited to
> the framebuffer. Since no sync tag is read back, this can be for
> various reasons :

>   o the sync command never reached the chip, because of of bus
>   hogging or somethign such.

>   o the graphic pipe did fail to synchronize, or dies for whatever
>   reason. This means the problem is not in the sync call, but with
>   whatever was done previously.

>   o naturally, the random crashing could hint at hardware problem or
>   bus problems, a more complete of your exact card (with lspci -v
>   output maybe and/or lspci -vn too) would be welcome, as well as on
>   what arch it is, and what kind of bus it is connected. Information
>   on the motherboard would be welcome too, altough i doubt it is
>   relevant here.

Something I failed to mention in my original bug report is that this
machine has always worked just fine in the past.  I don't think it's a
hardware problem unless it just started for some weird reason.

eugene:~# lspci -nv
00:00.0 Class 0600: 8086:7190 (rev 03)
        Subsystem: 1028:0080
        Flags: bus master, medium devsel, latency 64
        Memory at f0000000 (32-bit, prefetchable) [size=64M]
        Capabilities: [a0] AGP version 1.0

00:01.0 Class 0604: 8086:7191 (rev 03)
        Flags: bus master, 66Mhz, medium devsel, latency 64
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
        Memory behind bridge: fb000000-fdffffff
        Prefetchable memory behind bridge: f6000000-f6ffffff

00:07.0 Class 0601: 8086:7110 (rev 02)
        Flags: bus master, medium devsel, latency 0

00:07.1 Class 0101: 8086:7111 (rev 01) (prog-if 80 [Master])
        Flags: bus master, medium devsel, latency 64
        I/O ports at ffa0 [size=16]

00:07.2 Class 0c03: 8086:7112 (rev 01)
        Flags: bus master, medium devsel, latency 64, IRQ 14
        I/O ports at dce0 [size=32]

00:07.3 Class 0680: 8086:7113 (rev 02)
        Flags: medium devsel, IRQ 9

00:11.0 Class 0200: 10b7:9055
        Subsystem: 1028:0080
        Flags: bus master, medium devsel, latency 64, IRQ 14
        I/O ports at dc00 [size=128]
        Memory at fe000000 (32-bit, non-prefetchable) [size=128]
        Expansion ROM at f8000000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 1

00:13.0 Class 0604: 1011:0024 (rev 03)
        Flags: bus master, medium devsel, latency 64
        Bus: primary=00, secondary=02, subordinate=02, sec-latency=64
        I/O behind bridge: 0000e000-0000efff
        Memory behind bridge: f9000000-faffffff
        Prefetchable memory behind bridge: 00000000f5000000-00000000f5f00000
        Capabilities: [dc] Power Management version 1

01:00.0 Class 0380: 104c:3d07 (rev 01)
        Subsystem: 1092:0149
        Flags: 66Mhz, medium devsel, IRQ 11
        Memory at fcfe0000 (32-bit, non-prefetchable) [size=128K]
        Memory at fc000000 (32-bit, non-prefetchable) [size=8M]
        Memory at fb800000 (32-bit, non-prefetchable) [size=8M]
        Expansion ROM at 80000000 [disabled] [size=64K]
        Capabilities: [40] AGP version 1.0

02:0a.0 Class 0100: 9005:001f
        Subsystem: 1028:0080
        Flags: bus master, medium devsel, latency 64, IRQ 10
        BIST result: 00
        I/O ports at ec00 [disabled] [size=256]
        Memory at f9fff000 (64-bit, non-prefetchable) [size=4K]
        Expansion ROM at fa000000 [disabled] [size=128K]
        Capabilities: [dc] Power Management version 1

02:0e.0 Class 0100: 9004:8078 (rev 01)
        Subsystem: 9004:7880
        Flags: bus master, medium devsel, latency 64, IRQ 10
        I/O ports at e800 [disabled] [size=256]
        Memory at f9ffe000 (32-bit, non-prefetchable) [size=4K]
        Expansion ROM at fa000000 [disabled] [size=64K]
        Capabilities: [dc] Power Management version 1

Here is lspci -v for this device:

01:00.0 Display controller: Texas Instruments TVP4020 [Permedia 2] (rev 01)
        Subsystem: Diamond Multimedia Systems FIRE GL 1000 PRO
        Flags: 66Mhz, medium devsel, IRQ 11
        Memory at fcfe0000 (32-bit, non-prefetchable) [size=128K]
        Memory at fc000000 (32-bit, non-prefetchable) [size=8M]
        Memory at fb800000 (32-bit, non-prefetchable) [size=8M]
        Expansion ROM at 80000000 [disabled] [size=64K]
        Capabilities: [40] AGP version 1.0

> My first guess would either be a previous call did make the graphic
> pipeline die, or maybe even hog the bus. Does the machine stay
> accessible under ssh or something, i guess yes since you were able
> to to gdb work on it. What happens if you kill and restart the X
> server, which should reinitialize the pipeline.

Killing and restarting the X server doesn't seem to do anything.  I
still have a blank screen, and I haven't been able to get the hardware
'back to normal' without rebooting.  You are correct in assuming that
the machine itself is not hanging, the load goes to one and X is dead
but I am able to ssh just fine.

> > Sven, any ideas?

> I hope this gives some ideas, altough the randomness of this
> happening might well be the input fifo overflowing and thus loosing
> the sync command or something such. Tricky thing to hunt down if
> this is the case.

Well, I'm willing to try stuff out for you guys, but I would like to
avoid downloading and compiling X.  I don't really have the time for
that.

As for the machine itself:

Linux eugene 2.4.21 #2 Fri Jun 27 15:58:56 CEST 2003 i686 GNU/Linux

Although I thought that stuff was in the bug report.

Thankyou,
-- 
David N. Welton
   Consulting: http://www.dedasys.com/
     Personal: http://www.dedasys.com/davidw/
Free Software: http://www.dedasys.com/freesoftware/
   Apache Tcl: http://tcl.apache.org/




Reply to: