[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: iBook and playing DVDs



>Hey, wait a minute... why guarded?

Well, you are right about this, guarded isn't needed though
the kernel tends to put guarded along with cache inhibit
automatically (it does so in ioremap for example).

The main reason I set it currently that I'm still trying to
figure out what is causing both r128 and radeon drivers to
lockup when using DRI + AGP with Apple chipset, and among
the things I suspected was a speculative access issue, but
I tend to no longer think it's related.

>Tell me where I'm wrong:
>
>AGP memory is regular RAM on the motherboard.
>(at least it isn't device registers)

Yes.

>Typically an app puts images (bumpmaps, textures, etc.)
>in AGP memory. Triangles for 3d rendering also
>get written to AGP memory.

Yes.

>This app is X, or an authorized local client.


Yes.

>It is not common to have the video card writing
>to AGP memory.

By default, the r128 and radeon DRI drivers to write to
AGP memory the ring readptr, but doing so seem to be
broken on some HW (UniNorth 1.0.x and some ia64 bridges
don't deal with that properly)

>If the video card does write to memory, X can
>ensure that this doesn't happen to memory that
>the user is busy writing to.

Currently, I tweaked r128 to write using normal PCI cycles
to a separate page of memory only holding that ring pointer,
and I hacked radeon to not write, but instead have the driver
read that pointer from the card MMIO registers. This didn't
help fixing the lockup though.

>It is not common for the for the user to read AGP memory.

You don't know. If it's cacheable, writing a byte will cause
a CPU load of the entire cacheline for example.

>If the user does read from AGP memory, the X server
>could flush some cache lines before telling the user
>that the memory has been updated. (PowerPC uses a
>physical cache, not a virtual cache)

Well, +/- On radeon+DRI, we could do a flush pass on the
indirect buffers when they get passed to the kernel driver.

>The motherboard chipset will walk some sort of page table
>when the video card tries to access AGP memory. This is
>kept coherent by a Linux kernel DRI/DRM/AGP driver.

The uninorth driver does explicit flush of this page table
after modifying, I don't map it uncacheable.

>Aside from X itself, ordering isn't going to matter.
>User apps won't be trying to atomicly update data
>structures as viewed from the video card. X might
>do this.
>
>It wouldn't be insane to update X to include all
>the necessary cache-related instructions.

Actually, not X, but the DRM kernel driver.

>User apps need caching off by default, since trying to
>update all the apps would be insane.
>
>Unless user code will write to AGP memory on one
>processor and read or write on another processor,
>the M bit (Memory Coherency Attribute) can be
>cleared. It's pointless for the CPU to waste bus
>cycles trying to be coherent, since the video card
>will not cooperate. All non-SMP systems should
>map the AGP memory with coherency disabled.

I'm not too sure about that. What about one CPU writing
half a cache line of the ring buffer in AGP memory, and
another CPU writing the other half ?

>No existing PowerPC will do unrequested prefetching
>across page boundries, or this is easily avoided
>by not using memory adjacent to the boundry
>between AGP memory and non-AGP memory.

That isn't a problem, though I'm not sure about your statement
that they won't do unrequested prefetching. Do you have some
pointers to the docs ?

>If apps would at least avoid reading stuff written
>by the video card, write-through cached would be OK.
>Apps that read AGP memory are uncommon enough that
>fixing all of them would be feasible.

I think we can use full caching (copyback) without too much
problems. In the r128 case, we'll have to flush from the X server
as it's directly writing to the ring (and maybe from the mesa driver
as well). On radeon, it's all done via indirect buffers and those
get passed to the kernel driver before beeing inserted in the ring.

So we can definitely improve the throughput by letting it be
cacheable. The main reason I didn't work on this yet is that I want
the driver to be stable first to avoid possibly mixing problems.
Currently, I haven't managed to figure out what is causing the
card lockups when AGP is used.

Ben.




-- 
To UNSUBSCRIBE, email to debian-powerpc-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org



Reply to: