[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Notes on netwinder frame buffer performance



I've been looking into the xserver-fbdev performance on the potato
netwinder release.  I have a collection of data and facts but no great
news yet.

THIS DOES NOT APPLY TO THE REBEL.COM ACCELERATED XSERVER.

First, for those who haven't used it, the xserver-fbdev works and
feels fine for xterms, browers, and other things that draw some screen
and then leave it.  It is uncomfortably slow dragging opaque windows
and scrolling.

After failing to find a X performance quantifier in Debian I dusted
off an ancient copy of `xbench' to do some quantification.

I ran it on my Intel machine (name rhth, K2-350, Riva TNT 128 AGP) and
one of my Netwinders (name elbow, SA-110, CyberPro 2000).  The other
rows in the table are the 1992 era machines whose results were
including in the xbench package.  (Note, I left rhth in 24bps and the
netwinder in 8bps)

+-----------+----+---------+---------+---------+----------+---------+---------+
|machine    | p  |  line   |  fill   |   blt   |  text    |  cmplx  | xstones |
+-----------+----+---------+---------+---------+----------+---------+---------+
|rhth       | 24 |  649991 |  272336 |  625215 | 13001324 |  650980 |  694951 |
+-----------+----+---------+---------+---------+----------+---------+---------+
|sparcII gx |  8 |  244821 |   44250 |   50912 |   435875 |   85816 |   97803 |
+-----------+----+---------+---------+---------+----------+---------+---------+
|elbow      |  8 |  311700 |   24732 |    7770 |   577500 |   24248 |   28269 |
+-----------+----+---------+---------+---------+----------+---------+---------+
|Sun3/50 (R3|  1 |   10000 |   10000 |   10000 |    10000 |   10000 |   10000 |
+-----------+----+---------+---------+---------+----------+---------+---------+
|DEC gpx (R2|  8 |    4835 |    7892 |    5710 |    30937 |    5490 |    8250 |
+-----------+----+---------+---------+---------+----------+---------+---------+

The `blt' column is really hurting.  That is screen to screen copies and
scrolls that hurt.

I poked around and found the source of all slowness.  The PCI memory
for the frame buffer has caching and write-buffering disabled in the
page tables.  I seem to have achieved a nice speed up by enabling
caching and write-buffering at the expense of having anything
usable. :-) (Even if it worked, it must be resolved with the
acceleration in the fbdev to get the cache flushed before using the
accelerator).

When a cacheline writes to the cyberpro the first 32 bit word gets
written correctly, the next 3 do not make it into the frame buffer.
Consider these tests with the cache and writebuffer enabled...

  Writing the frame buffer horizontally, yields a striped pattern of 4
  pixels of data then 12 pixels not written.

  Writing the frame buffer vertically works just fine.

In this state caching is not possible for the framebuffer.  If the
DEC21285 can be convinced not to do burst writes (thats a guess) or
the CyberPro 2000 can be convinced to accept them then we could turn
on caching and have much better performance.  I'm prodding the 21285,
but I can't make any progress on the CyberPro until I get some docs.

Does anyone here have a contact at iGS Technologies?  I have been
unable to get in contact using their telephones, voice mail, web
pages, or any of their published e-mail addresses.

Another option is to slightly alter the kernel fb acceleration for the
cyberpro 200 and create a new xfree86 fb acclerator type for it.  This
would be higher performance, but much more complicated than just
getting the chips to communicate correctly (if thats even possible).

There doesn't seem to be any cyberpro support in Xfree86 4.0 (at least
from the release notes, I haven't unpacked and grepped).

-- 
                                     Jim Studt, President
                                     The Federated Software Group, Inc.


Reply to: