[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Analysis On Getting Rid Of xresprobe



On Thu, Dec 06, 2007 at 06:48:19PM -0800, Bryce Harrington wrote:
> On Tue, Dec 04, 2007 at 11:47:27PM -0500, David Nusinow wrote:
> >    I've been spewing my random findings in to the irc channel for a few
> > days now but I think it's time I posted a summary of where things stand on
> > getting rid of xresprobe. Please do provide corrections, commentary, and
> > questions.
> 
> Thanks for this write up and explaining the strategy and details about
> how the system works.  It appears consistent with my own thinking on
> this (https://wiki.ubuntu.com/DesktopTeam/Specs/HardyHardwareDetection)
> although you've dug in a lot deeper than I did.  :-)

I hope it was useful. I didn't really understand how the modesetting worked
before this, so I hope the writeup saved people the several days' journey
through the code that I had to take to get this far.

> > Motivation:
> >    xresprobe and the parts that call it in the postinst script require a 
> > knowledge of what driver we're working with, which in turn requires
> > discover. So in order to get rid of discover we either need to get rid of
> > xresprobe or slim it down substantially.  Additionally, it's a bunch of
> > code that really papers over actual deficiencies in the various drivers,
> > and our goal should be to get the drivers fixed rather than work around
> > them.
> 
> xresprobe also has suffered from lack of maintenance attention.
> Fortunately I was able to get a bunch of the issues we knew about in
> Ubuntu fixed up for Gutsy, the basic stuff in it is duplicating code in
> X; the non-basic in it is papering over real issues, as you say.

Right, xresprobe is obviously neglected on our side as well. As it is, I
think the most popular drivers will work better without it in the randr 1.2
world, and a few of the older drivers will work more poorly. We'll have to
see what the fallout is.

> > How modesetting works in the pre-randr1.2 world:
> >    The server provides the necessary services to do a DDC probe. It doesn't
> > actually run the DDC normally though. Instead, it relies on the driver to
> > tell it to run a DDC probe, and it passes the information back to the
> > driver as a monitor structure. Initially, the server will pass the driver
> > all the standard VGA modes and settings so the driver will always have
> > some sort of modes to deal with. Most drivers will tell the server to do a
> > DDC probe normally, so they don't have to rely only on the standard modes.
> 
> There's one additional twist with EDID in that we have certain
> situations where xresprobe experiences EDID read fails.  In these cases,
> read-edid seems to fail as well, and I think the problem is also
> affecting X's EDID reading.  Some simply chalk it up to lying hardware,
> but as this seems to be a relatively common issue, I think it'd be worth
> getting a deeper understanding.  In reading the EDID spec, I see it's
> gone through a number of revisions (1.0, 1.1, 1.2, 1.3, and 2.0), and
> has an extension mechanism, and I'm suspicious if the failures are due
> to either the code lacking support for one of the versions, or getting
> messed up by extensions or something.  Or perhaps the monitor is failing
> to implement the EDID protocol correctly.  Or maybe the graphics card is
> corrupting the read somehow.  Monitor cable adapters, kvms, and the like
> have also been known to louse up the EDID info.

Yeah, there's a few problems that I'm aware of. the EDID revision thing is
a new one to me. I know about the kvm issue, but xresprobe would fail that
one too. One issue is that sometimes the monitor will fail to do a proper
DDC on one occasion, but succeed on another. What I'm planning to do to
address this if it's really an issue is to have the server cache the EDID
block somewhere on disk, and use the cache when it can't get a DDC. As it
is, I don't know if this is really an issue because having those
resolutions written to xorg.conf acts as a sort of non-dynamic cache.
Hopefully we can bug upstream about other issues.

> Whatever the case is, this is an area where xresprobe and postinst would
> catch and try to handle (not always with good results, but better than
> nothing).  Xorg may be able to handle some of these cases better, but
> it's an area we'll need to keep a strong eye on.

Yes, this makes sense, although I think that the server should recover
fairly well. If there's no EDID then the driver will just get the standard
VGA modes, which should at least get the display up and running. I'm not
sure what the postinst or xresprobe would do that's any better than that.

> >    Another potential issue is that some monitors will lie. For drivers that
> > are randr1.2, we can use the quirks to work around it, but for those that
> > aren't we're still screwed. However, given that xresprobe uses the exact
> > same information as the driver itself I fail to see how xresprobe would be
> > of any help here.
> 
> Right; if we need workarounds for these cases, there's no reason those
> workarounds shouldn't be pushed into xorg itself.

That's the goal. I've put large chunks of the postinst and dexconf in to
the server already, but I think most of that work is finished. I need to
get my kernel-style driver loading patch in upstream, but that's the last
major thing I see.

> >    Finally. there's the issue of unmaintained buggy drivers. I don't have a
> > good solution to this problem aside from simply fixing the bugs and
> > exhorting our users to help us out.
> 
> With how fast core X has moved with major architectural changes, this
> seems to be becoming a common issue with the less well maintained
> drivers.  This is particularly hurting with embedded and educational
> users, who seem to be the biggest place where these drivers show up, and
> when they do, they seem to be among groups that lack the technical
> resources to participate in the driver work, so the issues end up
> falling on the packager's desk.  If the issues are just minor bugs,
> those are probably fixable, however stuff like xrandr, pci rework, and
> the like seem like non-trivial efforts, and hard to do without access to
> the hardware (and sometimes hard even when you do have it).
> 
> I'm concerned about this because the way things are scaling, this is
> going to become a larger and larger sore spot for all of us.  Maybe we
> need to start (threatening to?) deprecate extremely out of date drivers,
> or become more proactive at recruiting driver maintainers, or even just
> publish task lists for them?

Yeah, pci-rework will be the first real breaking point for the drivers.
Previously they all worked fine with the server updates. I could see SuSE
or Redhat porting a bunch of the legacy drivers so as to not piss off their
customers, but we may not want to wait around that long. Porting over to
pci-rework isn't too terribly difficult[0], but it does need to be done if
these drivers are going to survive. Otherwise we'll have to simply EOL the
drivers and tell people to use old versions or buy new hardware :-\

> > I'l be ripping out the code from the postinst and running it through local
> > tests over the next few days. Please let me know what you guys think, if
> > I've missed anything, and if you have concerns or questions that we should
> > address before doing this. This is a very complicated problem, so please do
> > think about this a little and speak up if you think of anything at all.
> 
> Great, I'll be happy to help test on the hardware I have, and send
> patches where I can.

Cool, thank you!

 - David Nusinow

[0] http://wiki.x.org/wiki/PciReworkHowto



Reply to: