[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#424068: cupsys: implicit classes, server failure, and remote.cache



Package: cupsys
Version: 1.2.7-4

An innovation of CUPS 1.2 is Remote Printer Caching: cupsd saves the
list of remote printers to /var/cache/cups/remote.cache on exit, and
reloads it on start-up.

I've just become keenly aware of this feature following an incident
which resulted in the remote.cache perpetuating an undesirable queue
state across cupsd restarts; I ended up having to
invoke-rc.d cupsys stop; rm /var/cache/cups/remote.cache; invoke-rc.d cupsys start
on all my etch hosts to restore sanity.

I have two IPP servers on my network (let's call them ipp1 and ipp2), 
advertising exactly the same queues to clients. My users and I rely
on the implicit class mechanism, so "lp -d queue" will send the job
to either queue@ipp1 or queue@ipp2 without the user having to be aware
of the details. These servers happen to still be running Debian sarge
(CUPS 1.1), but I don't think that's too important for the issue at
hand. The clients have a couple of BrowsePoll lines in their cupsd.conf
files, pointing to ipp1 and ipp2.

A few days ago, ipp2 crashed (hardware failure, unrelated to CUPS).
The Debian "etch" clients reacted by losing the implicit classes and 
only showing the queue@ipp1 queues (incidentally annoying my users, 
who found that the usual queue names no longer worked; so that 
loss of the implicit classes should itself be regarded as a bug. 
I'll probably add a third IPP server to keep this from happening again.)

After ipp2 was restarted, those clients started showing *both* the
unqualified queue names and the queue@ipp1 names. Unfortunately for
redundancy, the unqualified queue names were no longer implicit classes,
but pointed to ipp2 only.

Restarting cupsd on these clients didn't cure the problem, which is how
I found out (somewhat laboriously, by reading the source code) about 
the remote.cache feature. 

There is probably room for argument about what exactly needs to be fixed
and how. Tentatively, I'll suggest that:

1) something went wrong when ipp2 was restarted and its queues were
rediscovered by the clients: they should have received names with an
explicit @ipp2 suffix, keeping the unqualified names for the implicit
classes;
2) implicit classes shouldn't disappear when all but one of the
instances go offline, as they now do;
3) the existence of the remote.cache file needs to be documented
more prominently, so that administrators will know to check it (and
perhaps remove it, as I've done) when troubleshooting.




Reply to: