[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#382390: lpq -Pall fails for many printers when :filter is used



Package: lprng
Version: 3.8.28-2

(I suppose this is an upstream bug, but I havent't found an upstream
place to report this. That's why I use Debian's Bug Tracking System to
document it.)

I tried to trace a problem where "lpq -Pall" stops output before it
it has listed all printers. I made a trimmed-down way to reproduce the
problem:

Create the printcap - run mkprintcap >/etc/printcap

---mkprintcap---
#!/bin/sh

cat <<EOF
.default
# spool directory
        :sd=/var/spool/lpd/%P
# (no idea what needs translate_format)
        :translate_format=
# maximum job size: unlimited
        :mx=0
# suppress headers and/or banner page
        :sh
# maximum copies allowed
        :mc=1
# preserving already printed jobs: no more than ...
        :done_jobs=5
# ... and no longer than ...
#       :done_jobs_max_age=7200
        :done_jobs_max_age=43200
# stop when less than 10 GB disk space are free
        :minfree=10000000
# time between jobs to allow printer recovery
        :connect_grace=10
# time after which a job is reported as stalled
        :stalled_time=600

EOF

for i in $(seq 1 10000)
do
        echo p$i
        echo "  :tc=.default"
        echo "  :lp=h$i%9100"
        echo "  :filter=-$ /usr/local/bin/printer_filter gs hp2k3d"
        echo
done
---end-of-mkprintcap

This creates a printcap with a default template (see above script) and
10000 numbered printers:


p1
        :tc=.default
        :lp=h1%9100
	:filter=$ /usr/local/bin/printer_filter gs hp2k3d

p2
...


Then make an executable /usr/local/bin/printer_filter:

  ln -s /bin/true /usr/local/bin/printer_filter

This is only needed to make checkpc -f happy.


Now run checkpc -f, this creates the queue directories.


Stop any existing lpd and run a new lpd in foreground:

  lpd -F 

In a second shell run some lpq, like:

  lpq -v -Pall |grep "^Printer: "| wc -l 

These outputs result:

  # lpd -F 
2006-08-10-17:49:13.824 printserv p1019: Make_temp_fd_in_dir: cannot create tempfile '/var/spool/lpd/p1019/temp00zS8ve8'
  #


  $ lpq -v -Pall |grep "^Printer: "| wc -l 
  1018


I tried some variations in order to learn more about the problem. It
looks like it doesn't matter whether you use -v, -s or no option for
lpq. The problem completely vanishes when you omit :filter in
above printcap entries. 

lpd can give you some more debug info: 

  lpd -F -D 3 >lpd.log

The supposedly relevant output:

2006-08-10-18:07:29.110 printserv [9985] Job_status  p1019: Checkread: file 'lpq.1'
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: Checkread: cannot open 'lpq.1', Too many open files
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: Checkread: 'lpq.1' fd -1, size 0
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: Init_tempfile: temp file '/var/spool/lpd/p1019'
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: setstatus: msg 'p1019: Make_temp_fd_in_dir: cannot create tempfile '/var/spool/lpd/p1019/temp0
0sLselj''
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: Trim_status_file: file 'status.pr' max 10, min 0
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: Trim_status_file: 'status.pr' max 10, min 0, size 0
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: Checkwrite: file 'status.pr', rw 0, create 0, nodelay 0
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: Checkwrite: cannot open 'status.pr', Too many open files
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: Checkwrite: file 'status.pr' fd -1, inode 0x0, perms 00
2006-08-10-18:07:29.111 printserv [9985] Job_status  (INFO) p1019: Make_temp_fd_in_dir: cannot create tempfile '/var/spool/lpd/p1019/temp00sLselj'
2006-08-10-18:07:29.111 printserv [9985] Job_status  p1019: cleanup: signal No signal, Errorcode 32
2006-08-10-18:07:29.115 printserv [9985] Job_status  p1019: Killchildren: pid 9985, signal Interrupt, count 0


Oh, up to now I didn't see the "Too many open files"... When I raise
the limit with "ulimit -n 10000" it works up to printer p9995.

Looking in lpd's /proc/<pid>/fd reveals the files that eat the fd budget:

...
lrwx------ 1 root root 64 Aug 10 18:10 994 -> /var/spool/lpd/p990/lpq
lrwx------ 1 root root 64 Aug 10 18:10 995 -> /var/spool/lpd/p991/lpq
lrwx------ 1 root root 64 Aug 10 18:10 996 -> /var/spool/lpd/p992/lpq
lrwx------ 1 root root 64 Aug 10 18:10 997 -> /var/spool/lpd/p993/lpq
lrwx------ 1 root root 64 Aug 10 18:10 998 -> /var/spool/lpd/p994/lpq
lrwx------ 1 root root 64 Aug 10 18:10 999 -> /var/spool/lpd/p995/lpq

(In order to make lpd run slow enough use -D 3.)

So the workaround is to increase the fd budget with ulimit.

On the other hand this is the place where I don't know how to make a
real fix: I guess the lpq file should be closed before the one for the
next printer is opened, but I couldn't find these parts in the LPRng
source. And it is still magic why this does not happen when the
:filter statements are omitted. And sometimes after the first
invocation of a fresh lpd the problem does not appear. No idea why.

Hope this helps when someone runs into the same problem.

	Sven







Reply to: