[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

`mod_rewrite', zcat-html, and /usr/doc/pkg/file.htmlgz



 Ok, here's what I've come up with, so far, some assembly required.  I
would like some feedback and advice.  Do you want me to try and
package this?  It is working here very well.

 I can access gzipped html files transparently now.  Links inside the
documents do not need to be changed by a script for this to happen
because Apache does the work.  The documents are unzipped by a handler
when requested, so the browser doesn't need to handle any special
content encoding.

 I've opted to use an .htaccess file in "/usr/doc", rather than
putting it all in a <Directory ...> section in "access.conf".  It
seemed more flexible this way, since I thought that it might prove to
be useful for some packages to install .htaccess files of their own in
"/usr/doc/pkg", to invoke special handling of certain files.  This
will work well for SPI documents that might contain links to things
that aren't installed locally.  A rewrite rule can redirect the
viewer's browser to the right place.

 Any other ideas suggestions or feedback?  Please put them in the
list.

 I can write a `script` that will recurse down the "/usr/doc" tree, to
do the following:

bash-2.0# cd /usr/doc
bash-2.0# du -chk $(find -name '*.html')
 [...lots of output...]
29756	total
bash-2.0# find -name '*.html' | xargs gzip -9
bash-2.0# for f in $(find -name '*.html.gz');do mv $f ${f%%html.gz}htmlgz; done
bash-2.0# du -chk $(find -name '*.htmlgz')
 [...lots of output...]
7782	total
!! Wow...  29756 / 7782 = 3.82 that's almost 4x smaller.  I saved
21974 Kilobytes of disk space.

Oops... turns out to be only a handful, but:

bash-2.0# for f in $(find -name 'README.htmlgz'); \
  do mv $f ${f%%htmlgz}html.gz; \
  gunzip ${f%%htmlgz}html.gz; \
 done
bash-2.0# for f in $(find -name 'HEADER.htmlgz'); \
  do mv $f ${f%%htmlgz}html.gz; \
  gunzip ${f%%htmlgz}html.gz; \
 done
 Those files don't get rewritten properly.  Darn.  They should.

 I thik there should be a hook in the package installer to do
compression automagicly, at local option.  I believe this was
discussed.  Will it happen in Deity?


 I have Apache 1.2.1 configured to load `mod_rewrite', `mod_actions',
and:

# ---- "/etc/apache/httpd.conf" ----
# ReWriteLog
RewriteLog "/var/log/apache/rewrite.log"
# Running...
RewriteLogLevel 0
# Debugging...
#RewriteLogLevel 9

# ---- "/etc/apache/access.conf" ----
<Directory /usr/doc>
## FollowSymLinks is required to make mod_rewrite work.
Options FollowSymLinks All
AllowOverride All
XBitHack full
order allow,deny
# This can be tweaked for systems that do not allow outside access.
allow from all
</Directory>

# ---- "/etc/apache/srm.conf" ----
# A new MIME type and a handler
AddType text/x-html-gzip htmlgz
AddHandler x-html-gzip htmlgz
Action x-html-gzip /cgi-bin/zcat-html

... and add index.htmlgz to the DirectoryIndex line.

#!/bin/bash
# "/usr/lib/cgi-bin/zcat-html"
# This is a very simple minded handler for text/x-html-gzip
echo Content-type: text/html
echo
zcat $PATH_TRANSLATED

# "/usr/doc/.htaccess"
#----------------------------------------------------------------------------
# Rewrite rules for handling .gz'd html files.
RewriteEngine on
RewriteBase /doc/

# If the file is here, then pass it on through unchanged.
RewriteCond 	%{REQUEST_FILENAME} 	-s
  RewriteRule	^.*	-	[L]

# Does it exist in gzipped form?
RewriteCond 	%{REQUEST_FILENAME}gz 	-s
  RewriteRule	^(.*)\.html	$1.htmlgz
#----------------------------------------------------------------------------

mailto:karlheg+sig@inetarena.com (Karl M. Hegbloom)
http://www.inetarena.com/~karlheg
Portland, OR  USA
Debian GNU 1.3  Linux 2.1.36 AMD K5 PR-133

Reply to: