`mod_rewrite', zcat-html, and /usr/doc/pkg/file.htmlgz
Ok, here's what I've come up with, so far, some assembly required. I
would like some feedback and advice. Do you want me to try and
package this? It is working here very well.
I can access gzipped html files transparently now. Links inside the
documents do not need to be changed by a script for this to happen
because Apache does the work. The documents are unzipped by a handler
when requested, so the browser doesn't need to handle any special
content encoding.
I've opted to use an .htaccess file in "/usr/doc", rather than
putting it all in a <Directory ...> section in "access.conf". It
seemed more flexible this way, since I thought that it might prove to
be useful for some packages to install .htaccess files of their own in
"/usr/doc/pkg", to invoke special handling of certain files. This
will work well for SPI documents that might contain links to things
that aren't installed locally. A rewrite rule can redirect the
viewer's browser to the right place.
Any other ideas suggestions or feedback? Please put them in the
list.
I can write a `script` that will recurse down the "/usr/doc" tree, to
do the following:
bash-2.0# cd /usr/doc
bash-2.0# du -chk $(find -name '*.html')
[...lots of output...]
29756 total
bash-2.0# find -name '*.html' | xargs gzip -9
bash-2.0# for f in $(find -name '*.html.gz');do mv $f ${f%%html.gz}htmlgz; done
bash-2.0# du -chk $(find -name '*.htmlgz')
[...lots of output...]
7782 total
!! Wow... 29756 / 7782 = 3.82 that's almost 4x smaller. I saved
21974 Kilobytes of disk space.
Oops... turns out to be only a handful, but:
bash-2.0# for f in $(find -name 'README.htmlgz'); \
do mv $f ${f%%htmlgz}html.gz; \
gunzip ${f%%htmlgz}html.gz; \
done
bash-2.0# for f in $(find -name 'HEADER.htmlgz'); \
do mv $f ${f%%htmlgz}html.gz; \
gunzip ${f%%htmlgz}html.gz; \
done
Those files don't get rewritten properly. Darn. They should.
I thik there should be a hook in the package installer to do
compression automagicly, at local option. I believe this was
discussed. Will it happen in Deity?
I have Apache 1.2.1 configured to load `mod_rewrite', `mod_actions',
and:
# ---- "/etc/apache/httpd.conf" ----
# ReWriteLog
RewriteLog "/var/log/apache/rewrite.log"
# Running...
RewriteLogLevel 0
# Debugging...
#RewriteLogLevel 9
# ---- "/etc/apache/access.conf" ----
<Directory /usr/doc>
## FollowSymLinks is required to make mod_rewrite work.
Options FollowSymLinks All
AllowOverride All
XBitHack full
order allow,deny
# This can be tweaked for systems that do not allow outside access.
allow from all
</Directory>
# ---- "/etc/apache/srm.conf" ----
# A new MIME type and a handler
AddType text/x-html-gzip htmlgz
AddHandler x-html-gzip htmlgz
Action x-html-gzip /cgi-bin/zcat-html
... and add index.htmlgz to the DirectoryIndex line.
#!/bin/bash
# "/usr/lib/cgi-bin/zcat-html"
# This is a very simple minded handler for text/x-html-gzip
echo Content-type: text/html
echo
zcat $PATH_TRANSLATED
# "/usr/doc/.htaccess"
#----------------------------------------------------------------------------
# Rewrite rules for handling .gz'd html files.
RewriteEngine on
RewriteBase /doc/
# If the file is here, then pass it on through unchanged.
RewriteCond %{REQUEST_FILENAME} -s
RewriteRule ^.* - [L]
# Does it exist in gzipped form?
RewriteCond %{REQUEST_FILENAME}gz -s
RewriteRule ^(.*)\.html $1.htmlgz
#----------------------------------------------------------------------------
mailto:karlheg+sig@inetarena.com (Karl M. Hegbloom)
http://www.inetarena.com/~karlheg
Portland, OR USA
Debian GNU 1.3 Linux 2.1.36 AMD K5 PR-133
Reply to: