Re: Virtual Hosting and the FHS
On Thu, Jul 12, 2001 at 10:00:57AM -0500, Haim Dimermanas wrote:
> > any script i need to write can just open the virtual-hosts.conf file
> > and parse it (it's a single line, colon-delimited format) to find
> > out everything it needs to know about every virtual host.
> I used to do it that way and then I discovered something called a
i've considered using postgres for this but am resisting it until the
advantages greatly outweigh the disadvantages.
why complicate a simple job with a database? plain text configuration is
perfect for a task of this size.
it takes a lot longer to edit a database entry than it does to edit a
text file with vi.
i'd lose the ability to check-in all changes to RCS if i used a database
instead of a text file.
to get these features, i'd have to write a wrapper script to dump the
config database to a text file, run vi, and then import the database
from the edited file. that still wouldn't get around the fact that you
can put comments in text files - you can't in databases.
in short: databases are appropriate for some tasks, but not all.
> It makes it a lot easier to delete an entry and prevent duplicates.
huh? it takes no time at all to run "vi virtual-hosts.conf" and comment
our or delete a line.
> > i need to split up the log files so that each virtual domain can
> > download their raw access logs at any time. having separate error
> > log files is necessary for debugging scripts too (and preserving
> > privacy - don't want user A having access to user B's error logs).
> I strongly suggest you invest some time looking into a
> way to put the access log into a database. Something like
i wrote my own code a year ago to store logs in postgres (mysql is a
toy). it had it's uses but i decided it was a waste of disk space and
it made archiving old logs a pain. it greatly complicated the task of
allowing users to download their log files.
i went back to log files.
i'm a strong believer in the KISS principle, and see no need to add
unneccesary complication, especially for such little benefit.
> My research showed that web hosting customers don't look at their
> stats every day. Even if they did, your stats are generated
> daily. Having the logs in a database allows you to generate the stats
> on the fly. Now with a simple caching system that keeps the stats
> until midnight, you can save yourself a lot of machine power.
1. my customers want raw log files. the fact that i run webalizer
for them is a nice bonus, but what they insist on having is the raw
logs downloadably by ftp whenever they want (within a time limit -
we don't keep old logs forever). that's fine by me - stats are their
2. cpu usage is basically irrelevant on a machine which is I/O bound.
3. caching the stats pages defeats the purpose of generating them on the
4. generating stats on the fly is more expensive CPU and I/O wise than
running webalizer once/night and generating static html stats pages.
5. adding more boxes to the web farm is pretty easy with a properly
designed load-balancer system.
> > the only trouble is that means at least 2 log files open per vhost
> > per apache process...on one of my machines, that means 344 log files
> > open per process, * 50 processes (average) = 17,200 log files open.
> Read http://httpd.apache.org/docs/vhosts/fd-limits.html
i read it years ago. i'm fully aware of the issues regarding
> > that obviously is not very scalable.
> That's a nice way to put it. Another way to put it would be "it's not
> gonna work".
no. it does work. it's working right now, with that many log files open.
it's not scalable. looking at current growth patterns, i reckon i've got
a few months to come up with a long-term solution before it becomes a
craig sanders <firstname.lastname@example.org>
Fabricati Diem, PVNC.
-- motto of the Ankh-Morpork City Watch