[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Virtual Hosting and the FHS



> > Why not use vhost_alias_module in Apache and something like the
> > following:  VirtualDocumentRoot /home/www/%-1/%-2/%-3/%-4+
> 
> because that's not as flexible as my system. it's fine if you want
> all your vhosts exactly the same, but it doesn't allow for individual
> variation.

 Absolutely true. Nevertheless, you can get some interresting results by
playing with .htaccess and AllowOverride.

> also because my virtual-hosts.conf file is a central configuration file
> for everything to do with virtual hosts, not just apache - generating
> apache config fragments, htdig configuration, nightly log processing,
> weekly linbot runs, etc etc.

 That's always good :-)

> any script i need to write can just open the virtual-hosts.conf file
> and parse it (it's a single line, colon-delimited format) to find out
> everything it needs to know about every virtual host.

 I used to do it that way and then I discovered something called a database.
It makes it a lot easier to delete an entry and prevent duplicates.

[...]

> > Then for the logging you can have the following at the start of the
> > Apache config:
> >
> > LogFormat "%V %h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\" %T"

[...]

> i'll look into that.

 Sooner than you think, you will have to.

> i need to split up the log files so that each virtual domain can
> download their raw access logs at any time. having separate error log
> files is necessary for debugging scripts too (and preserving privacy -
> don't want user A having access to user B's error logs).

 I strongly suggest you invest some time looking into a way to put the
access log into a database. Something like
http://freshmeat.net/projects/apachedb/.

 My research showed that web hosting customers don't look at their stats
every day. Even if they did, your stats are generated daily. Having the logs
in a database allows you to generate the stats on the fly. Now with a simple
caching system that keeps the stats until midnight, you can save yourself a
lot of machine power.

 I used to have one log file for all vhosts. Not many, just > 15,000. It
took 12 hours for my stats server to split the log file and generate the
stats for all vhosts every day. This is not acceptable for most cases. If
you host your customer on a distributed architecture (one server, 3,000
domains or so, another server with the next 3,000 domains, etc) and you
generate the stats on the same server where your customer is hosted (like
Verio or ValueWeb does it), it's another story.

> the only trouble is that means at least 2 log files open per vhost per
> apache process...on one of my machines, that means 344 log files open
> per process, * 50 processes (average) = 17,200 log files open.

 Read http://httpd.apache.org/docs/vhosts/fd-limits.html

> that obviously is not very scalable.

 That's a nice way to put it. Another way to put it would be "it's not gonna
work".
 
> i have figured out how to have just one log file open per httpd - a
> named pipe to a splitter script, which writes to the real log files.

 Sounds good.
 
	Haim.



Reply to: