|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
SlickEdit: Code in over 40 languages across 7 platforms. SlickEdit’s unmatched power, speed, and flexibility allows even the most accomplished developers to write better code faster. Download a free trial today! |
|
#1
|
||||
|
||||
|
Mass Webhosting and Webalizer
This is a tutorial and a request for comments. Questions are welcome. Recently I was presented with the issue of providing webalizer log analysis for virtual hosts. It needed to be capable of being automated, and due to the high report configurable nature of webalizer, I also wanted the customers to have limited customizability in their config.
And I'm a huge fan of simplicity. So here's the issues that need to be addressed: 1) It needs to be scripted and automated. 2) The customer needs the means to access the configuration file. 3) Critical parameters that effect how the analysis is run must not be configurable by the customer. I'm not going to go into a lengthy logical process of why this is the solution I chose, but rather present a simple solution to the above issues, which I would appreciate comments on. File system organization: Log file /home/<user>/log/<site>.log Output directory for analysis /var/www/webalizer/<user>/<site>/ Default configuration /etc/webalizer.conf Site specific configuration (accessible by customer) /home/<user>/webalizer/<site>.conf Master configuration overrides /var/www/webalizer/webalizer.conf Webserver configuration You should output all access logging to either a single logfile or pipe. This output information should be virtual host aware. Implementations for this vary, and is beyond the scope of this discussion. Ultimately you need a separate logfile for each virtual host in combined format. In addition, each webalizer enabled virtual host needs an alias in the following format. Be sure that the "/var/www/webalizer" directory has "allow" access in httpd.conf. Alais /webalizer/ /var/www/webalizer/<user>/<site>/ Webalizer configuration When webalizer starts up it loads the default configuration file mentioned above then begins processing the commandline. The commandline may load additional configuration files that override the behavior of the default configuration. Once the configuration file(s) and commandline has been processed, analysis begins. Webalizer commandline After the default configuration file has been loaded, we specify on the commandline to load the user's configuration file. This way customers may change all the report parameters as they feel like. Immediately after loading their configuration file, we load the master configuration file. This file will set only the parameters you don't want the user to set. An example master file has been attached. This strategy makes for a simple, powerful, and secure configuration system that doesn't involve the complication of a restricting webbased webalizer control panel. Of course, this may still be provided for convenience. The other commandline parameters are preferrably for dynamic information only. This will also effectively override any related parameters specified by the default, user, and master configuration files. As seen below, I set the hostname, output directory, and logfile. For debugging purposes you may redirect stdin and stdout to separate files in the current directory (which should be made to be the output directory). Whether or not this information is written in the first place should be set in the master configuration file. webalizer -c <user> -c <master> -n <hostname> -o <outputdir> <logfile> 1>>webalizer.out 2>>webalizer.err Automation Cron is the obvious choice for automation. With webalizer's incremental processing, you can have it run several times a day with little drawback. The question is, will Cron execute webalizer directly or through an intermediate script? Personally I don't trust and am not very familiar with cron. There's a lot of considerations into how webalizer is executed. First of all, webalizer should never be executed as root. It should be executed as either apache or the user responsible for the site whose log needs to be analyzed. Second of all, I prefer that webalizer be executed with the current directory being equal to the output directory. Last of all, I do not want all webalizer tasks to be executed asyncronously for performance reasons, and I don't know if cron forks off each task asyncronously or not. So barring a lot of research into cron mechanics and configuration, I took the easy way out - writing an intermediate program in perl. Scripting I'm not even going to discuss how to script the addition and removal of site-specific webalizer tasks with cron. You're on your own if you take that route. Instead, we'll talk about one possible way that the intermediary script can be written. I've attached such a script written in perl. Of course, it does not have to be written in perl. It's just a simple implementation of the demands I listed above. It is executed as root. When it forks a child, the new thread setuid's to the configuration specified username and chdir's to the output directory. These properties, including the umask, are inherited by webalizer when it is executed. Meanwhile the parent waits for the child to exit before executing webalizer against another logfile.
__________________
Send me a private message if you would like me to setup your DNS for you for a price of your choosing. This is the preferred method if your DNS needs to be fixed/setup fast and you don't have the time to bounce messages back and forth on a forum. Also, check out these links: Whois Direct | DNS Crawler | NS Trace | Compare Free DNS Hosts Last edited by SilentRage : February 7th, 2005 at 02:51 PM. |
![]() |
| Viewing: Dev Shed Forums > Web Site Management > Development Software > Mass Webhosting and Webalizer |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|