Development Software
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsWeb Site ManagementDevelopment Software

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
SlickEdit: Code in over 40 languages across 7 platforms. SlickEdit’s unmatched power, speed, and flexibility allows even the most accomplished developers to write better code faster. Download a free trial today!
  #1  
Old February 7th, 2005, 02:48 PM
SilentRage's Avatar
SilentRage SilentRage is offline
DNS/BIND Guru
Dev Shed Specialist (4000 - 4499 posts)
 
Join Date: Jun 2003
Location: OH, USA
Posts: 4,193 SilentRage User rank is Second Lieutenant (5000 - 10000 Reputation Level)SilentRage User rank is Second Lieutenant (5000 - 10000 Reputation Level)SilentRage User rank is Second Lieutenant (5000 - 10000 Reputation Level)SilentRage User rank is Second Lieutenant (5000 - 10000 Reputation Level)SilentRage User rank is Second Lieutenant (5000 - 10000 Reputation Level)SilentRage User rank is Second Lieutenant (5000 - 10000 Reputation Level)SilentRage User rank is Second Lieutenant (5000 - 10000 Reputation Level) 
Time spent in forums: 1 Week 5 Days 14 h 27 m 56 sec
Reputation Power: 77
Mass Webhosting and Webalizer

This is a tutorial and a request for comments. Questions are welcome. Recently I was presented with the issue of providing webalizer log analysis for virtual hosts. It needed to be capable of being automated, and due to the high report configurable nature of webalizer, I also wanted the customers to have limited customizability in their config.

And I'm a huge fan of simplicity. So here's the issues that need to be addressed:

1) It needs to be scripted and automated.
2) The customer needs the means to access the configuration file.
3) Critical parameters that effect how the analysis is run must not be configurable by the customer.

I'm not going to go into a lengthy logical process of why this is the solution I chose, but rather present a simple solution to the above issues, which I would appreciate comments on.

File system organization:

Log file
/home/<user>/log/<site>.log

Output directory for analysis
/var/www/webalizer/<user>/<site>/

Default configuration
/etc/webalizer.conf

Site specific configuration (accessible by customer)
/home/<user>/webalizer/<site>.conf

Master configuration overrides
/var/www/webalizer/webalizer.conf

Webserver configuration

You should output all access logging to either a single logfile or pipe. This output information should be virtual host aware. Implementations for this vary, and is beyond the scope of this discussion. Ultimately you need a separate logfile for each virtual host in combined format.

In addition, each webalizer enabled virtual host needs an alias in the following format. Be sure that the "/var/www/webalizer" directory has "allow" access in httpd.conf.
Alais /webalizer/ /var/www/webalizer/<user>/<site>/

Webalizer configuration

When webalizer starts up it loads the default configuration file mentioned above then begins processing the commandline. The commandline may load additional configuration files that override the behavior of the default configuration. Once the configuration file(s) and commandline has been processed, analysis begins.

Webalizer commandline

After the default configuration file has been loaded, we specify on the commandline to load the user's configuration file. This way customers may change all the report parameters as they feel like. Immediately after loading their configuration file, we load the master configuration file. This file will set only the parameters you don't want the user to set. An example master file has been attached. This strategy makes for a simple, powerful, and secure configuration system that doesn't involve the complication of a restricting webbased webalizer control panel. Of course, this may still be provided for convenience.

The other commandline parameters are preferrably for dynamic information only. This will also effectively override any related parameters specified by the default, user, and master configuration files. As seen below, I set the hostname, output directory, and logfile. For debugging purposes you may redirect stdin and stdout to separate files in the current directory (which should be made to be the output directory). Whether or not this information is written in the first place should be set in the master configuration file.

webalizer -c <user> -c <master> -n <hostname> -o <outputdir> <logfile> 1>>webalizer.out 2>>webalizer.err

Automation

Cron is the obvious choice for automation. With webalizer's incremental processing, you can have it run several times a day with little drawback. The question is, will Cron execute webalizer directly or through an intermediate script?

Personally I don't trust and am not very familiar with cron. There's a lot of considerations into how webalizer is executed. First of all, webalizer should never be executed as root. It should be executed as either apache or the user responsible for the site whose log needs to be analyzed. Second of all, I prefer that webalizer be executed with the current directory being equal to the output directory. Last of all, I do not want all webalizer tasks to be executed asyncronously for performance reasons, and I don't know if cron forks off each task asyncronously or not. So barring a lot of research into cron mechanics and configuration, I took the easy way out - writing an intermediate program in perl.

Scripting

I'm not even going to discuss how to script the addition and removal of site-specific webalizer tasks with cron. You're on your own if you take that route.

Instead, we'll talk about one possible way that the intermediary script can be written. I've attached such a script written in perl. Of course, it does not have to be written in perl. It's just a simple implementation of the demands I listed above. It is executed as root. When it forks a child, the new thread setuid's to the configuration specified username and chdir's to the output directory. These properties, including the umask, are inherited by webalizer when it is executed. Meanwhile the parent waits for the child to exit before executing webalizer against another logfile.
Attached Files
File Type: txt webalizer.conf.txt (376 Bytes, 257 views)
File Type: txt webalize.conf.txt (102 Bytes, 267 views)
File Type: txt webalize.txt (1.3 KB, 259 views)
__________________
Send me a private message if you would like me to setup your DNS for you for a price of your choosing. This is the preferred method if your DNS needs to be fixed/setup fast and you don't have the time to bounce messages back and forth on a forum. Also, check out these links:

Whois Direct | DNS Crawler | NS Trace | Compare Free DNS Hosts

Last edited by SilentRage : February 7th, 2005 at 02:51 PM.

Reply With Quote
Reply

Viewing: Dev Shed ForumsWeb Site ManagementDevelopment Software > Mass Webhosting and Webalizer


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 6 hosted by Hostway