Quick and dirty apache logfile analysis

Dilemma: I had a bunch of rotated apache log files that I wanted to check traffic patterns in to see if some link changes I had made to a site were affecting traffic. Specifically, for the domain in question, I had tried to route requests to certain urls over to another domain by changing links in the html, and I wanted to see if it was actually impacting traffic. So I wrote the following little bash script to iterate over a set of log files and print the date and the line count for specified search strings.


if [ -z “$1” ]
echo “Usage: $0 search_string [log_file_prefix] [log_directory]”
echo “log_file_prefix defaults to ‘access_log.'”
echo “log_directory defaults to ‘.'”

if [ -z “$2″ ]

if [ -z “$3″ ]

FILES=$(ls $3)

for FILE in $FILES
DATE=$(echo $EPOCH|awk ‘{print strftime(“%c”,$1)}’)
COUNT=$(cat $3/$FILE | grep “GET $1” |wc -l)
echo “Looking for $1 in log for $DATE: $COUNT”
echo “”

So say I execute the command as follows: “./log_parse /about access_log_www. www”. It would scan for requests beginning with “/about” in all log files whose names begin with “access_log_www.” in the directory “www”. The script assumes that rotated log files are suffixed with a timestamp, and it writes the time based on that timestamp. Output looks something like this:

[root@www logs]# ./log_parse /about access_log_www. www
Looking for /about in log for Thu 08 Jun 2006 07:00:00 PM CDT: 726

Looking for /about in log for Fri 09 Jun 2006 07:00:00 PM CDT: 681

Looking for /about in log for Sat 10 Jun 2006 07:00:00 PM CDT: 28

It’s certainly not a full-service solution for log analysis, but it makes a quick check of one-off traffic patterns over time pretty easy to spot.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s