Saturday, March 24, 2012

Heroku, nginx, ec2, logs and awstat

That's a lot of crap in the title. I searched the internet far and wide for answers to why it doesn't just "work" but no dice.  So here it is, on the internet.  The answer mostly lies in the LogFormat string in the config.

I'm using heroku, I managed to get everything setup so my logs get shipped off to an ec2 instance.  Nice.

Now the problem I was having is trying to interpret the logs to get something like awstat to understand them.

Step 1, grep the log file and dump all the nginx output to a file

grep nginx file.log > nginx_only.file.log

I was trying to run something like this,

awstats.pl -config=model -output -staticlinks > awstats.model.html


And of course running into issues. First tip, take off all the arguments to awstats.pl that I was using, keep -config (which is mandatory), you'll get better error output although still not totally useful.  To generate the config file above I just ran the config script provided with the awstats tarball.

SO, clearly the output was telling me that my LogFormat is not correct. Standard apache output won't work? NO! Not from heroku it won't.

After a bunch of fiddling this seems to be working for me, some of the fields may be used incorrectly but it's a start for anyone looking

LogFormat="%time3 %host_r %other %other %host %other %other %time1 %methodurl %code %bytesd %refererquot %uaquot %virtualname"


Will match

Jan 24 12:07:53 d.b194-8831e521ea77 heroku[nginx] - 192.10.192.10 - - [24/Jan/2012:12:07:53 +0000] "GET /javascripts/application.js?1327406636 HTTP/1.1" 200 1001 "http://www.domain.com/" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" www.domain.com
 " 
One last last little note: the log has a bunch of dashes ("-"), I tried to capture those by putting a dash in the LogFormat string, this is incorrect, capture them by using %other.

The End.

No comments:

Post a Comment