Results 1 to 5 of 5

Thread: How to create Apache log analyzer

  1. #1
    Join Date
    Jan 2010
    Posts
    74

    How to create Apache log analyzer

    I want to create a Apache log analyzer (although it already exists) just for fun. For this I am led to reflect on how to proceed knowing that I'd like to do statistics. Consider an example:

    90.90.90.90 - - [11/Feb/2009:08:45:42 +0100] "GET /index.php" 200 907 "http://www.techarena.in" "Mozilla/4.0 (compatible; MSIE 7.0; Windows)"
    The idea will be to make a cut of each set and check if the next line we find the same thing. This would have intended to define more users visiting the site, the most visited URLs etc. Do you have tips for how to proceed? Regular expression? split?

  2. #2
    Join Date
    Nov 2008
    Posts
    1,221

    Re: How to create Apache log analyzer

    Here's a basic example to give you ideas.
    Code:
    #!/usr/bin/env perl
    use strict; use warnings;
     
    my %ip;
    while (<>) {
    	$ip{ (split /\s+/)[0] }++;
    }
     
    print map { $_, "\t", $ip{$_}, "\n" } 
    	sort { $ip{$b} <=> $ip{$a} } keys %ip;
    As you have probably realized, it recognizes the IP addresses by incrementing the values of an associative array (where the keys are IP), then displays the keys in descending order of occurrence.

    You can refine by testing whether the first element of your line is an IP address (by extracting with regular expression rather than the split).

    As such, it reads the standard input. You can also read compressed files (in case of logs generally) if you want.

  3. #3
    Join Date
    Jan 2010
    Posts
    74

    Re: How to create Apache log analyzer

    Firstly thank you for your reply. There I will need to have boss for hash because I am not so use to it! If I want to emphasize "IP: number of times it appears" in the output code, I used what variable? In output I now find myself with a tab, the number of times the IP appears and a line break.

  4. #4
    Join Date
    Nov 2008
    Posts
    1,221

    Re: How to create Apache log analyzer

    Quote Originally Posted by Endowed View Post
    In output I now find myself with a tab, the number of times the IP appears and a line break.
    Have you changed the example? Unchanged after the script parses a apache log like the one you gave in your first post. Change the split or use a capture with a regex. For the hash, its value is incremented for each IP in the past key.

    This is an example not very clean but functional to capture the IP line beginning with a regex and increment the hash with the value captured.
    Code:
    $ip{ $1 }++ if /^(\d{,3}\.\d{,3}\.\d{,3}\.\d{,3})/;

  5. #5
    Join Date
    Jan 2010
    Posts
    74

    Re: How to create Apache log analyzer

    Thank you so much for me, it works very well. Here is my current code (IIS indices are not good):
    Code:
    #!/usr/bin/perl
    use strict;
    system("cls");
     
    print "Welcome to the program of log analyzer\n";
    print " What type of log would you analyze ?\n\n";
    print "    1.Apache\n";
    print "    2.IIS\n\n";
    my $num = (<STDIN>);
    system("cls");
     
    print " Please enter the number corresponding to desired action :\n\n";
    print "    1.Hits IP addresses\n";
    print "    2.Hits of the most visited pages\n";
    print "    3.Hits of the referer\n";
    print "    4.Hits in KB\n";
    print "    5.Hits of the browsers\n";
    print "    6.Quit the program\n\n";
     
    open(File, "log") || die "Problem opening : $!";
    my($line,@ips,$ip,%total,@pages,$page,%total1,@ref,$ref,%total2,@kb,$kb,%total3,@nav,$nav,%total4);
     
    if ($num == "1") {
       while (<File>) {
         	 @ips = (split /\s+/)[0];
         	 foreach $ip (@ips) {
                  	  $total{$ip}++;
    }
       	 @pages = (split /\s+/)[6];
      	 foreach $page (@pages) {
              	  $total1{$page}++;
         }
            @ref = (split /\s+/)[9];
            foreach $ref (@ref) {
                      $total2{$ref}++;
         }
            @kb = (split /\s+/)[8];
            foreach $kb (@kb) {
              $total3{$kb}++;
         }
            @nav = (split /\s+/)[10];
            foreach $nav (@nav) {
              $total4{$nav}++;
         }
      }
    }
    elsif ($num == "2") {
         while (<File>) {
            @ips = (split /\s+/)[0];
            foreach $ip (@ips) {
                     $total{$ip}++;
         }
            @pages = (split /\s+/)[3];
            foreach $page (@pages) {
                     $total1{$page}++;
         }
            @ref = (split /\s+/)[6];
            foreach $ref (@ref) {
              	 $total2{$ref}++;
         }
            @kb = (split /\s+/)[8];
            foreach $kb(@kb) {
                     $total3{$kb}++;
         }
            @nav = (split /\s+/)[12];
            foreach $nav (@nav) {
                     $total4{$nav}++;
         }
      }
    }
    else {
       print "\nInvalid number - Good bye\n";
       exit;
    }
    close(File);
    my $num1 = (<STDIN>);
    if ($num1 == "1") {
    &subip();
    }
    elsif ($num1 == "2") {
    &subpage();
    }
    elsif ($num1 == "3") {
    &subref();
    }
    elsif ($num1 == "4") {
    &subkb();
    }
    elsif ($num1 == "5") {
    &subnav();
    }
    else {
    print "\nA bientot\n";
    exit;
    }
    sub subip {
    foreach $ip (sort keys %total)
         {
        print "IP : $ip has been found $total{$ip} times\n";
    }
    exit;
    }
    sub subpage {
    foreach $page (sort keys %total1)
         {
        print "The Page \"$page has been visited $total1{$page} times.\n";
    }
    exit;
    }
    sub subref {
    foreach $ref (sort keys %total2)
         {
        print "The referer $ref has been seen $total2{$ref} times.\n";
    }
    exit;
    }
    sub subkb {
    foreach $kb (sort keys %total3)
         {
        print "Here are the hit of Kilo bytes : $kb kb\n";
    }
    exit;
    }
    sub subnav {
    foreach $nav (sort keys %total4)
         {
        print "The browser $nav has been found $total4{$nav} times.\n";
    }
    exit;
    }
    The script is not very optimized. Now I'd like to do a sort by date. The user enters: 01/Apr/2009 10/Apr/2009 and the script made its analysis only on this interval.

    Do you have an idea how? The problem will calculate the date range. I liked to do without the module, but I think I have a choice.

Similar Threads

  1. Replies: 4
    Last Post: 05-05-2012, 05:59 PM
  2. Don't use the PC Analyzer offered since AVG 2011
    By Quasim in forum Windows Software
    Replies: 5
    Last Post: 06-08-2011, 10:23 PM
  3. Audio Analyzer Software
    By Ameeryan in forum Windows Software
    Replies: 4
    Last Post: 15-10-2009, 11:52 AM
  4. Logfile Analyzer for Mac?
    By X-Ray in forum Operating Systems
    Replies: 4
    Last Post: 28-03-2009, 09:25 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,750,423,484.67552 seconds with 16 queries