#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2013
    Posts
    10
    Rep Power
    0

    Need Regexp log pattern


    #!/usr/bin/perl -w
    use strict;
    use warnings;

    my $log_pattern =q{(.*) \- (.*) \- \[(.*)\] \"(.*) (/[^/]+))};#(.*)\?(.*)} HTTP\/(.*)\" ([0-9]*) ([0-9]*) \"(.*)\" \"(.*)\" \"(.*)\"};
    my $entry ='13.4.28.244 - 95.123.101.114 - - [21/May/2013:15:58:24] "GET /V/0/11573/Granturismo1.mp4?start=0 HTTP/1.1 RefURL" 200 11125709 111257
    09 10 0 "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)"';


    $entry =~ /$log_pattern/;
    print $1, "|";
    print $2, "|";
    print $3, "|";
    print $4, "|";
    print $5, "|";
    print $6, "|";
    print $7, "|";
    print $8, "|";
    print $9, "|";
    print $10, "|";
    print $11, "|";
    print $12, "|";
    print $13, "|";
    print $14, "|";
    print $15, "|";
    print $16, "|";
    print $17, "|";
    print $18, "\n";

    Can you plz check and give exact log patttern for above log record

    Output should below format
    13.4.28.244|95.123.101.114|21/May/2013:15:58:24|GET|V|0|11573|Granturismo1.mp4?start=0|Granturismo1.mp4|1.1|200|11125709|111257|09|10| 0 |Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)

    I am not find right log pattern for above log and need to process all log file at one time.
    if there is any value in the respective field,need to replace with null value and process that record.

    Can any one help me in this regard.
    I am not displaying ref. url as it is not forum rules.

    Thanks in advance.
  2. #2
  3. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Location
    spaceBAR Central
    Posts
    225
    Rep Power
    41
    This is one way; You can then print the capture groups with any formatting you desire:
    Code:
    #!/usr/bin/perl -w
    # Script: testit.pl
    use strict;
    use warnings;
    
    my $log_pattern =  q{(\d+\.\d+\.\d+\.\d+) \- (\d+\.\d+\.\d+\.\d+) \- \- \[(\d{1,2}/[a-zA-Z]{3,3}/\d{4,4}:\d{1,2}:\d{1,2}:\d{1,2})\] \"(GET) /([a-zA-Z0-9]+)/([a-zA-Z0-9]+)/([a-zA-Z0-9]+)/([a-zA-Z0-9\.]+)(\?start=0) HTTP/([0-9.]+) [a-zA-Z]+\" ([0-9]+) ([0-9]+) ([0-9]{6,6})([0-9]{2,2}) ([0-9]{2,2})( [0-9] )\"(.+)\"};
    my $entry       =  '13.4.28.244 - 95.123.101.114 - - [21/May/2013:15:58:24] "GET /V/0/11573/Granturismo1.mp4?start=0 HTTP/1.1 RefURL" 200 11125709 11125709 10 0 "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)"';
    
    my $result      =  $entry =~ m"$log_pattern";
    print "$1\n";
    print "$2\n";
    print "$3\n";
    print "$4\n";
    print "$5\n";
    print "$6\n";
    print "$7\n";
    print "$8\n";
    print "$9\n";
    print "$10\n";
    print "$11\n";
    print "$12\n";
    print "$13\n";
    print "$14\n";
    print "$15\n";
    print "$16\n";
    print "$17\n";
    
    
    $ testit.pl
    13.4.28.244
    95.123.101.114
    21/May/2013:15:58:24
    GET
    V
    0
    11573
    Granturismo1.mp4
    ?start=0
    1.1
    200
    11125709
    111257
    09
    10
     0
    Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)

IMN logo majestic logo threadwatch logo seochat tools logo