The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Perl Programming
|
Extracting lines
Discuss Extracting lines in the Perl Programming forum on Dev Shed. Extracting lines Perl Programming forum discussing coding in Perl, utilizing Perl modules, and other Perl-related topics. Perl, the Practical Extraction and Reporting Language, is the choice for many for parsing textual information.
|
|
 |
|
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

April 8th, 2009, 02:34 PM
|
|
Contributing User
|
|
Join Date: Mar 2008
Posts: 42
Time spent in forums: 8 h 2 m 16 sec
Reputation Power: 6
|
|
|
Extracting lines
Hello all
I have this code
use File::Tail;
$|++;
$name="C:\\users\\Mizo\\desktop\\log.t...
$file=File::Tail->new(name=>$name, maxinterval=>1,interval=>1, adjustafter=>1);
while (defined($line=$file->read)) {
if ($line=~/rules.txt/i){
print "$line";
}
}
Log.txt has this
Host:xXMizoXx
Request:blabla
Method:GET
Http:HTTP/1.1
I want the previous code match the rules.txt with only the "Request" line in the log file .. and ignore the others
do you know how to make it happens?
And is it possible to take the request line each time and put it in a text file to process it ? and how?
|

April 8th, 2009, 06:56 PM
|
 |
'fie' on me, allege-dly
|
|
Join Date: Mar 2003
Location: in da kitchen ...
|
|
Code:
open FH, "<rules.txt";
@rules=<FH>; # read the rules in to an array
close FH;
use File::Tail;
$|++;
$name="C:/users/Mizo/desktop/log.txt
$file=File::Tail->new(name=>$name, maxinterval=>1,interval=>1, adjustafter=>1);
while (defined($line=$file->read)) {
for (@rules) {
$rule=$_;
if ($line=~ m/($rule)/i){
print "$line";
}
}
}
Is one way, but if you have a lot of rules, it will be quite inefficient, might be an idea to think about how you could match the atom by splitting the string on ":"
Code:
$item=split(":", $line)[0];
if (index($rules, $item) != -1) {
print $line;
}
__________________
--Ax
without exception, there is no rule ...
Handmade Irish Jewellery
Targeted Advertising Cookie Optout (TACO) extension for Firefox
The great thing about Object Oriented code is that it can make small, simple problems look like large, complex ones
 
09 F9 11 02
9D 74 E3 5B
D8 41 56 C5
63 56 88 C0
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. -- Jamie Zawinski
Deta vil - the devil is in the detail, allegedly, and I use the term advisedly, allegedly ... oh, no, wait I did ...
BIT COINS ANYONE
|

April 10th, 2009, 04:41 AM
|
|
Registered User
|
|
Join Date: Apr 2009
Posts: 3
Time spent in forums: 51 m 53 sec
Reputation Power: 0
|
|
If you need only lines which starts with "Request" then do this:
print "$line" if $line =~ m/^Request: /;
But, it is better not to use RE for such easy task. Use index.
If you need lines that match at least one of the rules from rules.txt then you should change Axweildr's code in such way:
Code:
open FH, "<rules.txt";
@rules=<FH>; # read the rules in to an array
close FH;
use File::Tail;
$|++;
$name="C:/users/Mizo/desktop/log.txt
$file=File::Tail->new(name=>$name, maxinterval=>1,interval=>1, adjustafter=>1);
while (defined($line=$file->read)) {
for (@rules) {
$rule=$_;
if ($line=~ m/($rule)/i){
print "$line";
last;
}
}
}
This last statement prevent from printing one line several times (If it will match several rules).
|

April 10th, 2009, 04:47 AM
|
|
Contributing User
|
|
Join Date: Mar 2008
Posts: 42
Time spent in forums: 8 h 2 m 16 sec
Reputation Power: 6
|
|
|
Thank you ... it's clear now..
but one more thing >,<
The problem is that rules file is in linux
and the log file is in windows..
so when i am going to send the agent to windows it's not going to do any matching
because rules.txt is still in linux
how to bring the rules file with the agent? from the agent itself.
|

April 10th, 2009, 05:09 AM
|
 |
'fie' on me, allege-dly
|
|
Join Date: Mar 2003
Location: in da kitchen ...
|
|
|
it should still split on '\r\n' as well as '\n' so it shouldn't be an issue, if you believe it is then you can run unix2dos over the file in question
|

April 10th, 2009, 05:13 AM
|
|
Contributing User
|
|
Join Date: Mar 2008
Posts: 42
Time spent in forums: 8 h 2 m 16 sec
Reputation Power: 6
|
|
Quote: | Originally Posted by Axweildr it should still split on '\r\n' as well as '\n' so it shouldn't be an issue, if you believe it is then you can run unix2dos over the file in question |
No no it's not about spliting
what i wanted to say is how to send the rules file with the code ...to windows machine beacuase the rules file is in linux..
and the logfile is in windows
I want it to go with the agent from linux to windows everytime i run the code
|

April 10th, 2009, 05:31 AM
|
|
Registered User
|
|
Join Date: Apr 2009
Posts: 3
Time spent in forums: 51 m 53 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by -=Mizo=- No no it's not about spliting
what i wanted to say is how to send the rules file with the code ...to windows machine beacuase the rules file is in Linux..
and the logfile is in windows
I want it to go with the agent from Linux to windows everytime i run the code |
I think that it isn't a good idea to send code and rules file to windows from Linux. You should use some mechanism to access log files. For example you can use SSH or FTP or something else to access this files from linux to windows. Another way is to send them periodically from win to linux (by ftp or ssh or sth else).
You can also share windows folder with log files and mount them on Linux. So it's up to you which way to choose 
|

April 10th, 2009, 08:54 AM
|
|
|
Quote: | Originally Posted by Axweildr
Code:
open FH, "<rules.txt";
@rules=<FH>; # read the rules in to an array
close FH;
use File::Tail;
$|++;
$name="C:/users/Mizo/desktop/log.txt
$file=File::Tail->new(name=>$name, maxinterval=>1,interval=>1, adjustafter=>1);
while (defined($line=$file->read)) {
for (@rules) {
$rule=$_;
if ($line=~ m/($rule)/i){
print "$line";
}
}
}
Is one way, but if you have a lot of rules, it will be quite inefficient, might be an idea to think about how you could match the atom by splitting the string on ":"
Code:
$item=split(":", $line)[0];
if (index($rules, $item) != -1) {
print $line;
}
|
I realize that most of that code is from the OP, but lets take a look at at with Perl Best Practices in mind.
The script is missing 1, if not 2, very important pragmas which should be in every Perl script.
Code:
use warnings;
use strict;
It's missing proper error handling on the open call.
It's better to use the 3 arg form of open and a lexical var for the filehandle.
Code:
my $rules_file = 'rules.text';
open my $FH, '<', $rules_file or die "failed to open '$rules_file' $!";
The use statements are executed at compile time, so place them at the beginning of the script instead of intermixed with runtime code.
The script overall is lacking proper horizontal whitespace.
Please read: perldoc -q quoting
Code:
for (@rules) {
$rule=$_;
if ($line=~ m/($rule)/i){
print "$line";
}
Is better written as:
Code:
for my $rule ( @rules ) {
print $line and last if $line =~ /$rule/i;
}
The gain in efficiency by using index instead of a regex is offset by the split and an if block with only 1 line in the block is better written as 1 line, as shown above. Also, the first arg to split is a pattern (regex) not a string. However, " " is an exception.
perldoc -f split
Here's the complete script with a couple adjustments that I didn't mention.
Code:
use strict;
use warnings;
use File::Tail;
$|++;
my $rules = 'rules.txt';
open my $FH, '<', $rules or die "failed to open '$rules' $!";
my %rules = map { chomp; lc($_), 1 } <$FH>;
close $FH;
my $file = File::Tail->new( name => 'C:/users/Mizo/desktop/log.txt',
maxinterval => 1,
interval => 1,
adjustafter => 1
);
while (defined( my $line = $file->read) ) {
my $rule = (split /:/, $line)[0];
print $line if exists $rules{lc($rule)};
}
Finally, on the question about accessing the file remotely:
File::Remote - Read/write/edit remote files transparently
http://search.cpan.org/~nwiger/File-Remote-1.17/Remote.pm
|

April 10th, 2009, 10:07 AM
|
 |
'fie' on me, allege-dly
|
|
Join Date: Mar 2003
Location: in da kitchen ...
|
|
|
Thank you for your input
|

April 11th, 2009, 01:12 AM
|
|
Contributing User
|
|
Join Date: Mar 2008
Posts: 42
Time spent in forums: 8 h 2 m 16 sec
Reputation Power: 6
|
|
Thank you all..your answers were very useful.
but.. it isn't working for me
Code:
use strict;
use warnings;
use File::Tail;
$|++;
my $dir="C:\\users\\Mizo\\desktop\\rules.txt";
open (FH,$dir) or die $!;
my @rules=<FH>; # read the rules in to an array
close FH or die $!;
my $name="C:\\users\\Mizo\\desktop\\log.txt";
my $file=File::Tail->new(name=>$name, maxinterval=>1,interval=>1, adjustafter=>1);
while (defined(my $line=$file->read)) {
foreach my $rule(@rules) {
if ($line=~/$rule/i){
print "$line";
last;
}
}
}
it's neither matching nor printing..
but this code for example is printing when matched..
Code:
use File::Tail;
$|++;
$name="C:\\users\\Mizo\\desktop\\log.txt";
$file=File::Tail->new(name=>$name, maxinterval=>1,interval=>1, adjustafter=>1);
while (defined($line=$file->read)) {
if ($line=~/blabla/i){
print "$line";
}
}
|

April 11th, 2009, 01:37 AM
|
|
Contributing User
|
|
Join Date: Mar 2008
Posts: 42
Time spent in forums: 8 h 2 m 16 sec
Reputation Power: 6
|
|
Code:
use strict;
use warnings;
use File::Tail;
$|++;
my $rules = 'rules.txt';
open my $FH, '<', $rules or die "failed to open '$rules' $!";
my %rules = map { chomp; lc($_), 1 } <$FH>;
close $FH;
my $file = File::Tail->new( name => 'C:/users/Mizo/desktop/log.txt',
maxinterval => 1,
interval => 1,
adjustafter => 1
);
while (defined( my $line = $file->read) ) {
my $rule = (split /:/, $line)[0];
print $line if exists $rules{lc($rule)};
}
is printing nothing also..
|

April 11th, 2009, 07:07 AM
|
|
|
Please post a few sample lines from rules.txt.
Is log.txt continuously being updated? If not, then File::Tail is the wrong tool to use in this script.
The module states:
Quote: | File::Tail - Perl extension for reading from continously updated files |
|

April 11th, 2009, 07:12 AM
|
|
Contributing User
|
|
Join Date: Mar 2008
Posts: 42
Time spent in forums: 8 h 2 m 16 sec
Reputation Power: 6
|
|
Quote: | Originally Posted by FishMonger Please post a few sample lines from rules.txt.
Is log.txt continuously being updated? If not, then File::Tail is the wrong tool to use in this script.
The module states: |
for example..
[code]
\.txt\?$
[\code]
or even if i put a word in the rules.txt..and request it and the log file saves it .. the script isn't printing anything
so when i am going to requet with a .txt? in the request it's going to be saved in the log file and when the rules file matchs i will be alerted..
and yes log.txt is contrinuously updated.
|

April 11th, 2009, 07:40 AM
|
|
|
|
Based on your sample lines from each file, I don't see why you should expect it to print anything.
Please post a few sample lines in rules.txt and the corresponding lines in log.txt that should be extracted.
|

April 11th, 2009, 07:45 AM
|
|
Contributing User
|
|
Join Date: Mar 2008
Posts: 42
Time spent in forums: 8 h 2 m 16 sec
Reputation Power: 6
|
|
Quote: | Originally Posted by FishMonger Based on your sample lines from each file, I don't see why you should expect it to print anything.
Please post a few sample lines in rules.txt and the corresponding lines in log.txt that should be extracted. |
Code:
Host:xXMizoXx
Request:/.txt?
Method:GET
Http:HTTP/1.1
this is the log file...when i request from http webserver
and i have only 1 rule in rules.txt which i showed u..
but..
when i use the old code..
which is
if($line=~/http/i){
print "$line";
it works..
once the log file is updated and http is there i get a message
but the new code isn't working..
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|