The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> Perl Programming
|
Trying to extract 3 matches from the string
Discuss Trying to extract 3 matches from the string in the Perl Programming forum on Dev Shed. Trying to extract 3 matches from the string Perl Programming forum discussing coding in Perl, utilizing Perl modules, and other Perl-related topics. Perl, the Practical Extraction and Reporting Language, is the choice for many for parsing textual information.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

November 16th, 2012, 07:27 PM
|
|
Contributing User
|
|
Join Date: Jun 2012
Posts: 65
Time spent in forums: 14 h 35 m 42 sec
Reputation Power: 1
|
|
|
Trying to extract 3 matches from the string
Hi,
I have error logs and I’m trying to extract “Alarm Codes”, “time the error started” and “time the error was reset”.
All three matches are in the "Alarm Code strings" in the log file and the string look like this:
Code:
AlarmCode:105 {105 ALARM1601 1001 SC1_Sampler:TransferDHD 0 User Assist Other {} {scd1 UpperShuttle M-11} {} 0 0 1 0 1348117130 1348117987 {21:58:50 19-Sep-2012} {22:13:07 19-Sep-2012} 857 0 0 0 -5.0 Alarm Running 1 1 0 3838 3838 {} {No User Logged In} sptFG_M CDA2}
I need to extract:
ALARM1601 ------------ word “ALRM” and” 4” digits number that is my error code
21:58:50 19-Sep-2012 - the time alarm started
22:13:07 19-Sep-2012 – the time alarm was cleared
I need to make all three matches as variables to manipulate them later. I made a script, it prints “the time alarm started” and “the time alarm was cleared” but I cannot make them variables and I’m failing to extract “Alarm codes”.
Could you help me with this please?
Thanks, tester
Code:
#!/use/bin/perl -w
use strict;
use warnings ;
my ($line,$list, $tofindtime) ;
my $log2 = 'C:\\Alarms\\sptFG_M\\JBS02-A1CD.log' ;
open (my $log2_fh,'<',$log2) or die "Can't open $log2";
while ($line =<$log2_fh>) {
chomp ($line);
if ($line =~ /^AlarmCode\:/) {
print "$line\n" ;
while ($tofindtime = $line =~ m/(\d{2}\:\d{2}\:\d{2}\s\d{2}\-\w{3}\-\d{4})/gi) {
$list = $1 ;
print "$list\n" ;
}
}
}
close $log2_fh ;
|

November 16th, 2012, 07:53 PM
|
|
|
Once you remove the outer braces, it looks like you can just split it into brace-quoted/space-delimited fields. (Incidentally, the two numbers preceding the date/times appear to be unix timestamps which may be more friendly to manipulate them later.)
Code:
$t = 'AlarmCode:105 {105 ALARM1601 1001 SC1_Sampler:TransferDHD 0 User Assist Other {} {scd1 UpperShuttle M-11} {} 0 0 1 0 1348117130 1348117987 {21:58:50 19-Sep-2012} {22:13:07 19-Sep-2012} 857 0 0 0 -5.0 Alarm Running 1 1 0 3838 3838 {} {No User Logged In} sptFG_M CDA2}';
$t =~ /\{(.*)\}/;
@vals = $1 =~ /(\{[^}]*\}|\S+)/g;
print "$vals[1], $vals[17], $vals[18]";
__________________
sub{*{$::{$_}}{CODE}==$_[0]&& print for(%:: )}->(\&Meh);
|

November 18th, 2012, 12:05 AM
|
|
Contributing User
|
|
Join Date: Jun 2012
Posts: 65
Time spent in forums: 14 h 35 m 42 sec
Reputation Power: 1
|
|
|
Omega, it does the trick, I could never do that by myself.
I could not use the “split” as I've said, not all the strings are exactly the same, too many variations with the “split”.
I tried to split $1 but had no luck. I’m just wandering is there any way I could split $1?
Omega, thank you very much for your help!
testerV
|

November 18th, 2012, 04:52 AM
|
|
|
Hi,
if you have variations in the input, you have to let us know what in your input is changing from one record to the next and what is invariant, what is always there. We can't guess it. A general principle in data munging is that the better you know the data, the more you will be able to use it and extract relevant information from it. Here, we have only one example, we can only try to say how to extract information from the sample you provided.
With this input:
Quote: | AlarmCode:105 {105 ALARM1601 1001 SC1_Sampler:TransferDHD 0 User Assist Other {} {scd1 UpperShuttle M-11} {} 0 0 1 0 1348117130 1348117987 {21:58:50 19-Sep-2012} {22:13:07 19-Sep-2012} 857 0 0 0 -5.0 Alarm Running 1 1 0 3838 3838 {} {No User Logged In} sptFG_M CDA2} |
and this request:
Quote: I need to extract:
ALARM1601 ------------ word “ALRM” and” 4” digits number that is my error code
21:58:50 19-Sep-2012 - the time alarm started
22:13:07 19-Sep-2012 – the time alarm was cleared |
I can suggest the following approach (assuming your input record is in the $line variable):
Perl Code:
Original
- Perl Code |
|
|
|
my $date = qr/ [ 0- 9]\d:\d\d:\d\d [ 0- 9]\d-\w {3}-\d {4}/; my ($alarm, $date1, $date2) = ($1, $2, $3) if $line =~ /^AlarmCode.*(ALARM\d{4}).*\{($date)\} \{($date)\}/; print "$alarm, $date1, $date2";
Which will print:
Code:
ALARM1601, 21:58:50 19-Sep-2012, 22:13:07 19-Sep-2012
Note that for the hour and for the date in the month, I used "[ 0-9]\d" just in case the hour or the date may have only one space and one digit (rather that 2 digits), as it happens sometimes.
Last edited by Laurent_R : November 18th, 2012 at 04:57 AM.
|

November 18th, 2012, 05:51 PM
|
|
Contributing User
|
|
Join Date: Jun 2012
Posts: 65
Time spent in forums: 14 h 35 m 42 sec
Reputation Power: 1
|
|
Hi Laurent,
Here is more strings examples and there are many more.
Code:
AlarmCode:12 {12 ALARM1552 200 SC2_Cuter:TrayCoord 0 User Assist Other {} {} {} 0 0 1 0 1347967427 1347969367 {04:23:47
18-Sep-2012} {04:56:07 18-Sep-2012} 1970 0 0 0 95.0 Alarm Running 1 1 0 741 741 {} {No User Logged In} sptFG_U CDA3}
AlarmCode:85 {85 ALARM1502 1001 SC5_Cuter:TransferPnp 0 User Assist Other {} {B1 UpperTransferShuttle D1} {} 0 0 1 0 1348117130 1348117987 {21:58:50
19-Sep-2012} {22:13:07 19-Sep-2012} 857 0 0 0 -5.0 Alarm Running 1 1 0 3838 3838 {} {No User Logged In} sptFG_U CDA3}
AlarmCode:86 {86 ALARM2769 0 SC1_Cuter:Testsite 0 User Assist Other {} {Cannot reach TplSetPt too high by 17.38C,CurCtrlSe} {} 0 0 1 0 1348117937 1348117983 {22:12:17
19-Sep-2012} {22:13:03 19-Sep-2012} 46 106 0 0 -5.0 Unknown Running 1 1 0 3838 3838 {} {No User Logged In} sptFG_U CDA3}
I found more than 40 already, I could use “split” and make many “IF” statements but I could not be sure if later the machines would not print something I did not anticipated.
Thanks for your code man!
|

November 19th, 2012, 01:15 AM
|
|
|
|
Hi,
looking at your additional data, the regular expressions I suggested should work perfectly. Actually, what I have proposed is quite piclky about what it matches and, unless your data is really very different from what you originally posted, you should not get false matches.
The regex suggested looks for:
- a string stating with AlarmCode
- the word ALARM followed by four digits
- A date (time and date) enclosed within { }
- another date enclosed within { }
In addition the date regex is pretty finely defined (two digits or one space on digit, followed by : followed by 2 digits, followed by : followed by 2 digits, etc.). The chances of something else than a date matching this are close to 0.
|

November 19th, 2012, 04:37 PM
|
|
Contributing User
|
|
Join Date: Jun 2012
Posts: 65
Time spent in forums: 14 h 35 m 42 sec
Reputation Power: 1
|
|
|
Hi Laurent!
Yes, your "regex" works perfectly, I just wanted to say I could not use"split" too many variations in the strings. I could not make my own "regex", you and Omega helped me to make the code. Thank you very much (both of you) for the code!
testerV
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|