The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages - More
> Regex Programming
|
Negative Lookahead problems
Discuss Negative Lookahead problems in the Regex Programming forum on Dev Shed. Negative Lookahead problems Regular expressions forum covering PCRE and POSIX techniques, practices, and standards. Regular expressions help shorten coding time by providing the ability to compact many lines of code into one string.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

July 30th, 2009, 03:29 PM
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
Time spent in forums: 6 h 29 m 52 sec
Reputation Power: 0
|
|
|
Negative Lookahead problems
I am currently making a program to remove certain sections from a very large log file, I have experimented with lookaheads, lookbehinds, and a bunch of other things, but none of these seem to be working for this.
Here is an example of what I need.
1 'some text here' FieldID: 'some text here'
2 'Thousands of lines of text here'
3 'some text here' TxSuccess: True
4 'Another couple thousand lines'
5 'some text' FieldID: 'some text'
6 'many lines'
7 'some text' TxSuccess: False
8 'many lines'
9 'some text' FieldID: 'some text'
10 'many lines'
11 'some text' TxSuccess: True
12 'many lines'
13 'some text' FieldID: 'some text'
14 'many lines'
15 'some text' TxSuccess: False
I need it to match everything from "FieldID" to "TxSuccess: False", so in this example I need it to match from lines 5 to 7 and from lines 13 to 15 without matching any other lines.
The problem with most of the regexes I've tried is that they will start the match at the first "FieldID" encountered, like this extremely obvious one:
FieldID:\s.*(\r|\n|.*?)*?TxSuccess:\sFALSE
Also tried using lookahead and lookbehind but none of them were able to match what I needed without any other lines.
I'm fairly new to Regex so there might be a few concepts that I haven't even tried yet.
Thanks in advanced,
Christopher Wilson
|

July 30th, 2009, 03:38 PM
|
|
|
Try tthis:
Code:
(?s)FieldID(?:(?!FieldID).)*TxSuccess:\sFalse
|

July 30th, 2009, 03:41 PM
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
Time spent in forums: 6 h 29 m 52 sec
Reputation Power: 0
|
|
Quote: | Originally Posted by prometheuzz Try tthis:
Code:
(?s)FieldID(?:(?!FieldID).)*TxSuccess:\sFalse
|
According to RegexBuddy, that doesnt match any text in the log. However, the (?s) is a useful little trick that I didn't know about, thanks 
|

July 30th, 2009, 03:52 PM
|
|
|
Quote: | Originally Posted by cwilson According to RegexBuddy, that doesnt match any text in the log. However, the (?s) is a useful little trick that I didn't know about, thanks  |
Then you're probably not using RegexBuddy correctly because the regex I posted is PCRE-all-the-way!
Let me demonstrate by posting an example. If you execute this PHP script:
PHP Code:
$text = "1 'some text here' FieldID: 'some text here'
2 'Thousands of lines of text here'
3 'some text here' TxSuccess: True
4 'Another couple thousand lines'
5 'some text' FieldID: 'some text'
6 'many lines'
7 'some text' TxSuccess: False
8 'many lines'
9 'some text' FieldID: 'some text'
10 'many lines'
11 'some text' TxSuccess: True
12 'many lines'
13 'some text' FieldID: 'some text'
14 'many lines'
15 'some text' TxSuccess: False";
preg_match_all('/(?s)FieldID(?:(?!FieldID).)*TxSuccess:\sFalse/', $text, $matches);
print_r($matches);
it will produce the following output:
Code:
Array
(
[0] => Array
(
[0] => FieldID: 'some text'
6 'many lines'
7 'some text' TxSuccess: False
[1] => FieldID: 'some text'
14 'many lines'
15 'some text' TxSuccess: False
)
)
which is exactly what you said you want to match.
|

July 30th, 2009, 04:05 PM
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
Time spent in forums: 6 h 29 m 52 sec
Reputation Power: 0
|
|
|
Ok, I tried the actual example I put into this thread and it worked perfectly, however the log is much much more complex, i will put a few lines of it in here for you to try.
2009/06/18 10:40:44:421 ThreadID = 1836 INFO PORTALIMAGING 60 BeginBeamDelivery() - Data : FieldID: 3-1
2009/06/18 10:47:20:546 ThreadID = 1836 INFO PORTALIMAGING 60 EndBeamDelivery() - Data : TxSuccess: TRUE
2009/06/18 10:40:44:421 ThreadID = 1836 INFO PORTALIMAGING 60 BeginBeamDelivery() - Data : FieldID: 3-1
2009/06/18 10:47:20:546 ThreadID = 1836 INFO PORTALIMAGING 60 EndBeamDelivery() - Data : TxSuccess: FALSE
2009/06/18 10:40:44:421 ThreadID = 1836 INFO PORTALIMAGING 60 BeginBeamDelivery() - Data : FieldID: 3-1
2009/06/18 10:47:20:546 ThreadID = 1836 INFO PORTALIMAGING 60 EndBeamDelivery() - Data : TxSuccess: TRUE
2009/06/18 10:40:44:421 ThreadID = 1836 INFO PORTALIMAGING 60 BeginBeamDelivery() - Data : FieldID: 3-1
2009/06/18 10:47:20:546 ThreadID = 1836 INFO PORTALIMAGING 60 EndBeamDelivery() - Data : TxSuccess: FALSE
This is a slightly more accurate example, even if it is missing many thousands of lines.
|

July 30th, 2009, 04:08 PM
|
|
|
Quote: | Originally Posted by cwilson Ok, I tried the actual example I put into this thread and it worked perfectly, |
Yes, I knew that.
Quote: | Originally Posted by cwilson however the log is much much more complex, i will put a few lines of it in here for you to try.
2009/06/18 10:40:44:421 ThreadID = 1836 INFO PORTALIMAGING 60 BeginBeamDelivery() - Data : FieldID: 3-1
2009/06/18 10:47:20:546 ThreadID = 1836 INFO PORTALIMAGING 60 EndBeamDelivery() - Data : TxSuccess: TRUE
2009/06/18 10:40:44:421 ThreadID = 1836 INFO PORTALIMAGING 60 BeginBeamDelivery() - Data : FieldID: 3-1
2009/06/18 10:47:20:546 ThreadID = 1836 INFO PORTALIMAGING 60 EndBeamDelivery() - Data : TxSuccess: FALSE
2009/06/18 10:40:44:421 ThreadID = 1836 INFO PORTALIMAGING 60 BeginBeamDelivery() - Data : FieldID: 3-1
2009/06/18 10:47:20:546 ThreadID = 1836 INFO PORTALIMAGING 60 EndBeamDelivery() - Data : TxSuccess: TRUE
2009/06/18 10:40:44:421 ThreadID = 1836 INFO PORTALIMAGING 60 BeginBeamDelivery() - Data : FieldID: 3-1
2009/06/18 10:47:20:546 ThreadID = 1836 INFO PORTALIMAGING 60 EndBeamDelivery() - Data : TxSuccess: FALSE
This is a slightly more accurate example, even if it is missing many thousands of lines. |
In the example from your original post, you mentioned "False" but now you wrote "FALSE".
|

July 30th, 2009, 04:42 PM
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
Time spent in forums: 6 h 29 m 52 sec
Reputation Power: 0
|
|
|
Hahaha if thats all it was I feel like an idiot, thank you so much.
And by the way, how efficiently will this handle a 20+ megabyte file?
|

July 30th, 2009, 04:46 PM
|
|
|
Quote: | Originally Posted by cwilson Hahaha if thats all it was I feel like an idiot, thank you so much.
And by the way, how efficiently will this handle a 20+ megabyte file? |
It al depends on how many text there will be in between "FieldID" and "TxSuccess FALSE". But there are (of course) more efficient ways to find the text you're interested in without loading 20 MB, or more, of text in-memory.
Last edited by prometheuzz : July 30th, 2009 at 04:51 PM.
|

July 30th, 2009, 05:00 PM
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
Time spent in forums: 6 h 29 m 52 sec
Reputation Power: 0
|
|
|
There is usually only about 100 lines between fieldid and tx = false, since the tx = false means that the accelerator beam failed to start correctly and must start over.
My program currently extracts data from the log with a run time of about 2 minutes, but I am no longer on my work computer, so I have no means to test it, but either way I know where to proceed from here.
If it's fairly efficient, an extra 30 seconds or less of run time shouldnt be a problem, since the program will be run while performing another task.
However, if it is like most lookarounds I've tried it could take significantly longer than that, in which case I will have to run the logs through powerGREP to make them a more manageable size before starting to gather data.
Thank you very much for your help!
Christopher Wilson
Radiation Oncology - Physics
Helen F. Graham Cancer Center
4701 Ogletown-Stanton Road
Newark, DE 19713
302-623-4500
|

August 1st, 2009, 01:15 AM
|
|
|
Quote: | Originally Posted by cwilson ...
Thank you very much for your help! |
You're welcome Christopher!
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|