Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old April 23rd, 2012, 11:11 PM
jusserfinn jusserfinn is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2012
Posts: 4 jusserfinn User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 36 m 14 sec
Reputation Power: 0
PHP - Getting too many matches with my regex?

I'm fairly new with PHP and regex's. I am attempting to match the pattern: word-word 1 or more times in a file. my code:
Code:
 <?php

$fIn = fopen( 'matchtext.dat', 'r' );
if( ! $fIn )
     die( "Couldn't open the source file for reading<br/>");

while( $line = fgets( $fIn ) )
     if( preg_match( '|\b([a-zA-Z]+-+)+[a-zA-Z]+|', $line, $matches ) ) {
            echo "<pre>"; print_r( $matches ); echo "</pre>";
     }

?>


is returning an additional array entry which is the next-to-the-last word in the line with a hyphen :
Array
(
[0] => learn-english-today
[1] => english-
)
Array
(
[0] => that-than-which-nothing-greater-can-be-though
[1] => be-
)
Array
(
[0] => that-than-which-nothing-greater-can-be-thought
[1] => be-
)

Where did I go wrong? I only want the first!

Reply With Quote
  #2  
Old April 23rd, 2012, 11:50 PM
E-Oreo's Avatar
E-Oreo E-Oreo is offline
Lost in code
Click here for more information.
 
Join Date: Dec 2004
Posts: 7,939 E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)  Folding Points: 945 Folding Title: Novice Folder
Time spent in forums: 2 Months 9 h 12 m 42 sec
Reputation Power: 7053
Quote:
... $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.

source

([a-zA-Z]+-+) is a parenthesized subpattern. However, the parenthesis are an essential part of your regex so you cannot remove them. If you only want the first element in the array then only use the first and just ignore the second.
__________________
PHP FAQ
How to program a basic, secure login system using PHP

Quote:
Originally Posted by Spad
Ah USB, the only rectangular connector where you have to make 3 attempts before you get it the right way around

Reply With Quote
  #3  
Old April 24th, 2012, 12:09 AM
jusserfinn jusserfinn is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2012
Posts: 4 jusserfinn User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 36 m 14 sec
Reputation Power: 0
Quote:
Originally Posted by E-Oreo
If you only want the first element in the array then only use the first and just ignore the second.


Thanks:
Do you mean the "[a-zA-Z]+|"? If so, that won't work because then it will find trailing hyphens (that-than-which-nothing-greater-can-be-)

the second part in this case needs to find a final word after the last hyphen.

Reply With Quote
  #4  
Old April 25th, 2012, 02:21 AM
abareplace abareplace is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2011
Posts: 29 abareplace User rank is First Lieutenant (10000 - 20000 Reputation Level)abareplace User rank is First Lieutenant (10000 - 20000 Reputation Level)abareplace User rank is First Lieutenant (10000 - 20000 Reputation Level)abareplace User rank is First Lieutenant (10000 - 20000 Reputation Level)abareplace User rank is First Lieutenant (10000 - 20000 Reputation Level)abareplace User rank is First Lieutenant (10000 - 20000 Reputation Level)abareplace User rank is First Lieutenant (10000 - 20000 Reputation Level)abareplace User rank is First Lieutenant (10000 - 20000 Reputation Level) 
Time spent in forums: 8 h 25 m 9 sec
Reputation Power: 0
Hi, jusserfinn,

the subexpression [a-zA-Z]+-+ is captured several times. For example, in "learn-english-today", it will capture "learn-", then "english-". Only the last captured string (english-) will be saved.

If you want to match the first word, use

Code:
<?php

if( preg_match( '|\b([a-zA-Z]+)(?:-[a-zA-Z]+)+|', 'learn-english-today', $matches ) ) {
            echo "<pre>"; print_r( $matches ); echo "</pre>";
     }

?>

Reply With Quote
  #5  
Old April 25th, 2012, 04:40 PM
jusserfinn jusserfinn is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2012
Posts: 4 jusserfinn User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 36 m 14 sec
Reputation Power: 0
Ok, I know there is something to be said about looking up your question in the manual (which I did), but loads more should be said about checking it twice!

PHP.net says:

Code:
int preg_match ( string $pattern , string $subject [, array &$matches [, int $flags = 0 [, int $offset = 0 ]]] )

"If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on."

So simple, I must have skimmed over it.

I fixed my code by adding [0] to the $matches in the echo statement:

Code:
 <?php

$fIn = fopen( 'matchtext.dat', 'r' );
if( ! $fIn )
     die( "Couldn't open the source file for reading<br/>");

while( $line = fgets( $fIn ) )
       if( preg_match( '|\b([a-zA-Z]+-+)+[a-zA-Z]+|', $line, $matches ) ) {
                echo "<pre>"; print_r( $matches[0] ); echo "</pre>";
     }
?>


Thanks for trying though!

Reply With Quote
  #6  
Old April 25th, 2012, 05:34 PM
E-Oreo's Avatar
E-Oreo E-Oreo is offline
Lost in code
Click here for more information.
 
Join Date: Dec 2004
Posts: 7,939 E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)E-Oreo User rank is General 91st Grade (Above 100000 Reputation Level)  Folding Points: 945 Folding Title: Novice Folder
Time spent in forums: 2 Months 9 h 12 m 42 sec
Reputation Power: 7053
Quote:
So simple, I must have skimmed over it.

Twice actually. Since I quoted that exact line and linked you to the manual page in my first post.

Reply With Quote
  #7  
Old April 25th, 2012, 05:53 PM
jusserfinn jusserfinn is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Apr 2012
Posts: 4 jusserfinn User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 36 m 14 sec
Reputation Power: 0
Quote:
Originally Posted by E-Oreo
Twice actually. Since I quoted that exact line and linked you to the manual page in my first post.


lol, Didn't even look at the quoted part, thought you were quoting me. I apologize.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > PHP - Getting too many matches with my regex?

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap