Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old March 7th, 2013, 06:55 AM
badger_fruit's Avatar
badger_fruit badger_fruit is offline
Confused badger
Dev Shed Novice (500 - 999 posts)
 
Join Date: Mar 2009
Location: West Yorkshire
Posts: 760 badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 5 h 15 m 18 sec
Reputation Power: 339
REGEX to find specific HTML

Hello all
Have a small problem which I've been trying to figure out all day so far to no avail.

I have a string of HTML which I need to pick out all the A tags (and replace with some text but that's something for later on!).

I have tried

PHP Code:
<[A-Za-z0-9_\-='":/\.].*></a> 
but it selects everything up to the LAST </a> instead of the "next" one (I hope that makes sense!).

Please, can someone help me find the right regex to use?
Thanks a million in advance!!
__________________
The number for UK Emergencies is changing, the new number is 0118 999 881 999 119 7253

"For if leisure and security were enjoyed by all alike, the great mass of human beings who are normally stupefied by poverty would become literate and would learn to think for themselves; and when once they had done this, they would sooner or later realise that the privileged minority had no function and they would sweep it away"
- George Orwell, 1984

Last edited by badger_fruit : March 7th, 2013 at 07:05 AM.

Reply With Quote
  #2  
Old March 7th, 2013, 10:35 AM
acray acray is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Posts: 21 acray User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 5 h 27 m 59 sec
Reputation Power: 0
I'm assuming you have multi-line/"dot matches new line" turned on...

Looks to be a problem with greedy vs. lazy.

And I'm not sure what's the point of
Code:
[A-Za-z0-9_\-='":/\.].*
because apart from excluding a few special chars you wouldn't expect to be in an <a> tag anyway it's basically just
Code:
..*


try
Code:
<a[^>]*?></a>


Of course that would only match an <a> tag with nothing between the <a> and </a> so the following might be what you are shooting for
Code:
<a ([^>]*?)>([^<]*?)</a>


Back reference \1 would collect all the attributes of the tag and \2 would have the display text.

Reply With Quote
  #3  
Old March 7th, 2013, 12:17 PM
badger_fruit's Avatar
badger_fruit badger_fruit is offline
Confused badger
Dev Shed Novice (500 - 999 posts)
 
Join Date: Mar 2009
Location: West Yorkshire
Posts: 760 badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 5 h 15 m 18 sec
Reputation Power: 339
Quote:
Originally Posted by acray
Back reference \1 would collect all the attributes of the tag and \2 would have the display text.


Do what now?
Okay,sorry, I honestly have no idea about regex, all I know is that the code ...


PHP Code:
 $text "<p style=\"text-align: center;\">
    <span style=\"line-height: 1.538em;\">
        <img src=\"http://www.example.com/pics/ln/20130210/100213_2013_grammy_arrivals_9/katy-perry-55th-annual-grammy-awards_3495470.jpg\" alt=\"Katy Perry, Grammys Dress 2013\" title=\"Katy Perry, Grammys Dress 2013\" width=\"500\" height=\"749\" style=\"vertical-align: top; display: block; margin-left: auto; margin-right: auto;\" />
        <a href=\"http://www.example.com/pictures/katy_perry/1-1\">
            <span style=\"font-size: x-small;\">Katy Perry's Grammys Dress Caused Quite A Stir</span>
        </a>
    </span>
</p>
<p>It's safe to say Katy Perry stole the entire show at the Grammy awards on Sunday evening (February 10, 2013) in a mint green Gucci dress that featured a rather revealing keyhole cut-out. It quickly caused excitement on Twitter, with fans of the star lauding her daring choice of dress. It wasn't just her followers who noticed either, with various amusing photographs of celebrities in awe of Perry's, ugh, assets, circulating online today.</p><p>Elton John, of all people, was caught out cheekily eyeing up Perry's dress, while television star Ellen DeGeneres made a joke of the elephant in the room, staring intently at Perry's chest as her girlfriend Portia de Rossi looked on. \"I was inspired by Priscilla Presley in the Seventies... Married to Elvis Presley, of course,\" Perry told Ryan Seacrest of her Grammys dress. Seacrest himself joked that he had luckily had plenty of practice at staying focused when at eye level. One man who did manage to keep his eyes on the ball was Perry's boyfriend John Mayer, who was photographed staring straight into Katy's eyes when the pair were snapped taking their seats. The 35-year-old blues guitarist admitted that he's been thinking about marrying Perry, 28, sometime in the future. When asked whether a wedding would be a possibility, Mayer said, \"Of course. I mean, I'm still the kid from Connecticut. That's what you do,\" according to the Daily Mail.</p>
<p>
    <img src=\"http://www.example.com/pics/mn/20130210/100213_2013_grammy_arrivals_9/katy-perry-55th-annual-grammy-awards_3495445.jpg\" alt=\"Katy Perry, Grammys Dress 2013\" title=\"Katy Perry, Grammys Dress 2013\" width=\"300\" height=\"560\" style=\"vertical-align: top; margin-left: 12px; margin-right: 12px;\" />
    <img src=\"http://www.example.com/pics/mn/20130210/100213_2013_grammy_arrivals_9/katy-perry-55th-annual-grammy-awards_3495475.jpg\" alt=\"Katy Perry, Grammys Dress, 2013\" title=\"Katy Perry, Grammys Dress, 2013\" width=\"297\" height=\"560\" style=\"vertical-align: top;\" /></p><p style=\"text-align: center;\">
    <a href=\"http://www.example.com/pictures/katy_perry/1-1\">
        <span style=\"font-size: x-small;\">Katy Perry Ignored The Grammys' Dress Code Memo,&nbsp;Though She Looked Smoldering In Her Gucci Dress</span>
    </a>
</p>
"
;

preg_match_all('/<a [^<>]+>(.*?)/i'$text,  $matches_a);
echo 
"A: ";
print_r($matches_a);

preg_match_all('/<a ([^>]*?)>([^<]*?)</a>/i'$text$matches_b);
echo 
"B: " ;
print_r($matches_b); 


Gives me ...

Quote:
A: Array
(
[0] => Array
(
[0] => <a href="http://www.example.com/pictures/katy_perry/1-1">
[1] => <a href="http://www.example.com/pictures/katy_perry/1-1">
)

[1] => Array
(
[0] =>
[1] =>
)

)
B:


I think I expected to see something like this :-
EDIT: What I mean is 'What I WANT to see is this ... '

Quote:
A: Array
(
[0] => Array
(
[0] => <a href="http://www.example.com/pictures/katy_perry/1-1">
[1] => <a href="http://www.example.com/pictures/katy_perry/1-1">
)

[1] => Array
(
[0] => <span style=\"font-size: x-small;\">Katy Perry's Grammys Dress Caused Quite A Stir</span>
[1] => <span style=\"font-size: x-small;\">Katy Perry Ignored The Grammys' Dress Code Memo,&nbsp;Though She Looked Smoldering In Her Gucci Dress</span>
)

)

Last edited by badger_fruit : March 7th, 2013 at 12:33 PM. Reason: clarification on requirements

Reply With Quote
  #4  
Old March 7th, 2013, 02:39 PM
badger_fruit's Avatar
badger_fruit badger_fruit is offline
Confused badger
Dev Shed Novice (500 - 999 posts)
 
Join Date: Mar 2009
Location: West Yorkshire
Posts: 760 badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level)badger_fruit User rank is Major (30000 - 40000 Reputation Level) 
Time spent in forums: 1 Week 4 Days 5 h 15 m 18 sec
Reputation Power: 339
Well, after a full day of searching, testing and banging my head off of walls/floors/tables/chairs/cats/dogs, I *think* I have found a working solution ...

PHP Code:
 preg_match_all('/<a\s[^>]*href=\"([^\"]*)\"[^>]*>(.*)<\/a>/siU'$text,  $matches_aPREG_PATTERN_ORDER); 


Gives me ...

Quote:
A: Array
(
[0] => Array
(
[0] => <a href="http://www.example.com/pictures/katy_perry/1-1"><span style="font-size: x-small;">Katy Perry's Grammys Dress Caused Quite A Stir</span></a>
[1] => <a href="http://www.example.com/pictures/katy_perry/1-1"><span style="font-size: x-small;">Katy Perry Ignored The Grammys' Dress Code Memo,&nbsp;Though She Looked Smoldering In Her Gucci Dress</span></a>
)

[1] => Array
(
[0] => http://www.example.com/pictures/katy_perry/1-1
[1] => http://www.example.com/pictures/katy_perry/1-1
)

[2] => Array
(
[0] => <span style="font-size: x-small;">Katy Perry's Grammys Dress Caused Quite A Stir</span>
[1] => <span style="font-size: x-small;">Katy Perry Ignored The Grammys' Dress Code Memo,&nbsp;Though She Looked Smoldering In Her Gucci Dress</span>
)

)


Wooop woop

Reply With Quote
  #5  
Old March 7th, 2013, 04:06 PM
acray acray is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2013
Posts: 21 acray User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 5 h 27 m 59 sec
Reputation Power: 0
Trying to read a regexp still makes my eyes bleed--not to mention all the extra control stuff when doing it in something like PHP...

So since you think you got it working I only glanced what you came up with.

One thing that jumped out at me was the lazy vs greedy thing I mentioned earlier. Unless you used something to change the default operation of * from greedy to lazy, you could run into problems with different data sources.

But take that with a grain of salt, doing anything significant with regexp involves lots of banged heads for me too. If it seems to be working...

In any case, I hope I was able to provide some direction.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > REGEX to find specific HTML

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap