Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old January 15th, 2009, 02:30 AM
PigeonMarine PigeonMarine is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2009
Posts: 12 PigeonMarine User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 52 m 32 sec
Reputation Power: 0
Grab HTML from Source code pasted into Text Field

I posted this question under the PHP forum and to reduce the risk of spaming the boards I am going to just link this one to that post.

Link to PHP Board Post

Can anyone point me in the right direction on this.

Thanks

Reply With Quote
  #2  
Old January 15th, 2009, 03:16 AM
requinix's Avatar
requinix requinix is offline
Still alive
Click here for more information.
 
Join Date: Mar 2007
Location: Washington, USA
Posts: 12,714 requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)  Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1
Time spent in forums: 5 Months 1 Week 4 Days 7 h 6 m 6 sec
Reputation Power: 8969
Send a message via AIM to requinix Send a message via MSN to requinix Send a message via Yahoo to requinix Send a message via Google Talk to requinix
Hey, you look familiar. Have we met? Sorry to be a pain but I'm trying to get people to post their regex questions in the right forum.

PHP Code:
 $before "The status of the <B>[^<]+</B> of <B>[^<]+</B>\.<P>";
$after '<TR><TD>Net Income</TD><TD ALIGN=RIGHT>\$[\d,]+</TD></TR>';
preg_match("!$before(.*?)$after!is"$text$matches);
// $matches[1] is the text in between

$before "<TABLE>
<TR><TH COLSPAN=2 BGCOLOR=#000040>Land Distribution</TH></TR>"
;
$after "<TR><TD>SDI</TD><TD align=right>[\d,]+</TD></TR></TABLE></TD></TR>
</TABLE>"
;
preg_match("!$before(.*?)$after!is"$text$matches);
// again, $matches[1] is the text in between 

I think that should suffice.

Reply With Quote
  #3  
Old January 15th, 2009, 03:33 AM
PigeonMarine PigeonMarine is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2009
Posts: 12 PigeonMarine User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 52 m 32 sec
Reputation Power: 0
little confused here, how does this code work... I would like to understand it instead of just asking for someone to do it for me.

I don't understand how this will print or echo back to the user.

I tried using the code snippet you provided and I get nothing back.

would really like to understand how this is suppose to work.

thanks for the reply.

Reply With Quote
  #4  
Old January 15th, 2009, 09:02 AM
prometheuzz prometheuzz is offline
User 165270
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2005
Posts: 497 prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level) 
Time spent in forums: 5 Days 10 h 14 m 35 sec
Reputation Power: 936
Quote:
Originally Posted by requinix
...
I think that should suffice.


Since that input is coming from a text field filled by a user, I highly doubt that it will always be the same. So you won't have a fixed $before and $after string.

Reply With Quote
  #5  
Old January 15th, 2009, 09:05 AM
prometheuzz prometheuzz is offline
User 165270
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2005
Posts: 497 prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level) 
Time spent in forums: 5 Days 10 h 14 m 35 sec
Reputation Power: 936
Quote:
Originally Posted by PigeonMarine
I posted this question under the PHP forum and to reduce the risk of spaming the boards I am going to just link this one to that post.

Link to PHP Board Post

Can anyone point me in the right direction on this.

Thanks


Before being able to answer your question (with an explanation), could you explain the rules for the strings you want preserve, or the other way around: explain the rules for the strings that should be removed. You gave just a single source and said I want this and that to be preserved, but what about other forms of input?

Before being able to tell the regex engine what should and should not be removed, you should explain it in great detail here.

Good luck.

Reply With Quote
  #6  
Old January 15th, 2009, 01:59 PM
requinix's Avatar
requinix requinix is offline
Still alive
Click here for more information.
 
Join Date: Mar 2007
Location: Washington, USA
Posts: 12,714 requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)requinix User rank is General 120th Grade (Above 100000 Reputation Level)  Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1Folding Points: 417516 Folding Title: Super Ultimate Folder - Level 1
Time spent in forums: 5 Months 1 Week 4 Days 7 h 6 m 6 sec
Reputation Power: 8969
Send a message via AIM to requinix Send a message via MSN to requinix Send a message via Yahoo to requinix Send a message via Google Talk to requinix
Quote:
Originally Posted by prometheuzz
Since that input is coming from a text field filled by a user, I highly doubt that it will always be the same. So you won't have a fixed $before and $after string.

Right. Which is why I looked at the two strings and guessed what parts of them would change and what would not.
If I missed something he would likely mention it.

Quote:
Originally Posted by PidgeonMarine
little confused here, how does this code work... I would like to understand it instead of just asking for someone to do it for me.

I don't understand how this will print or echo back to the user.

I tried using the code snippet you provided and I get nothing back.

would really like to understand how this is suppose to work.

thanks for the reply.

It stuffs that "everything from... to..." into two variables. You're supposed to... I don't know, all you said was that you wanted what was in between them.
I assumed you were going to do something else, like use an HTML parser, or maybe just print out the stuff literally. You didn't really say what you were going to do next and I didn't ask.

Literally, all that code does is search for the $before string (which is generalized a bit) and the $after string (also generalized) and get everything in between them. That's it.
In both cases, $matches[1] contains the text. If you want to do something then you use that. Keep in mind that it contains HTML so if you simply echo/print it out then you'll get (invalid) HTML-formatted text.

If you need an explanation or tutorial on regular expressions then check the sticky here: it has a bunch of links you should look at.


If it's not working then
Quote:
Originally Posted by prometheuzz
could you explain the rules for the strings you want preserve, or the other way around: explain the rules for the strings that should be removed. You gave just a single source and said I want this and that to be preserved, but what about other forms of input?

Before being able to tell the regex engine what should and should not be removed, you should explain it in great detail here.

Reply With Quote
  #7  
Old January 15th, 2009, 02:07 PM
prometheuzz prometheuzz is offline
User 165270
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2005
Posts: 497 prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level) 
Time spent in forums: 5 Days 10 h 14 m 35 sec
Reputation Power: 936
Quote:
Originally Posted by requinix
Right. Which is why I looked at the two strings and guessed what parts of them would change and what would not.
...


Aha, I should have read your post with more attention: I missed that completely! Sorry.

Reply With Quote
  #8  
Old January 19th, 2009, 12:17 AM
PigeonMarine PigeonMarine is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2009
Posts: 12 PigeonMarine User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 52 m 32 sec
Reputation Power: 0
The data I posted in the other thread is a page called advisor in an online game.

This page is the same for all users, except the data in the table and the main Title.

Everything else is the same.

an example of what I am trying to is at this link.
Code:
http://evolution2025.com/qzStatusTidy.php


Copy the code (Source Code) I posted before, and past it into that page and check the preveiw table box, and click the button....

This will show you what I am trying to learn how to do.

I wish I could explain more, but like I said, I am trying to learn how to do this, but not sure where to start.

Thanks

Reply With Quote
  #9  
Old January 19th, 2009, 10:57 AM
PigeonMarine PigeonMarine is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2009
Posts: 12 PigeonMarine User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 52 m 32 sec
Reputation Power: 0
been messing around with this, and got it to return the $before lines of each block of code, but is will not return the other data nor the $after lines

Also it is not stripping the javascript or other tags out, just clearing all white space from the code.

Reply With Quote
  #10  
Old January 19th, 2009, 06:39 PM
printf printf is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Jan 2005
Posts: 1,586 printf User rank is Captain (20000 - 30000 Reputation Level)printf User rank is Captain (20000 - 30000 Reputation Level)printf User rank is Captain (20000 - 30000 Reputation Level)printf User rank is Captain (20000 - 30000 Reputation Level)printf User rank is Captain (20000 - 30000 Reputation Level)printf User rank is Captain (20000 - 30000 Reputation Level)printf User rank is Captain (20000 - 30000 Reputation Level)printf User rank is Captain (20000 - 30000 Reputation Level)printf User rank is Captain (20000 - 30000 Reputation Level) 
Time spent in forums: 4 Weeks 1 h 34 m 20 sec
Reputation Power: 274
Why not just put it into an array and then you can format it any way you want. This is an old script that still works...


PHP Code:
<?php

// $html would be the Status Report you want to process!

// sub string from where we need our data to where our data ends...

// start

$html substr $htmlstripos $html'the status' ) );

// end

$html substr $html0strripos $html'<br><br>' ) );

// setup the html... (get it ready to convert);

$regex = array ( '#<th.*>#Uis''#<tr.*>#Uis'
            
'#<td.*>#Uis''#<\/?table.*>#Uis'
            
'#<\/th.*><\/tr.*>#Uis''#&nbsp;#Uis'
            
'#<\/tr.*>#Uis''#<\/td.*>#Uis' );

$replace = array ( '<th>''<tr>'
            
'<td>'''
            
''''
            
'</tr>''</td>' );

$html preg_replace $regex$replace$html );

// split the data up starting at each title element (header)

$parts explode '<tr><th>'$html );

// set our output container

$out = array ();

// set our main header text

$out['header'] = strip_tags $parts[0] );

// remove $parts[0] = (our header text) and reset the $parts array!

array_shift $parts );

// now build our data array

foreach ( $parts AS $data )
{
    
// split the data fields (IE: <td>name</td>||<td>value</td>)

    
$data str_replace '</td><td>''</td>||<td>'$data );

    
// create a new data array for each new (<tr><td>)

    
$data explode '<tr><td>'$data );

    
// the first element $data[0] is always the header for this data block

    
$header trim array_shift $data ) );

    
// go through each <TR> tag set (IE: <tr><td>? = name</td><td>? = value</td</tr>)

    
foreach ( $data AS $item )
    {
        
// get the name value pairs

        
list ( $name$value ) = array_map 'trim'explode '</td>||<td>'substr $item0strpos $item'</td></tr>' ) ) ) );

        
// there is one case where the structure may cause a false positive, so we catch it here...

        
if ( ! empty ( $name ) && ! empty ( $value ) )
        {
            
$out[$header][$name] = $value;
        }
    }
}

// print out the result array...

print_r $out );
    
?>


It will output this...

Code:
Array
(
    [header] => The status of the Republic of Canyon Land (#750).



    [The Basics] => Array
        (
            [Turns Left] => 10
            [Turns Taken] => 2938
            [Rank] => 38
            [Networth] => $18,143,148
        )

    [Current Status] => Array
        (
            [Money] => $238,681,220
            [Population] => 501,921
            [Land] => 18865 Acres
            [Food] => 2,165,120 bushels
            [Production] => 7 bushels
            [Consumption] => 23,525 bushels
            [Net Change] => -23,518 bushels
            [Oil] => 1,140,920 barrels
        )

    [Economics] => Array
        (
            [Tax Revenues] => $10,375,953
            [Tax Rate] => 35%
            [Per Capita Income] => $59.06
            [Expenses] => $4,419,146
            [Military] => $4,112,567
            [Alliance/GDI] => $117,929
            [Land] => $188,650
            [Net Income] => $5,956,807
        )

    [Land Distribution] => Array
        (
            [Enterprise Zones] => 8663
            [Residences] => 8663
            [Industrial Complexes] => 260
            [Military Bases] => 960
            [Construction Sites] => 300
            [Unused Lands] => 19
        )

    [Military Forces] => Array
        (
            [Spies] => 218,990
            [Troops] => 4,807,197
            [Jets] => 9,142,002
            [Turrets] => 4,348,609
            [Tanks] => 1,328,015
            [Nuclear Missiles] => 5
            [Chemical Missiles] => 12
            [Cruise Missiles] => 9
        )

    [Technology] => Array
        (
            [Military] => 288,889
            [Medical] => 18,716
            [Business] => 508,812
            [Residential] => 509,326
            [Agricultural] => 2316
            [Warfare] => 2931
            [Military Strategy] => 9334
            [Weapons] => 827
            [Industrial] => 8651
            [Spy] => 3587
            [SDI] => 190,345
        )

)

Reply With Quote
  #11  
Old January 20th, 2009, 11:02 PM
PigeonMarine PigeonMarine is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jan 2009
Posts: 12 PigeonMarine User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 3 h 52 m 32 sec
Reputation Power: 0
Thanks, but I only want to pull the table out of the code... Putting it into an array is not useful, I want the output to look just like the orginal table, but I am going to add some things in after it strips the useless data off.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > Grab HTML from Source code pasted into Text Field

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap