Perl Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming LanguagesPerl Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
Stay one step ahead of the competition. Evaluate and give feedback on some of the hottest web development tools on the market today. Make your opinion heard! Click Here
  #1  
Old December 20th, 2000, 12:14 PM
Intaglio Intaglio is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2000
Posts: 0 Intaglio User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Hi,

I'm writing a Palm Pilot portal that allows users to see the Palm Pilotized version of any website. Basically, they input a URL and ALL HTML tags except for <br>,<p>, and <a...></a> get stripped. What I'm doing now is this:
$content = get $url;
#$content =~ s/<style.*?/style>//gi;
#$content =~ s/<!.*?>//gi;
#$content =~ s/<br>/--break--/gi;
#$content =~ s/<p>/--break-- --break--/gi;
#$content =~ s/<a/--startlink--/gi;
#$content =~ s/</a>/--endlink--/gi;
#$content =~ s/<.*?>//g;
#$content =~ s/--startlink--/<a/gi;
#$content =~ s/--endlink--/</a>/gi;
#$content =~ s/--break--/<br>/gi;

but the problem I run into w/ that is that the stuff between style and script tags isn't being removed and it doesn't really look good.

Any suggestions?

-Intaglio

Reply With Quote
  #2  
Old December 20th, 2000, 01:48 PM
luciddream luciddream is offline
Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2000
Location: Ft. Lauderdale, FL, US
Posts: 29 luciddream User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
remember that . matches everything except newline characters, so, it may be that it's not getting stripped for that reason.

Reply With Quote
  #3  
Old December 20th, 2000, 02:27 PM
Anonym0us Anonym0us is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Dec 2000
Posts: 0 Anonym0us User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Instead of /gi, use /gis and newlines will be included.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPerl Programming > Stripping HTML tags of an external site


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 6 hosted by Hostway