SunQuest
           Perl Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming LanguagesPerl Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
Get inside! Sample the range of functionality easily built with JMSL Library for Time Series Data Analysis, Heat Maps, Portfolio Optimization, Monte Carlo Simulation, Stock Price Charting and more. Download Now!
  #1  
Old November 13th, 2000, 08:18 PM
sbeacher sbeacher is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2000
Location: Atlanta, GA
Posts: 10 sbeacher User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
So I need a regular expression to strip out all HTML tags EXCEPT the ones I've allowed.

I think I almost have it, but I can't get the negation right..

"/</?(^IMG|A|FONT|B|I|U|STRONG|EM|CODE|PRE|H1|H2|H3|H4|H5|H6)(.*)>?/i"

Now the ^ isn't negating because it's not in a class. So how would I negate all those tags (meaning match anything EXCEPT those?)

Also, what's a better alternative to the .* match so that they can't just throw a newline in there and **** things up?

Reply With Quote
  #2  
Old November 15th, 2000, 05:08 PM
yuccatan yuccatan is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2000
Location: Boston, MA
Posts: 6 yuccatan User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 m 8 sec
Reputation Power: 0
As you note, the negation appears to not be working because '^' serves as a negator for character classes (the [] construct.)

Possibly you could set the match search to a negated match search? (Change the =~ ?)

Also, what's a better alternative to the .* match so that they can't just throw a newline in there and f*** things up?

I've read that trying [^>]* will encompass everything (including n) until the close of the tag; will this help?

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesPerl Programming > Stripping HTML tags with regular expressions


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 4 hosted by Hostway