Regex Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsProgramming Languages - MoreRegex Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old July 4th, 2009, 06:54 AM
scanf scanf is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2009
Location: Sydney
Posts: 3 scanf User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 55 m 44 sec
Reputation Power: 0
"Push" single document root URL rewrite also drop file ext.

We have reached our limits here, so hopefuly someone can explain this. We are containing windows .chm helpfile source with unhelpful URLs under a single online navigator. Here is an example of .chm source URL:

Code:
<A HREF="mk:@MSITStore:ITC_IDR.chm::/IDR_LISP_B118.htm">


I have this developed to sanitize, works most of the time:
PHP Code:
 $body ereg_replace("mk:@MSITStore:[^<>[:space:]]+[[:alnum:]/]+.chm::/"$docroot$body);
$body str_replace(".GIF"".gif"$body);
$body ereg_replace(".htm|.hhk"""$body); 


...where $docroot is a string created by another function to return the URL base of this "content manager". This results in any of the hundreds of HTML source files in directory being callable by name from a single base URL xxx?help=NAME. My problem is twofold:

I have another set of HTML source documents parsed from the hhc indexes which actually uses "sane" HTML links which need to be "rewritten" upon display for same behavior as above. I would also like to drop all file extensions that same line as well (more efficient) -- here is the conversion:

(match a)|(match b)+(FILENAME)+(extension)=
($docroot)(FILENAME)

where a = "mk:@MSITStore:[^<>[:space:]]+[[:alnum:]/]+.chm::/"
but b = nothing (these elements look like <a href="filename.htm">)

Since we can't search for nothing, probably I actually need to search for all <a> tags and push/replace WHATEVER MIGHT prepend the href's filename with $docroot to PHP buffer:

(match href=?)+(FILENAME)+(extension)=
($docroot)(FILENAME)

The resulting URLS are always xxx?help=FILENAME
(minus extension, and where "xxx?help=" represents what is actually in $docroot)

Sorry I don't know how to describe the problem more simply, but I know there is an answer. Can anyone help?

Reply With Quote
  #2  
Old July 4th, 2009, 07:04 AM
prometheuzz prometheuzz is offline
User 165270
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2005
Posts: 496 prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level) 
Time spent in forums: 5 Days 9 h 15 m 23 sec
Reputation Power: 933
Quote:
Originally Posted by scanf
...
Can anyone help?


Probably.
First, why are you using the older ereg functions instead of the preferred preg ones?
Second, forget about regex for the moment. Could you post a couple of example strings and for each string post the desired transformation? Also, describe what exactly is changed for each example.

Reply With Quote
  #3  
Old July 4th, 2009, 07:18 AM
scanf scanf is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2009
Location: Sydney
Posts: 3 scanf User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 55 m 44 sec
Reputation Power: 0
Quote:
Originally Posted by prometheuzz
Probably.
First, why are you using the older ereg functions instead of the preferred preg ones?


Legacy code from a previous project. Can change it no problem, wasn't aware of a preference for preg_replace thanks

Quote:
Originally Posted by prometheuzz
Second, forget about regex for the moment. Could you post a couple of example strings and for each string post the desired transformation? Also, describe what exactly is changed for each example.


Sure, first examples from chm source:
<A HREF="mk:@MSITStore:ITC_IDR.chm::/B118.htm">
to
<A HREF="http://mysite?help=B118">
::
<A HREF="mk:@MSITStore:ITC_IDR.chm::/B156.htm">
to
<A HREF="http://mysite?help=B156">
:: second example set from "index" file type
<a href='1062.htm'>
to
<A HREF="http://mysite?help=1062">
::
<a href='LOCAL-B.htm'>
to
<A HREF="http://mysite?help=LOCAL-B">


Where "http://mysite?help=" is a generic URL to represent what is actually being stored (correctly) in $docroot variable as explained above. Thanks a lot!

Reply With Quote
  #4  
Old July 4th, 2009, 07:34 AM
prometheuzz prometheuzz is offline
User 165270
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2005
Posts: 496 prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level) 
Time spent in forums: 5 Days 9 h 15 m 23 sec
Reputation Power: 933
Try something like this:
PHP Code:
 $text 'abc
<A HREF="mk:@MSITStore:ITC_IDR.chm::/B118.htm">
def
<A HREF="mk:@MSITStore:ITC_IDR.chm::/B156.htm">
ghi
<a href=\'1062.htm\'>
jkl
<a href=\'LOCAL-B.htm\'>'
;

$docroot 'http://mysite?help=';

echo 
$text "\n-------------------------------\n";
echo 
preg_replace(
  
'#<a\s+href=[\'"](?:mk:@MSITStore:[^/]*/)?(.*?)\.htm[\'"]>#i'
  
"<a href=\"$docroot$1\">"
  
$text
);

/* output:

abc
<A HREF="mk:@MSITStore:ITC_IDR.chm::/B118.htm">
def
<A HREF="mk:@MSITStore:ITC_IDR.chm::/B156.htm">
ghi
<a href='1062.htm'>
jkl
<a href='LOCAL-B.htm'>
-------------------------------
abc
<a href="http://mysite?help=B118">
def
<a href="http://mysite?help=B156">
ghi
<a href="http://mysite?help=1062">
jkl
<a href="http://mysite?help=LOCAL-B">

*/ 

Last edited by prometheuzz : July 4th, 2009 at 07:43 AM.

Reply With Quote
  #5  
Old July 4th, 2009, 08:14 AM
scanf scanf is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Jul 2009
Location: Sydney
Posts: 3 scanf User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 55 m 44 sec
Reputation Power: 0
Works good

I don't know how to thank you.

Reply With Quote
  #6  
Old July 4th, 2009, 08:28 AM
prometheuzz prometheuzz is offline
User 165270
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Oct 2005
Posts: 496 prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level)prometheuzz User rank is General (90000 - 100000 Reputation Level) 
Time spent in forums: 5 Days 9 h 15 m 23 sec
Reputation Power: 933
Quote:
Originally Posted by scanf
I don't know how to thank you.


Saying that you don't know how to thank me is more than enough gratitude!
You're most welcome.

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming Languages - MoreRegex Programming > "Push" single document root URL rewrite also drop file ext.


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump




 Free IT White Papers!
 
How to Present Effectively Online
This white paper offers practical and actionable advice on the key steps that any presenter should consider as they plan and execute a Webinar or online meeting.

 
Open Source Security Myths
Open Source Software (OSS) is computer software whose source code is available to the general public with relaxed or non-existent intellectual property restrictions (or arrangement such as the public domain), and is usually developed with the input of many contributors.

 
Power and Cooling Capacity Management for Data Centers
This paper describes the principles for achieving power and cooling capacity management.

 
Scalable, Fault-Tolerant NAS for Oracle - The Next Generation
For several years NAS has been evolving as a storage alternative for Oracle databases, and for good reason: NAS is quite often the simplest, most cost-effective storage approach for Oracle. Learn about the benefits that HP's approach to scalable NAS brings to Oracle environments in this comprehensive white paper.

 
Understanding Web Application Security Challenges
This white paper discusses many common threats and preventive measures for Web application security, and explains what you can do to help protect your organization.

 

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 




© 2003-2009 by Developer Shed. All rights reserved. DS Cluster 3 Hosted by Hostway
Stay green...Green IT