We have reached our limits here, so hopefuly someone can explain this. We are containing windows .chm helpfile source with unhelpful URLs under a single online navigator. Here is an example of .chm source URL:
I have this developed to sanitize, works most of the time:
...where $docroot is a string created by another function to return the URL base of this "content manager". This results in any of the hundreds of HTML source files in directory being callable by name from a single base URL xxx?help=NAME. My problem is twofold:
$body = ereg_replace("mk:@MSITStore:[^<>[:space:]]+[[:alnum:]/]+.chm::/", $docroot, $body);
$body = str_replace(".GIF", ".gif", $body);
$body = ereg_replace(".htm|.hhk", "", $body);
I have another set of HTML source documents parsed from the hhc indexes which actually uses "sane" HTML links which need to be "rewritten" upon display for same behavior as above. I would also like to drop all file extensions that same line as well (more efficient) -- here is the conversion:
(match a)|(match b)+(FILENAME)+(extension)=
where a = "mk:@MSITStore:[^<>[:space:]]+[[:alnum:]/]+.chm::/"
but b = nothing (these elements look like <a href="filename.htm">)
Since we can't search for nothing, probably I actually need to search for all <a> tags and push/replace WHATEVER MIGHT prepend the href's filename with $docroot to PHP buffer:
The resulting URLS are always xxx?help=FILENAME
(minus extension, and where "xxx?help=" represents what is actually in $docroot)
Sorry I don't know how to describe the problem more simply, but I know there is an answer. Can anyone help?