July 20th, 2013, 10:35 AM
How to find specific HTML div and it's matching closing tag
I am trying to use regex in a vbscript to find a specific html div by ID like this '<div id="page_content_top_stage">' and everything in that div making sure that any nested div tags do not mess things up.
Here is one example of this type of HTML code:
<div id="page_content_top_stage"> <br>
<!-- InstanceBeginEditable name="content up" -->
<table border="0" cellpadding="0" cellspacing="0">
<td width="1" height="100%"><img src="../spacer.gif" width="1" height="627"></td>
<td width="740" valign="top"><img src="../gui_assets/gui_images/spacer.gif" width="740" height="1"><br>
<!-- InstanceEndEditable --> </div>
This worked somewhat but if there was a nested div tag, this code would only match up to that nested '</div>' and miss content after that point. In other words, it was catching the wrong closing tag.
July 20th, 2013, 05:48 PM
Regular expressions are not really suitable for parsing complex structures like HTML. They work OK if you're just trying to grab really small snippets out of it, but if you need to actually match related tags it's a bit beyond their scope. Consider looking for a library that's actually able to parse the HTML as HTML.
July 20th, 2013, 06:03 PM
I have never used something like that. Would it work in a vbs script file?
Originally Posted by E-Oreo