#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2009
    Posts
    19
    Rep Power
    0

    Matching a <div> without another div inside it


    I'm working on HTML (I can't control it so can't parse it as XML as it wont always be valid) .

    I want to match (for example) any div that does not have another div inside it:

    (the bold parts here)

    <div>
    <div>
    <div>Foo</div>
    </div>
    </div>

    <div />




    <div>
    <span>Bar</span>
    </div>



    This works:
    Code:
    /<(div)([^>]*)(\/>|>((?!<[\/]*\1>).?)*?)<\/\1>/s
    on strings up to around 5000 chars, then appears to cause php to crash. It also seems overy complicated for what i'm trying to do.

    Is there another easier way?

    Thanks
  2. #2
  3. No Profile Picture
    User 165270
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2005
    Posts
    497
    Rep Power
    938
    This may perform better:

    PHP Code:
    '/<div(?:[^>]*\/>|>(?:(?!<\/?div>).)*<\/div>)/s' 

IMN logo majestic logo threadwatch logo seochat tools logo