#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    3
    Rep Power
    0

    Search on one thing and replace another


    For this line of html, the search is for the "24pt" and if it exists then replace its <p> tags with <h1> tags.

    <p style="font-size:24pt">Heading</p>
    <p style="font-size:12pt">Some text</p>
    <p style="font-size:12pt">Some more text</p>
    <p style="font-size:24pt">Another Heading</p>

    Would become:

    <h1 style="font-size:24pt">Heading</h1 >
    <p style="font-size:12pt">Some text</p>
    <p style="font-size:12pt">Some more text</p>
    <h1 style="font-size:24pt">Another Heading</h1 >


    I'm using Expresso but just cannot get there... Any help would be MUCH appreciated.

    Ron
  2. #2
  3. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Location
    spaceBAR Central
    Posts
    225
    Rep Power
    41
    This is an example of using sed to do the replacement for you:
    Code:
    $ echo '<p style="font-size:24pt">Heading</p>' | sed  's/^<p\(.*\)<\/p>$/<h1\1<\/h1>/'
    <h1 style="font-size:24pt">Heading</h1>


    If that is what you want, you can run it over your file:
    Code:
    sed  's/^<p\(.*\)<\/p>$/<h1\1<\/h1>/' file_in.txt > file_out.txt
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    3
    Rep Power
    0
    I should have been explicit, I need to use regex find and replace, similar to Expresso. I'm using a program that converts MS Word files to CSV with HTML and that takes direct regex code.
  6. #4
  7. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Location
    spaceBAR Central
    Posts
    225
    Rep Power
    41
    I just looked at info for Expresso(ultrapico.com/Expresso.htm) and the site info says it uses the ".net" flavor of the regular expression engine and if that is true then you can use this as the regular expression to find the lines:
    Code:
    ^<p(.*)</p>$

    And this as the substitution or replacement:
    Code:
    <h1\1<\/h1>
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    3
    Rep Power
    0
    Thanks for clarification. And, how would that be for only lines with "font-size: 24pt" ?
  10. #6
  11. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Location
    spaceBAR Central
    Posts
    225
    Rep Power
    41
    Just hard code into the regular expression the text that should be found, For example:

    Code:
    ^<p style="font-size:24pt">(.*)</p>$
    
    Assert position at the beginning of the string ^
    Match the characters <p style="font-size:24pt"> literally <p style="font-size:24pt">
    Match the regular expression below and capture its match into backreference number 1 (.*)
       Match any single character that is not a line break character .*
          Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
    Match the characters </p> literally </p>
    Assert position at the end of the string (or before the line break at the end of the string, if any) $

IMN logo majestic logo threadwatch logo seochat tools logo