#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    14
    Rep Power
    0

    Question Regex Rookie needs help with text replacement


    So I tasked myself to learn enough basic regex to do what I thought was some simple text replacement, hmm
    learning-curve +59 yrs of age =
    I want to convert any BB-code that looks like these:
    [code=js]
    [code=css]
    [code=html]

    and replace it with wiki syntax that should look like this:
    <code js>
    <code css>
    <code html>

    I'm using RegexBuddy to help me

    Thanks for any suggestions you can throw my way
  2. #2
  3. Turn left at the third duck
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2011
    Location
    Nelson, NZ
    Posts
    112
    Rep Power
    93
    Hey there,

    It's a pleasure to help you with this.
    In RB, at the very top, click Replace (to the right of Match).

    In the expression window, paste:
    Code:
    \[code=([^]]*?)]
    In the subject window, paste your test strings:
    Code:
    [code=js]
    [code=css]
    [code=html]
    In the Replace box, paste:
    Code:
    <code \1>
    Click the pull-down menu that says "Replace", select "List All Replacements". Look in the replace window:
    Code:
    <code js>
    <code css>
    <code html>
    Next, you can click the Use tab to get a pre-packaged preg_replace call if you like.

    Does this help?
    Let me know if I you'd like to know any more details.

    Wishing you a fun day,

    -Andy
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    14
    Rep Power
    0
    Originally Posted by ragax
    Hey there,

    It's a pleasure to help you with this.
    In RB, at the very top, click Replace (to the right of Match).
    Thank you so much Andy - you made my DAY (plus the Tylenol's helped a bit)
    To impose upon you a bit more lets see if I can grasp the concepts you've presented:

    Originally Posted by ragax
    In the expression window, paste:
    Code:
    \[code=([^]]*?)]
    So
    Code:
    \[code=
    this part is easy - literal text replacement
    Code:
    ([^]]*?)]
    so at the point the regex is sitting at "=" for everything from here to before the last "]" we're checking for whatever GeSHi language is in the CODE boxes,e.g css,php etc.
    so you've created a group containing this character class
    Code:
    [^]]
    which creates a back reference and accepts all characters except the closing "]"
    then last we have the closing "]"
    Originally Posted by ragax
    In the Replace box, paste:
    Code:
    <code \1>
    so we LITERALLY replace with
    Code:
    <code
    and then using the back reference
    Code:
    \1
    to give us the unaltered GeSHi language and finally the closing "]"
    Code:
    >
    Am I close?
    Thanks again so much
    Dan
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    14
    Rep Power
    0
    So now your pupil wants to show off
    So this:
    [attachment=0]1_0_a-ViewPort-WebPage.jpg[/attachment]

    has to become this:
    {{wiki:1_0_a-ViewPort-WebPage.jpg}}

    So MATCH with:
    Code:
    \[attachment\=\d\]([^[]*?)\[\/attachment\]
    and then REPLACE with:
    Code:
    {{wiki:\1}}
    and VOILA!

    Again thanks a lot ragax
  8. #5
  9. Turn left at the third duck
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2011
    Location
    Nelson, NZ
    Posts
    112
    Rep Power
    93
    Hi Dan,

    Glad it worked for you, and even gladder you're getting the hang of it.

    Code:
    Am I close?
    I think you're there!
    One thing that helps me understand and explain an expression is to try to put in in plain English. Here is your expression, in comment mode, which means that you can still dump it in RB and it will work!

    Code:
    (?x)       # comment mode
    \[code=    # match literal characters [code=
    (          # in Group 1, capture... 
    [^]]*?     # any number of characters that are NOT a right bracket, lazily, expanding as needed
    )          # end of Group 1 capture
    ]          # match literal character right bracket
    Look at the comments. Do you agree that the "plain English" makes the expression easy to understand?
  10. #6
  11. Turn left at the third duck
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2011
    Location
    Nelson, NZ
    Posts
    112
    Rep Power
    93
    Hi again, Dan!

    Now replying to your second message.
    Code:
    \[attachment\=\d\]([^[]*?)\[\/attachment\]
    Nice.
    For the purpose of the exercise, let's look at ways to make it a little tighter. A couple things can be tightened---characters that don't need to be escaped, and the character class.

    Here's another option.
    Code:
    \[attachment.*?](.*?)\[/at
    After you're satisfied you've started an attachment tag, lazily capture everything until you hit the next left bracket.
    Literally match [/at
    At this stage, you know it's the closing tag, so we can stop.

    Just one option.

    Hey I think your "headaches" are going to disappear and you'll have a lot of fun with regex.

    Warmest wishes,

    -A
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    14
    Rep Power
    0
    Originally Posted by ragax
    Just one option.

    Hey I think your "headaches" are going to disappear and you'll have a lot of fun with regex.

    Warmest wishes,

    -A
    Thanks again Andy, a thing of beauty but to me it looks like chess players would make the best Regex coders with all this look-ahead stuff
  14. #8
  15. Turn left at the third duck
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2011
    Location
    Nelson, NZ
    Posts
    112
    Rep Power
    93
    You're very welcome. Please ask again anytime, there's a lot more cool stuff in PHP's PCRE flavor of regex, for instance conditional and recursive expressions.

IMN logo majestic logo threadwatch logo seochat tools logo