#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    3
    Rep Power
    0

    Remove T-SQL comment


    hello everybody,
    I have a problem with searching for T-SQL comments. I've searching a lot but no solution could solve my problem.

    I will use regex replace with to snip out SQL comments. But I don't want to search and replace within strings - therefore I will look for following things
    1.) search for strings - they begin with ' followed by 0-n '' or other characters and ends with '
    2.) look for -- everything behind this is a comment
    3.) look for /* followed by 0-n characters ends with */
    therefore I create this regexpattern
    (?<string>'(?:''|[^'])*')+|--[^\r\n]*|/\*[^*]*\*+(?:[^/*][^*]*\*+)*/



    but by point 3) there is following possibility
    /* blabla
    /* comment within comment */
    blabla
    */

    with pattern above it will find:
    /* blabla
    /* comment within comment */

    Does anybody know a solution for this problem - is there a possibility in regex to search forward and after a match start from the first beginning?


    Thanks a lot for your reply
  2. #2
  3. No Profile Picture
    Lost in code
    Devshed Supreme Being (6500+ posts)

    Join Date
    Dec 2004
    Posts
    8,316
    Rep Power
    7170
    A regular expression isn't a suitable solution for something like this; you need an actual T-SQL lexer in order to guarantee that you handle the query correctly.

    In addition to the problem you've already pointed out, consider how your definition of a string (1) would handle a query like this:
    Code:
    /* This is a comment that ends in a single quote ' */
    some actual sql code
    /* This is another commend that ends in a single quote ' */
    So you would have to amend your definition of a string in order to not include quotes that are inside comments. However, since your definition of a comment is based on your definition of a string that becomes a bit of a problem...

    In the end, you will continue to run into problems like that as long as you're trying to do this with a regular expression.

    Comments on this post

    • Laurent_R agrees : This is correct. Regexes are not suitable for such a problem.
    PHP FAQ

    Originally Posted by Spad
    Ah USB, the only rectangular connector where you have to make 3 attempts before you get it the right way around
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    3
    Rep Power
    0
    Hello E-Oreo,
    But the string isn't a problem in this szenario. It is handled ok. First it will find the /* and therefore it goes till to the end of the command include the '



    I look for a pattern something like
    Search for /* followed of any charcter include /* any charcter */ any character */
    All /*...*/ have to be pairwise.

    But they also could be nested like this:
    /* beginn comment1
    /* beginn comment2
    /* beginn comment3 end */
    End commente 1*/

    There are so many regex pattern i found, but ....
    Maybe there is a "nearly perfect" solution and a better one than mine.

    End 3*/
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    837
    Rep Power
    496
    I agree with E-Oreo, a regular expression is not suitable for this type of problem. There are too many special cases. Trying to handle them with a regex would quickly become a nighmare.

    You need a real parser.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    3
    Rep Power
    0
    Shi*, but thanks a lot for quick anwser

IMN logo majestic logo threadwatch logo seochat tools logo