#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2012
    Posts
    4
    Rep Power
    0

    XML regular expression


    One of the tool we use reads a XML file to generate the number of lines of code from given source files types (like.. java,sql etc..). Below is the line of code wirtten in the XML file which is used to ignore the lines which have only the words "do" or "while".

    <codeArea name="Begin/End tags" isCode="false" >
    <expression>^\s*begin\s*$</expression>
    <expression>^\s*end\s*$</expression>
    </codeArea>

    In the same way, how can I write the code to ignore the "case" statements which is used with "switch" in programming languages. For example, in the below code i want to ignore "case" statements


    switch ((char)(e.KeyChar))

    {
    case '\b':
    case "Thr":
    case '1':
    }

    I tried below way, but it doesn't worked. could someone please help on it.


    <expression>^\s*case[*]:\s*$</expression>
  2. #2
  3. Jealous Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,302
    Rep Power
    9400
    Leading whitespace, "case", whitespace, something, maybe more whitespace (which you can roll into the "something"), a colon, maybe even more whitespace, and the end of the line.
    Code:
    ^\s*case\s.*:\s*$
    [*] means literally an asterisk while .* means pretty much anything.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2011
    Posts
    29
    Rep Power
    0
    Originally Posted by requinix
    [*] means literally an asterisk while .* means pretty much anything.
    It's a good idea to use a non-greedy quantifier here:
    Code:
    ^\s*case\s.*?:\s*$
    When the case statements are on the same line, there will be several colons, and we want to match the first one, not the last one.
  6. #4
  7. Jealous Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,302
    Rep Power
    9400
    Originally Posted by abareplace
    When the case statements are on the same line, there will be several colons, and we want to match the first one, not the last one.
    To do that you'd have to replace the last \s* with something else. As it stands both expressions accomplish the same thing, but yours creeps along the line while mine goes to the end and backtracks.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2011
    Posts
    29
    Rep Power
    0
    Originally Posted by requinix
    To do that you'd have to replace the last \s* with something else.
    \s* can match an empty string; if there is no newline, it still works.
  10. #6
  11. Jealous Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,302
    Rep Power
    9400
    Originally Posted by abareplace
    \s* can match an empty string; if there is no newline, it still works.
    Yeah... But that doesn't change anything. If there were two cases on one line then both our expressions would match the entire line (as well as trailing whitespace). No functional difference.

    So really there should be a change: all the \s should be merely [ \t]. (Unless they're supposed to eat up empty lines.) And that should be done for every expression.

IMN logo majestic logo threadwatch logo seochat tools logo