#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Posts
    1
    Rep Power
    0

    Three python regex expressions around underscores


    in python regex,

    a) in the first expression i have to find all characters before the first underscore in patterns like this:

    cannon_mac_23567_prsln_333
    jones_james_343342_prsln_333
    smith_john_223462_prsln_333

    so, i have to get cannon, jones, and smith

    b) in a separate expression i have to find all characters between the first and second underscore. so, i need to find mac, james, and john in the examples above.

    c) in the last expression i have to find the first underscore
  2. #2
  3. Come play with me!
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    13,744
    Rep Power
    9397
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    776
    Rep Power
    495
    This looks like a school assignment. Please do not expect us to to your homework for you. It would probably take me less than 5 minutes, but I wouldn't be doing you a favor.

    Please come up with what you have tried so far and tell us if something is not working properly, then we can help you solving problems you encounter.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Posts
    4
    Rep Power
    0
    Originally Posted by requinix
    What have you tried so far?

    i know very little about regular expressions and trying to help my wife with some file renaming for her work using a renaming application that supports regular expressions. to do what needs to be done has to be done in three passes, which is why i listed the a), b), c).

    to try and get the cannon, jones, and smith from part a) i tried ^[^_]+(?=_) and a few others lifted from forum posts but they didn't work so i figured i better reach out to people that actually know what they're doing.

    p.s. i'm posting under a different user name because my login didn't work and i did not get the password reset email
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Posts
    4
    Rep Power
    0
    Originally Posted by Laurent_R
    This looks like a school assignment. Please do not expect us to to your homework for you. It would probably take me less than 5 minutes, but I wouldn't be doing you a favor.

    Please come up with what you have tried so far and tell us if something is not working properly, then we can help you solving problems you encounter.
    hi there, i'm actually trying to help my wife with something for her work - she's using a renaming application that supports regular expressions. to do what needs to be done has to be done in three passes, which is why i listed the a), b), c).

    if you can help that would be great. i tried a few things lifted from other forums but they didn't work.

    thanks...
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    776
    Rep Power
    495
    OK, for the first one.

    There are various possibilities, but here is one: you need to match start of string, a number of characters other than _, followed by _, with a capture of the characters other than _.

    This could be:

    Code:
    /^([^_]+)_/
    The second one could be something like this:

    Code:
    /^[^_]+_([^_]+)_/
    (I haven't tested them, as I am working in Perl rather than PHP, but I think they should work.)

    I leave the third one to your tries for the time being. Try those and tell us if they do what you want.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Posts
    4
    Rep Power
    0
    Originally Posted by Laurent_R
    OK, for the first one.

    There are various possibilities, but here is one: you need to match start of string, a number of characters other than _, followed by _, with a capture of the characters other than _.

    This could be:

    Code:
    /^([^_]+)_/
    The second one could be something like this:

    Code:
    /^[^_]+_([^_]+)_/
    (I haven't tested them, as I am working in Perl rather than PHP, but I think they should work.)

    I leave the third one to your tries for the time being. Try those and tell us if they do what you want.
    hi, no result. the renaming app uses Python syntax for regular expressions. i don't know how that compares to Perl syntax.

    screen capture here: http://www.bigidearesults.com/files/regex/screen-1.jpg

    i was getting a result for the first one using this but it was not working right. see this screen capture for test showing problem with first line

    http://www.bigidearesults.com/files/regex/screen-2.jpg

    the objective with part one and two is to find and swap the first and last name using this app
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    776
    Rep Power
    495
    I do not know what the application you are using is, but it seems that if you are using this, you need to remove the beginning and ending "/" (which mark beginning and end of regex in Perl) from the regex I gave you, i.e., try with:

    Code:
    ^([^_]+)_
    and

    Code:
    ^[^_]+_([^_]+)_
    Last time I used Python was I think in 2002, I certainly don't remember the syntax details (and they may have changed in between). But most dynamic languages of today, including I think Python, use a regex syntax directly derived from Perl, so what I have given (at least the pure regex) you should work with the necessary adaptation to Python in terms of what you put around your regex. After all, the most common regex package used il most of these languages is called PCRE, for Perl Compatible Regular Expression.

    Just to show how the regex I provided works, the following is a test of it under the Perl debugger:

    Code:
      DB<1> $c = "cannon_mac_23567_prsln_333";
    
      DB<2> print $1 if $c =~ /^([^_]+)_/;
    cannon
      DB<3> $c = "jones_james_343342_prsln_333";
    
      DB<4> print $1 if $c =~ /^([^_]+)_/;
    jones
    
      DB<5> print $1 if $c =~ /^[^_]+_([^_]+)_/;
    james
    Last edited by Laurent_R; March 6th, 2013 at 03:26 PM.
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Posts
    4
    Rep Power
    0
    Originally Posted by Laurent_R
    I do not know what the application you are using is, but it seems that if you are using this, you need to remove the beginning and ending "/" (which mark beginning and end of regex in Perl) from the regex I gave you, i.e., try with:

    Code:
    ^([^_]+)_
    and

    Code:
    ^[^_]+_([^_]+)_
    Last time I used Python was I think in 2002, I certainly don't remember the syntax details (and they may have changed in between). But most dynamic languages of today, including I think Python, use a regex syntax directly derived from Perl, so what I have given (at least the pure regex) you should work with the necessary adaptation to Python in terms of what you put around your regex. After all, the most common regex package used il most of these languages is called PCRE, for Perl Compatible Regular Expression.

    Just to show how the regex I provided works, the following is a test of it under the Perl debugger:

    Code:
      DB<1> $c = "cannon_mac_23567_prsln_333";
    
      DB<2> print $1 if $c =~ /^([^_]+)_/;
    cannon
      DB<3> $c = "jones_james_343342_prsln_333";
    
      DB<4> print $1 if $c =~ /^([^_]+)_/;
    jones
    
      DB<5> print $1 if $c =~ /^[^_]+_([^_]+)_/;
    james
    it must be a bug in the renaming application i'm using. thanks for your help
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    776
    Rep Power
    495
    Hi,

    You could try to do some prints at various points in the program, to see what is done correctly and what is not right.

IMN logo majestic logo threadwatch logo seochat tools logo