#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2009
    Posts
    37
    Rep Power
    16

    JavaScript: Trying to Capture a Backreference


    I'm trying to capture a value within some strings in JavaScript.

    These are the possible strings:

    http://www.url.com/sw/standard_b.jpg
    http://www.url.com/sw/2_1238144910_3655223_ih_ee2.jpg
    http://www.url.com/sw/2_1238144910_3655223_c_yr8.jpg
    http://www.url.com/sw/56_1238193910_320_kh.jpg

    I'm trying to write a regex to capture the following values from each string:

    b
    ih
    c
    kh

    Every regex I write only captures these 1 character backreferences:

    b
    c

    I can't figure out the regex that will capture the values above from the strings above.

    The closest I got was this:
    Code:
    /(.+)([a|b|c|ih|kh])(.*)?(\.jpg)/i
    It captures the value of "b," for example, in all the strings, but the moment I try to capture the value of "ih" or "ah" in the possible string sequences, it simply cuts off the second character and only returns "i" (from ih) or "k" (from kh).

    I'm really stumped. The major problems occurs when the value to capture is longer than 1 character (as in ih|kh opposed to a|b|c).

    All insight or solutions will be appreciated.
  2. #2
  3. Did you steal it?
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,068
    Rep Power
    9398
    How did you decide that it should capture b/ih/c/kh? It looked like you were getting the first letters after a _ but your expression doesn't reflect this at all.

    Oh, and post all your JavaScript code. I'm thinking the problem may be there instead.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2009
    Posts
    37
    Rep Power
    16
    Those one- and two-character values are file ids in the name. My code is minimal as I'm just trying to alert() the actual value and nothing more, so there is nothing else interfering. I haven't decided to continue with the script, until I know how to do this. All I need is the value (file id).

    Because refreshing the page all the time just to see an alert eventually got timely, I tested out the strings and the regex at http://www.regular-expressions.info/javascriptexample.html and clicked on the "show match" button, as it was much quicker to see the results, and it performed the same function as refreshing over and over again--and returned the same results. The problem is the pattern.

    You should note that the first string's file name is a word followed by only one underscore while the others are just sequences of numbers separated by up to four underscores, and the file id is not always directly preceding the file extension. The lack of a uniform naming convention reflects the pattern I wrote, which does return good results so long as the file id is only one character long.

    But the moment I try to return ih or kh, it cuts off the second character in the backreference.

    Comments on this post

    • requinix agrees : for insisting on what you knew was right: the problem was in the pattern
  6. #4
  7. Did you steal it?
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,068
    Rep Power
    9398
    Originally Posted by threequestions
    You should note that the first string's file name is a word followed by only one underscore while the others are just sequences of numbers separated by up to four underscores, and the file id is not always directly preceding the file extension.
    So how do you know what the ID is? What if you had a file named abc_123_b_c_789.jpg?


    ...
    Oh god, I can't believe I missed that My fault for trying to be helpful after 2am.
    Code:
    ([a|b|c|ih|kh])
    [] marks a character set. It counts individual characters. So that set will look for any character a,b,c,h,i,k,| (because | is a regular character in there).

    Don't want that? Don't use it.
    Code:
    (a|b|c|ih|kh)
    Last edited by requinix; March 28th, 2009 at 04:21 PM.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2009
    Posts
    37
    Rep Power
    16
    Thank you very much, requinix. I didn't realize the square brackets did that. I thought placing the pipe | in between the brackets would denote OR.

IMN logo majestic logo threadwatch logo seochat tools logo