Thread: fuzzy match?

    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2016
    Posts
    8
    Rep Power
    0

    fuzzy match?


    Hi regExp gurus,

    Here's a question. I have the following list.
    list = "ID,OPTION9,OPTION10,OPTION11,OPTION12,OPTION13,OPTION14,OPTION15,ORDERNO,OPTION10TXT,
    OPTION11TXT,OPTION16"
    and I have a term called "optoin16", which does not match any exact item or element in the list, however, in this case,
    the last item, namely, OPTION16, is the closest match, that is what I want. Btw, my need is case insensitive.
    How do we construct a regular expression to find "OPTION16" from this list based on "optoin16"?

    Thanks.
  2. #2
  3. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    16
    Rep Power
    0
    In Javascript you write regexp as "/pattern/modifiers". You need to put a modifier "i", meaning ignore case. You should also consider modifiers "m" (multiline) and "g" (global match, returning all results instead only the first one).
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2016
    Posts
    8
    Rep Power
    0
    Ok, a couple of things. First, please ignore the case, the list values may well be in lower case in the first place. Secondly, consider the list is on the same line.
    And ignore JavaScript for now. How would you do the regular expression "fuzz" match with the case I presented? Thanks.
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    16
    Rep Power
    0
    I can answer only in context of Javascript, if that interests you:
    Code:
    list = "ID,OPTION9,OPTION10,OPTION11,OPTION12,OPTION13,OPTION14,OPTION15,ORDERNO,OPTION10TXT,OPTION11TXT,OPTION16";
    patt = /option16/i;
    if (patt.test(list)) alert(patt.lastIndex);
    where patt.lastIndex returns the position in the string.
  8. #5
  9. Forgotten Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,052
    Rep Power
    9616
    Regular expressions can't really do fuzzy matches. It's kinda there in the term: regular expression, meaning it has to fit a regular pattern.

    If your options are actually "OPTIONN" then I'd search based on the number and ignore the rest of the text. If they have normal names then I think I would (1) break the string into different options, (2) calculate the levenshtein distance from the input ("optoin") to each option, and (3) pick the one within an acceptable distance.

    I don't know where "optoin" is coming from, but this problem may also be considered a UI problem: rather than prompt someone to enter a name and try to find the best match, give them a list to choose from so there's no question about what they meant.
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2016
    Posts
    8
    Rep Power
    0
    I appreciate your input. Well, "optoin16" is a typo by a programmer and there could other typo by the programmer or other programmer(s), thus, instead of typing "option16", the typo goes into source code, and initially the block of code
    where the typo resides may be used in, say, 90% of the use case, however, when the other 10% users access the application, the typo caused malfunction...
  12. #7
  13. Forgotten Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    16,052
    Rep Power
    9616
    1. If the problem is a developer doing something wrong then you need to institute code reviews because typos are one of the least significant problems someone can make in a codebase.
    2. If typos can be such a problem then you need to do something in code so that typos like this are irrelevant. I personally write code to maximize the amount of checking an IDE can do for me.
    3. Are you not testing your application before releasing it to users?
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2016
    Posts
    8
    Rep Power
    0
    @ your #2, I'm using a simple and yet quite powerful text editor for writing code, it can't detect typo, what IDE do you use that is able to detect typo?
    @ your #3, with hundreds of business logic, extreme thorough testing is very time consuming though it is necessary.

    In addition, I'm using typo as an example, there might be other cases that require such "fuzzy" match.

IMN logo majestic logo threadwatch logo seochat tools logo