#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    7
    Rep Power
    0

    Mysql regex alphanumerical with some special chars


    Alright, I've had a some nightmares with it already so i bow my stupid head to the almighty hive-mind.

    I run a small DB with the codes for spare parts. The codes are alphanumerical and i need some proper way to search for them. They can include special chars like parentheses (), and/or @, #, &, $, <, >.

    So the codes could look like:

    "A", "Bb", "2C8", "A7(BO)19", "B29H$", "29H(6JI)0#", "29H(6JI)0#<O>", etc

    The problem is that all these special chars in codes are optional. And the user should be able to find the code by at least its alphabetical part. So the query is something like:


    SELECT *
    FROM table
    WHERE column REGEXP '^[a-z0-9\@\#\$\&\<>()]*
    AND column LIKE CONCAT('%', '$user_value', '%')


    The DB doesnt return me "S(LJ)9" or "09S(LJ)3$" if i seek for "SLJ" or "S(LJ)"

    Would appreciate any help.
  2. #2
  3. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,118
    Rep Power
    4274
    the regexp value is missing the closing quote
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    7
    Rep Power
    0
    Originally Posted by r937
    the regexp value is missing the closing quote
    Thanks, you are right, its missing over here and it should be:

    SELECT *
    FROM table
    WHERE column REGEXP '^[a-z0-9\@\#\$\&\<>()]*'
    AND column LIKE CONCAT('%', '$user_value', '%');


    However its not missing in the code. The query is generated by the script is displayed for test purposes and then i check it in MySQL Workbench. For some things it works, for some it doesn'n.
  6. #4
  7. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,118
    Rep Power
    4274
    um, just guessing, but try this --

    REGEXP '^[a-zA-Z0-9\@\#\$\&\<>()]*'
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    7
    Rep Power
    0
    Originally Posted by r937
    um, just guessing, but try this --

    REGEXP '^[a-zA-Z0-9\@\#\$\&\<>()]*'
    Thanks, r937, I really missed this part, but i need also to be able to search for "S(LJ)9" with "SLJ". Any ideas?
  10. #6
  11. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,118
    Rep Power
    4274
    on the page in da mysql manual for regexp, one of the users posting comments says you have to put square brackets around any parentheses you want

    REGEXP '^[a-zA-Z0-9\@\#\$\&\<>[(][)]]*'

    not sure if that's even valid, but give it a try
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  12. #7
  13. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,957
    Rep Power
    1045
    Hi,

    Originally Posted by r937
    not sure if that's even valid
    no, it's neither correct nor necessary.

    The square brackets denote character classes (just like the one that's already there). And since parentheses in character classes are always interpreted as literal characters rather than delimiters, you can use this as a kind of poor man's escaping.

    But the parentheses in the example already are in a character class, so no escaping necessary. If you still do it, you add literal brackets to the character class, so now it also matches "[" and "]" (the second occurence of each bracket has no effect).

    When you actually need to escape characters within regexes, use a double backslash.

    As to the original question:

    Your problem is not the regex but the LIKE. The string "S(LJ)9" simply isn't LIKE "SLJ", regardless of how many regexes it has passed earlier.

    If you wanna ignore all special characters in the column, remove the special characters from the input string and then make a a regex by placing a [@#$&<>()] character class between each character. Note: Do not try to escape "$" or "&" with single backslashes. That again is neither necessary (depending on the context) nor valid. A regex escape is a double backslash. A single backslashes is an escape character within the MySQL string.

    Your example "SLJ" would become the regex

    Code:
    'S[@#$&<>()]*L[@#$&<>()]*J'
    This matches "S(LJ)9", "09S(LJ)3$" and any string consisting of "SLJ" with possible "special characters" before, after or between the characters.

    Not pretty, not efficient, but you asked for it.
    Last edited by Jacques1; April 16th, 2013 at 06:15 AM.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    7
    Rep Power
    0
    Thanks, guys. I've altered the query so the single slashes are not withing the character class brackets "[]" anymore. Seems to work the same as it did with slashes.

    Thanks for the advice, Jacques1, I will try to seek the way you told me. I will seek for the records WHERE REGEXP '^[a-zA-Z0-9@#$&<>()]*' or '*regexp after each char as you told me*'. I think that will do the trick. That Its not efficient, i agree, but the table itself is only 30K of rows or less, so it will do for now.
  16. #9
  17. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,957
    Rep Power
    1045
    You don't need the old regex, just use the one I gave you.

    I'm not even sure what you're trying to do with your regex, since it's not relevant for the search. Do you wanna check if the codes are valid? That's something for an INSERT and UPDATE trigger (or CHECK constraints in other database systems). You certainly don't validate data everytime somebody does a SELECT.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    7
    Rep Power
    0
    I need to do something that would allow me to find the code from as minimal of the user input as possible. The idea is to create a pattern combined with the user-input to help the search. Am i really missing something?
  20. #11
  21. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,957
    Rep Power
    1045
    Originally Posted by Rhiozan
    I need to do something that would allow me to find the code from as minimal of the user input as possible.
    And that's exactly what the regex I gave you does. I don't see why you keep your own regex. What is that supposed to do?

    Actually, your regex does absolutely nothing except keeping the CPU busy. It matches any string, because you ask for zero or more of certain characters at the beginning of the string. Obviously, this is a tautology: Either a string has those characters (> 0 occurences), or it doesn't have those characters (0 occurences).
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  22. #12
  23. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    7
    Rep Power
    0
    Thank you, Jacques1. The regex you provided worked.

    But how do i restrict the search to the amount of alphabetical chars the user submitted? For instance i have the value "S9" that is stored in the db and that want to find with input of "S". But the problem is, the search returns irrelevant values like S(P19)12, SOIU and so on...
  24. #13
  25. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,957
    Rep Power
    1045
    Not sure if I understand you correctly. So before and after the search string, there mustn't be any other alphabetic characters?

    So "SLJ" should match "4SL5J" or "(SL)J10" but not "SLJC" or "ASLJ"?

    That's
    Code:
    '^[@#$&<>()]*S[@#$&<>()]*L[@#$&<>()]*J[@#$&<>()]*$'
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  26. #14
  27. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2013
    Posts
    7
    Rep Power
    0
    Seems to be working fine! Thank you!

IMN logo majestic logo threadwatch logo seochat tools logo