#1
  1. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    6
    Rep Power
    0

    Question MySQL REGEXP [[.anything.]] won't work ?


    In the MySQL docs you can read the following:
    [[.characters.]]
    The sequence of characters of that collating element. The sequence is a single element of the bracket expression's list. A bracket expression containing a multi-character collating element can thus match more than one character, for example, if the collating sequence includes a ch collating element, then the regular expression [[.ch.]]*c matches the first five characters of chchcc.
    Link to MySQL REGEXP documentation
    I tried this query for example:

    SELECT "Dois helicópteros foram dados como desaparecidos devido às tempestades" REGEXP "[[.devido.]]";

    It won't work Am I doing anything wrong?

    I'm trying to substitute «LIKE» on a search query. Fulltext search won't do the trick since I require exact control over the results obtained and the MySQL database version in use is prior to 4.xx

    After I finally figure out how to REGEXP the hell out of it, the next problem will be how to escape special chars from the variable that will be passed into the REGEXP ...

    example:
    $query="select somecolumn from sometable where somecolumn REGEXP \"[[:<:]][[.$searchstring.]][[:>:]]\";" // obviously, the $searchstring has to be escaped for regexp special chars before being used in the REGEXP query ...

    So now, being the problems exposed I turn to you for help

    MAYDAY MAYDAY ?

    Best regards and thanks in advance for your most welcome help
    Kore
  2. #2
  3. /(bb|[^b]{2})/

    Join Date
    Nov 2001
    Location
    Somewhere in the great unknown
    Posts
    5,163
    Rep Power
    796
    well, first off, this belongs in the mysql forum, not the php one.

    On your first sql statement try:
    SELECT "Dois helicópteros foram dados como desaparecidos devido às tempestades" REGEXP "(devido)";

    If you are wanting an exact match, that is the easiest way to do it.
  4. #3
  5. mod_dev_shed
    Devshed Supreme Being (6500+ posts)

    Join Date
    Sep 2002
    Location
    Atlanta, GA
    Posts
    14,800
    Rep Power
    1103
    This isn't PHP related -- you want to use REs in MySQL, not REs in PHP. You'd get better responses if you have a moderator move this to the MySQL forum, or delete this post and repost it there (but only after you delete this one as to not cross-post).
  6. #4
  7. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    6
    Rep Power
    0
    Hi,

    This is indeed related to PHP, I suppose you haven't read the second part of my post...

    Onslaught: SELECT "Dois helicópteros foram dados como desaparecidos devido às tempestades" REGEXP "(devido)";
    this will return results for any combination of the caracters d, e, v, i, d, o ... and not for the word «devido». right?

    As for the second problem exposed on my post (related entirely to PHP), no one suggests any solution for escaping the characters mentioned?

    Thanks in advance
  8. #5
  9. /(bb|[^b]{2})/

    Join Date
    Nov 2001
    Location
    Somewhere in the great unknown
    Posts
    5,163
    Rep Power
    796
    no, () is a sequence listing
    [] is a class listing

    so in my above example it should only return 1 if it matches in order.
    so this:
    Code:
    SELECT "Dois helicópteros foram dados como desaparecidos devido às tempestades" REGEXP "(devido)"
    is not the same as
    Code:
    SELECT "Dois helicópteros foram dados como desaparecidos devido às tempestades" REGEXP "(odived)";
    The first statement would return 1, whereas the second would return 0.

    Now, if you exchange () for [] then both statement would return 1.
    As far as the collating example listed on the mysql page. I could not get it to work even with thier example.
    Anyway, you can follow the basic rules of regular expressions and it should be acceptable.

    Now, on to your second question.
    First, what characters need to be escaped?
    Second, you are mixing class types so that will probably crap out. The class type [[:<:]] [[:>:]] is for searching for a complete word, so just take out the collating class syntax the put the word instide of the first class.
  10. #6
  11. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    6
    Rep Power
    0
    Originally posted by Onslaught
    no, () is a sequence listing
    [] is a class listing

    so in my above example it should only return 1 if it matches in order.
    so this:
    Code:
    SELECT "Dois helicópteros foram dados como desaparecidos devido às tempestades" REGEXP "(devido)"
    is not the same as
    Code:
    SELECT "Dois helicópteros foram dados como desaparecidos devido às tempestades" REGEXP "(odived)";
    The first statement would return 1, whereas the second would return 0.

    Now, if you exchange () for [] then both statement would return 1.
    As far as the collating example listed on the mysql page. I could not get it to work even with thier example.
    Anyway, you can follow the basic rules of regular expressions and it should be acceptable.



    Following your tip I tried this:
    SELECT "Dois helicópteros foram dados como desaparecidos devido às tempestades" REGEXP "[[:<:]](devido)[[:>:]]";

    So far it seems to be working right Thanks for your tip


    Originally posted by Onslaught
    Now, on to your second question.
    First, what characters need to be escaped?
    Second, you are mixing class types so that will probably crap out. The class type [[:<:]] [[:>:]] is for searching for a complete word, so just take out the collating class syntax the put the word instide of the first class.
    The characters that need to be escaped are the special regexp characters, like $, ., ^, *, etc ...
    Imagine this:
    $searchstring='zbr ^ xpto * $ lots of money';
    $query="SELECT somecolumn from sometable where somecolumn REGEXP \"[[:<:]]($searchstring)[[:>:]]\"";

    Won't the $, ^ and * blow the regexp? They probably will need to be escaped so that they act as regular characters in the searchstring (having no parseing value)...
    Did I explain this well? Perhaps the correct term isn't "escape", but I'm not really used to discuss these problems in english, so I'm taking guesses on what terminology to use...
    How can I protect those special chars so that they are used as search chars and not as parsing units... ? Any clue?

    Thanks in advance for your help
  12. #7
  13. /(bb|[^b]{2})/

    Join Date
    Nov 2001
    Location
    Somewhere in the great unknown
    Posts
    5,163
    Rep Power
    796
    You will have to create a little function that will go through the search string and escape all of the special characters.
  14. #8
  15. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    6
    Rep Power
    0
    Originally posted by Onslaught
    You will have to create a little function that will go through the search string and escape all of the special characters.
    That's what I was trying to run away from ;D lol Perhaps there was any function made that already did that.

    Anyclue how I escape the characters ? I tried this: "\^devido" to search for «^devido» and it didn't work... any clue what's the escape character?

    Thanks for your help
  16. #9
  17. /(bb|[^b]{2})/

    Join Date
    Nov 2001
    Location
    Somewhere in the great unknown
    Posts
    5,163
    Rep Power
    796
    The only thing I know to do would be to exchange them for wildcards.
    i.e. .

    I don't know of a legal way to escape posix style regular expressions, which is what mysql uses. You might can post in the mysql forum and see if anyone knows there a legal method for this.
  18. #10
  19. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    6
    Rep Power
    0
    Originally posted by Onslaught
    The only thing I know to do would be to exchange them for wildcards.
    i.e. .

    I don't know of a legal way to escape posix style regular expressions, which is what mysql uses. You might can post in the mysql forum and see if anyone knows there a legal method for this.
    I will try that, thanks for your help

IMN logo majestic logo threadwatch logo seochat tools logo