#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2000
    Location
    London, England
    Posts
    251
    Rep Power
    15
    I've got a table with a load of plain text news articles in it, and I want to allow users to search for stories that contain particlar word(s).

    So I'm doing something like:

    SELECT *
    FROM stories
    WHERE text LIKE '%searchword%';

    The problem is with the % wildcards. This will happily find the word I want, but not just a whole word but part of a word too, eg. it will find "park" even in "parked" and "parking".

    And obviously this is what I'd expect because that's what the % wildcard means.

    But is there a simple/easy/elegant way of finding a search term *only* when it's a whole word?

    The obvious thing to do is to add spaces either side of the term:

    LIKE '% searchterm %'

    but of course some words are going to be followed by punctuation, some preceded by quotes, and of course there's always the first word in the field that doesn't have a preceding character at all.

    Anyone ever done this?

    And for an extra point....

    Searching the full text of every story like this seems to work well enough at the moment when there's a small number of users and a small number of stories. However, I can see it getting very inefficient and slow with a large number of users and/or stories. Any pointers on the best way to index the stories allowing for a more efficient search, or has anyone successfully used this brute force method on a large scale?
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 1999
    Location
    Annapolis, Maryland US
    Posts
    113
    Rep Power
    15
    I would suggest familiarizing yourself with and using MySQL's regexp()function to further refine your select criteria.

    Example - to find all occurrences of "searchword" in your stories table, and taking into account that "searchword" may be preceded by a space or perhaps quotes AND followed by possibly either another space, quote or punctuation, something like this might work....

    > SELECT *
    > FROM stories
    > WHERE text
    > REGEXP '[ "]{1,}searchword[.!?"]?';

    There's so many things you can do with regular expressions, it's impossible to do them justice in this simple example. I think DevShed or PHPBuilder has some good regex tutorials.

    Kyuzo
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2000
    Location
    London, England
    Posts
    251
    Rep Power
    15
    Thanks for pointing that out.

    I'm familiar with regexs from Perl so I'm sure I can work out a good solution from here.

    :-)

Similar Threads

  1. float question
    By yeah-yeah in forum HTML Programming
    Replies: 10
    Last Post: January 15th, 2004, 03:59 PM
  2. Full text search problems
    By rzporter in forum MySQL Help
    Replies: 2
    Last Post: July 30th, 2002, 02:31 PM
  3. RegEx Problem!
    By mmeyer in forum PHP Development
    Replies: 7
    Last Post: November 5th, 2001, 03:29 AM
  4. full text search engine
    By pieter guis in forum PHP Development
    Replies: 0
    Last Post: July 27th, 2000, 08:20 AM
  5. php full text search
    By pieter guis in forum PHP Development
    Replies: 1
    Last Post: July 25th, 2000, 06:00 AM

IMN logo majestic logo threadwatch logo seochat tools logo