#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2009
    Posts
    11
    Rep Power
    0

    Web site caching policy and outsourcing


    Hi group,
    I want to ask about web site caching policies.

    I have built a website in C from scratch, and have got to the point where I need a fuzzy search facility. I want to put a job for this on Rent-a-coder, Find-a-guru or script lance but it occurs to me I really need to sort out a policy on caching first. There would be little point in trying to do fuzzy searching through built in MySQL facilities for example, if thatís possible, only to end up with the whole the database contents cached in main memory where it would be far faster to search it there. (my data will be fairly small for quite some time).

    I believe that caching web pages, semi or fully formed, rather than caching the raw relational data is more usual.
    Also I believe a simple expiry policy is often used rather than directly tracking dependencies between raw data and web pages but that doesnít sound like an acceptable solution due to inconsistencies in the generated web pages.

    Any one want a discussion about caching policies or the pros/cons of the above three contractor websites?

    Very best wishes,
    David
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Location
    Australia
    Posts
    35
    Rep Power
    6
    Sounds like you need a search engine.
    I believe that caching web pages, semi or fully formed, rather than caching the raw relational data is more usual.
    Yep if there is no direct relationship (definable/consistant) between a database and the webpages, then you need to index the fully formed web pages. But, then you have WordPress which does search the database and provides web links, because there is a direct relationship between the DB and the page. Hence, Wordpress's search is live and un-cached.
    ...doesnít sound like an acceptable solution due to inconsistencies in the generated web pages
    Yep, how many times have you searched Google and received a result only to goto a page in question and it hasn't got what you want. This happends a lot on blog/news sites where old content is pushed onto another page, but Google still lists the page originally indexed.

    If your "raw relational data" can be searched and URL's can be derived from the result, then I'd do that.

    If not, then a spider is required to build a cached index of your website content. Ive coded a number of spiders in the past, but you can use Google's results to get a list of URL's on your site: ie: Google "site:www.microsoft.com"

    The reason why a simple expiry policy is used is because by the time you head check (HTTP HEAD) a page and try to compare that with some cached values, you might as well get the entire HTML and re-index it.

    So, the question is, can you provide a live spider search? Um, depends on the speed of your server and the amount of traffic you are planning to have I guess.

    Those are just my views. Other people would have diferent views based on different experiences.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2009
    Posts
    11
    Rep Power
    0
    Many thanks for your thoughts.

    The search I need is not to search the text of existing web pages but simply to search the product names / descriptions and artists names from my database.

    I guess there are a few commercial / open source apps available that might impliment this directly on my database if I decided not to use caching.
    Any suggestions here would be much appreciated.

    I could build a cached page for each category / sub-category page, each artist page and each product page but then need to build cache coherancy logic so that when the relational data changes the corrisponding cached page(s) are invalidated or re-built.

    Will MySQL's string matching function do an acceptable job of searching the database if I decided not to use caching?

    If I do use caching and assume that at some point, not all the data will fit in main memory then each entry in the database would have an indication of which web cached page(s) must be invalidated when that data item (product or artist name) is altered or removed.

    Any thoughts would be much appreciated.
    Very best wishes,
    David
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Loyal (3000 - 3499 posts)

    Join Date
    May 2004
    Posts
    3,417
    Rep Power
    887
    There isn't a C or derivative language question here so I have requested this thread be moved to the Software Design forum.

    Originally Posted by davidmellor
    I want to put a job for this on Rent-a-coder, Find-a-guru or script lance but it occurs to me I really need to sort out a policy on caching first.
    It's always easier for the bidders if they know what it is they are a bidding on and it will increase your odds of finding a good match for the task.
    I no longer wish to be associated with this site.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2009
    Posts
    11
    Rep Power
    0
    Sorry. I had actually not intended to post the message in this section. It may be better to delete it so I can repost a revised version elsewhere.... if possible.
    David
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Location
    Australia
    Posts
    35
    Rep Power
    6
    O, its ok david. jwdonahue runs around telling everyone off.

    Basiclly what jwdonahue is saying is, if you are planning on paying someone, then you need to provide a job spec or pay someone to help you write a job spec.

    Otherwise, you are asking people to contribute help to a private (possable commercial) code base which you arn't willing to post, in which case you are just here to save money.

    Hey noone can blame you for wanting to save money, but yes, if you cant do it yourself, then coff up money like everyone else has too
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2010
    Posts
    2
    Rep Power
    0
    Hi, this is Nicole from Rent a Coder.

    As others have suggested, providing a thorough spec is highly recommended. Fortunately, Rentacoder makes creating a proper spec rather easy with its Bid Request Wizard -- an easy, online question and answer type program.

    If you have any questions, please let me know. You can also call in to talk to a facilitator 7 days a week, or email us.

    Nicole

IMN logo majestic logo threadwatch logo seochat tools logo