February 9th, 2010, 03:18 PM
Web site caching policy and outsourcing
I want to ask about web site caching policies.
I have built a website in C from scratch, and have got to the point where I need a fuzzy search facility. I want to put a job for this on Rent-a-coder, Find-a-guru or script lance but it occurs to me I really need to sort out a policy on caching first. There would be little point in trying to do fuzzy searching through built in MySQL facilities for example, if thatís possible, only to end up with the whole the database contents cached in main memory where it would be far faster to search it there. (my data will be fairly small for quite some time).
I believe that caching web pages, semi or fully formed, rather than caching the raw relational data is more usual.
Also I believe a simple expiry policy is often used rather than directly tracking dependencies between raw data and web pages but that doesnít sound like an acceptable solution due to inconsistencies in the generated web pages.
Any one want a discussion about caching policies or the pros/cons of the above three contractor websites?
Very best wishes,
February 11th, 2010, 07:55 AM
Sounds like you need a search engine.
Yep if there is no direct relationship (definable/consistant) between a database and the webpages, then you need to index the fully formed web pages. But, then you have WordPress which does search the database and provides web links, because there is a direct relationship between the DB and the page. Hence, Wordpress's search is live and un-cached.
Yep, how many times have you searched Google and received a result only to goto a page in question and it hasn't got what you want. This happends a lot on blog/news sites where old content is pushed onto another page, but Google still lists the page originally indexed.
If your "raw relational data" can be searched and URL's can be derived from the result, then I'd do that.
If not, then a spider is required to build a cached index of your website content. Ive coded a number of spiders in the past, but you can use Google's results to get a list of URL's on your site: ie: Google "site:www.microsoft.com"
The reason why a simple expiry policy is used is because by the time you head check (HTTP HEAD) a page and try to compare that with some cached values, you might as well get the entire HTML and re-index it.
So, the question is, can you provide a live spider search? Um, depends on the speed of your server and the amount of traffic you are planning to have I guess.
Those are just my views. Other people would have diferent views based on different experiences.
February 11th, 2010, 11:01 AM
Many thanks for your thoughts.
The search I need is not to search the text of existing web pages but simply to search the product names / descriptions and artists names from my database.
I guess there are a few commercial / open source apps available that might impliment this directly on my database if I decided not to use caching.
Any suggestions here would be much appreciated.
I could build a cached page for each category / sub-category page, each artist page and each product page but then need to build cache coherancy logic so that when the relational data changes the corrisponding cached page(s) are invalidated or re-built.
Will MySQL's string matching function do an acceptable job of searching the database if I decided not to use caching?
If I do use caching and assume that at some point, not all the data will fit in main memory then each entry in the database would have an indication of which web cached page(s) must be invalidated when that data item (product or artist name) is altered or removed.
Any thoughts would be much appreciated.
Very best wishes,
February 11th, 2010, 04:25 PM
There isn't a C or derivative language question here so I have requested this thread be moved to the Software Design forum.
It's always easier for the bidders if they know what it is they are a bidding on and it will increase your odds of finding a good match for the task.
Originally Posted by davidmellor
I no longer wish to be associated with this site.
February 12th, 2010, 07:23 AM
Sorry. I had actually not intended to post the message in this section. It may be better to delete it so I can repost a revised version elsewhere.... if possible.
February 12th, 2010, 07:47 AM
O, its ok david. jwdonahue runs around telling everyone off.
Basiclly what jwdonahue is saying is, if you are planning on paying someone, then you need to provide a job spec or pay someone to help you write a job spec.
Otherwise, you are asking people to contribute help to a private (possable commercial) code base which you arn't willing to post, in which case you are just here to save money.
Hey noone can blame you for wanting to save money, but yes, if you cant do it yourself, then coff up money like everyone else has too
March 30th, 2010, 08:13 AM
Hi, this is Nicole from Rent a Coder.
As others have suggested, providing a thorough spec is highly recommended. Fortunately, Rentacoder makes creating a proper spec rather easy with its Bid Request Wizard -- an easy, online question and answer type program.
If you have any questions, please let me know. You can also call in to talk to a facilitator 7 days a week, or email us.