Calculating query cache hitrate without Qcache_inserts?
I wonder why the query cache hitrate is calculated this way:
Qcache_hits / (Qcache_hits + Com_select) * 100
What about Qcache_insert?
The first time the cache can not be hit und Com_select is increased. The next time it hits and Qcache_hits is increased. In this example you get
1 / ( 1 + 1 ) * 100 = 50%
and nothing more.
Why don't we bring in Qcache_inserts to check only on queries that have a chance to exist in the cache:
Qcache_hits / (Qcache_hits + Com_select - Qcache_inserts) * 100
Because you want to see the hitrate of how many of the total amount of queries that where served from the cache.
Having to perform a qcache_insert doesn't change that ratio, it only means that the query didn't exist in the cache in the first place.
In your example you had two queries:
Result was not in the Qcache
Com_select is incremented
A Qcache_insert is performed
Same SELECT second time
Result is in the Qcache
Qcache_hits is incremented
That was two queries and only one of them used the Query_cache to return the data, so that gives a hitrate of 50%.
If you had run the query one more time the hirate would have grown to 66.6666%, etc.
thanks for the reply.
Because of frequently table updates or highly dynamic queries the cache is often not able to serve the query. Some SQL statements forbid caching. That's not a sign for a problem with the cache. So of what use is this information? I can only adjust my monitoring thresholds and see if there are significant changes in the future, can't I?
I want to know how usefull caching is at all. That's why I prefer to monitor a hitrate like "Hits / Inserts". How often is an inserted query reused? If that value is lower than e.g. 100% then I can think about turning off the caching or optimizing queries to specify if caching should be used.
One database server I'm working on right now shows about 450%. That's the point where I know that caching does a good job for that purpose.
Actually it kind of is.
Originally Posted by mpiet
Because the overall figure of how many of the queries that are served from the cache shows how useful the cache actually is.
If you have a lot of statements that forbid caching and then you have very few static queries that are served from the cache you could get a very high cache hit ratio with hits/insert but really the cache isn't that effective which you would better see if you look at hits/overall_queries.
But as always it's hard to find just one way to look at statistics to get the full picture.
I understand that performance loss is the drawback of caching. SQL_NO_CACHE is used to prevent a detour through the cache, isn't it? Do these queries really compromise the work of the cache as a low hit rate implies?
You're right about statistics.
Still I find a hits/inserts rate of about 500% a useful sign that the cache helps quite a lot since this is the average value. Some queries might not hit at all but others a lot more often.
Hits/Inserts might not be enough. To dig deeper we'd need the distribution of often and not so often hitting queries. That would make clear if cache_type=2 and SQL_CACHE would make more sense than the over way round.
July 12th, 2013, 04:17 PM
It prevents the result from these "one time" queries to be stored in the cache and force out other cached results that are better to keep in the cache since those queries will turn up again.
Originally Posted by mpiet
And yes the distribution is as always interesting since all performance tuning with databases are dependent on three things:
1. The database design
2. The distribution of data
3. The distribution and the design of the queries
July 12th, 2013, 05:36 PM
So since SQL_NO_CACHE queries don't bother the cache and don't have an impact on performance... Of what use are they in calculating how useful the cache is?
So why is hits/selects prefered instead of hits/inserts?
July 13th, 2013, 05:34 PM
Because of this potential scenario (extreme yes but I'm trying to prove a point ).
99% of the queries against a DB server are SQL_NO_CACHE which means that you are only using the cache for 1% of the queries.
For these 1% you might have a hit/insert of 500% and you might think the cache is really useful on this machine.
But if you calculate hits/selects instead where 1% with a 500% hit rate would be for example 6 queries, 1 insert and 5 served from cache. Your total amount of queries would amass to 100% times 6 = 600 queries.
So the hits/select would equal 5 queries out of 600 = 0,833% where you would probably consider shutting off the cache.
Hits/selects is just better at showing the complete picture instead of only the subset of queries that are sent to the cache.
The SQL_NO_CACHE queries themselves will probably not have any direct performance impact, but the query cache and the amount of RAM it consumes usually does. Since that RAM could be much better used as index/data cache instead of speeding up 0.833% of the queries.
July 17th, 2013, 03:37 AM
It is indeed useful für those 0,833%. A cache suitable for that amount of queries would not consume that much RAM or in consequence not make us poor.
But you're right: In this case something has gone totally wrong. I'd not call it extreme, because it might be an authentic so-called historical development: Oversized cache because it was maybe even increased at the beginning and instead of thinking about the whole picture more and more SQL_NO_CACHE statements were added.
In this case I'd still keep caching enabled if it gives a good hits/inserts rate. But one could have analysed the structure before and used query_cache_type=2 and SQL_CACHE.