|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
Here is my source query:
Code:
select
"book_id"
from
"bookgenres",
"genre_children"
WHERE
"bookgenres"."genre_id" = "genre_children"."genre_child_id" AND
"genre_children"."genre_id" = 1345
The following used the indexes on genre_id for both tables and returns me 16 book_id in a few miliseconds. However, I need the actual books with these IDs so I do: Code:
SELECT * from "books" where "books"."book_id" IN
(select
"book_id"
from
"bookgenres",
"genre_children"
WHERE
"bookgenres"."genre_id" = "genre_children"."genre_child_id" AND
"genre_children"."genre_id" = 1345)
So I expected it to use the books.book_id index against each of the 16 results. Strangely Firebird insists on doing here a fulltable scan on books. It is even weirder since if I do it like that: Code:
SELECT * from "books" where "books"."book_id" IN (66,36,87,27,76......) Thus manually write these 16 ids in, it does it quickly it uses the index as it should. Below are the PLANs for all these queries:\ First query - Fetches the 16 book ids: Code:
Query Time ------------------------------------------------ Prepare : 0 Execute : 16 Plan: ------------------------------------------------ PLAN JOIN (genre_children INDEX (IDX_genre_children),bookgenres INDEX (IDX_bookgenres1)) Enchanced Info: +--------------------------+-------+-----------+---------+---------+----------+ | Table Name | Index | Non-Index | Updated | Deleted | Inserted | | | reads | reads | | | | +--------------------------+-------+-----------+---------+---------+----------+ | bookgenres| 16 | 0 | 0 | 0 | 0 | | genre_children| 1 | 0 | 0 | 0 | 0 | +--------------------------+-------+-----------+---------+---------+----------+ Second query - Fetches the book themselves using the first query as a subquery: Code:
Query Time ------------------------------------------------ Prepare : 0 Execute : 1,703 Plan: ------------------------------------------------ PLAN JOIN (bookgenres INDEX (IDX_bookgenres),genre_children INDEX (IDX_genre_children1,IDX_genre_children)) PLAN (books NATURAL) Enchanced Info: +--------------------------+-------+-----------+---------+---------+----------+ | Table Name | Index | Non-Index | Updated | Deleted | Inserted | | | reads | reads | | | | +--------------------------+-------+-----------+---------+---------+----------+ | RDB$FIELDS| 4 | 0 | 0 | 0 | 0 | | RDB$INDEX_SEGMENTS| 1 | 0 | 0 | 0 | 0 | | RDB$RELATION_FIELDS| 4 | 0 | 0 | 0 | 0 | | RDB$RELATIONS| 2 | 0 | 0 | 0 | 0 | | RDB$RELATION_CONSTRAINTS| 1 | 0 | 0 | 0 | 0 | | bookgenres| 45,04 | 0 | 0 | 0 | 0 | | genre_children| 16 | 0 | 0 | 0 | 0 | | books| 0 | 18,054 | 0 | 0 | 0 | +--------------------------+-------+-----------+---------+---------+----------+ Finally, this is the query that fetches from books by writing the ids numbers manually: Code:
Query Time ------------------------------------------------ Prepare : 31 Execute : 32 Plan: ------------------------------------------------ PLAN (books INDEX (PK_books,PK_books,PK_books,PK_books,PK_books,PK_books,PK_books,PK_books,PK_books,PK_books,PK_books, PK_books,PK_books,PK_books,PK_books,PK_books)) Enchanced Info: +--------------------------+-------+-----------+---------+---------+----------+ | Table Name | Index | Non-Index | Updated | Deleted | Inserted | | | reads | reads | | | | +--------------------------+-------+-----------+---------+---------+----------+ | RDB$FIELDS| 4 | 0 | 0 | 0 | 0 | | RDB$INDEX_SEGMENTS| 1 | 0 | 0 | 0 | 0 | | RDB$RELATION_FIELDS| 4 | 0 | 0 | 0 | 0 | | RDB$RELATIONS| 2 | 0 | 0 | 0 | 0 | | RDB$RELATION_CONSTRAINTS| 1 | 0 | 0 | 0 | 0 | | books| 16 | 0 | 0 | 0 | 0 | +--------------------------+-------+-----------+---------+---------+----------+
__________________
VitalyB ![]()
|
|
#2
|
||||
|
||||
|
Does a query like the following give you the results you want? Does it give you better performance? I had no way of testing this.
SELECT books.* FROM books INNER JOIN bookgenres ON books.book_id = bookgenres.book_id INNER JOIN genre_children ON bookgenres.genre_id = genre_children.genre_child_id WHERE genre_children.genre_id = 1345 |
|
#3
|
|||
|
|||
|
Not really
, it took 27s, I need it to take under a second.Here's the deal. There are 3 tables I need to join. books (2000 rows) bookgenres (20000 rows) genre_children (6500 rows) I need to join them all (all joined fields are indexed), order the result on a field appearing in books (vote_avg) and then show the book_title and the genre. Problem is that it takes WAY too long whatever way I try (e.g my first post). I tried to fight with the Firebird plan/performance-graph but just couldn't find something that gives reasonable results . |
|
#4
|
||||
|
||||
|
If you would like to post your table definitions, I can try to reproduce your problem. I created some test tables based on your queries and used Database Workbench to load them with more records than you currently have. I gave the books a random vote average between 0 and 100 and sorted by this value.
Whenever I would run the query I posted for you earlier, I would always get sub-second response times. Depending on whether I selected for an individual book_id or for a range of book_id's, the query would come back in 0.070~0.340 seconds. This is running on a laptop. When I ran your original query with the subselect, I also got sub-second response times. What version of Firebird are you running? Is this the only query that is causing you problems? 27 seconds seems a little long for a three table join of less than 30,000 records. But I can't reproduce your problem. |
|
#5
|
|||
|
|||
|
Sorry it took me so long to reply, I never got the notification for some strange reason
.I ran again the subquery version, it takes on my computer about 0.4 seconds, which is quick, but not satisfactory as I don't know why it takes so long (thus why it doesn't use the books table index). This is annoying because my next step in the query is to sort it like this: Code:
SELECT FIRST 10 * from "bv_books" where "bv_books"."book_id" IN
(select distinct
"book_id"
from
"bv_bookgenres",
"bv_genre_children"
WHERE
"bv_bookgenres"."genre_id" = "bv_genre_children"."genre_child_id" AND
"bv_genre_children"."genre_id" = 1345)
order by "bv_books"."vote_avg" desc
This now takes 0.8 seconds which is REALLY weird for the 16 book_ids that are returned in the subquery. Last thing before I write the structure dump. If you want to, I can just send you the (some what censored) data dump of the following tables. I'll just need your ICQ/MSN/Y!/AIM/EMAIL for that, whatever is comfortable to you. Here's the table structure: Code:
/* Table: bv_bookgenres */
CREATE TABLE "bv_bookgenres" (
"book_id" INTEGER NOT NULL,
"genre_id" INTEGER NOT NULL);
/* Indices definition */
CREATE INDEX "IDX_bv_bookgenres" ON "bv_bookgenres" ("book_id");
CREATE INDEX "IDX_bv_bookgenres1" ON "bv_bookgenres" ("genre_id");
CREATE INDEX "IDX_bv_bookgenres2" ON "bv_bookgenres" ("book_id", "genre_id");
/* Table: bv_books */
CREATE TABLE "bv_books" (
"book_id" INTEGER NOT NULL,
"book_title" VARCHAR (500) CHARACTER SET ASCII COLLATE ASCII,
"vote_avg" INTEGER,
"vote_count" INTEGER);
/* Primary keys definition */
ALTER TABLE "bv_books" ADD CONSTRAINT "PK_bv_books" PRIMARY KEY ("book_id");
/* Indices definition */
CREATE INDEX "IDX_bv_books" ON "bv_books" ("vote_avg");
CREATE UNIQUE INDEX "PK_bv_books" ON "bv_books" ("book_id");
/* Table: bv_genre_children */
CREATE TABLE "bv_genre_children" (
"genre_id" INTEGER NOT NULL,
"genre_child_id" INTEGER NOT NULL);
/* Indices definition */
CREATE INDEX "IDX_bv_genre_children" ON "bv_genre_children" ("genre_id");
CREATE INDEX "IDX_bv_genre_children1" ON "bv_genre_children" ("genre_child_id");
|
|
#6
|
||||
|
||||
|
I did three things that helped the speed of your query:
First, I made the field vote_avg not null. On some databases, the optimizer will pick a less than optimal plan if it thinks it might be dealing with NULL data. Unless a field has to contain nulls, I always declare it not null and give it a default value. Second, I changed an index to be descending and added the book_id so the data in the index would be in the exact order the query wanted it to be in: CREATE DESC INDEX "IDX_bv_books" ON "bv_books" ("book_id", "vote_avg"); Third, I retuned the indexes by rebuilding them and recalculating their selectivity. You can read about this in the Operations Guide under Database and Server Performance -> Database Tuning Tasks -> Tuning Indexes. Basically you do something like this: ALTER INDEX name INACTIVE; ALTER INDEX name ACTIVE; SET STATISTICS INDEX name; I don't know if you need to do all these steps. I was trying several things to see what would work. The query ran about 4X faster for me after doing this. Does it help you too? |
|
#7
|
|||
|
|||
|
Just a note --
Don't create additional indices on PK or FK columns. Creating a PK or FK automatically creates indices for those columns.
__________________
Martijn Tonies Database Workbench: developer IDE for Firebird, MySQL, InterBase, MSSQL Server and Oracle Upscene Productions http://www.upscene.com |
|
#8
|
|||
|
|||
|
How do you get the "Enchanced ( Enhanced ) Info" for queries
How do you get the "Enchanced Info" ( Enhanced? ) for queries that displays indexed vs. non-indexed reads? Also, are there any good references that you would suggest that provide a methodology for creating new indices? We're trying to optimize our queries without creating too many indices, but are having a hard time knowing what indices should be created.
Thanks. |
![]() |
| Viewing: Dev Shed Forums > Databases > Firebird SQL Development > Firebird misuses indexes? |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|