Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. A Change of Season
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2004
    Location
    Next Door
    Posts
    2,689
    Rep Power
    171

    Is it possiblt to avoid full table scan?


    Hello;

    This scenario comes up a lot. In this simple example , how can I avoid full table scan (get rid of the type All) on Country table so it doesn't examine all rows?
    Code:
     Explain
    
    SELECT *
    FROM   city
           INNER JOIN country
                   ON city.countrycode = country.code
    Same with this query, still does the full table scan on all rows in city table
    Code:
    Explain
    
    SELECT city.name
    FROM   city
           INNER JOIN country
                   ON city.countrycode = country.code
    WHERE  country.code = 'usa'
           AND city.countrycode = 'usa'
    Code:
    CREATE TABLE `City` (
     `ID` int(11) NOT NULL AUTO_INCREMENT,
     `Name` char(35) NOT NULL DEFAULT '',
     `CountryCode` char(3) NOT NULL DEFAULT '',
     `District` char(20) NOT NULL DEFAULT '',
     `Population` int(11) NOT NULL DEFAULT '0',
     PRIMARY KEY (`ID`),
     KEY `CountryCode` (`CountryCode`),
     KEY `CountryCode_index` (`CountryCode`),
     CONSTRAINT `city_ibfk_1` FOREIGN KEY (`CountryCode`) REFERENCES `country` (`Code`)
    ) ENGINE=InnoDB AUTO_INCREMENT=4080 DEFAULT CHARSET=latin1
    Code:
    CREATE TABLE `country` (
     `Code` char(3) NOT NULL DEFAULT '',
     `Name` char(52) NOT NULL DEFAULT '',
     `Continent` enum('Asia','Europe','North America','Africa','Oceania','Antarctica','South America') NOT NULL DEFAULT 'Asia',
     `Region` char(26) NOT NULL DEFAULT '',
     `SurfaceArea` float(10,2) NOT NULL DEFAULT '0.00',
     `IndepYear` smallint(6) DEFAULT NULL,
     `Population` int(11) NOT NULL DEFAULT '0',
     `LifeExpectancy` float(3,1) DEFAULT NULL,
     `GNP` float(10,2) DEFAULT NULL,
     `GNPOld` float(10,2) DEFAULT NULL,
     `LocalName` char(45) NOT NULL DEFAULT '',
     `GovernmentForm` char(45) NOT NULL DEFAULT '',
     `HeadOfState` char(60) DEFAULT NULL,
     `Capital` int(11) DEFAULT NULL,
     `Code2` char(2) NOT NULL DEFAULT '',
     PRIMARY KEY (`Code`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1
    Thanks
    Last edited by zxcvbnm; April 29th, 2013 at 08:56 PM.
  2. #2
  3. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,280
    Rep Power
    4279
    just to be clear -- a table scan is usually a good thing if you intend to return all the rows of the table

    as for your query for cities in usa, how many rows are in the table, and how many of them are usa?

    by the way, these --
    Code:
    KEY `CountryCode` (`CountryCode`),
     KEY `CountryCode_index` (`CountryCode`)
    are redundant with each other, drop one of them
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  4. #3
  5. A Change of Season
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2004
    Location
    Next Door
    Posts
    2,689
    Rep Power
    171
    Hi;
    Originally Posted by r937
    just to be clear -- a table scan is usually a good thing if you intend to return all the rows of the table
    I can't understand why type is ALL for country table and there are 232 rows ! Is it really not possible to use an index for country and see 1 row (usa) instead of 232?
    Originally Posted by r937
    As for your query for cities in usa, how many rows are in the table, and how many of them are usa?
    ~ 240 countries and ~ 4000 cities (out of them 274 are in usa).
    I dont want to tire you but if you have time please see post #3 here
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Oct 2003
    Location
    Germany
    Posts
    2,788
    Rep Power
    349
    Originally Posted by zxcvbnm
    I can't understand why type is ALL for country table and there are 232 rows
    232 rows is absolutely nothing. Given the size of the table that will be stored in 2 maybe 3 database blocks (assuming an average row length of 50 and a block size of 8192)

    So a "full table scan" means reading 3 blocks from the disk, which is 3 IO operations.

    An index lookup followed by the retrieval of the row will be approx. 3 IO operations as well (2 for the index lookup, one to get the row from the table). So you gain nothing through the index lookup.

    You might be interested in this website which really does a good job of explaining how indexing works in a DBMS:
    http://use-the-index-luke.com/sql/anatomy
    I will not read nor answer questions where the SQL code is messy and not formatted properly using [code] tags.
    http://forums.devshed.com/misc.php?do=bbcode#code

    Tips on how to ask better questions:
    http://tkyte.blogspot.de/2005/06/how-to-ask-questions.html
    http://wiki.postgresql.org/wiki/SlowQueryQuestions
    http://catb.org/esr/faqs/smart-questions.html
  8. #5
  9. A Change of Season
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2004
    Location
    Next Door
    Posts
    2,689
    Rep Power
    171
    Originally Posted by shammat
    232 rows is absolutely nothing.
    Yes I understad it's a small table

    The point is it goes through all rows. Now I don't want that to happen when I am dealing with a table with 100 million rows.
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,485
    Rep Power
    1752
    The simple answer is to write well-formed queries that run against a well-designed database.
    Ideally you'd get the database to only return as many rows of data, containing as many columns of information as you actually want or need; avoiding the dread "select *" as much as you can.
    Sometimes a table scan is the 'only way to go' - either with a small table where any cost benefits of an index are out-weighed by the size of the data, or the index data is 'inbalanced' with very few different values, or you just need to return or examine every row of data!

    To reduce the amount of data returned you would filter the data, or rather you'd get the database engine to do so. You do that by supplying effective ON clauses for JOINs and appropriate WHERE and HAVING clauses for the (sub-)queries.
    Last edited by SimonJM; April 30th, 2013 at 07:24 AM.
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Oct 2003
    Location
    Germany
    Posts
    2,788
    Rep Power
    349
    Originally Posted by zxcvbnm
    The point is it goes through all rows. Now I don't want that to happen when I am dealing with a table with 100 million rows.
    If there are 100 million rows, MySQL will (most probably) use the index. The execution plan depends heavily on the actual data (this is called a "cost based optimizer"). It is not based on simply examining the SQL statement (which would be a "rule based optimizer")
    I will not read nor answer questions where the SQL code is messy and not formatted properly using [code] tags.
    http://forums.devshed.com/misc.php?do=bbcode#code

    Tips on how to ask better questions:
    http://tkyte.blogspot.de/2005/06/how-to-ask-questions.html
    http://wiki.postgresql.org/wiki/SlowQueryQuestions
    http://catb.org/esr/faqs/smart-questions.html
  14. #8
  15. A Change of Season
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2004
    Location
    Next Door
    Posts
    2,689
    Rep Power
    171
    Thanks fellows. Maybe an example will make the discussion more clear. For example for either of these queries below, they are not "poorly written" or lacking index. But still slow!

    This takes 5 seconds, type index rows 5522015:
    Code:
    SELECT Count(*)
    FROM   producst_table;
    This takes 17 seconds:
    Code:
    SELECT 1
    FROM   producst_table;
    Thank you
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    120
    Rep Power
    10
    For big table look on partitioning if you set partition when create a table it will speed up your performance
  18. #10
  19. A Change of Season
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2004
    Location
    Next Door
    Posts
    2,689
    Rep Power
    171
    Originally Posted by gk53
    For big table look on partitioning if you set partition when create a table it will speed up your performance
    So this is nothing to be improved with indexes?

    For the record:
    Code:
    CREATE TABLE `products_table` (
     `id` int(11) NOT NULL AUTO_INCREMENT,
     `code` varchar(4) NOT NULL,
     `class` varchar(11) NOT NULL,
     `category` varchar(5) NOT NULL,
     `price` decimal(6,2) NOT NULL,
     `production_date` date NOT NULL,
     PRIMARY KEY (`id`),
     KEY `production_date` (`production_date`),
     KEY `index_price` (`price`)
    ) ENGINE=InnoDB AUTO_INCREMENT=5521508 DEFAULT CHARSET=latin1
  20. #11
  21. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    120
    Rep Power
    10
    in your examples

    SELECT Count(*)
    FROM producst_table;
    does not use any indexes it is just getting number of records from system table...

    and

    SELECT 1
    FROM producst_table;

    not using too...

    try to change to

    SELECT 1
    FROM producst_table
    order by id;

    and it shold work faster (when you add order by id you actually tell server use id index...)
  22. #12
  23. A Change of Season
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2004
    Location
    Next Door
    Posts
    2,689
    Rep Power
    171
    Originally Posted by gk53
    try to change to

    SELECT 1
    FROM producst_table
    order by id;

    and it shold work faster (when you add order by id you actually tell server use id index...)
    Didnt make any difference
  24. #13
  25. No Profile Picture
    Contributing User
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Oct 2003
    Location
    Germany
    Posts
    2,788
    Rep Power
    349
    Originally Posted by zxcvbnm
    Thanks fellows. Maybe an example will make the discussion more clear. For example for either of these queries below, they are not "poorly written" or lacking index. But still slow!

    This takes 5 seconds, type index rows 5522015:
    Code:
    SELECT Count(*)
    FROM   producst_table;
    This takes 17 seconds:
    Code:
    SELECT 1
    FROM   producst_table;
    Thank you
    Neither of those queries can use an index. You are requesting the database to look at every single row because there is no where clause.
    I will not read nor answer questions where the SQL code is messy and not formatted properly using [code] tags.
    http://forums.devshed.com/misc.php?do=bbcode#code

    Tips on how to ask better questions:
    http://tkyte.blogspot.de/2005/06/how-to-ask-questions.html
    http://wiki.postgresql.org/wiki/SlowQueryQuestions
    http://catb.org/esr/faqs/smart-questions.html
  26. #14
  27. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,280
    Rep Power
    4279
    Originally Posted by gk53
    in your examples

    SELECT Count(*)
    FROM producst_table;

    does not use any indexes it is just getting number of records from system table...
    actually, no

    it is the other way around, because this is an innodb table

    only mysam tables maintain a row count with the tab le

    he said it read 5+ million index rows, and took 5 seconds to do so -- this is typical behaviour for a COUNT(*) query on an innodb table
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
  28. #15
  29. SQL Consultant
    Devshed Supreme Being (6500+ posts)

    Join Date
    Feb 2003
    Location
    Toronto Canada
    Posts
    27,280
    Rep Power
    4279
    Originally Posted by shammat
    Neither of those queries can use an index.
    the first one most definitely can

    in fact he said it did !!

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo