Home » Performance » MySQL » Mysql not using index for ORDER BY ?
Mysql not using index for ORDER BY ? [message #1240] Sat, 12 May 2007 05:47 Go to next message
arnoooooo  is currently offline arnoooooo
Messages: 3
Registered: May 2007
Junior Member
Hi all,

I am having trouble with a query on a large (39m rows) table : MySQL won't use the index, for the ORDER BY clause, and creates a temporary table, which takes a lot of disk space and time.

EXPLAIN SELECT * FROM histos_backtest ORDER BY id;

+----+-------------+-----------------+------+---------------+------+---------+------+----------+----------------+
| id | select_type | table           | type | possible_keys | key  | key_len | ref  | rows     | Extra          |
+----+-------------+-----------------+------+---------------+------+---------+------+----------+----------------+
|  1 | SIMPLE      | histos_backtest | ALL  | NULL          | NULL | NULL    | NULL | 39252003 | Using filesort |
+----+-------------+-----------------+------+---------------+------+---------+------+----------+----------------+


SHOW INDEX FROM histos_backtest;
+-----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table           | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| histos_backtest |          0 | PRIMARY  |            1 | id          | A         |    39252003 |     NULL | NULL   |      | BTREE      | NULL    |
| histos_backtest |          1 | id       |            1 | id          | A         |    39252003 |     NULL | NULL   |      | BTREE      | NULL    |
+-----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+


I created an index on the same column as the primary key just to make sure... I use Sybase at work and sometimes, the primary key is not enough. Apparently it is not the case with MySQL.

Any help would be greatly appreciated Smile
Re: Mysql not using index for ORDER BY ? [message #1247 is a reply to message #1240 ] Sun, 13 May 2007 13:41 Go to previous messageGo to next message
carpii  is currently offline carpii
Messages: 46
Registered: November 2006
Member
Whats the schema for this table?

Youre right about mysql by the way, a PK in mysql is a unique constraint and also an index.

In MSSQL (and presumably sybase), a PK is only a constraint, which is usually accompanied by an independant clustered index

Re: Mysql not using index for ORDER BY ? [message #1250 is a reply to message #1247 ] Sun, 13 May 2007 17:15 Go to previous messageGo to next message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
carpii wrote on Sun, 13 May 2007 19:41

In MSSQL (and presumably sybase), a PK is only a constraint, which is usually accompanied by an independant clustered index


Is this something with older versions?

I'm not a mssql or sybase expert but as far as I know (or have read), when you define a column as primary key both is implicitly creating a unique index on it.

And all databases that I mostly have worked with does it (mysql, postgresql, oracle).

This due to:
1. You basically always need a index to get decent performance for the unique constraint.
2. You very often use the primary key for selecting and then you benefit from an index anyway.

Ref:
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybas e.dc34982_1500/html/mig_gde/mig_gde73.htm
Re: Mysql not using index for ORDER BY ? [message #1251 is a reply to message #1250 ] Sun, 13 May 2007 17:29 Go to previous messageGo to next message
carpii  is currently offline carpii
Messages: 46
Registered: November 2006
Member
sterin wrote on Sun, 13 May 2007 17:15


Is this something with older versions?



Not that I know of. My experience is with MSSQL, but since they based it on the Sybase code, and arnoooooo came to the same concllusion, I assumed Sybase was the same.

In MSSQL, if you create a primary key on a table, it will, as you say, automatically add a clustered index on the same fields. That is, if a clustered index does not already exist - you can obviously only have one clustered index.

But this clustered index can be dropped, and you can add a clustered index to something else if you want. This does not affect the enforcement of the Primary Key constraint, although most likely it would affect performance of it.


Re: Mysql not using index for ORDER BY ? [message #1252 is a reply to message #1251 ] Sun, 13 May 2007 17:59 Go to previous messageGo to next message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
OK

As for the initial problem with that mysql is not using an index.

You are using a MyISAM table, right?

Because if you used a InnoDB table then it would use the primary key exclusively since the data is stored as leaves on the primary key in a InnoDB table.

While playing around on my mysql version 5.1 I noticed that it wants to use the filesort as you said.

The only way I got it to use the index is by using a the hint FORCE INDEX():
mysql> explain select * from a order by id;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | a     | ALL  | NULL          | NULL | NULL    | NULL |    4 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

mysql> explain select * from a use index (primary) order by id;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | a     | ALL  | NULL          | NULL | NULL    | NULL |    4 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

mysql> explain select * from a force index (primary) order by id;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------+
|  1 | SIMPLE      | a     | index | NULL          | PRIMARY | 4       | NULL |    4 |       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------+
1 row in set (0.00 sec)

Which tells mysql that a table scan is _very_ expensive.
Hope it helps.
Re: Mysql not using index for ORDER BY ? [message #1255 is a reply to message #1252 ] Tue, 15 May 2007 05:24 Go to previous messageGo to next message
arnoooooo  is currently offline arnoooooo
Messages: 3
Registered: May 2007
Junior Member
Thank you for your suggestions.

carpii, here is the schema for the table :

DESCRIBE histos_backtest                                                ;

                                                                        
+-----------+------------+------+-----+-------------------+----------------+    
| Field     | Type       | Null | Key | Default           | Extra          |    
+-----------+------------+------+-----+-------------------+----------------+    
| id        | bigint(20) | NO   | PRI | NULL              | auto_increment |    
| isin      | char(12)   | NO   |     |                   |                |    
| exchange  | char(4)    | NO   |     |                   |                |    
| time      | timestamp  | NO   |     | CURRENT_TIMESTAMP |                |    
| field     | varchar(5) | NO   |     |                   |                |    
| value     | float      | NO   |     |                   |                |    
| validated | tinyint(1) | NO   |     |                   |                |    
+-----------+------------+------+-----+-------------------+----------------+    
7 rows in set (0.15 sec)                                                     


sterin : Yes, I am using a MyISAM table. I tried FORCE INDEX, and apparently, MySQL no longer uses a filesort. It seems to be preparing the query for hours however, until it apparently runs out of ram and starts swapping so much that it renders the whole system unusable ! I let it run for a few hours then decided to shut it down.

Is MySQL trying to cache all the rows before it returns them ? If I try to do a select on a few rows with arbitrary ids, it is very fast. I think I might resort to using a few millions of one row selects ?

[Updated on: Tue, 15 May 2007 05:25]

Re: Mysql not using index for ORDER BY ? [message #1256 is a reply to message #1255 ] Tue, 15 May 2007 05:35 Go to previous messageGo to next message
carpii  is currently offline carpii
Messages: 46
Registered: November 2006
Member
What version of mysql are you using?

Does it perform any better if you add a LIMIT clause?

Also on a completely unrelated note..
Given the size of your table, and the fact its the only variable length field there, I wonder if that varchar might be better off being a char[] field. Then all your rows would be fixed length and makes it a little easier for mysql to locate rows

This wont solve your problem however, just an observation Smile

Re: Mysql not using index for ORDER BY ? [message #1257 is a reply to message #1255 ] Tue, 15 May 2007 06:16 Go to previous messageGo to next message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
1.
Just to make clear: I assume that you actually want all rows returned right?

2.
Yes the default behaviour for the client library is to cache all rows.
But if you for example run the mysql frontend you can use it like this:
mysql --quick [user and database entries here]

the --quick means that it should not cache up the result on the client side.
But instead start by printing it out directly instead.
Re: Mysql not using index for ORDER BY ? [message #1259 is a reply to message #1257 ] Tue, 15 May 2007 08:06 Go to previous messageGo to next message
arnoooooo  is currently offline arnoooooo
Messages: 3
Registered: May 2007
Junior Member
carpii :

- I am using MySQL 5.0.27

- It does perform well with a LIMIT clause, although I have not tried a huge LIMIT clause, in which case I imagine the results would be the same ?

- Using a char field might be a good idea indeed.


sterin :

- Yes, I want all the rows returned. Or at least I will use all the rows in my program, even if I split the query into several smaller queries.

Considering that I will use the rows one by one, is it a mistake to use only one SELECT statement ?

I currently am doing the following to temporarily solve the problem :

I loop on "SELECT * FROM table WHERE id>XX ORDER BY id LIMIT 1"
where XX is initially set to -1 then set to the latest value of id fetched...

- Do you know how to get the equivalent of --quick within a Perl program using DBD::MySQL ?

[EDIT] apparently I can do that with the RowCacheSize hint... assuming DBD::MySQL supports it

[Updated on: Tue, 15 May 2007 08:08]

Re: Mysql not using index for ORDER BY ? [message #1260 is a reply to message #1259 ] Tue, 15 May 2007 08:49 Go to previous message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
When working with large databases learn to not think about using LIMIT think about using WHERE instead to limit the nr of rows returned from a query.
The reason is that LIMIT is evaluate last in the query while WHERE is evaluated first and the sooner that the DBMS can rule out unnecessary rows the better.

In your case if you are running a perl program that retrieves all rows for you and then process them.
I would suggest that you retrieve maybe 10,000 rows (or try some good figure) at a time and if your primary key is normal auto_increment then just use that in a WHERE to limit the nr of rows:
-- first select
SELECT * FROM yourTable WHERE id > 0 and id < 10000;
-- second select
SELECT * FROM yourTable WHERE id > 10000 and id < 20000;
-- etc

But naturally you implement this by using prepared statements to speed it up even further.

And even if you have deleted a lot of rows from the DB then it still doesn't really matter that you get exactly 10,000 rows each time.
Since you are looping thru the table until you reach the COUNT(*) rows in the table you will get all rows in the end anyway.

These individual select will perform a range scan on the index and the time to get them is not that different from selecting them all at once.
You will get a small overhead with each query but at the same time that will be pretty negligable if you select about 10,000 rows at a time.
Previous Topic:Bad Performance or Bad Query ?
Next Topic:MySQL Extremely Slow Starting Up
Goto Forum:
  



Current Time: Wed Jan 7 08:59:36 EST 2009

Total time taken to generate the page: 0.18434 seconds
.:: Contact :: Home :: MySQL Support by Percona.com ::.

Powered by: FUDforum 2.7.5.
Copyright ©2001-2006 FUD Forum Bulletin Board Software

MySQL Performance | Forum authority Badge