Home » Performance » MySQL » 13GB DB - Performance Question
icon2.gif  13GB DB - Performance Question [message #1295] Wed, 23 May 2007 16:22 Go to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

I have a 13 GB MYSQL table with the following definition:

CREATE TABLE `WikiParagraphs` (
  `ID` int(10) unsigned NOT NULL auto_increment,
  `Paragraph` text NOT NULL,
  PRIMARY KEY  (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=73035281 ;


I query it like this:

select Paragraph from WikiParagraphs where ID in (1,2,3,4)


the 1,2,3,4 bit comes from the Sphinx FullText engine that gives me the IDs I need that match my query within about 500 milliseconds.

But retrieving the data itself takes approximately 3-6 seconds.

Obviously, I'd like to speed this query up.

Any ideas?


G-Man
Re: 13GB DB - Performance Question [message #1296 is a reply to message #1295 ] Wed, 23 May 2007 16:54 Go to previous messageGo to next message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
How much RAM memory do you have on the server and how much have you reserved for the InnoDB cache (innodb_buffer_size)?

Because this is very important to get speed with innodb.


Apart from that I have a suggestion:
select Paragraph from WikiParagraphs where id = 1
UNION
select Paragraph from WikiParagraphs where id = 4
UNION
select Paragraph from WikiParagraphs where id = 6



It will give const searches for each select instead of a range scan which might go faster if your smallest and biggest value in the IN() part is far apart and you don't want to retrieve that many rows.

mysql> explain select Paragraph from WikiParagraphs where id = 3
    -> UNION
    -> select Paragraph from WikiParagraphs where id = 4
    -> UNION
    -> select Paragraph from WikiParagraphs where id = 6
    -> ;
+----+--------------+----------------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type  | table          | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+----+--------------+----------------+-------+---------------+---------+---------+-------+------+-------+
|  1 | PRIMARY      | WikiParagraphs | const | PRIMARY       | PRIMARY | 4       | const |    1 |       |
|  2 | UNION        | WikiParagraphs | const | PRIMARY       | PRIMARY | 4       | const |    1 |       |
|  3 | UNION        | WikiParagraphs | const | PRIMARY       | PRIMARY | 4       | const |    1 |       |
|NULL | UNION RESULT | <union1,2,3>   | ALL   | NULL          | NULL    | NULL    | NULL  | NULL |       |
+----+--------------+----------------+-------+---------------+---------+---------+-------+------+-------+
4 rows in set (0.00 sec)

Re: 13GB DB - Performance Question [message #1297 is a reply to message #1295 ] Wed, 23 May 2007 16:57 Go to previous messageGo to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

Well, I'm getting 1000 results back from Sphinx and I'm using those to get about 55k of data that I then massage.

I probably don't need all 1000 results but I need at least 55k of data returned.

I tried doing single queries and I also tried OR = stuff but neither of those sped it up (slowed it down a bit).

How do I tell how much memory INNODB is using for this?

I should note that I also have MYISAM tables on the same server.


G-Man
Re: 13GB DB - Performance Question [message #1298 is a reply to message #1295 ] Wed, 23 May 2007 16:58 Go to previous messageGo to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

Ooh. wait. Just noticed that UNION there.

That's rather interesting. I would have never thought of doing that.

I'll give that a shot and see how it performs.


G-Man
Re: 13GB DB - Performance Question [message #1299 is a reply to message #1295 ] Wed, 23 May 2007 17:03 Go to previous messageGo to next message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
1000 rows was a bit more than I thought but I hope it works for you. Smile

To get the info about server settings you can use this query:
show variables like 'innodb%';

Re: 13GB DB - Performance Question [message #1300 is a reply to message #1295 ] Wed, 23 May 2007 17:06 Go to previous messageGo to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

Well, if it's faster, I could break it down into chunks of 50 or 100 until I have enough data I guess...

+---------------------------------+-------------------------------+
| Variable_name                   | Value                         |
+---------------------------------+-------------------------------+
| have_innodb                     | YES                           |
| innodb_additional_mem_pool_size | 12582912                      |
| innodb_autoextend_increment     | 8                             |
| innodb_buffer_pool_awe_mem_mb   | 0                             |
| innodb_buffer_pool_size         | 134217728                     |
| innodb_data_file_path           | innodb/ibdata1:10M:autoextend |
| innodb_data_home_dir            |                               |
| innodb_fast_shutdown            | ON                            |
| innodb_file_io_threads          | 4                             |
| innodb_file_per_table           | OFF                           |
| innodb_flush_log_at_trx_commit  | 1                             |
| innodb_flush_method             |                               |
| innodb_force_recovery           | 0                             |
| innodb_lock_wait_timeout        | 30                            |
| innodb_locks_unsafe_for_binlog  | OFF                           |
| innodb_log_arch_dir             |                               |
| innodb_log_archive              | OFF                           |
| innodb_log_buffer_size          | 8388608                       |
| innodb_log_file_size            | 33554432                      |
| innodb_log_files_in_group       | 2                             |
| innodb_log_group_home_dir       | ./                            |
| innodb_max_dirty_pages_pct      | 90                            |
| innodb_max_purge_lag            | 0                             |
| innodb_mirrored_log_groups      | 1                             |
| innodb_open_files               | 300                           |
| innodb_table_locks              | ON                            |
| innodb_thread_concurrency       | 8                             |
+---------------------------------+-------------------------------+


The server has 4GB of memory.

[Updated on: Wed, 23 May 2007 17:10]


G-Man
Re: 13GB DB - Performance Question [message #1301 is a reply to message #1295 ] Wed, 23 May 2007 17:38 Go to previous messageGo to next message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
It's not faster if you break it into 50 or 100.
The only advantage is that you could break when you have got all data instead of reading in all 1000 rows.

But you are only using 134MB for InnoDB_buffer_size.

How much memory do you have available on the server?

And are you using the server for something else like webserver or something?

Because I would suggest that you increase the innodb_buffer_pool_size to 80% of available memory.
Re: 13GB DB - Performance Question [message #1302 is a reply to message #1301 ] Wed, 23 May 2007 17:40 Go to previous messageGo to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

Hmmm...

This server has 4GB of memory. I'm using as the MYISAM, INNODB and web server all in one - LOL. Too much for one server to do?

G-Man


G-Man
Re: 13GB DB - Performance Question [message #1303 is a reply to message #1295 ] Wed, 23 May 2007 17:46 Go to previous messageGo to next message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
Do you have a reason for using both MyISAM and InnoDB?

That is, are you using both FULLTEXT and transactions?
Or have you run into the update/select performance problem with MyISAM?

Otherwise I would suggest that you choose to use only one storage engine and tune the server for that one.
Re: 13GB DB - Performance Question [message #1304 is a reply to message #1303 ] Wed, 23 May 2007 17:50 Go to previous messageGo to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

I'm actually not using transactions at all right now nor do I expect to be with this data.

The only reason I added innodb was because figured it would help to not lock the entire table when creating these 13-25gb tables. Now that they're created the point is kinda moot but converting it back to myisam (reimporting the data) is likely to take a week or more unless you know a quick way to do it?

Fulltext is done via the sphinx engine so I'm not using that part of myisam either.


G-Man
Re: 13GB DB - Performance Question [message #1306 is a reply to message #1295 ] Thu, 24 May 2007 03:11 Go to previous messageGo to next message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
Are you performing a lot of updates against the table?
Or is it basically only selects?
I'm guessing selects since it looks like a wiki but you never know. Smile

This is the quickest way (but you don't have any control over it):
ALTER TABLE yourTable ENGINE=MyISAM;

The only question is how long your server will take to convert it.

So that's why I suggest that you create a smaller table with maybe 500MB data and test on that one instead so you can get a feel for how long it takes.

And depending on the time estimate from your test it could be better to perform the move yourself:
1.
Create a myisam copy of your table. _Without_ indexes. Those will be added later.
2.
INSERT INTO yourMyISAMCopy SELECT * FROM yourInnoDBoriginal
3.
Create the indexes. And make sure that you give mysql a _lot_ of memory for this step. Because create index can work in two ways and one is _very_ slow when it as to work against disk all he time.

Step 2 could be split into portions by
... WHERE id BETWEEN 1001 AND 2000.
... WHERE id BETWEEN 2001 AND 3000.

Or some appropriate value for how many you want to move at a time if you can't take all in one go.

Re: 13GB DB - Performance Question [message #1307 is a reply to message #1295 ] Thu, 24 May 2007 06:25 Go to previous messageGo to next message
bluesaga  is currently offline bluesaga
Messages: 20
Registered: December 2006
Junior Member
I really wouldnt recommend the alter table, talking from experiance (50 gig table) it will be slow, and crummy!

I do queries similar to:
select Paragraph from WikiParagraphs where ID in (1,2,3,4, *1000)

and im looking at subsecond queries!

I think you should improve the buffers your applying to these tables sufficiently!

One thing that will help you, is do all the matching with sphinx, do any phpside ordering of the data and then only pull the data you actually need! This should help greatly, in terms of speed at least.
Re: 13GB DB - Performance Question [message #1308 is a reply to message #1295 ] Thu, 24 May 2007 13:00 Go to previous messageGo to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

Well, since it's a static table and my other tables are in myisam I've decided to move over to myisam.

The problem that I'm having tho is when I try this query:

insert into WikiParagraphs2 select * from WikiParagraphs


It drops about 40 million rows...

I tried a mysql dump but there are odd chars that make it barf.

Any ideas?


G-Man
Re: 13GB DB - Performance Question [message #1309 is a reply to message #1295 ] Thu, 24 May 2007 13:25 Go to previous messageGo to next message
bluesaga  is currently offline bluesaga
Messages: 20
Registered: December 2006
Junior Member
You could try forcing the charsets?

Are both tables in the same character set anyway?

Oh wait, are you SURE it dropped the rows? innoDB does not keep a record of the actual amount of rows in the table, so it may be that?

The only way (i can think of) is to do something similar to:

SELECT count(*) FROM innoDBtable

But that may take a long time, you could try two seperate count()'s from a boolean field (do a search for true and one for false) may be quicker.

[Updated on: Thu, 24 May 2007 13:28]

Re: 13GB DB - Performance Question [message #1310 is a reply to message #1295 ] Thu, 24 May 2007 13:31 Go to previous messageGo to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

When I saw the number of rows not matching up, I did a count(*).

It took about 30 minutes to do that.

I've tried keeping the charsets the same as well as changing charsets.

Neither option worked.

I suppose I could write a php script to do this by hand but good lord that'll be slowwwwwww Razz


G-Man
Re: 13GB DB - Performance Question [message #1311 is a reply to message #1295 ] Thu, 24 May 2007 13:54 Go to previous messageGo to next message
bluesaga  is currently offline bluesaga
Messages: 20
Registered: December 2006
Junior Member
Honestly i cant see why it didnt work. It should of from what i can tell its a pretty simple query.

Some of the more proficient mysql dba's here may be able to assist you more thouhg Smile
Re: 13GB DB - Performance Question [message #1312 is a reply to message #1295 ] Thu, 24 May 2007 13:56 Go to previous messageGo to next message
bluesaga  is currently offline bluesaga
Messages: 20
Registered: December 2006
Junior Member
ALSO, i might add i tested with MYISAM and INNODB seperately, and saw pretty much no difference between loadtimes selecting 1000 rows using SELECT ... FROM ... WHERE ... IN (1,2 ...1000)

Just giving you the heads up, id say its HIGHLY related to your buffer pool.
Re: 13GB DB - Performance Question [message #1313 is a reply to message #1312 ] Thu, 24 May 2007 14:09 Go to previous messageGo to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

bluesaga wrote on Thu, 24 May 2007 13:56

ALSO, i might add i tested with MYISAM and INNODB seperately, and saw pretty much no difference between loadtimes selecting 1000 rows using SELECT ... FROM ... WHERE ... IN (1,2 ...1000)

Just giving you the heads up, id say its HIGHLY related to your buffer pool.


Yeah, I figure you're right but having the memory eaten by myisam as well as innodb is kind of a waste. So I'm gonna switch to myisam across the board to make sure that I can optimize for one engine instead of two! Smile


G-Man
Re: 13GB DB - Performance Question [message #1314 is a reply to message #1296 ] Thu, 24 May 2007 15:22 Go to previous messageGo to next message
carpii  is currently offline carpii
Messages: 46
Registered: November 2006
Member
sterin wrote on Wed, 23 May 2007 16:54



Apart from that I have a suggestion:
select Paragraph from WikiParagraphs where id = 1
UNION
select Paragraph from WikiParagraphs where id = 4
UNION
select Paragraph from WikiParagraphs where id = 6




Its better to use UNION ALL when you know the resultset will not contain any duplicates.
By default UNION implies DISTINCT (ie, it attempts to remove all duplicate rows from the resultset)

I doubt this would make much of a difference in this case, but I thought it was worth mentioning Smile
Re: 13GB DB - Performance Question [message #1315 is a reply to message #1308 ] Thu, 24 May 2007 18:31 Go to previous messageGo to next message
sterin  is currently offline sterin
Messages: 323
Registered: March 2007
Location: Sweden
Senior Member
What did you actually mean by "It drops about 40 million rows..."?

The INSERT INTO ... SELECT * FROM ....;

Is very straight forward and I have never experienced it to fail.
I might take a long time to perform it sometimes and that can be a problem sometimes. But that is why I suggested that you move it in portions where you select a range of primary keys that you transfer each time you perform the query.
Re: 13GB DB - Performance Question [message #1316 is a reply to message #1295 ] Thu, 24 May 2007 18:35 Go to previous messageGo to next message
GeoffreyF67  is currently offline GeoffreyF67
Messages: 21
Registered: May 2007
Junior Member

When I say it dropped rows I mean that about 40 million rows never made it to the myisam db.

count(*) doesn't match on both db's.

I'm doing it through php right now by selecting one row at a time and inserting it. Should only take me 24 hours or so assuming it works.

With regards to the same charset, I tried it with the same charset and without to no avail.

I'm hopeful that going through php and doing it one select/insert at a time will work even if it is a bit slower Razz


G-Man
Re: 13GB DB - Performance Question [message #1338 is a reply to message #1295 ] Tue, 29 May 2007 10:52 Go to previous messageGo to next message
tanj  is currently offline tanj
Messages: 16
Registered: March 2007
Location: France
Junior Member
Hello,

I'm in a habit to play with 50GB tables so I will give you my own advice for what it's worth Wink

1. There's a better way to move large data into MySQL. Don't do "INSERT INTO ... SELECT * FROM" it's just plain inefficient.
Dump your data into a textfile. "SELECT * FROM t1 INTO OUTFILE" will do the trick.
There you can do external editing (removing bogus chars, converting charsets...) as you wish.
Then insert your data in your new table using "LOAD DATA INFILE"
This should be much faster.

2. About your performance problem. As someone said, 128MB is EXTREMELY INEFFECTIVE for InnoDB. InnoDB likes memory, and for that you need at last 70% of available memory allocated to the buffer pool (or if you're on a 32-bit system, 2GB will do).

3. I see that you will be using MyISAM. Can you post your key_buffer_size variable? MyISAM buffer only stores indexes so the size will be inferior to InnoDB's.

4. Partitioning could be a key response to your performance problems. Can you use MySQL 5.1? If then, I recommend you have a look at the Partitioning stuff. You could easily partition your data by date and improve access times. Otherwise, if you stick with 5.0 you can try to divide your data in smaller tables, then use MERGE tables for the largest span queries.

Best Regards
Re: 13GB DB - Performance Question [message #1344 is a reply to message #1311 ] Thu, 31 May 2007 15:01 Go to previous message
JGilbert  is currently offline JGilbert
Messages: 31
Registered: May 2007
Location: Chicago, IL
Member

If the query exceeds the maximum amount of memory available, it will fail. I've had this happen before, in fact, i have it happen nightly at least once for a query I have yet to optimize as MySQL is running its backups processes Wink
Previous Topic:InnoDB innodb_log_file_size
Next Topic:Innodb Index cardinality keep change
Goto Forum:
  



Current Time: Tue Jan 6 22:06:57 EST 2009

Total time taken to generate the page: 0.03783 seconds
.:: Contact :: Home :: MySQL Support by Percona.com ::.

Powered by: FUDforum 2.7.5.
Copyright ©2001-2006 FUD Forum Bulletin Board Software

MySQL Performance | Forum authority Badge