Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
VERY strange MySQL problem, definitely worth a look
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
planet-admin
Apprentice
Apprentice


Joined: 27 Mar 2004
Posts: 213
Location: Boise, ID

PostPosted: Sun Dec 31, 2006 6:42 pm    Post subject: VERY strange MySQL problem, definitely worth a look Reply with quote

I have a mysql 4.1 server running on a quad opteron machine running redhat E/S 4u4.

This mysql runs at about 30-40% cpu usage. It has tables that are no larger than 2 gb. They are all MyISAM tables.

The cpus in this box are 2.2ghz opterons, and there are 20gb of RAM.

This box also serves as a web server, so we wanted my migrate the mysql off of the box, onto a 16core opteron 32gb RAM box, (running gentoo, with a vanilla 2.6.19, and it runs very well) with fiber channel attached raid array. Alreay on this machine we have a MySQL 4.1 with several databases, and a MySQL 5.0.26 with tables upwards of 6gb (a vbulletin site). (those databases have NO problem at ALL)

With both of these running (and that vbulletin sees a LOT of traffic), the load avg is 1-2, and the cpu usage is no more than 10%. (It is 16 cores after all).

I created a new mysql instance/config file/init script, and moved our binary data files over (created using mysqlhotcopy).

I started the server (used even the same config file, mysql variable-wise), and the cpu usage slams the server, upwards of 1100%. The load avg gets as high as 90.

I have tried:
clearing the linux disk cache, and cat'ing the myi/myd files to /dev/null, just to get them back in the cache.
MySQL 5
switching the tables to innodb
fine-tuning the mysql variables
several other things


Now, quite obviously, the 16 core machine is a much beefier server, and the nic wasn't being overutilized, yet the mysql process was taking at the very least 400-500% cpu, with actual numbers around 900%. It was causing the load average to spike, making the other mysql instances....slow.

I moved the EXACT SAME data files back (so, that rules out something strictly relating to them) to the other quad opteron box, and it runs perfectly.

I don't even know where to BEGIN to try to figure out why it would perform so differently. Anyone have any ideas?

Thanks,
Michael
_________________
Michael S. Moody
Sr. Systems Engineer
Global Systems Consulting
Web: http://www.GlobalSystemsConsulting.com
Back to top
View user's profile Send private message
Dan
Veteran
Veteran


Joined: 25 Oct 2005
Posts: 1302

PostPosted: Sun Dec 31, 2006 8:52 pm    Post subject: Reply with quote

nothing fishy in /var/log/mysql ?
Back to top
View user's profile Send private message
planet-admin
Apprentice
Apprentice


Joined: 27 Mar 2004
Posts: 213
Location: Boise, ID

PostPosted: Sun Dec 31, 2006 9:04 pm    Post subject: Reply with quote

Nada, zip. I can't figure it out for the life of me. It's completely incomprehensible.

Thanks,
Michael
_________________
Michael S. Moody
Sr. Systems Engineer
Global Systems Consulting
Web: http://www.GlobalSystemsConsulting.com
Back to top
View user's profile Send private message
BitJam
Advocate
Advocate


Joined: 12 Aug 2003
Posts: 2508
Location: Silver City, NM

PostPosted: Sun Dec 31, 2006 9:05 pm    Post subject: Re: VERY strange MySQL problem, definitely worth a look Reply with quote

planet-admin wrote:
It was causing the load average to spike, making the other mysql instances....slow.

Perhaps the performance problem has to do with using multiple instances. I would also double check and make sure your indices are actually being used.
Back to top
View user's profile Send private message
m.wales
n00b
n00b


Joined: 06 Sep 2006
Posts: 18

PostPosted: Sun Dec 31, 2006 10:19 pm    Post subject: Reply with quote

So the only thing that stood out to me was that you were moving from MySQL 4 to 5. Perhaps the problem lies somewhere in that switch? Maybe the configuration you used for 4 doesn't work too well with 5? Also are all the tables in the new 5 instance indexed and analyzed properly as I've certainly had migration problems due to that in the past.
_________________
- Mark W -
Back to top
View user's profile Send private message
planet-admin
Apprentice
Apprentice


Joined: 27 Mar 2004
Posts: 213
Location: Boise, ID

PostPosted: Mon Jan 01, 2007 3:43 am    Post subject: Reply with quote

I'm already using multiple instances, and they run without problem. In regards to the above message, that's not the case either, as I tried it as MySQL 4.1, and then moved it to 5 to see if that would solve the problem. I currently have 2 MySQL instances running on that machine, with cpu usage of 10%, and a load average of about 1. If I add that other database, it freaks out, whether I make it 4.1, ol 5.0.

Thanks,
Michael
_________________
Michael S. Moody
Sr. Systems Engineer
Global Systems Consulting
Web: http://www.GlobalSystemsConsulting.com
Back to top
View user's profile Send private message
planet-admin
Apprentice
Apprentice


Joined: 27 Mar 2004
Posts: 213
Location: Boise, ID

PostPosted: Mon Jan 01, 2007 3:43 am    Post subject: Re: VERY strange MySQL problem, definitely worth a look Reply with quote

BitJam wrote:
planet-admin wrote:
It was causing the load average to spike, making the other mysql instances....slow.

Perhaps the performance problem has to do with using multiple instances. I would also double check and make sure your indices are actually being used.


They are, I did "explain" on several of the queries.

Thanks,
Michael
_________________
Michael S. Moody
Sr. Systems Engineer
Global Systems Consulting
Web: http://www.GlobalSystemsConsulting.com
Back to top
View user's profile Send private message
BitJam
Advocate
Advocate


Joined: 12 Aug 2003
Posts: 2508
Location: Silver City, NM

PostPosted: Mon Jan 01, 2007 5:12 am    Post subject: Reply with quote

I can think of three possible reasons for the slowdown:
  1. The new database is running unoptimized
  2. Misconfiguration
  3. Resource conflict

1. To test the indices, I would suggest actually timing select statements under a no-load condition. Compare timing on the new machine and on the old machine. I wouldn't trust explain. It would also be interesting to see how the new database works on the new machine when the vB database is off but I realize this might not be possible.

2. Make sure each instance has it's own:
  • Socket
  • Port
  • PID file
  • Log file
  • Data directory
Consider using mysql_multi.

3. Since the other instances slow down after you start the new one, (if there is no misconfiguration) it is clear that there is resource contention. What is not clear is if this is a symptom of some other problem or if it is the actual cause of the problem. I realize that you don't think this could be the cause because your current two instances play nice together, but resource contention among multiple instances is only an issue if the databases have a high load. Your new database could be the straw that breaks the camel's back.

Ideally, it would be nice to be able to see if the problem exists without the vB database running (or with no load). Alternatively you could add the new database but don't give it any load and see what happens. This might help isolate a misconfiguration. See if simple queries affect resource use. Then write a Perl (or some other language) script to start hitting the new database with multiple queries. See if this starts to cause a problem.

If such tests would be difficult or painful to run then as an alternative, you might want to closely monitor resource use when the contention starts. Perhaps you could set up GKrellm or some other monitoring tool to monitor memory use, processes, CPU, network traffic, and disk IO. Be prepared to take some screen shots.

If the new database works well by itself (tested either with timing selects or actually running it with the other dbs off or throttled back) then my next guess would be a misconfiguration causing two instances to vie for the same resource. If there is no configuration problem then you might have to start tracking down resource use. For example, depending on how your raid is configured it could be extremely unoptimized for multiple instances and you are getting creamed by seeks back and forth between two different data directories. Raids can often slow things down due to the seek times. In the past, I've tended to try to design my disk layouts to minimize seeks by playing tricks such as putting log file directories on separate drives. Generally, raids are often not well optimized for database performance. They are good for reliability but the can really suck performance-wise when compared with having multiple spindles available to cut down on seek times.

Finally, you might want to take a day off away from this and come back to it with a fresh set of eyes. Happy New Year!
Back to top
View user's profile Send private message
planet-admin
Apprentice
Apprentice


Joined: 27 Mar 2004
Posts: 213
Location: Boise, ID

PostPosted: Mon Jan 01, 2007 6:14 am    Post subject: Reply with quote

I used mysqlhotcopy to get the datafiles initially, so they're the same ones. Also, when I moved them back, I moved back the exact same files, where they are now running fine. This probably rules out some "not using indexes" problem.
I used the same config file (all the various buffer_size, etc variables from the "lesser" machine), and that didn't work. I tuned it specifically for my hardware, that didn't work. I used one of the config files from the existing mysql instances, that didn't work.

Each instance is bound to a different ip address/nic. Each has it's own pid, socket, log file, and data directory, and port.

I have turned on "log queries not using indexes", and it IS using the indexes.

I have a VERY high amount of context switching, upwards of 250,000. (using vmstat)

The database does NOT do this when there are no open connections to it, but once I start the web server (hosted on another machine), it gets SLAMMED. Again, the vbulletin, and the other database instances are far far far more utilized, but they run at a nice load average of.....1.

My RAID is actually a 16-disk 12TB raid 6 array, connected via 4gb fiberchannel, with a 1gb RAM buffer on the raid controller. Plus, I flushed the linux disk buffer cache, and then cat'ed the files to dev/null to get them loaded back into the linux disk buffer cache. This raid array is using sata2 ncq drives. It's so blazing fast it's not even funny.

The key_read vs key_read_request ratio is .002.

The tmp_tables ratio is also very low.

The query cache isn't full.

NOTHING that I can think of is causing this.

Thanks,
Michael

BitJam wrote:
I can think of three possible reasons for the slowdown:
  1. The new database is running unoptimized
  2. Misconfiguration
  3. Resource conflict

1. To test the indices, I would suggest actually timing select statements under a no-load condition. Compare timing on the new machine and on the old machine. I wouldn't trust explain. It would also be interesting to see how the new database works on the new machine when the vB database is off but I realize this might not be possible.

2. Make sure each instance has it's own:
  • Socket
  • Port
  • PID file
  • Log file
  • Data directory
Consider using mysql_multi.

3. Since the other instances slow down after you start the new one, (if there is no misconfiguration) it is clear that there is resource contention. What is not clear is if this is a symptom of some other problem or if it is the actual cause of the problem. I realize that you don't think this could be the cause because your current two instances play nice together, but resource contention among multiple instances is only an issue if the databases have a high load. Your new database could be the straw that breaks the camel's back.

Ideally, it would be nice to be able to see if the problem exists without the vB database running (or with no load). Alternatively you could add the new database but don't give it any load and see what happens. This might help isolate a misconfiguration. See if simple queries affect resource use. Then write a Perl (or some other language) script to start hitting the new database with multiple queries. See if this starts to cause a problem.

If such tests would be difficult or painful to run then as an alternative, you might want to closely monitor resource use when the contention starts. Perhaps you could set up GKrellm or some other monitoring tool to monitor memory use, processes, CPU, network traffic, and disk IO. Be prepared to take some screen shots.

If the new database works well by itself (tested either with timing selects or actually running it with the other dbs off or throttled back) then my next guess would be a misconfiguration causing two instances to vie for the same resource. If there is no configuration problem then you might have to start tracking down resource use. For example, depending on how your raid is configured it could be extremely unoptimized for multiple instances and you are getting creamed by seeks back and forth between two different data directories. Raids can often slow things down due to the seek times. In the past, I've tended to try to design my disk layouts to minimize seeks by playing tricks such as putting log file directories on separate drives. Generally, raids are often not well optimized for database performance. They are good for reliability but the can really suck performance-wise when compared with having multiple spindles available to cut down on seek times.

Finally, you might want to take a day off away from this and come back to it with a fresh set of eyes. Happy New Year!

_________________
Michael S. Moody
Sr. Systems Engineer
Global Systems Consulting
Web: http://www.GlobalSystemsConsulting.com
Back to top
View user's profile Send private message
jonnevers
Veteran
Veteran


Joined: 02 Jan 2003
Posts: 1594
Location: Gentoo64 land

PostPosted: Mon Jan 01, 2007 4:37 pm    Post subject: Reply with quote

m.wales wrote:
So the only thing that stood out to me was that you were moving from MySQL 4 to 5. Perhaps the problem lies somewhere in that switch? Maybe the configuration you used for 4 doesn't work too well with 5? Also are all the tables in the new 5 instance indexed and analyzed properly as I've certainly had migration problems due to that in the past.


if i remember correctly, to properly go from mysql4 to mysql5 you should dump the mysql4 database and import that dump back into mysql5. allowing the mysql5 databse to create all the data structures itself. I've had mysql4 datafiles completely wigg out a mysql5 database instance.Yes. it is a lot of work to dump a large database but it does have the side effect of providing you a proper backup of the database also.

This has to do with the way mysql5 structures its data.

planet-admin wrote:
I have a mysql 4.1 server running on a quad opteron machine running redhat E/S 4u4.

The cpus in this box are 2.2ghz opterons, and there are 20gb of RAM. My RAID is actually a 16-disk 12TB raid 6 array, connected via 4gb fiberchannel, with a 1gb RAM buffer on the raid controller.... (It is 16 cores after all).....

I moved the EXACT SAME data files back (so, that rules out something strictly relating to them) to the other quad opteron box, and it runs perfectly.

sounds a little like you are approaching Oracle territory w/ hardware like that.
Back to top
View user's profile Send private message
planet-admin
Apprentice
Apprentice


Joined: 27 Mar 2004
Posts: 213
Location: Boise, ID

PostPosted: Tue Jan 02, 2007 12:53 am    Post subject: Reply with quote

Well, I first tried it on MySQL 4.1. I'm aware of the migration path, I just wanted to "quickly" try a solution, when 4.1 didn't work, I started the 5 binary I had installed. Sadly, it didn't work either.

And, yes, this hardware is a serious piece of hardware. It boots off of the fiberchannel. It'll compile the COMPLETE 2.6.19 kernel, with a make -j in about 19 seconds. 19 SECONDS!

(Sadly, I also have the "joy" of working on an sgi altix, itanium 2, supercomputer pain, but I digress)


Thanks,
Michael
_________________
Michael S. Moody
Sr. Systems Engineer
Global Systems Consulting
Web: http://www.GlobalSystemsConsulting.com
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum