If you’re problem is your SQL, chances are you’re going to go about fixing it the wrong way. What is the “wrong way”? Trying to optimize Mysql settings is often a sign that you’re heading in the wrong direction. What should you do? Look at your SQL. Yes, just look at it.
That’s assuming you know what SQL is the problem. If you aren’t quite sure, try looking into Mysql’s Slow Query Log. Also use “mysql -e” or “mysqladmin processlist”. Maybe even make a cron to monitor this so you can go back and look at the logs. Bottom line: figure it out yourself, that’s not for this article.
Once you figured out your problem SQL, ask yourself a few questions:
- Do I have a lot of JOIN’s?
- Am I working with one or more large tables?
- Does my SQL have any basic flaws?
These are just covering the basics. Once you get to the point where you are sure you’re JOIN’s are good, your table size/# of records isn’t the cause of the problem, and your SQL doesn’t have any basic flaws, move on to the more advanced questions:
- Is my problem with my SQL or with my PHP?
Even though it’s your SQL causing the hang, your problem might be your PHP. Are you pulling records in an efficient order? Are you pulling only the information you need? Are you doing lots of queries repeatedly because of a loop? - Is my SQL doing anything repeatedly that I can do more efficiently, such as a JOIN that is retrieving the same information over and over?
Am I retrieving duplicate information over and over due to JOIN’s?
Specifically, we’re hinting at removing a join or two, and replacing them with a parent SQL query and loop in PHP or your language of choice. This isn’t always an option, or the correct solution, but you would be amazed at how many problems this fixes. If you join several tables and simultaneously pull information from those tables, you might be able to organize your query more efficiently. Sometimes even reducing the number of unnecessary fields you are pulling can speed up your application. This doesn’t mean using “table.*” is bad, but you’ll have to keep an eye out for problems once your applications begins to grow.A great example of this is “Categories”. Don’t join to your Categories table, pull all of the categories ahead of time into a variable
$allCategories = dbQueryAllRows(""); //change function to correspond to your DB API. db_query() etc if necessary.
That way you never have to join to your category table, and when you pull records later you just pull their Category ID and reference the Category’s Name within your program. - Is my SQL trying to do too much at once?
If the problem is that your tables are massive, or your JOIN’s are necessary, you might need a different solution.Remember that even though you can read/write to multiple different tables at once, writing to individual tables is limited to one person at a time. If a large SELECT query with a half dozen JOINs is causing the system to hang, chances are everything trying to use the table will hang. This isn’t necessarily an SQL problem, and you might find that by addressing logic in PHP and adding crons, queue’s, and summary tables you can solve the problem in other ways.
One option might be a summary table. You can create a “Cached Table” that contains information you would otherwise have to query real-time. The idea is to pre-process as much as possible ahead of time, and do the rest in real-time when a user triggers a certain action. If you can’t do the rest real-time, you might be able to add a queue. So you combine the “summary table” ahead of time with the queue when a user triggers an action.
An example might be a search: Instead of constantly updating a “search index” (a table), you can generate the index from a cron every 15-60 minutes. Users will do whatever they do, add classifies or forum posts or what not, and they will appear in the search index later. There is no queue here.
Lets say you are “generating a download” similar to a search, but the end-result is a downloadable file. You can use the same search index method above, but add the “download” request into a queue, and the action will do whatever your queue is designed to make it do. The search will be performed, a file generated, and a notice will appear in the user’s account telling them a request has been completed and their file is available for download. Even if it only takes a few seconds to process, it makes the whole process look fancy and can help avoid server hangs when a bunch of users hit the site at the same time.
- Is this my only problem?
Just because you identified the #1 problem doesn’t mean there aren’t 50 other problems. In fact, it might be worthwhile to forget about #1 if it’s too difficult to solve, and go around solving the other problems you can identify. You might be able to make up for the #1 problem with other improvements throughout your application. By solving other problems you can familiarize yourself with the code more and maybe open up new fixes or identify new bugs.
Tags: join, mysql, performance, sql