I’ve been working on tuning my Apache/MySQL/PHP configuration lately to see how to improve the performance. Lately, I’ve been noticing some crawlers (same IPs all the time) that have been ignoring my robots.txt file and hitting my server really hard with several page requests every second.
So, I decided to run some tests to see where I could speed up my website in the code.
One test was significant. I commented one update statement in my code and that sped up my page 3 times. Page display time went down from 0.15 second to 0.045 second! It seems as though my VMWARE server has slow write access times in comparison to reads, so this is a source of the cause. I decided to use this to my advantage. So, basically create an array with the blacklisted IPs (but you can’t block them entirely since I need certain crawlers to index websites/etc. for searches)
$is_crawler = 0;
$crawler_array = array("[CRAWLER IP]","[CRAWLER IP2]");
foreach($crawler_array as $crawler_item)
{
if ($crawler_item == $current_ip)
{
$is_crawler = 1;
break;
}
}
In using this, only run the update query if the source IP isn’t a crawler… so something like this:
if ($is_crawler==0)
{
// Run MySQL Update command
}
Other ways to speed up your MySQL performance.
- Only use ORDER BY when necessary. Ordering is costly with large datasets
- Joins are costly as well. Design a database table cache that has the union set if performance is a concern
- Memcache is a viable option as well for pages that use many tables to generate the page.