Take This Cron Job And Shove It

Take This Cron Job And Shove It - Earlier this week, I began migrating all of the sites hosted by Synfibers that were using the PHP-based CMS called Drupal from version 4.2 (and one site that was hosted on a modified 4.0 installation) to the latest and greatest edition, 4.3.2.

For the most part, the process was pretty painless. Unlike some previous upgrades, there were very few times I had to manually alter a site's database to get the new code tree to work, and most of the time these were mentioned explicitly by the upgrade script. The first day I did one site, and let it sit overnight. When everything seemed to be fine, I did five more sites the next day.

That's when everything seemed to slow down to a crawl, about every fifteen minutes or so. I thought it was the scheduled cron job that updates the database with XML newsfeeds and so on; I figured some of the newsfeed URLs were bad or had malformed XML, so I started weeding out the feeds, removing sites that had invalid XML or sites that didn't seem to have active newsfeeds anymore.

Then Verio tech support emails me, warning of dangerously high CPU usage in some Apache webserver threads, and suggests that it might be due to Googlebot. I thought at first it was again the cron jobs and that he just wasn't familiar with my usage of Drupal and the need for these cron scripts to be run frequently. However, as it turns out he did, and he had already dismissed the cron jobs as a potential cause because the times when they ran didn't seem to match up with the times when these high-CPU-usage Apache processes were spawning.

So that meant that either the search bots were the culprit, or somebody was DDOSing Synfibers. That seemed unlikely, so I started changing some things. I used robots.txt to exclude cron.php from their view, figuring that perhaps the bots were hitting that page and causing multiple cron script instances, which might bog things down. I took a look at top on the server and noticed some Apache processes going as high as 95% CPU utilization, in effect bringing the server to its knees.