Diagnosing performance degradation under adverse circumstances

[This post is a few years old and was never published. Recently, I was reminded about memcached slab imbalance, which in turn reminded me of this post.]

At work, we encountered a sudden and precipitous performance regression on one particular page of a legacy application. It’s a Perl web application, running under mod_perl, using ModPerl::RegistryLoader to compile scripts at server startup, and Apache::DBI to provide persistent database connections.

Our users suddenly began complaining about one particular page being “three times slower than normal.” Later examination of the Apache logs showed a 20x(!!) slowdown.

Investigating this performance problem was interesting because we didn’t have good access to required data, and our technology choices slowed us down or completely prevented us from collecting it. Although we solved the mystery, the experience had several important lessons.