April 5th, 2011 by John Reeve
In my experience as a web developer at Pelago I’ve helped develop several web-based applications that required more than just an out-of-the box database configuration. While 99% of the online apps we’ve built can get by on a MySQL or PostgreSQL database with little to no configuration, the few that do require optimizations will certainly squawk about when the time comes.
I’ll never forget the first time this happened to us. We had launched an e-commerce web site that generated a fair amount of traffic, but not enough to affect the server in any way. Then there was a big blowout sale one day. Sale transactions, for the first two hours of the sale, were coming in at the rate of 8 per minute. That combined with the sheer number of people on the site brought the server to it’s knees. Most web applications suffer the same bottleneck, the database. This experience was no different. The database server was holding up the site.
There are plenty of contingencies that can be implemented when a database comes under load, but it’s best to have prepared for them ahead of time. Still, that can’t always prepare you for a spike in traffic. Take, for example, this blog post from Pinboard:
“The bad news was that it had never occurred to me to test the database under write load.”
— Anatomy of a Crushing
Read the blog post for some great tips on optimizing a database to handle spikes in traffic. Meanwhile, here are some tips on basic database optimization that will get you 90% of the way there.
- Use a dedicated database server.
- Make sure all of the columns used in JOIN and WHERE clauses are indexed, especially if they have foreign key constraints on them.
- For PostgreSQL, designate more shared memory to the server, especially if it is on a dedicated server. Also, modify the shared_buffers in postgresql.conf to give it more memory. There are some great tutorials online for how to do this right.
- Increase the number of simultaneous connections allowed by the database.
- Use a connection pooler, such as pgBouncer.
- Optimize your queries, then optimize them again.
- Log your ten slowest queries and make it a point to speed them up. Then repeat this exercise as many times as you can.
- Test your queries under load and against larger data sets. A query might seem fast in development but can fail miserably in production.
- Allow for exceptions to the above rules. For example, if you are logging usage data behind the scenes, don’t index it. It will slow down writes.
We were able to get the e-commerce site running smoothly again, to the point where it wasn’t slowed down by any more sales. It took a combination of server optimizations, Apache configuration modifications, and most importantly, database optimizations. Many of our clients, including the aforementioned e-commerce site, have benefited from the lessons we’ve learned doing battle with databases under heavy load. The best part? Every bit of what we’ve learned about database design and infrastructure has been applied to our successful online time, task and project management software, Intervals.
When developing an online application, don’t neglect the database. It’s one of the most important components and also one of the most overlooked. Many web developers just assume the database will run fine. There are a lot of good articles online that will teach you about writing better SQL queries and tuning databases for speed. The more prepared you are, the more successful your app will be.