The Importance of "noatime"

At Shutterstock we’ve been putting a lot of effort into rolling out an infrastructure-wide configuration management / provisioning system to ensure all our servers are built correctly every time.  This system consists of cobbler / puppet to ensure the appropriate packages and configurations on each server set that we have (thumb, web, DB, memcache, etc.), and we use some other cool tools like fabric / mcollective to do bulk jobs across pools of servers.  It’s been a lot of work and it’s always nice to see some validation that it was worth it.  I’ll write a larger post on some of these later on.

Below is a good example of a server that is currently not “puppetized” and largely was built by a human to be a replica of the other three servers in the pool.  This server was missing a simple “noatime” mount option.  By simple we mean the fix was completely trivial, though finding the cause of the problem itself was something we discussed for quite some time.  Not a ton of ops time was lost… but I think we spent some hours scratching our heads before one of our engineers really wanted to sort this out.  Check out the difference that this made on load.

Before:

After:

There are a few wins here:

Performance – Major decrease in load on thumb02

Sexy – A graph that looks like the server set is scaling horizontally as we would expect

Validation – The warm fuzzy thought knowing that with puppet on hosts a misconfiguration like this should never happen again (and if it does we can always do a diff to find out what’s awry).