This may seem a bit ‘back to basics’, but it’s a topic of discussions I’ve been having recently.
When presenting performance metrics, folks sometimes use mean, sometimes median, sometimes both, sometimes they include standard deviation, percentiles, etc. I’ve been looking for some concrete guidance on what metrics to use, and in what contexts.
In my search, I came across a fantastic book (ironically, written by an old college professor or mine), called The Art of Computer
Systems Performance Analysis.
This book covers a wide range of topics, and I highly recommend it. It’s rather expensive at $75, but Dr. Jain has provided the material in presentation form here.
I specifically want to call out chapter 12, on Summarizing Measured Data (slides can be download at the bottom of the page). Don’t get too caught up in all the math – you can still get a lot out of it without understanding every equation.
Here’s my interpretation/summarization of the chapter.
When attempting to summarize data with a single number, we want to show both indices of central tendencies, and indices of dispersion. In other words, when measuring Web page performance, what’s the most common experience, and how variable is the data.
Web performance data is not normally distributed, but is positively skewed (i.e. has a long tail), as is common with computer response times. Because the Mean is so impacted by the tail, it can be far from the central tendency. The Median, while not perfect, is closer to the central tendency for skewed distributions, so is the preferred metric. This is demonstrated on slide 15 and 16, with more details in surrounding slides.
Correspondingly, the suggestion for indicating data variability (dispersion) is either percentiles or SIQR (Semi-Interquartile Range, defined on slide 41). Again, this is due to response time data having a skewed distribution. Refer to slide 45, with more details in surrounding slides.
Thoughts?
Web performance data actually follows a log-normal distribution, ie, the log (base is unimportant) of response times is normally distributed. This property of the data means that the geometric mean is a good measure of central tendency. You can get the geometric mean as exp(avg(log x)). Similarly you can get the geometric standard error.
There’s a paper that describes this:
http://home.pacbell.net/ciemo/statistics/WhatDoYouMean.pdf
There’s also a question about whether you should run IQR filtering on the data before analysing it or not. IQR filtering tends to get rid of outliers, which in the case of web performance data applies mainly to excessively high numbers that come out of DNS misses (ie, an ISP’s first DNS server failed and the second responded), which isn’t something you can control, though you may want to know about it any way.
I’ll be speaking about this at confoo.ca in Montreal in March if you’re interested.