Let’s do something here on our Analytics & Site Intelligence blog that quite honestly we don’t do enough of: talk about pure statistics! Can you feel the excitement running through your veins? Oh wait, that’s only me.
As the Web Analytics industry becomes more and more mature, the requirement to understand basic statistical concepts becomes greater and greater. Awesome new features, like Google Analytics’ Intelligence report section and predictive modeling features from Google Insights for Search, beg the user to dive deep on their web data, segment it, grab insights, make a conclusion, and take meaningful actions.
Sure, you can do all that without knowing a lick about statistics, but chances are very high that you’ll start to get confused, lost, and overwhelmed along the way. Think of statistics like contractors think about a foundation for building a home – we all know what happens without a strong foundation!
Enter “Standard Deviation”, which is quite possibly (next to mean) the most important element in the field of statistics. Standard Deviation is the variance (another stat term!) from the mean (average) of a set of data.
Let’s say that the average football fan watches 3.5 hours of football a week, with a standard deviation of .5 hours (a half-hour). This means that – assuming a normal distribution (a third stats term!!) – most football fans (about 68% of them) will watch anywhere from 3 to 4 hours of football a week. Since the average is 3.5, and the standard deviation is .5, watching 4 hours of football a week is said to be “one standard deviation above the mean”. Conversely, watching 3 hours of football is said to be “one standard deviation below the mean”.
However, almost all football fans (which is about 95% of them, assuming a normal distribution), will watch anywhere between 2.5 and 4.5 hours of football, which is said to be “two standard deviations above or below the mean”. It’s two standard deviations above or below the mean, because 2.5 hours or 4.5 hours is two “.5′s” above or below our mean of 3.5.
In statistics, it is generally considered unusual if a particular data point (like, watching 9 hours of football) is above or below two standard deviations from the mean. Watching an average of 9 hours a week of football for the average football fan is way…WAY above 2s (two standard deviations), so this would be considered highly unusual for the average football fan.
What it means for you (the Web Analyst)?
Knowing what Standard Deviation is and how it’s used in Web Analytics will help you get an idea of just how important events that happen on your website could be. For example, in the new Intelligence Section in Google Analytics, you may see some alerts for an increase in Revenue from different regions:
If you notice on the left-hand side of the image, the revenue for this particular time period increased by 111% from North Carolina from the expected revenue. This is definitely significant (check out the significance bar on the right), as it’s about 3 or even 4 standard deviations above the mean! Perhaps your new PPC campaigns that were targeted to North Carolina were successful, and you can now duplicate that success everywhere else! Or maybe your email marketing strategy worked, and North Carolina residents responded so well that you can re-market to them in 1-2 months.
In that same image, the Revenue from the United Kingdom increased by 46%, which is about one or possibly two standard deviations above the mean. It’s not as significant of an increase as North Carolina’s, but still worthy of your attention nonetheless. Apply the same negative keywords or the same match types for your other international campaigns as well!
So now that you know what standard deviation is all about, use reports like Google Analytics’ Intelligence section to get a truer, deeper meaning of just how significant certain trends are that happen on your website, which will allow you to improve whatever it is that you are doing exponentially. You’ll be a better analyst for it!