Whenever you find yourself writing the words, "the average worker" this, or "the average household" that, you don't want to use the mean to describe those situations. You want a statistic that tells you something about the worker or the household in the middle. That's the median.
Again, this statistic is easy to determine because the median literally is the value in the middle. Just line up the values in your set of data, from largest to smallest. The one in the dead-center is your median.
For the World Wide Widget Co., here are the workers' salaries:
That's 9 employees. So the one halfway down the list, the fifth value, is $15,000. That's the median. (If you have an even number of values lined up, split the difference between the two in the middle.)
Comparing the mean to the median for a set of data can give you an idea how widely the values in your dataset are spread apart. In this case, there's a somewhat substantial gap between the CEO at WWW Co. and the rank and file. Of course, in the real world, a set of just nine numbers won't be enough to tell you very much about anything. But we're using a small dataset here to help keep these concepts clear.
Here's another illustration of this concept. Ten people are riding on a bus in Redmond, Washington. The mean income of those riders is $50,000 a year. The median income of those riders is also $50,000 a year.
Joe Bleaux gets off the bus. Bill Gates gets on.
The median income of those riders remains $50,000 a year. But the mean income is now somewhere in the neighborhood of several million dollars or so. A clueless or dishonest reporter could jump in now to say that the average income of those bus riders is several million bucks. But those other nine riders didn't become millionaires just because Bill Gates got on their bus. A reporter who writes that the "average rider" on that bus earns $50,000 a year, using the median instead of the mean, provides a far more accurate picture of those bus riders' place in the economy.
Statisticians have a value, called a standard deviation [SD], that tells them how widely the values in a set are spread apart. A large SD tells you that the data are fairly diverse, while a small SD tells you the data are pretty tightly bunched together. If you'll be doing a lot of work with numbers or scientific research, it will be worth your time to learn a bit about the standard deviation. We'll get to that definition in a bit.
If you are interested, I'll tell you the definition of the mode, too.
Read the rest of Robert's statistics lessons for people who don't know math.
© Robert Niles. Read more in the column archive.