Skip to Content

One of my few pet peeves in life is when I hear someone say “the numbers don’t lie!” While I understand and appreciate the use of fact-based information to support an argument, its application has become overly generalized.

People often use numbers as a crutch to support weak arguments,  presuming that any stat is a good one, capable of automatically validating their position. The truth is that numbers can and do lie to us every day. This is especially important to keep in mind as the hype around Big Data and Analytics reaches a fever pitch.  

As a reminder to all of us who use data in work and life to make decisions, I’ve put together some examples of how numbers can often lie or mislead. Feel free to add more that I’ve missed!

1. Small sample size

  • Description: These conclusions based on a small number of data points, yet portrayed as an accurate reflection of the truth. When seeing any data, this is the first question I will always ask.
  • Example from daily life: Baseball statistics. Volumes have been written about the use and misuse of baseball stats, but one of the most common mistakes is to judge a player based on a few weeks or months of performance. In reality, even the worst baseball players can look like All-Stars for short periods of time. It takes multiple years of data to validate the true talent level of a player. To illustrate, here are some recent players who have had great 3-4 week stretches but are no longer in the Major Leagues:


Player Recent Award Current Playing Status
Dee Gordon MLB Rookie of the Month – Sept 2011 AAA – Minor Leagues
Jemile Weeks MLB Rookie of the Month – June 2011 AAA – Minor Leagues
Jair Jurggens MLB Pitcher of the Month – May 2011 AAA – Minor Leagues
Bryan LaHair National League All-Star – 2012 Playing in Japan

2. Big meaningless numbers

  • Description: These large numbers are meant to imply a significant trend, but do not provide any context. Therefore its meaning is of limited or no use.
  • Example from daily life: Social media stats. Saying that you have a lot of Twitter followers or Facebook fans doesn’t really mean anything, yet they are often used as a proxy of someone’s level of “influence.” There are easy ways to get a ton of followers in social media. What matters is whether those people actually care about what you’re saying, if you’re engaging them, and if it results in a real business benefit. I cringe when I see big social media “ego metrics” now.

3. Correlation, not causation

  • Description: Such figures state that Variable A causes Variable B, when in fact they are merely correlated.
  • Example from daily life: Taken from SAP CMO Jonathan Becher’s recent blog on this topic:  When male college students wake up with a headache, a large percentage of the time they are still wearing their shoes. Does sleeping with your shoes on really cause headaches? Of course not, they are only correlated. You could play this game all day long.

4. Selection bias

  • Description: These numbers imply that data came from a random sample when it actually came from a (systematic) non-random sample. 
  • Example from daily life: Online voting polls. These are easy to discredit because by definition, all participants have access to the Internet, which automatically distorts the sample. Furthermore, the results will skew towards the readership profile of the host Web site. This is not a big deal for trivial topics like sports or entertainment, but political views extrapolated from online results can lead to truly misinformed decisions.

5. Visual trickery

  • Description: Some graphics deceive or mislead based on how the information is presented.
  • Example from daily life: Changing the Y-axis of a graph to magnify the difference in data points (see example below). You see visual trickery all the time on cable news channels. Keep an eye on how graphs are manipulated the next time you watch a news show.

     misleading graph.PNG

6. Arbitrary cutoffs

  • Description: This is another form of selection bias. Setting arbitrary start-and-end points that impact the meaning of data.
  • Example from daily life: Any “Top 10” list. Why is it 10 and not 11? Why does this blog have 6 bullet points instead of 10? Again, it’s not a big deal for trivial topics, but if it’s a list of Top Hospitals or Colleges, some people will make significant decisions based on that information. In addition to lists, any data that is time-bound could have arbitrary cutoff dates, so we should always keep that in mind.

So that’s my list. What am I missing? Do you have other examples of “numbers that lie”?

To report this post you need to login first.

9 Comments

You must be Logged on to comment or reply to a post.

  1. Tammy Powlas

    There have been studies that have shown the 3-D graphs you are showing in section 5 make the “numbers lie” – so that is visual trickery 🙂

    I’ll have to call my brothers re: baseball stats though.

    (0) 
    1. Christopher Kim Post author

      Gail, I need to reminisce with you about those pre-PowerPoint days! My first job out of college was at AT Kearney in the mid-90s, and I have memories of writing out entire decks by hand, then giving it to a production staff who would work on it all night on Aldus PageMaker. And then scribbling my edits on each page and going through the same process over and over. And somehow it always got completed 2 minutes before the meeting started 🙂 Ahh the good old days!

      (0) 
  2. Lesli Arbuthnot

    “Statistics are like a bikini. What they reveal is suggestive, but what they conceal is vital.”

    – Aaron Levenstein

    That one has stuck with me ever since I was a kid, when i first heard it on a segment of CBS’ 60 Minutes. It wasn’t until the advent of the Internet that I was finally  able to track down the attribution.

    (0) 
  3. Carl Harris

    Following on with appropriate quotes, my old maths teacher always used to say to me that “figures never lie, but liars sometimes figure” – go figure 😉

    Great post though Christopher, thanks for sharing.

    (0) 

Leave a Reply