Polls, Margins of Error, and Six Sigma Data-Driven Decision Making

One of the most critical skills that a technology manager can have – or any manager, really – is the ability to interpret data and assess whether or not it reflects reality. Why is this important? Because good managers base their decisions at least in part on data, so the quality of the decision is often related to the quality of the data on which the decision is based. (One of the tenets of Six Sigma, for example, is “data-driven decision making”.)

So what if you were basing a decision on the quality of the election polls currently being conducted? Six Sigma experts and practitioners, take note: today’s election polls offer up a really effective lesson on threats to validity which, if human subjects are ever a part of your quality improvement efforts, you need to be aware of these sorts of issues:

[Poll results and margins of error work] pretty well if you’re interested in hypothetical colored balls in hypothetical giant urns, or survival rates of plants in a controlled experiment, or defects in a batch of factory products. It may even work well if you’re interested in blind cola taste tests. But what if the thing you are studying doesn’t quite fit the balls & urns template?

  • What if 40% of the balls have personally chosen to live in an urn that you legally can’t stick your hand into?
  • What if 50% of the balls who live in the legal urn explicitly refuse to let you select them?
  • What if the balls inside the urn are constantly interacting and talking and arguing with each other, and can decide to change their color on a whim?
  • What if you have to rely on the balls to report their own color, and some unknown number are probably lying to you?
  • What if you’ve been hired to count balls by a company who has endorsed blue as their favorite color?
  • What if you have outsourced the urn-ball counting to part-time temp balls, most of whom happen to be blue?
  • What if the balls inside the urn are listening to you counting out there, and it affects whether they want to be counted, and/or which color they want to be?

If one or more of the above statements are true, then the formula for margin of error simplifies to:

Margin of Error = Who the hell knows?

Because, in this case, so-called scientific “sampling error” is completely meaningless, because it is utterly overwhelmed by unmeasurable non-sampling error. Under these circumstances “margin of error” is a fantasy, a numeric fiction masquerading as a pseudo-scientific fact.

Read the whole article at http://iowahawk.typepad.com/iowahawk/2008/10/balls-and-urns.html. It’s a winner. (And thanks, Mary Pat, for posting this on Facebook.)


  • Nicole, I’m not the quality guru you are. That’s for sure! But, I actually had similar thoughts about all the polling going on for the presidency. How the heck do they calculate the margins of error reliably when they are talking to people, who we know to be a fickle species? My gut was telling me that they simply couldn’t. Your argument is much more scholarly. 😀

    P.S. And, yes, I do read your blog. Very nice and prolific as well!

  • Hi Amy – not my argument, but I appreciated it and saw that it could be applied to lots of other situations where you’re sampling humans. Too often people use or interpret statistics without thinking about all the threats to validity. Thanks for the note!!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s