Dear Modern Day Statistics Student

I’m sure most of you guys have heard of the whole BART strike fiasco up in the Bay Area for the last couple of months. While the issue itself is immensely interesting, it also led me to things like this:

http://enjalot.github.io/bart/

Now, on initial glance, the hidden statistician and truth seeker in all of us will rejoice at the data and information at our hands. We will play around and hover over each bubble trying to figure out what story each visualization has to share.

And herein lies the problem.

These visualizations tell us no stories. Well, maybe thats not fair. These visualizations have stories to tell us – its just that the authors and creators have made them mute. I get it. These visualizations look nice. These visualizations are not made from simple excel bar charts and pie graphs but are rather made from fancy javascript where bubbles get bigger when you hover over them. These visualizations are what I would have put in a 5th grade science fair project to wow all the asian parents into comparing their sons and daughters to me. But honestly, that is exactly what I think it is – shiny, 5th grade art.

Now, to clarify, I am not against creative data visualization. I think the very act of visualizing data not only helps emphasize our insights into the data, but can also instill those very insights into our minds. To be able to creatively illustrate a point is like the milk to your statistical cookies (sorry, lactose intolerant people you’re going to have to imagine this). But this? This is all wrong. This is like pouring milk, thinking you have cookies only to realize that you actually have no cookies. Then you just stand there thinking why in the world you poured the milk.

Lets take the first graphic for example “How Much Do BART Employees Earn?” When I look at these dots, I have no idea what kind of conclusion I am supposed to draw. Okay, predominantly the people who make the most money, most contributions to pension, most any kind of benefits are those not in Unions. So what? This tells me nothing about the BART organization. This tells me nothing about whether or not the Union is justified in making demands. This just tells me about a comparison that gives me no context to the issue. But man, look at how those circles move when you change that drop-down.

Okay, so lets go a bit further down to “are the demands reasonable?” Here, we have a fairly standard graph with 4 lines. At initial glance, what hops out at you is that MAN THOSE UNION PEOPLE BE RIDICULOUS. Then you start to ask yourself. What is “Index”? What does that measure? Do I generally want to stay above the index or below the index? Is it only the slope that matters or do the actual values matter? Nope, no information. Luckily there is a link below about the “fairness in transportation” that uses the same graph. Clicked-in, found some blog with some fairly large words and complex ideas (which, by the way, are fairly interesting), yet still no explanation of the what the graph is trying to show us.

Maybe the problem lies with how most people view statistics. Most people view statistics as a way of aggregating knowledge. It takes millions and millions of numbers and letters and otherwise seemingly unrelated things and ties them together for us to see. I disagree. Statistics is a way of parsing out all the noise in the world for us to see truth. It allows us to absorb and quantify millions and millions of occurrences of events so that we can being to formulate an opinion about what is truth. And that is what is different. Statistics is not a reporting tool; Statistics is a tool allowing us to dig and claw and reassess our knowledge of truth. And, honestly, the only subtle difference between the two is the story we glean from the statistics.

So. Statistics students of the modern age. I challenge you to be proactive with your statistics. Use your analysis and data visualization to challenge (or reaffirm) your view of the world. Use it to elucidate, to influence, to persuade others with what you see as truth. The numbers do not speak – you will have to be the orator for the single most powerful pool of knowledge in the information age.

Just remember: Statistics is a contact sport – do not be a passive onlooker.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: