Damned Lies and Statistics?
Recently, I’ve started to analyse data gathered via the online questionnaire which is central to my thesis. This means having to get acquainted with a little bit more statistical analysis than I am comfortable with – i.e. pushing beyond the basics of mean, median, mode, and standard deviation, although naturally I first had to refresh my memory of those as well. Along with a copy of IBM’s excellent SPSS program (essentially, software for undertaking both basic and complex statistical analyses) and a copy of Julie Pallant’s SPSS Survival Manual, the complicated cycle of deriving numbers from text, recoding existing numbers into other numbers ➞ extrapolating from those numbers other, more illuminating numbers ➞ interpreting and then turning these new numbers back into narrative and prose, begins! Let’s just say that adjusting my research questions into something that will conform with the mystical world of dependent and independent variables is an intriguing process. Initial tests have led me to make a number of observations, some of which I think are worth sharing, especially with other humanities/social science researchers:
- Contrary to popular misconceptions about the coolly objective operating manual-style of science, there are, if you care to look beyond basics, almost as many disagreements about method, applicability and interpretation when it comes to statistics as there are about whether or not god exists. Well, okay, maybe not quite as many. But you get my point.
- The reassuring tone of a beginner’s textbook is wonderful but also dangerous. Particular authors will recommend making certain assumptions and using certain techniques that other authors argue just as convincingly against. Using one over the other may appear a trivial decision, if you are even aware (as a novice) of the debate to begin with. In reality, the decision you make about which author to trust can make a huge difference to the output you end up with. An output that cuts (or seems to) right to the heart of your research.
- Debates flagged up in various books are troubling and usually glossed over – can we really charge ahead with parametric tests when data does not look very normal? To what extent is it justifiable to manipulate (i.e. alter) data so that different more “robust” tests can be used? If I will never in a million years understand the maths behind a given procedure, how confident can I ever really be about using it?
As a result of all of this, statistics are often sloppily applied or deliberately misused; researchers proceed from all the wrong assumptions because they don’t really know what they are dealing with, or they already know what result they want. Knowing that nobody will really dig very deeply anyway, it can be assumed that most readers skip ahead to the conclusions. Naturally, there will be differences according to academic field (very relevant for my work!) in how statistics are perceived, used and justified. Young Min Baek writes of statistics in communication studies:
Like most social scientific terms, statistical terms and their findings are academically and/or socially (re)constructed facts. Statistical methods are not given, but created and (re)constructed for specific reasons in various disciplines before the birth of the communication field. Methodological myths, such as subjectivity or neutrality, are reinforced by learning of statistics as something given, not as something constructed. Learning something established does not demand critical minds that statistics can be changed for more appropriate understanding of communication. Communication students simply learn statistics from a communication methodology course, or an introductory statistics course. Most, if not all, students rarely have an interest in how statistical terms or concepts are born and (re)constructed throughout intellectual history in diverse academics. They just learn the basic logic and its applications to the understanding of social worlds.1
A friend who knows just a little bit more about all this than me suggested:
If you want to get some excitement out of statistics, ignore classical probability theory and use quantum probabilities. Statistics could be more fun than the usual Kolmogorovian bore, if only statisticians would not be so boring themselves…
Hmm. Right. I think maybe what he means by that is that standard statistical methods do not capture the subtlety at the heart of chaotic “reality”. But I can’t be sure. Software helps us but also flatters us, letting us click buttons and tick boxes to pretend that we are in some ways mathematicians. For that, I am grateful but also (as a “truth-seeker”) a little concerned. How far I can do any more than learn the basic logic, is unclear, but at least I am aware of some of these issues. I have plenty more analysis ahead of me, and I’m sure it’s going to continue being challenging, infuriating, fun, and informative. Right now though, I feel like Mulder in the X Files – the truth is out there, but I’m not sure if I will ever be able to prove it, or even if proof is the most relevant concept…watch this space!