Ed's Ink: Questioning Data Collection Methods

For years it was reported that the instigators of bar room brawls in the UK were more likely to be killed in such fights than those defending themselves.

Sociologists, psychologists and substance abuse agencies studying this phenomenon formulated evidence-based theories which attempted to explain this rather unexpected and unusual outcome.

The Royal Statistical Society, however, was somewhat suspicious of these results and set out to check for themselves their accuracy and reliability.

The statisticians (a.k.a. data analysts) eventually discovered the data were indeed questionable as the collection methods did not encourage accurate results.

What the statisticians discovered was every time a bar brawl ended with a fatality or with serious injuries, the investigating police officer dispatched to the incident would invariably ask: "Who started this?"

Witnesses would immediately point to the victim lying on the floor. This response was duly noted and because of the victim's inability to dispute, the witnesses' assertions were usually taken to be true.

Our Point?

"Garbage in, garbage out" is a catchphrase known to every I/T professional. Bad basic data (the "garbage in") almost always leads to incorrect conclusions, and often to incorrect if not harmful policies.

In this day and age, you cannot escape listening to ill-trained commentators, quoting analyses based on questionable data collection methods and in some instances data without sources.

The quality of the basic data must be questioned with respect to possible flaws, bias (e.g., groups sampled, the way questions are formulated, operational definitions).

If the basic data is faulty, analytical methodologies of all kinds will produce meaningless results.

