Table of contents:

4 ways to lie with statistics
4 ways to lie with statistics
Anonim

One of the most effective ways to lie is misinterpreting statistics. Knowing how the numbers are juggled can help you notice if someone is trying to trick you.

4 ways to lie with statistics
4 ways to lie with statistics

Collect data that will make your conclusions even more biased

The first step in collecting statistics is to determine what you want to analyze. Statisticians call information at this stage. Next, you need to define a subclass of data that, when analyzed, should represent the entire population as a whole. The larger and more accurate the sample, the more accurate the research results will be.

Of course, there are different ways to spoil a statistical sample by accident or intentionally:

  • Selection bias. This error occurs when the people taking part in the study identify themselves as a group that does not represent the entire population.
  • Random sampling. Occurs when readily available information is analyzed rather than trying to collect representative data. For example, a news channel might conduct a political survey among its viewers. Without asking people who watch other channels (or do not watch TV at all), it cannot be said that the results of such a study will reflect reality.
  • Refusal of respondents to participate. Such a statistical error occurs when some people do not answer the questions asked in a statistical study. This leads to incorrect display of results. For example, if a study asks the question, "Have you ever cheated on your spouse?" As a result, it will seem that infidelity is rare.
  • Free access polls. Anyone can take part in such surveys. Often it is not even checked how many times the same person answered questions. An example is various surveys on the Internet. It is very interesting to pass them, but they cannot be considered objective.

The beauty of selection bias is that someone, somewhere, is likely to conduct an unscientific survey that will support whatever theory you have. So just search the web for the poll you want, or create your own.

Choose results that support your ideas

Since statistics use numbers, it seems to us that they convincingly prove any idea. Statistics relies on complex mathematical calculations that, if mishandled, can lead to completely opposite results.

To demonstrate the flaws in data analysis, English mathematician Francis Anscombe created. It consists of four sets of numerical data that look completely different on the graphs.

lie with statistics
lie with statistics

Figure X1 is a standard scatter plot; X2 is a curve that first rises up and then falls down; X3 - a line that rises slightly upward, with one on the Y-axis; X4 - data on the X-axis, except for one overshoot located high on both axes.

For each of the graphs, the following statements are true:

  • The mean of x for each dataset is 9.
  • The mean of y for each dataset is 7.5.
  • The variance (spread) of the variable x - 11, variable y - 4, 12.
  • The correlation between variables x and y for each dataset is 0.816.

If we saw this data only in the form of text, we would think that the situations are completely the same, although the graphs refute this.

Therefore, Enscombe suggested that you first visualize the data, and only then draw conclusions. Of course, if you want to mislead someone, skip this step.

Create graphs that highlight the desired results

Most people don't have time to do their own statistical analysis. They expect you to show them graphs summarizing all of your research. Well-designed charts should reflect ideas that fit reality. But they can also highlight the data you want to show.

Omit the names of some parameters, slightly change the scale on the coordinate axis, do not explain the context. So you can convince everyone that you are right.

By all means, hide sources

If you openly cite your sources, it is easy for people to verify your findings. Of course, if you are trying to get everyone around your finger, never tell how you came to your conclusions.

Usually, in articles and studies, references to sources are always indicated. At the same time, original works may not be provided in full. The main thing is that the source answers the following questions:

  • How was the data collected? Were people interviewed by phone? Or stopped on the street? Or was it a Twitter poll? The method of collecting information can indicate certain selection errors.
  • When did they meet? Research quickly becomes outdated and trends change, so the timing of information gathering influences conclusions.
  • Who collected them? There is little credibility in the tobacco company's research on the safety of smoking.
  • Who was interviewed? This is especially important for public opinion polls. If a politician conducts a survey among those who sympathize with him, the results will not reflect the opinion of the entire population.

Now you know how to manipulate numbers and use statistics to prove almost anything. This will help you recognize lies and disprove fabricated theories.

Recommended: