In science and life we always use the average. In science we use statements like; fat people have a higher chance on diabetes, men are taller than women, that school is better than this school. In all these statements, the average is used. The average is one of the most common statistic to use. In life we often use sentences like; I eat 2 pieces of fruit every day, I like that person, or my past year was great. We use the average to say these things. If I ask you, how was your week? And you respond with “good” you have aggregated all the experiences from that week into one word, which is “good”. Thus, you take the average experience of all experiences and label it one word: “good”. Without knowing it, you are using the average quite often! But what is the average? When you calculate the average, you divide the sum of the values in the set by their number(link). But what is lost by using the average as answer or in a statement? What are the downsides? I believe there are many things lost with taking the average. In this blog I will describe what is lost, and what solutions for aggregation might be.
What are the downsides of the average?
By taking the average, you lose many valuable insights such as growth and deviation. Secondly, you are also discriminating things when you take the average. And we say NO to discrimination right? Below I will further elaborate on the downsides of the average.
Suppose it is the Christmas eve and you uncle asks you; how was your year? In your mind you start thinking; well from January till March I was very depressed, then I got into therapy and from April till September I was doing “meh”. I learned from the therapy and became very happy, from October until today. So on average; you have to say “Meh”, your year wasn’t that great. Nevertheless, you grown a lot over the past year. The fact that you lost your growth by taking the average is quite sad. Many people, businesses and markets have to deal with this problem. People just want quick statistics that they can understand. The average does not always matter that much, often the growth of something or someone is much more beautiful and important but neglected.
A year later, you meet that uncle again on Christmas with the same question: “how was your year?” On average, your year was okay. Most months you were feeling okay, however, in May you lost your job and you felt awful. Nevertheless, you recovered quite quickly. In July you got a new job, you were feeling awesome, but soon after you were feeling okay again. Thus on average, you felt okay over the past year. By taking the average, these “special” moments got lost in the average. In science, the standard deviation is used to show how the data-points spread around the average. In bar-charts, a vertical line with a cap (see next figure) is often used to show the standard deviation visually. However, in the Pop-science articles, this bar is often neglected because it is too difficult to many readers. The third situation “the best situation” shows you exactly how many, and how far all the measured value’s deviated from the average (below and above the average). However, with many data-points this view looks kind of messy thus this is not often used in scientific articles.
More problematic situations occur in real life caused by taking the average. At first sight, saying that Europe is more developed than Asia is not such a bad thing to say. Here you take the average development of Europe and compare it with the average of Asia. However, people from Japan and Singapore might be very pissed of, because they are very well developed, maybe even better developed than Turkey or Romania (is this offending someone?). The average does not have to mean that all countries in Europe are better developed, just that on average these countries are more developed. Some of the countries might be less developed but those are just a few. This is a large and important topic. I am planning to write a full blog about this topic, so I won’t elaborate on this problem much further.
How do you take your average?
If I ask myself, how many pieces of fruit I eat every day, I would say about two. But on what numbers did I calculate that value? Did I take my average of the past week, the average of the past month, or even the past year? When you give an answer to this question, it is up to you which period of time you use. You can pick whatever suits you when taking your personal average. Usually, I think people pick the period that benefit their situation. But of course, the questioner could also be more specific. I track a lot of data, and I learned that my averages are often very different than I expected. When someone ask me these kind of (average) questions, I often feel obligated to take the whole period that I was tracking it, and with fruit this is 3 years of tracking. No one will ever take the past three year as their average. However, on intuition, I would say I eat two pieces of fruit a day. Because most of the days, I eat two pieces. However, on some days I eat less than 2 pieces but there are even fewer days that I eat more than 2 pieces on one day. This means that my average amount of fruit-intake is less than two pieces a day. I have this phenomenon with quite a few situations, with sleep-duration as well.
Other useful concepts: the mode and median
In science, there are other value’s that are more useful in these situations. These value’s are the mode and the median and are similar to the average, but a little different. The median is the middle number of the whole set of data (which would be probably 2 in my fruit-dataset). While the mode is the number that occurs most often in the dataset, which would be 2 as well in my data-set (link). Our mind does not really think in data-sets, and hardly anyone uses the median and the mode. So we work with averages, which is often a better estimation method.
What can we do?
You need to make countless decisions in your life. If you need to choose between two (or more) things, you calculate some kind of average in your mind and pick your choice. When you are answering a quick question, you can’t always elaborate about it for hours. You need to make decisions, you need to discriminate. Otherwise, you’ll just sit there, and you won’t do anything at all. Nevertheless, when using an average you can hurt many people or misinform people. There are several solutions for this problem, the fist one is to stop using the average to make a statement. This is a very harsh decision because the average is quite useful to make an argument. Therefore, when using an average statement, you should use additional sentences to support or mitigate your statement and note that these are important. You can also decide to be less satisfied with the average you know, especially if the thing that you are talking about is quite a big-thing (like a race, continent, gender, etc.). If you believe that there are many sub-groups present within these groups (like with Japan in Asia), you might mitigate your statement, or not make a statement at all. Not making a statement is not a bad thing! People always say that using adverbs are bad and make you look less confident, but I believe we shouldn’t be that confident most of the times. If you use words like “about”, “close to”, or “on average”, you are actually using standard deviations explicitly in your sentences what reduces the risk of making the strong discriminating statements which aren’t true.
What can science do?
In science we can use different methods that don’t use the average making them less discriminating. Correlations are less discriminating, techniques like multi-level analysis, and mixed-model analysis is also less discriminating than t-tests, repeated measures, and other statistical tests. Nevertheless, I believe that all scientists know the meaning and the problems of the average. They are aware that it isn’t perfect. The threat is the use of scientific articles in popular science magazines. These magazines really take advantage of the average. Therefore, I think scientists should correct these magazines when writing incorrect conclusions, and scientists should be careful when making statements towards journalists.