231 – Unexplained Variability – IDEMS International Community Interest Company (CIC)

The IDEMS Podcast

231 – Unexplained Variability

00:00 / 23:20

January 27, 2026

Description

Lily talks with Roger about an agricultural experiment in West Africa that revealed the impact of termite mounds on crop yield data. The discussion focuses on handling unexplained variability and the importance of recognizing outliers. Roger explains the necessity of removing certain plots to reduce data variability and achieve clearer results.

Transcript

[00:00:06] Lily: Hello and welcome to the IDEMS Podcast. I’m Lily Clements, a Data Scientist, and I’m here today with Roger Stern, a statistician for IDEMS. He’s worked previously for ICRISAT to support analysis of data in their research.

Hi, Roger.

[00:00:19] Roger: Hello, Lily.

[00:00:21] Lily: So I believe today you’re going to tell us a story about some of your experience in the West Africa region. And there’s one in particular on termite mounds?

[00:00:31] Roger: There is. This was an interesting experiment with a researcher, he was then doing his PhD. He’s now become a professor of agriculture at the University of Kassel, and I hope he might give a podcast in the future. He’s called Andreas Bürkert.

[00:00:49] Lily: I believe that we did some work with him on the ProRuwa course.

[00:00:53] Roger: Absolutely, he was the person that led the ProRuwa project.

[00:00:57] Lily: Yes, which we’ve previously referenced on the podcast.

[00:01:01] Roger: So, in doing the experiments, we worked together on this and he was enjoying learning a bit of statistics and I was learning a lot about agriculture from him. He’s quite an exceptional scientist and his experiments were usually very sensibly designed, and he measured a lot of important things. And in the particular field that we had, I’m not sure what it was, let’s assume it had something like a dozen treatments, which were factorial, say two factors, one at three levels and one at four levels. And so there were maybe 48 plots.

And he was doing something with me that often statisticians, I think fail to do. We went off quite often to look at the actual plots in the field, and we noticed that there were a few patches, maybe two or three just, that the plants were growing rather differently. They were growing extremely well.

And when we looked more carefully, there was something like six plots that were clearly affected by previously having been termite mounds. And the termite mounds had been cleared away, the plots were cleared, but it was extremely fertile, those sections. And they cut across different plots.

And then when we looked, the yields were really much more determined by the presence or absence of termites, which wasn’t a treatment, than it was by the actual treatment factors. So the question was with the 48 plots, what to do about it. And we found to our surprise that this was rarely considered. So I think it actually featured in one of the publications for his PhD of what he reacted. And what he did was relatively simple.

Namely, we were interested in the factors in the experiments. And we weren’t interested in termites, and therefore, we simply noted all the plots that were affected as being termite mounds. We now had, instead of 48, we only had 42 plots left. Of course, the experiment was now very unbalanced because we wanted to omit those six plots.

But we can analyse, things don’t have to be neat and balanced now for a good analysis. So we just analysed the 42 and we carefully reported the results that the results are only applicable to non termite areas. When you’ve got termite mounds, then all bets are off and you may have different results. So if you like, this was a way of looking at extremes and I really like the fact that you don’t look at extremes as a statistician and look for oddities in data. You look for reasons for oddities.

And when we looked at the fields, there was an obvious reason, nothing to do with the experiment. And the results were very clear. Had we left them in, we would just have added a lot of variability. We would’ve had more general results, which applied to termite or non termite, but we wouldn’t have had such clear results because the existence or not of termite was just messing up the data.

So in the interests of minimizing the unexplained variability, we confined our analysis to non termite areas of land. And anybody, any farmers who had termites would have no information on which of the treatments they should plant there. But we thought that was better.

[00:04:55] Lily: Yeah, absolutely. So there’s a few things that I want to pick up on this story here. Firstly, you say that it became unbalanced. So I just wanted to kind of highlight about what that means and why that is important. So could you just explain, what do you mean by the experiment became unbalanced when you removed six plots, and what could have been, why does that matter, why is that important?

[00:05:20] Roger: Well, it used to be very important because when you design experiments, especially on research fields, you’ve got the chance of designing neat experiments and there isn’t any reason why you have unequal replication and you repeat certain treatments three times and other treatments five times.

Whereas, in the real world, you often find that the replication is unequal and as things are in blocks, you might have things unequal, scattered around these blocks. And this means that just getting simple means may be misleading because the data are what we call unbalanced, and therefore you have to adjust the means.

And when you have to adjust the means for this lack of balance, the arithmetic gets a little bit more difficult. And when I started, we were still in the era where people worried about the hard arithmetic, whereas now I hope we’re past that stage and recognising that we’re never going to do the analysis ourselves, we’re going to get analysis done by some software. And therefore the fact that we have to do slightly more complicated calculations to get rid of the bias is no problem at all. So that’s what we did.

[00:06:43] Lily: Nice. Okay. I see. So you wanted it, I mean, in an ideal world, we’d have kind of everything evenly spread, you’d have 50% with your treatment and 50% not with your treatment, or say you’ve got a sloped hill, you don’t want to have some of your treatment that you’re trying to see if it results in a better yield, you don’t want to have that at one end of the hill, which might have more water because then you don’t know if the outcome is because of the location or if it’s because of the treatment itself.

[00:07:17] Roger: Yeah, I mean, taking your example, supposing we thought of the experiment in two blocks, one which is at the bottom of the hill with lots of water around and the other at the top of the hill. And let’s have six replicates of our treatments. And if we had three of them at the top and three at the bottom for each treatment, then everything is nice.

If on the other hand you had four at the top and two at the bottom of one, and four at the bottom and two at the top of the other, then you are now starting to confuse the water with the treatment. You could therefore try and adjust for this. Those adjustments are reasonably automatic and quite sensible.

So instead of taking means you get slightly adjusted means adjusting for the number that were close to the water for each treatment. And it’s so easy to do now, so it isn’t a problem. It wasn’t a problem then, but I was still working in the era when the easy analyses were for balanced experiments, and once you’ve got balanced experiments, usually you just take the arithmetic mean, just add up those and divide by the number. So you are just adjusting it a little if it’s not balanced.

[00:08:41] Lily: Nice. So in your example, instead of say this water, instead of having, say a sloped field and some of it near water and some of it not, you had termite mounds, some of your treatments were on termite mounds, and some of them weren’t, or some of your even control potentially was on termite mounds and some of it weren’t, which meant you couldn’t then see is this difference in yield that we’re seeing because of this extra fertility from the termite mounds, or is from the treatment itself?

[00:09:12] Roger: The slight difference between the two examples is that in your water and not water I’m sort of assuming quite a lot of the treatments were in water and others were not in water. Therefore, adjusting for the water is a good idea, and you keep all your plots, but you adjust for the water.

In my case, it was much more like an outlier, it’s as though there were just six plot, there were 48 plots and six plots were a bit more submerged in water than the other 42. Now having the analysis on the water plots doesn’t become very sensible because there weren’t enough of them. The usual case of adjustment means that we bring in that factor, if you like, of near water and away from water, and that fits nicely into our experiment.

My example is the more common one when one replicate was trampled on by elephants and we had a four replicate experiment, and then we say, oh, we can’t use that replicate. But anyway, it’s okay as a three replicate experiment. We don’t have to try and keep all four replicates. So you have that option of leaving out. And then the results don’t apply to elephant trample data and you just have to accept that.

[00:10:34] Lily: I see. So let’s say that you didn’t go to the field and you didn’t notice this difference, ’cause you kind of explained it in a way that sounded like you noticed this difference in the yields visually, by saying, okay, well these ones here seem to be doing a lot better, these plots here seem to be doing a lot better than those plots there, and then realising the termite mounds.

[00:10:55] Roger: And it is the fact that because these were very professionally managed fields, we did know the history. So we could now go to the farm manager and learn about the history of those. This is like in mapping that you have the map of the place and those maps showed very clearly the reasons. It was quite obvious, we didn’t actually need the maps in this particular case, but had we needed it, these results had been mapped.

And that’s the reason quite often you would do uniformity trials as a component of your experiment, as a sort of way of mapping to understand the innate fertility of different plots, and then you could adjust for that fertility. In this case, we could see it visually, it was obvious once we started looking for it. Had we not, then we could have gone back to the farm manager and found the history of this, and then that history would’ve showed.

’cause these termite mounds can become huge and very obvious, so you do know about them when you then prepare the field for experiments.

[00:12:05] Lily: Yeah. So I imagine then, if this data was to stay, if you were to keep these termite mounds in, then you’d have huge variability within your data, which you wouldn’t be able to explain because as you say, it wasn’t from a treatment, it wasn’t from something that you had measured in your experiment. You had measured a few things, or you had, in your data as it were, had a few things that you were looking at, and whether it was a termite mound or not was not one of them.

[00:12:33] Roger: The only trouble we get in the analysis is when you get a rather pedantic supervisor that says when you’ve got 48 plots, unless you’ve got a good reason, you should analyse all 48. Let me give you a sampling example, which I found, which has similar characteristics. I’m obsessed with people measuring things about plants, and I once had the question that you’ve got a hundred plants in each plot and you want to sample 10 of them, but some plants have died. What should you do?

And what they had done was just carried on with the 10 plants at random because that’s what random numbers do, and recorded them. And of course, the plants that had died had a zero value. And I’m obsessed by the fact that people seem to give them problems with zeros. Instead of doing that, maybe you could learn separately about how many plants had died, and then worry about the weights of the living plants. But let’s do that in a separate podcast.

[00:13:41] Lily: Sure, no, very interesting, and interesting to see how it can link in here. I guess I wanted just to steer us to the variability point of, I guess, unexplained and explained variability or accounted for and unaccounted for variability. My understanding is when it comes to statistics, when it comes to these trials, what we’re ultimately trying to do is explain or account for the variability that we see. So we want to be able say, okay, why do we have a yield that’s different for one treatment than we have for another treatment?

[00:14:17] Roger: That’s correct.

[00:14:18] Lily: And so what then is the issue with unexplained variability or why is variability important?

[00:14:26] Roger: Well, for me it’s important because, I think for you as well, that’s our life, namely, if we don’t have variability, then we don’t need to have replication. We just measure each treatment once and we understand which one is better than another, and it’s always better.

Now, in practice we find that different plots or different people or different animals might be better because of the treatment, but they might not, that we have variability in our data. And we want to distinguish very much between the variability that we can understand because of something we’ve done and the variability we don’t understand because it’s just there. And we want to minimise the variability that we don’t understand.

[00:15:19] Lily: But you can have several plants next to each other, all with the same treatment and they can just very naturally be different. But then in your example, there’s variability that we don’t understand and we want to be able to explain because it’s not from something natural, it’s from these termite mounds.

[00:15:39] Roger: And this comes very much to the study of outliers as well. Everybody worries about outliers and the main way to cope with outliers, or the main way I like to cope with outliers, is to have a reason for why they’re outlying observations and not representative of the ordinary plots in the experiment.

The minute you have a reason you can do something about it, and that’s much better than just saying that seems rather big so it must be an outlier. So having the because is important. So the reason that my termite mounds are more like outliers than your flooded fields at the bottom, or your wetter fields at the bottom, is that sometimes the variability can be taken into account and you can still analyse all your data.

Other times it makes sense either to just omit the outlier or to analyse it separately. So for example, what we didn’t do but could have done is to have analysed the data on the termite mounds to say which treatments did well for the termite mounds.

And quite often that’s quite useful, even if it’s small, to get an indication, particularly if the termite mounds had made things behave very differently in a way that was interesting, even with a small experiment. This is where you might have thought, this is not a good experiment for termite mound data, but it might serve as a pilot. Usually outliers you have them individually and you discuss them.

We had something like six and we could discuss them together as a group. Once you have 20, then it almost becomes better to think of it as two experiments, and this becomes another factor in the experiment, so you do include it. With six, you could include it as a sort of pilot and say, I wonder if there’s anything interesting about the effects of my treatment on termite mound data?

So sometimes, and this is where whenever you get the data, you are looking for things in your objectives that you couldn’t do, and you are also looking for opportunities. For example, it might be that you had an experiment with a lot of bird damage. If you had bird damage on say, six plots, which would be like the termite mounds, then that might be the same, you just leave them out.

But you then might find that the bird damage was much more on one variety than another variety, and that would become very interesting that it’s only on a small case, but you are learning a little bit about varieties and birds and you’re taking advantage because it’s not very many years you have the opportunity of bird damage. So there’s an example where you could turn it round to your advantage.

[00:18:34] Lily: Yes, yeah, that’s a great example. So you want to understand the reason for this kind of unexplained variability, termite mounds, bird damage, or elephant stampede. But you want to be very careful before removing that data or when explaining it, you know, maybe the elephants, or maybe a pest, particularly like the taste of one crop, in which case that’s very useful data to have because that would help inform farmers and people and so forth to make decisions on which variety is better for my situation.

[00:19:09] Roger: Yes, and this answers the question for some people more generally about outliers. I often get people coming to me and say, with an outlier, should I leave it out or keep it in? And I say, if you are going to leave it out, it doesn’t mean you ignore it, it means you treat it separately. So the question is more, should I treat it separately or should I treat it with all the other plots?

And often that is better, I’m not hiding the fact I had outliers. I’m trying to understand why they were outliers and explain them separately, but if I just leave them in as a plot, I’ve got lots of unexplained variability.

And it’s amazing how much a single outlier can contribute to the variability. So often you want to leave it out and then you say, but this is where it’s so much more helpful to be able to say, I’m leaving it out not just because it was large or small, i’m leaving it out because it was large and.

[00:20:11] Lily: Yes, and so there’s that link there between your kind of outliers and your unexplained variability. You want to be able to explain the outliers, explain that unexplained variability so that you can then remove it. Because otherwise it could be that actually this unexplained variability, these outliers, we can’t explain. And that could be problematic because you want to be able to have as much information as you can.

[00:20:42] Roger: Yes.

[00:20:43] Lily: Very interesting and very interesting to hear how outliers link in with unexplained variability because that’s not a direct link I’ve done before in my head, I guess, outliers are in a way their own variability that you’d want to be able to explain.

[00:21:00] Roger: Yes.

[00:21:01] Lily: Is it incorrect or is it correct, but it’s because of this extreme event or so forth?

[00:21:08] Roger: The word because is really important in your analysis, this was an outlier because.

[00:21:14] Lily: Excellent. No, thank you very much Roger. So this has been really insightful. It’s really nice to get a kind of hands-on example of the real world where this happens. I think a lot of the time with experiments, you want them to be in really controlled conditions where everything is perfect. But in reality you do have birds and pests and termite mounds and water.

[00:21:42] Roger: And this is where in the more common experiments now, which are on farmer’s fields, where fields are large and there’s a lot of variability, you do want to be able to take account of the variability in the plots to understand the data and be prepared to omit plots or at least explain them.

Andreas was famous at the same time doing something which was very difficult at that stage, and that was taking a photograph of your field. And he actually used kites with a camera on them and they were tethered to a camel or something like this, and he managed to take many photographs of the fields going above. Now taking photographs is so much easier, so, really very useful and important, not just to take selfies of people, but to take important photographs of plots. Then you are understanding your data so much better.

[00:22:44] Lily: Yeah, the original drone.

[00:22:46] Roger: That’s right. A camel driven drone, yes, which was an expensive camera sitting on a kite, which was flown up and was really very much harder than a drone because of the wind.

[00:22:59] Lily: Yeah. Yeah. Probably much harder to control. Thank you very much.