Description
Social impact scientist and anthropologist Lucie Hazelgrove Planel joins Roger Stern to discuss the intricate process of designing agricultural experiments. Using a real-world example involving 10 maize varieties and a field with 12 plots, they explore the challenges of fitting theoretical models to practical scenarios.
[00:00:07] Lucie: Hi, and welcome to the IDEMS podcast. My name is Lucie Hazelgrove Planel, I’m a Social Impact Scientist and anthropologist, and I’m very pleased to be here again with Roger Stern, continuing our series of conversations about research methods in agriculture.
Hi Roger.
[00:00:23] Roger: Hello, Lucie.
[00:00:25] Lucie: So we’re going to discuss today an example, which really highlights, I think, that designing an experiment isn’t always as easy as what they sort of lay out in a textbook. There’s often questions that come up and decisions to be made. I think you can perhaps introduce this better than me, though.
[00:00:44] Roger: Okay, let me try. This came up as an interview. In fact, it was an interview of statisticians, but it was introducing some very simple design questions and I wanted to know how the statisticians coped with it. So I think it becomes clearer if I give you the example.
The example relates to the idea that before the season you plan your experiment, and this was going to be a very simple experiment with 10 varieties of a crop, let’s say maize. And the experimenter assumed that he wanted four or five replicates. And he then went off to a meeting with the farm manager to discuss where he could do this experiment.
And the farm manager told him that with the usual plot size, the area would fit very well with five replicates, but it was a little wider. And so it would fit nicely with 12 plots. What would the experimenter like to do? So five varieties and 12 plots going across, and, sorry, 10 varieties, but there’s 12 plots going across and there’s five replicates.
And he was also informed that with the slope of the land, the obvious blocking factor was to have the blocks going across. So at the top five blocks, each of 12 plots, would be the standard experiment. And I’m assuming now that the experimenter recognised that he had 10 treatments, 10 varieties, but now he had 12 plots per replicate, if you like, or per block at least.
And one of the things is to not confuse replication and blocking. And because everything’s always in a very simple randomised block where if there are 10 varieties, there’s 10 plots, people often confuse those two topics. So I asked the statistician, and could equally have asked an experimenter, what are the options? What are the possibilities? Now you’ve got 12 plots going across rather than 10.
[00:03:06] Lucie: I think I’ve got completely confused between all of these replicates, plots, blocks and then varieties.
[00:03:13] Roger: Okay, so there are 10 varieties, which are the 10 treatments that the experimenter was thinking about before starting.
[00:03:21] Lucie: Yep.
[00:03:22] Roger: There are blocks of size 12, not 10. And there are five blocks of size 12. Think of them going across, so there are 60 plots in experiments, so five twelves, and they’re going across, and he has 10 varieties.
It’s not quite so simple because there’s 12 plots and 12 isn’t the same as 10. So that gives him some interesting opportunities both for confusion and for doing better research. So what are those opportunities?
[00:04:00] Lucie: This is a question that I think quite a few researchers, agricultural researchers will come up with, you know, each time that they want to design a trial or set up a trial, the field is not exactly the size that their blocks evenly fit into.
[00:04:15] Roger: I had three candidates who were answering this question, and I had an agenda that I hoped they would fit in with. I want to tell you the person that didn’t do so well, and he said that’s not possible. If you’ve got 10 varieties, then the block should be of size 10, and therefore that’s all you can have.
And I said, but the blocks are of size 12. What do you think? And he said, it’s not possible. You can’t have that. So he wasn’t ready to give much advice to the experimenter if he said that’s not possible, because that’s what he’d been offered. And so I then said to him, well, would you like me to give you one possible solution? And he said, oh, yes, please.
And I said, well, one thing is you could ignore the last two plots in each block and still stick with your 10 plots. You don’t have to use the whole field. Oh, he said, yes, you are right. And I said, can you suggest anything else? And he said, no, that’s what people should do. So he was therefore giving his recommendation, having been pushed into it, and I felt there were a lot more options that people have. So let’s think what else you could do.
I then tried to push him a bit further. I said, well, often breeders, they thought there were 10 varieties, but if you told them there were 12 plots, they say, oh, that’s nice, I can add another two varieties that I didn’t think I was able to. So another possibility is to add two more varieties and therefore do a better experiment because you are looking at 12 varieties rather than 10.
And he then said, oh yes, that’s good. So I said, which one would you do? He was very confused and I was convinced that he was not a good statistician to help experimenters, but also for the fact that he only wanted one solution.
[00:06:01] Lucie: It sounds like he was really sort of restricting himself to only what he was perhaps taught or only what he saw in books. This is the ideal situation, this is the easiest situation.
[00:06:12] Roger: Yes. Even though I’d given him two varieties, which link with the books. Namely, ignoring two plots means it’s a simple randomised block experiment, which he was familiar with. Adding two more varieties means it’s a simple randomised block experiment. And the fact that he even found these two alternatives were confusing meant that he certainly couldn’t have the job. However, I felt there were many other possibilities. And I was keen within a few minutes what other people would consider.
Let me give you one more. Often when you are doing a trial, there may be a control, which is really rather important to you. And you now think, oh, with 12 plots, I could repeat the control on three of the plots in each block, so I’ll have nine of the treatments, which are, if you like, my test treatments, which is one each in each block, and I’ll have the control three times in the block. That would seem to be worth considering.
Or maybe there are two controls and you have each of those two controls twice each in the block, so you stick with your 10 varieties. Now I was interested in that solution because as I say, many people seem to confuse blocking and replication. The minute that you have, let’s say, the control repeated three times in each block. When you say, how many replicates are there?
Now you have to say, do you mean of the control? In which case with five blocks, there’s 15 replicates, or do you mean the other nine varieties? In which case there’s five replicates. So the replication isn’t equal. And that is quite acceptable. Another possibility is that you have five replicates of most of the varieties. Another possibility would be to repeat two of the varieties in the first block and two other varieties in the second block.
[00:08:10] Lucie: So then you can understand a bit more, because I think you mentioned that the field was a bit on the slope. And so if you are repeating two of the varieties within the same block, then you can understand, well, is it really the slope that’s affecting it, is it the block that’s affecting it?
[00:08:25] Roger: That’s right, and when I say how many replicates do you have, well, actually, you’ve added another replicate. You’ve used those two extra plots to effectively add another replicate. So although you’ve got five blocks, each variety is repeated six times.
[00:08:40] Lucie: Yeah.
[00:08:41] Roger: Now, you have actually done something which people get rather frightened about, a randomised block, is very balanced because every variety is once and only once in each block. Whereas now, two of the varieties are repeated in the first block and two different varieties are repeated in the second block. And therefore you have actually six replicates.
So now that option is totally destroying the myth that blocks equal replicates. You’ve got six replicates and five blocks. I was hoping the statisticians would suggest that’s quite interesting. And the fact it’s not quite so neat and balanced is no longer a problem because you’re going to have a computer to be able to do the analysis and they can cope with this very well indeed. That’s yet another option.
I also would like them at least to query, maybe they would benefit from having the plots a bit bigger than normally they have. So actually, although it’s 12 normal plots, it’s 10 slightly larger plots, or the extra space in these larger plots gives more space for guard rows, and that’s quite useful if some of the varieties are very tall and others are rather short, that you don’t have shading problems.
So I felt that what I wanted was for the advice to show lots of options, and to show instead of the field being 12 plots complicating, it was actually adding many opportunities. More dramatically would be to question, with your 12 plots, a lot of studies don’t have just one factor, they have two factors, would you like instead to consider six varieties at each of two fertilizer levels. So each variety has a rather more fertile plot or something with some fertilizer added or weeding done in a different way, to see whether you add another factor.
And so I find that quite a lot of the time variety trials don’t look at anything else. In fact, they have to look at everything else because if you take weeding, they have to select a certain weeding scenario for all the plots, and so their results, if weeding is very important, then their results only apply strictly to that sort of weeding scenario. Would you like to consider weeding as a treatment and maybe at two levels of not weeded so much and weeded rather more and have fewer varieties?
So I just wanted the statistician to say, I could make lots of suggestions and then it’s not up to me to decide, it’s the experimenter to decide given the priorities and the actual research and the objectives of the research. That was the puzzle I was giving, and that was what I was hoping for.
[00:11:40] Lucie: And I think you did have some candidates who were able to sort of see the value in this extra space in the field, which they could do something with, they saw it as extra space which they could do something with, I think.
[00:11:50] Roger: I was relieved that the first candidate is the one I explained to you, and I thought, what is going on here? And the other two candidates were just perfect. And one of them eventually took the post and was very good and very useful, and the other one could have taken the post equally, and I was very relieved. And they answered the question within about two minutes perfectly, gave a lot of alternatives, but also said what’s most important to me, the statisticians shouldn’t be telling the researcher what to do, the statistician should supply some options as to what the possibilities are, and then let the researcher decide, depending on the objectives and the priorities.
The statistician is there to open the mind, if you like, and also often I find to reassure the person doing the experiment that if the experiment doesn’t happen to be quite a neat, randomised, simple, randomised block, that he’s still going to be there and the analysis is also going to be equally easy.
[00:12:59] Lucie: Exactly. And I think this is a really important point, that, just because you know, what you’ve been taught perhaps in your studies is that this is the sort of ideal design, it doesn’t mean that it is the only one possible, and it doesn’t mean it is the best one, because as you’ve been giving examples, there’s many different opportunities, I think you said, it gives and takes, you know, each sort of different possibility comes with some benefits and some disadvantages. And so as a researcher, you always need to be thinking through, well, which decision will I take, but being clear about why too, I think.
[00:13:32] Roger: That’s right. And I think a corollary to this, which is why I enjoyed the question, was that many researchers think that statisticians can help tremendously on design. And often the statisticians feed this almost misconception by, if there’s a problem with data, they say, why didn’t you come to me earlier?
[00:13:54] Lucie: Yes.
[00:13:55] Roger: And you could have come at the design stage. However, very few statisticians are willing to admit that their training is very much more on the analysis of designed experiments rather than emphasizing the concepts of design. So this is a question to them very much on the concepts of how to design an experiment.
[00:14:18] Lucie: Absolutely. And that’s making me think though, that, you know, as a researcher, if someone isn’t that sure, or has this sort of situation, then one of the best things to do is just to discuss it with somebody else, whether that’s the farmer that’s going to be involved in the trial, what do they want to do with that added space, whether that is with a statistician, as in your example, or whether it’s with other researchers who perhaps have other experiences.
[00:14:42] Roger: That’s right. However, adding in the issue that one of the things I feel very strongly about is people spend a long time on the design. There’s a tremendous amount of time spent on the data collection. There’s very much less time spent on the analysis. Could we get a bit more balance in the time spent on the data collection in particular and the time spent on the analysis and reporting? I think that would help research a lot.
[00:15:10] Lucie: That’s an interesting point. Okay. Well thank you very much Roger, for an interesting story and a word of warning for statisticians who are interviewed by you.
[00:15:17] Roger: Exactly.
[00:15:20] Lucie: There’ll be trick questions!
Thank you very much, Roger.

