Description
In this episode, Lily and David discuss the concept of proxy variables and feedback loops. They explore the use of proxies when direct measurement is impractical, using examples from agriculture and education. The discussion includes the pitfalls of educational performance metrics and university rankings, demonstrating the complexities and potential misinterpretations involved in using proxy indicators.
Lily: [00:00:00] Hello and welcome to the IDEMS podcast. I’m Lily Clements, a Data Scientist, and I’m here with David Stern, a founding director of IDEMS. Hi David.
David: Hi, what are we discussing today?
Lily: I thought today we could discuss proxy variables and feedback loops.
David: Sounds good. Where’s this come from?
Lily: Good question. It’s come from two places, which I guess are linked. So firstly, we’ve been writing this MMI course, the Masters in Mathematical Innovation course. And there’s some bits in there on using data responsibly and on AI and whatnot. And then that led me to the book by Kathy O’Neill on Weapons of Maths Destruction, which I’ve then been reading and that really digs in quite heavy on proxy variables and then hand in hand with that, feedback loops.
David: Absolutely. And I think we should dig into what is a proxy variable.
Lily: So I guess my definition of a proxy would be that if you’re trying to measure [00:01:00] something and you can’t measure it directly, you can’t observe it directly, then you might use something else which is correlated with it as your kind of stand in measurement.
David: Yeah, absolutely. I mean, my favorite example of this comes from some work in West Africa where we had researchers working with lots of farmers and they were struggling to get the weight of the millet yield. When you have a small plot of millet and you wanted to harvest it and then know how much yield you had from that plot reliably, they were really struggling to get reliable measures of the grain weight.
And they found that if they measured the weight of the millet heads, the seed heads where the seeds are on, like a cob of maize. If they measured the weight of the heads instead of the seeds, it was much more reliable. Partly ’cause you could [00:02:00] do it at harvest time. So when you harvested, you knew exactly where you were harvesting, what you were harvesting, you could then do the weighing.
And if you were wanting the seeds, you had to wait until you’re taking the seeds off the heads. And so you had to go back at another date, which was not harvest time to get that weight. And between one point and the other point, someone might have eaten some, it might have got lost, it might have got damaged, pests might have got to it. All sorts of things might have happened in the intermediary time between harvest and when you could take the measurement of the grain.
And what we were worried about, of course, was that there could be quite a big variation in head weight, which doesn’t correspond to the same variation in grain weight. But we were able to take some reliable data that we had and check, and we found that the correlation was extremely high, much higher than the variability we were getting in the yield data.
So the correlation meant that the reliability [00:03:00] of the head data was actually better as an estimator of the yield data for the grain than the data that we had measuring the grain.
Lily: Wow.
David: And it’s cheaper to collect that data.
Lily: Yeah. And I don’t think I’ve heard a case before where a proxy is a better estimator of the thing you want to actually observe.
David: Yeah. It’s a wonderful example where the practicalities of the real world context do come into play. The fact that if you leave a period of time, some of it might get eaten by humans or pests or whatever, then you’ve got a whole headache to worry about conservation.
Whereas if you can measure it at the point of harvest, that just simplifies the whole process and you get much more reliability. Anyway. It’s a beautiful example of a proxy variable, which is really useful.
Lily: Yeah, proxy done right.
And, I guess, I’m seeing a lot of examples in this book and just a lot of examples in general of proxy variables where it’s done wrong, [00:04:00] ’cause it’s getting me to ask myself the question or say to myself, okay, what else could this variable be a proxy for?
David: Yeah.
Lily: I suppose the head, these kind of millet heads, what else could that have been a proxy for? I’m not sure if it’s not necessarily to the yield itself.
David: Well, I suppose the worry we had when we first started looking at this is if you think about the head itself it has two components. It has the grains which the humans like to eat, and then what’s left you tend to give to the animals. And so it is genuinely two different pieces, and you could find that the variability of the amount and the weight of what you give to the animals is so great compared to the grain yield that it wouldn’t have been a great proxy for the grain yield. We just happened to find in that context, in that sort of study with that type of millet, it was a very good proxy.
Now whether this would’ve been true if you were [00:05:00] doing a variety trial, that’s a whole different question because maybe different varieties, you need to do different calculations, because the proportion of grain to animal feed is different.
Lily: I see. Yes. Yeah.
David: And this is where if your research focuses on the proxy, you could then create a variety of millet which seems wonderful but has no grains in it. Because if you’re just measuring against the head and increasing head weight, then it might not be that you are getting more grains. So if you use this as your target, so what are you using this for? What are you using the product for is really important.
Lily: Interesting. Because I guess a lot of examples I’m seeing is where what you are using the proxy for, it could be correlated, but it could also be correlated to other things, or is it that, I mean, it’s not a lack of causation because in your instance there, it’s not that a larger head causes more, [00:06:00] it’s something else that’s causing the larger head, and the more grains.
David: Yeah. It’s not necessarily about there being a causal link between your proxy. This again, depends on what you are using that information for and how you’re using the proxy. These things are very delicate.
Lily: Okay. Okay, so then let’s dig into an example then on, I guess in education on one that I quite enjoyed reading about, which is on these performance measures for schools. So a way that you can compare schools, you’re looking at performance measurements and you can’t really get just a general idea of how good an education is or how good a school is, so you might look at grades. But then grades could very much be correlated to many other things.
David: Not just they could be correlated to many other things, but if you are using grades, and grades is a sensible proxy to school quality. So then the key is that [00:07:00] schools who want to get chosen for that shouldn’t accept weak students.
Lily: Yes.
David: If all schools were trying to use grades as the way to determine how good they were, then you should get rid of all weak students who are going to get the bad grades and make sure that they don’t go to school.
Lily: Or if you have a student that falls ill don’t worry about them for a bit. Focus on those students who are gonna be at their best, the ones that can concentrate.
David: Or you make sure the students who fall ill are not part of your school.
Lily: Yes. Yes, you are ill, you need to leave. You might have students then who can’t concentrate for other reasons. Maybe they’re hungry and stuff. And so we ignore them. We get them to leave the school.
David: We’re being of course extreme and exaggerating the issues. But if, let’s say you associate funding to quality of the school and the quality of the school is judged by grades. These [00:08:00] are some of the things where, actually, those proxies do have real world implications.
Lily: Yeah. And I guess that’s where it might be just the funding is from something directly, but also from something else, like alumni being more successful. If you’ve got a school that was more successful, then you might get funding from alumni after they leave.
David: But that’s a reality of life. This is something where there’s many schools, universities that are built on that model. By giving a successful education, your alumni value the education they’ve got and they reinvest in the institution that gave them that education.
Lily: I see. So you can’t fault the alumni for that, but I guess if you are a government or if you are an organisation deciding who to give funding to and you’re going to give it to a school that’s performing better, that’s where it can go badly.
David: [00:09:00] Well, I’m not saying we can necessarily go badly. Grades is as good a indicator as there is in many ways for certain education systems. But the question of trying to optimise then to that is a very difficult one. Should you be optimising to, you know, if you’ve got a school which is taking in students who are struggling or who have difficulties, learning difficulties, that does not mean that they’re an underperforming school or a poorly performing school.
So you’d need to have another indicator to be able to judge their performance because judging them on the metric, or on the performance of their students, the grades of their students, is not going to give a fair reflection of the work done in that school. That proxy is not serving that purpose.
Lily: But then, as we add in more of these variables, so we’re now adding in more proxies.
David: Well, we haven’t yet added a second proxy for quality. So if what you are wanting is the quality of the school, [00:10:00] we’ve got one proxy at the moment, which is the grades of the students, which can give you a measure of the quality of the school. And we can see that for some schools this could be a sensible indicator because it’s correlated with indicators of success in future life. You get good grades, you have more chance of succeeding. You can see why this would be correlated with the quality of the school.
But you can also take very good examples of schools such as those that focus on students with learning difficulties or that are underperforming students. And such schools would be, could be performing extremely well and not be seen as doing well on the indicator of student grades.
Lily: So you kind of have your grade and your expected grade and you look at how the school is performing around that kind of expected grade.
David: Now you’re starting to try and create a more complicated [00:11:00] indicator.
Lily: Well, yes, and then I’ve also read in other examples of where that’s gone wrong for the schools that are performing better. The second that their students struggle with maybe something else, like maybe they have a personal thing that happens in their life, which then means that they drop a grade, their expected grade is down, and so that teacher looks like that they’ve performed worse.
Or the teacher in the year before them has gone, okay, well, I’ll really help my students out to get them to hit these standards. And then the next year they’ve gone to another teacher who’s not like basically cheated for them, say, or not fudged the numbers. And they suddenly look like that they’re underperforming.
David: Well, I mean, it is not even just about fudging the numbers, but it is about being too indicator related. If you start having indicators that people are pushing towards, then there are all sorts of knock on effects. I must have told you my favorite example of this from Tanzania. One of [00:12:00] the qualities they had, one of the indicators they had for quality of schools there for a long time from a lot of UN projects and so on, was staff student ratios.
Lily: Okay.
David: And so Tanzania did extremely well at bringing down staff student ratios in schools by getting more teachers and by employing more teachers.
Lily: Okay.
David: It built a whole infrastructure where they actually found it was cheaper and there were better ways of developing arts teachers than science teachers. And so by the time I got involved in trying to help with the math education there, there was a six to one imbalance of arts teachers to science teachers. And the whole infrastructure was set up, they’d managed to bring down staff student ratios brilliantly.
So your arts teachers only had four hours teaching a week, whereas your science teachers had 24 hours teaching a week. And of course this creates a system whereby your [00:13:00] arts teachers have lots of time to give to the students and to be generous with their time and encourage them to pursue the arts. And your science teachers are run off their feet. And therefore many students feel why would I want to study science? Arts is obviously, you can see, it’s really enjoyable, it’s much better, you don’t have to work as hard as the sort of science teachers. This seems a much better route.
And so now there’s a whole infrastructure from secondary school on, where this imbalance of arts to sciences is being reinforced. And I’m not saying that you shouldn’t have an imbalance between arts and sciences, and I think there’s a lot of value in the arts and I think there’s all sorts of benefits which have come from this, but there’s a real shortage of scientists in Tanzania because of this imbalance.
And it means that it’s very difficult to address this balance because you don’t have people choosing subjects now to really change that. That was a misinterpretation of an indicator. [00:14:00] Your indicator of school success was staff student ratio. So they pursued that and they created a different imbalance.
Lily: Interesting. Yes. And I suppose then, we talk a lot about transparency, or maybe we don’t talk a lot about it, but I think a lot about transparency in that we want transparency in models. But then when we come to something like this, them having that transparency of, okay, we’re measuring you on, I mean, it’s good that they know that they’re being measured on their student to staff ratio, but then that’s where things could get misinterpreted.
David: Well, it’s not just misinterpreted. There’s all sorts of ways in which this has been shown, this is your feedback loops in different ways, that knowing what you are being measured against means that you can focus on achieving those metrics at the expense of maybe what is best as a whole.
Lily: I see.
David: And so decisions might have been made differently, [00:15:00] you might have made different decisions if there wasn’t that metric or that indicator of success.
Lily: Yes. If you know what you’re being measured on, then your focus goes off that, in this case, off that education or off the actual standard of education and onto hitting those targets.
David: And this comes back to your proxy. If your indicator is a proxy for educational quality, then focusing people’s attention on that indicator might make it a worse proxy, because actually improving with relation to that indicator might mean that it is less of a proxy for educational quality.
Lily: Yes, because now that shows a different educational quality. That’s interesting.
David: Well, because the efforts people are putting into this aren’t necessarily efforts to improve educational quality, they’re [00:16:00] efforts to improve that indicator. And improving that indicator might have been correlated to educational quality, but maybe it isn’t in the future when everybody’s trying to meet that indicator and that indicator now is no longer a differentiator between educational quality.
Lily: And then an example that they went into in this book was with universities. It’s quite American based, but, you know, American university rankings and how that originated, and then goes into these different, essentially, proxies, I think it was about 18 at the start. I don’t know if it still is. These different proxies that were use to decide if a university is good or not, and there’s a lot to say on that. In a way, one of them is like perhaps some reverse engineering. ’cause they already know which ones are the best universities. So they then already decide which proxies to put in anyway.
And so they don’t put in things like the fees because that would make these top tier universities look worse. And so now the universities could charge whatever they want. And they can keep [00:17:00] getting top tier facilities so that they keep fulfilling these models. But what I found really interesting was that going into administration staff say, okay, we know who they want to accept into the university because that will fulfill their model. And so now they’re playing the game.
And then even beyond that, then the students applying, students from certain backgrounds can go and get help with writing their CVs because you can get help with writing your CV because they know what the staff want, the administration staff want, who know what the models want that decide how they are ranked. And so it becomes this whole system.
David: Yeah.
Lily: It initially just started as, okay, let’s rank some universities. I think that they phrase it really well in the book, like second tier news magazine.
David: The thing which I love about this as an example is it is really high stakes, these university rankings. People will literally just choose to go to the highest ranked [00:18:00] university they can get into. That would be the sort of decision criteria that some people will make. So from a university perspective, from all sorts of perspectives, this is really high stakes.
Lily: Yeah.
David: But the nature of the indicators, there’s no way to do this in a way that isn’t something which is totally unbiased. It is always going to be from a perception of value in one way or another, or what you choose to value. And I do like the example of fees of whether or not that is value for money.
These are always going to be questions where if you were to now look at different sets of indicators, different rankings that would come out, maybe you could actually get a more interesting set of decision making. This is where multiple rankings may [00:19:00] give you better results than single rankings, but they give you confusion.
And now from an institutional perspective, which ranking do you work towards? Which is the ranking you prioritise? And so on. It’s a wonderful example because a lot of the negative consequences of that ranking are very easy to understand. But it is also easy to understand how it’s really useful for students to be able to make that decision.
A very interesting example I heard just recently of a very strong African student who is now a professor at a top university in the US. When they were choosing which institution to go to in the US, they didn’t know about the ranking system, they didn’t know how to choose. And so the indicator they used to choose meant that the institution they turned down was a much, much better institution, a much higher [00:20:00] ranked, a much more widely considered institution that the institution they chose to go to.
Now they did that simply ’cause they were ill-informed, but they’d been extremely successful. And this is interesting ’cause this ties in with another piece of research which has shown that in the American system, you are not necessarily better off by going to the best institution you can get into. There are advantages to being amongst the top students in your institution, and if you go to an institution where you’re more likely to be amongst the top students, you could get benefits.
And in particular for the science subjects, they showed that actually strong students at weaker institutions were more likely to stay in science than weak students at stronger institutions. Your weak students at the stronger institutions were stronger than your strong students at the weak institutions.
Lily: Yeah, I can believe that. I just have a side note [00:21:00] story of when I was in year 10 and I was put in a maths set too low for whatever reason. And I was very upset, ’cause maths, I loved maths. Oh, but I had a great time in that set and that’s where a lot of my confidence came in.
David: Yeah, it builds your confidence. Whereas if you’re in a different set, you might have found you just wouldn’t have built the same confidence, you wouldn’t have had that same experience. This is the thing. There’s no way of knowing per se what the right environment is going to be for you at any given point in time.
Lily: And I guess there’s a lot to say and we’re out of time. But I guess one of the things is giving that information of what’s going into the model would be useful because then you could decide for yourself what is important, i.e. is it your student teacher ratio? Is it the… well funding amount doesn’t even come into the model.
I don’t know, I’m trying to think of how do we make a system? Because I understand the usefulness of ranking and the [00:22:00] usefulness of having kind of vague ideas. But it’s hard to summarize one whole university or one whole school into a number.
David: This is the thing, this is your sort of proxy for university quality. Any proxy you take, even if it’s a composite indicator of multiple things, is still not recognizing, for example, that university quality, if you are a student wanting to go to that university, every student would have a different, if you want, measure of how good that university would be for them. And there’s no measurement which can give that. That’s totally unmeasurable.
Lily: And I suppose having these kind of single summaries then means that the universities are shooting for this kind of same goal, and then you lose a bit of that diversity. And students need that diversity for where they [00:23:00] study.
David: We’re getting sucked into that example because it’s such an interesting example and such a good one. But in general, the point which is being made is when you have big, multidimensional, complex problems, whatever proxies you are taking, you are by definition reducing the complexity down to the proxy.
Lily: Okay. Yeah.
David: And the point is that actually this is when that can give feedback loops, which mean that then the systems get set up around the proxies rather than around the actual goals that you have. And that’s the same phenomena that’s been observed with respect to university rankings, with respect to staff student ratios in schools, with respect to all these other things.
They are instances of where the indicator, the proxy for let’s say institutional quality, has led to a [00:24:00] feedback loop where people look to meet the proxy rather than focusing on working the hard work that it takes to build institutional quality. And that’s a really interesting problem more generally.
Lily: Yes. Okay, we’ve run over time at this stage.
David: This has been fun. It’s been a nice discussion. Thank you for bringing the topic up.
Lily: No, my pleasure. I’ve very much enjoyed thinking and reading about it lately. Thank you.
David: Cheers.

