Description
Michele and David discuss the impact of AI in low resource environments. They discuss the complexities surrounding AI technology, the hype versus the actual value, and the potential for AI to either widen or reduce global inequalities. They consider the need for robust infrastructural and social frameworks, the promise of small language models, and the importance of local ownership in AI development.
Transcript
[00:00:07] Michele: Hello everybody and welcome to the IDEMS podcast. I am Michele Pancera, an Impact Activation Fellow, and I’m here today with David Stern, one of the founding directors of IDEMS. Hi David.
[00:00:20] David: Hi Michele. Great to be here. What are we discussing today?
[00:00:24] Michele: So today we are continuing one conversation that we started a few episodes ago, and this is about the impact of AI in low resource environments. And in that episode I was mainly curious about your opinion, your vision. This is a very complex topic because we may be talking about what AI is today and how it is influencing low resource environments in many different ways. But also, we may discuss infrastructure or the future of AI, I’m very open to this discussion.
[00:01:03] David: Sounds good. And as you know, it’s a topic I care deeply about, and I seem to be more and more engaged in discussions around it, which I enjoy.
[00:01:13] Michele: Yeah, AI is the talk of the town, there’s a lot of hive and interest, but I suppose the hard part here is to distinguish what is real and what is imagination.
[00:01:25] David: I think that’s particularly important now that everybody’s starting, well, starting, everyone’s continuing to talk about the bubble bursting. And this is something where, as with the previous times, bubbles have burst related to AI, it’s because the hype has led people to have expectations which aren’t realised, which go beyond the actual value which is added, which is substantial.
Trying to sort of focus in and understand the real value compared to the hyped value, it is, I think, many people’s analysis that the orders of magnitude of money and the orders of magnitude, the scale that things are happening now, doesn’t quite feel right. This feels exaggerated, which I have to say I agree with, mainly because my mind can’t quite cope with quite the orders of magnitude of funding that people are putting in and the belief that that’s going to actually add value.
[00:02:27] Michele: Yeah, I just heard your conversation with Digital Green, and that was very interesting, you also touched on this topic. One of my fears here, and let me know what you think, is that while, I suppose, I’m pretty sure it will be the case, and it is the case today that AI is already bringing value all over the world, my fear is this value will be disproportionately in favour of the rich part of the world, widening the gap that already exists.
[00:02:59] David: There’s very few recent technologies that have been introduced that haven’t done that. This is actually a sometimes counterintuitive fact, especially in education, people often talked about the value technologies would bring to education and how that would reduce inequalities when all the evidence really points in the other direction, and to how actually these technologies do nothing but increase inequalities.
And so that worry is a very valid concern. It is not going to be avoided without very hard work from certain people. And it is something where I have questions about whether it is even possible to sort of build the technologies which do serve the most disadvantaged people and therefore reduce inequalities.
But my instinct from being on the ground in very low resource environments is that, actually, it is possible that we are getting a set of technologies which could, and I tell you what gives me real hope, it’s the fact that all the noise and the hype about the latest and the further advances in AI are not important for that.
Actually, the advances that have already happened, some of them quite recently, are probably sufficient. And so actually we don’t need that latest, really expensive, in my opinion, innovation to be able to serve and to be able to build technologies that could reduce the gap and that could really serve very low resource environments.
Do we know how to do so? I believe not yet. Are there groups working on it? As you said, you listened to the Digital Green episode, I love what they’re doing and they are thinking in the right ways, and they would argue as well that the technologies that are needed for them, they already exist.
I think there is work to be done and, however much I love them, they would agree and I agree, there is work to be done and there’s deep mathematical work in terms of these small language models, the nano models, and so on, all the different ways in which we can get more from less with AI.
But there’s also work to be done in terms of the social structures. It’s the socio tech, which of course we are very passionate about. How do we actually build technology where the ownership related to the technology remains local? And there are groups like Digital Green doing pieces of this. But systematically. elements of this local ownership, enabling community level ownership, understanding what the governance structures could be, what the data sharing policies and structures could be to be able to really serve those communities. That’s work that is just getting started and I really, I want to be part of it.
I believe this is stuff which we are, maybe not uniquely positioned, but there’s not many people who understand the low resource environments, work consistently across different low resource environments, and have the deep mathematical and technical skills to be able to play this sort of interface.
So my hope is over the next few years, this is a space that we are going to actually move into, with others like Digital Green and many others, hopefully, creating the technologies that could be reducing inequality and could be using local ownership in ways which are really innovative and really are looking at changing the way not only technology is developed for low resource environments, but these cutting edge technologies are developed.
And the hypothesis that I have, which is emerging, is that if people are right and the bubble does burst, what’s going to be left afterwards? Well, of course, the really expensive, big things, they can’t be left because they’ll burst. But actually what we’re talking about and what we are feeling is enough in the low resource environments, that might be what gets left over and becomes the technologies that everybody uses, because it is not so resource intensive, it’s not needing trillions of dollars to build. A few million will get you a long way.
It’s something where there’s a real potential to, maybe not re-equilibrate because the equilibrium has never been there, but to actually create a better equilibrium where the AI technologies are better placed and put in their place in terms of the value that they’re bringing to society, which is substantial, but it is not all consuming.
[00:08:03] Michele: Yeah, the big cost at the moment is in training while inferencing is not as costly. And, as you said, I’m very surprised a lot of people talk about the exponential growth of the potential of AI. And that’s, I mean, true, it has been true, it may be true in the future, but that’s not the point.
[00:08:25] David: There was exponential growth in terms of processing power. That was well documented for a long period of time. There is no exponential growth that I’ve ever seen in AI. There’s step growth, there are points, there are innovation points, which lead to step changes. And there have been a number of them, and there was one a few years ago which sort of burst onto the scene because of the, if you want, passing the Turing test as a very simple way of actually articulating what that latest step change was.
But it wasn’t exponential growth. Fundamentally, there’s been exponential growth of the money put into AI, but that’s finite, and that cannot continue. There hasn’t been exponential growth in terms of the real capabilities of AI. You yourself, have gone from chat GPT-4 to chat GPT 5 relatively recently, and there are improvements, but it’s not exponentially better. It’s maybe logarithmically better, which is a really bad sign. But that’s a whole different discussion.
[00:09:30] Michele: But what surprises me more than the growth, and I think you’re right by the way, is that the potential today of the technology that we have at the moment, the AI of today has so many potential applications that can be so beneficial, even if there was no growth at all, which is what you were saying before.
And I think this is partially missed, this is what is missed in the hype. The hype should be, look at what we have at the moment and look at how much value it could bring if implemented in existing environments and solutions and ecosystems.
[00:10:10] David: Well, and I think even that, this is where in high resource environments, the problem is all of this is caught up in venture capital. And venture capital requires exponential growth. And so a lot of that is about, you know, actually can you capture a market? Can you get a market of some form, capture it and really move forward?
I mean, one of the most recent ones, there was a guardian article related to this, which is a very high resource environment innovation, which is your AI buddy, your AI friend, there’s a lot of wearable AI technologies. And so there was a Guardian article where one of the journalists wore their AI friend, they ordered one for the research and they wore it for a week.
And the point that they made at the end of this was that that was the most boring friend. It was like being stuck with the boring person in the room who was just feeding back to you, or who never challenged you, who never really engaged in a very human way. And that is one way to perceive these technologies. But the other way is, well, who is it who then is going to get isolated by that?
Your person who actually has a good social network, has a good set of friends, they aren’t going to be valuing being sat with a boring person in the room. But there will be people for whom having something is better than nothing. Is this going to be positive or negative? Nobody knows.
And, if I think of this compared to some of the work that we’ve done relating to the parenting programs, the research partners have always been so careful. We as a society are not being careful at this at all at the moment. This is where whether there’s a big bubble that’s going to be burst or lots of small bubbles, who knows? Because we have no idea. It’s a huge social experiment, which is playing out in high resource environments with this technology.
And it’s been done in very uncontrolled ways, which may have some great benefits and may have some very dangerous consequences. This is where one of the advantages of working in the low resource environments is you can’t afford the frivolous pieces. You need the meaningful impact for pieces, that’s what can be afforded to be worked on in that context.
And so I believe actually that focus to be able to say, if we are going to work on AI in a really low resource context, how is it going to add value? That’s the most important piece. Isn’t this valuable, not just for the low resource environment, but worldwide? Isn’t this a good investment for anyone to be saying, okay, it’s very noisy in the high resource environment, but in low resource environments, let’s focus on how these technologies can add value, how they can reduce the inequalities, how they can lift people up.
That’s something where if we do pursue that in the right way and the right levels of money are invested into that, because it does require philanthropy and in the world we are in, this is almost certainly individual philanthropy rather than governmental philanthropy, to sort of focus on that problem. How can we make sure that in low resource environments the AI technologies that are being developed are serving and improving the livelihood of the people they’re serving?
That’s incredibly valuable learning worldwide, because almost certainly what we learn from that, if we do that rigorously, if we do that with rigorous learning, what we will learn, we’ll then be valid in the high resource environments where there will have been all this interesting innovation happening, but this huge social experiment, which is totally uncontrolled.
And so I do think, even from a global perspective, the low resource environments provide an opportunity to do better research into how these modern technologies can add value to society and play a purely positive role. It’s hard when there’s so much noise, which is what’s happening in higher resource environments.
So, I do genuinely believe that this work isn’t just important because it’s going to reduce inequalities in the poorest situations. I think that is of course, important. But it’s also important because those learnings may have much wider implications for everyone.
[00:14:52] Michele: What is your vision about, or your supposition about how AI could or is bringing value to humanity, but to lower resource environment specifically?
[00:15:03] David: Well, of course, one of the big ones that is recognised and where there is a lot of work is language, communication, but it’s not just communication as language, it is also communication of ideas. This is why I’m loving the work which is happening related to smallholder farmers, and what responsible AI for smallholder farmers looks like, because that is not clear, but there are groups working on it.
And furthermore, I’m really excited by the potential for education. I think that there is a real possibility that, done right, you could raise the baseline of educational skills and teaching, which would therefore reduce inequalities in education, so I think there are possibilities there.
And more generally still, I think skills gaps, I believe that, as we’ve discussed in the past, in other contexts, having AI agents which are very narrowly focused on enabling technical tasks to be done by people who are less technically skilled is something which I believe will open a whole set of opportunities around further innovation and locally built innovation, digital innovation, to serve local communities.
So there’s just a few examples, but I do believe that it can reduce, it can cut through barriers that were otherwise impenetrable. The skills gaps barriers are so well documented in different contexts, and there is a real possibility that it could cut through them by simply lowering the barrier, which nothing else could do.
The ability to do something technical could now be done with an AI agent in ways that we’ve discussed or you’ve looked into very seriously for STACK authoring, but could apply much more widely. And that’s particularly exciting to me because that’s consistent with the fact that actually these don’t need large language models. They almost certainly only need small language models at most.
That means that it’s suddenly realistic without needing extreme resources, you don’t need to be training it on huge masses of data, you actually want to train it on a more controlled set of data, which is locally relevant.
And so that idea of actually having global and local training data sets, which are done, the group I know doing this best, I come back to Digital Green, they’re really thinking about this well. And those ideas fleshed out more generally across different domains, they’re really exciting.
And I think the potential for it is very different from the idea that everything needs to be, you know, general super intelligence. If it happens, that might be another step change, but this is where we come back to the fact that AI is not advancing exponentially, it’s got step changes. And with the step changes that have happened, oh, there’s real potential to break down skills barriers, to engage people in a way they couldn’t be engaged before, with local knowledge, to have them building out their local knowledge, sharing their local knowledge, to be having elements of co-creation, which are very differently conceived from what you would be doing in a large language model.
[00:18:58] Michele: There are so many things I would like to say, but I have to choose. So lowering barriers at the moment is one of the things I’m the most focused on. When we work on the AI STACK assistant, from my point of view, that’s exactly the point. STACK is an amazing technology, but there are barriers there and potential authors just don’t have the time or the expertise to become actual users of STACK.
And now, with the work that we are doing, hopefully the barriers are just way lower than before. And I can just imagine that something similar can be done for so many other things. So from this point of view, AI it’s not really something new, but it’s enabling to do what they could have done if they only had more time, energy, and all of the resources that we have in life. So, yeah, that’s exciting to me with the current technology.
[00:20:02] David: And as you say, this is all possible, but it’s not easy with the current technology. And this is where it comes back to the fact that there is work to be done. Making this really possible and making this happen doesn’t just happen. We need to be learning how to do this really well, we’re learning how to bring these things in.
This comes back to, and people have been calling me crazy for ages for saying, you know, we need coders’ data or all these things. Why do we need coders’ data? Well, because then it can be better analysed by the AI models where they can now be serving this, this is in actually a better form. It’s not that they can’t read it as code, but if they read it as data, they know what it means. You know, you actually can have the metadata and you can have layers of information associated to it, which have more meaning to the AI processes.
So coders’ data, as I’ve been discussing it for ages, is just part of this bigger idea of creating these enabling environments for AI agents to work alongside humans, it’s that collaborative process of humans in the loop, being able to understand it, being able to actually check it, even with lower skill levels. That’s what we want to create and enable.
And this is not easy. The mathematical foundation for this, this is actually research, there’s maths research behind this. This is a big surprise to me that I actually have got back to thinking about broadly what I would consider elements of pure maths research, which were part of my PhD, but not really relevant since, because they are this foundational layer, which is needed to be able to say, well, okay, how can we make the structures that these AI agents are learning from and working with better so that they can better be an interface between the AI agents and humans?
These are hard problems to solve, which still need proper research done. But where that research is directed to now building these tools for the lower source environments, that’s where we’re really uniquely positioned.
I don’t know anyone else who actually has the connections in the low resource context to understand what’s needed and why, and has the mathematical skills to actually engage with the researchers at the Topos Institute, and elsewhere, doing applied category theory, trying to, you know, they’ve got this program, or it’s an opportunity space on safeguarding AI.
And at the heart of that process, they’ve recognised we need category theory to be able to build these safeguarded AI systems. So these needs are not something which is emerging just from low resource environments. They’re emerging from low resource environments, high resource environments. There’s an underlying need for these same mathematical structures to give us systems where we can have good human AI interaction.
And that’s something which is hard. I mean, there’s genuine mathematical challenges to this and it’s not visible because not many people actually can even express the problem. I don’t believe I can express this problem formula. We have members of our team who have dug into it more and who are able to speak the language, the mathematical language better to be able to formalise the problems that we’re identifying and that we then need the systems to be built so that we can actually resolve them properly.
And if you can’t resolve them properly, you end up building these solutions which work, but which don’t have the same level of power as they would. And then what you end up having to do is throw more resources at them. And that’s what’s happening the world over. People are throwing resources at the AI models because the mathematical foundation needs work. And that’s happening more slowly than people are throwing money at the system.
And so the best way to do it is, when I say the best way, if you are a for-profit company building AI systems, then the way to do it is to just throw more resources at it, because you don’t know whether the mathematical foundations will bear fruit anytime soon to make it more efficient, more effective. And so the more reliable way to get better performance is to just throw more resource at it.
[00:24:42] Michele: So you often mention small language models, and not many people talk about small language models. And you are also talking about research. So I wonder, what, in your opinion, what is the state of the art for small language models at the moment?
[00:25:00] David: I know enough to know I’m not the expert, but I do know a bit. I don’t know at this point exactly where we lie in terms of the performance of small language models and what can be done. What I do know is that if you think about your interactions with a chat GPT, pick your favourite large language model, there’s a lot of computation power that goes towards getting your result.
What would happen if you used less computation power? Okay, you’d use less water, less energy, but would the result be as good? Now, of course, the obvious answer is no. The reason you are using all that resource is because you get a better result. How much better?
What if you actually narrowed down the set of questions or the set of discussions you were going to ask that it would actually answer well on, to a sort of narrower set of things that it was trained on. Now, of course, that narrowing down means that you are now using a lot less resource because its not got everything in its data banks, it’s only got a fraction of those things.
So of course there’s many questions you can ask it that it won’t answer as well. But if the questions you are asking it are narrowed down themselves, then, I would expect it to answer those questions, maybe at least as well because it’s now trained on exactly the sort of things, it’s not trained on everything and anything where it could be pulling wrong things from different places, it’s trained on a much narrower set of things.
And that’s of course using a lot less energy, a lot less resource, a lot less water, and therefore more efficient. Is it as effective? That’s the question, and that’s the question which is being asked. And there are people working on this, what are the efficacy’s like?
And of course the main problem with determining efficacy is efficacy for what? You need to be able to have a well-defined set of things you are trying to achieve to be able to determine efficacy. But my claim is that if you have a well-defined set of things you’re trying to achieve, and instead of just throwing it at a large language model, you actually built the appropriate small language model, you could get equivalent performance. That’s where I believe the state of the art is.
However, I don’t believe at this point we can easily define what we can easily bound what we’re wanting something to be able to do and we can easily sort of measure how effective it is and actually improve it to get these systems to build small language model systems easily in the same way that it’s easy to build large language model systems.
That’s where I think we are moving and that’s part of what I think is needed in terms of the work and the research, how do we actually build the structures around small language models for more specific purposes, you know, better, rather than throwing things into large language models.
[00:28:12] Michele: Yeah. So, I suppose, the communication between small language models so that one can have a multi-model agent, let’s say, that is more tailored to the needs to the specific needs. At that point, the tailoring can be very specific by choosing both the small language models and the system that makes them interact. Am I making sense?
[00:28:41] David: And of course you’ve gotta remember, before we had generative AI, the previous big step change was really about neural networks and their ability to categorise. So the ability now to have families of small language models interacting with AI, which categorises to say, which family should I send this to, how should I do this? That’s all part of the questions and part of the problem.
So there’s the large language models, there’s then sort of categorisation processes, you know, these are whole sets of systems, to build these systems well, to reimagine how we interact with our generative AI as a sequence of interlinked small language models, I’ve not really seen people engaging that as much as I think we will in the future, because that’s where you can almost certainly outperform the large language models with a lot less resource, with a lot less water, energy, and with better efficacy.
Even maybe on something which is quite wide ranging. You need specialisation to be able to have effective models, which are smaller. But that layering of them, that building of them in different ways into these complicated systems, that’s the exciting piece. Now the real thing, and there is work, I know, I know somebody at Caltech working on this, what if we didn’t use the same underlying algorithms, the same black box algorithms? What if we could actually use kernel methods or something like this where you could actually take through elements of uncertainty and measure the uncertainty with respect to your results? What if that could be baked into our systems?
Now, I don’t think that can be baked into the large language model systems in sensible ways. But I do think it could be baked into certain sort of complex systems related to small language models in ways which could be very powerful. And so now you are getting to cutting edge mathematics, you know, professors researching how to, what the next generation of mathematical algorithms which could actually do similar tasks to the current AI algorithms, but where you can keep a measure of uncertainty, and baking that in then to decision making processes in these multimodal small language models? Oh, these are exciting ideas and I am getting ahead of myself ’cause I don’t think anyone knows how to do this.
[00:31:11] Michele: Everyday use of AI, of large language models, one of the things that I like the least is that they seem to just be completely unable to say, I don’t know when they don’t know.
[00:31:24] David: They don’t know they don’t know. Just like many humans. And the hardest thing, the big difference I find between top class intellectual academics, what makes them amazing is how much they know they don’t know. It’s not how much they know. It’s how much they know they don’t know, because that means they can bound what they do and what they don’t know.
[00:31:46] Michele: Yeah. And, as you were mentioning, how sure or unsure they are about the statements that they are proposing.
[00:31:53] David: Yeah.
[00:31:53] Michele: Okay. I think this is a good point to close the podcast. As always, the issues at hand are way more complicated than they seem initially, and I’m very happy that IDEMS is thinking about them.
[00:32:08] David: Well, thank you. And it does feel slightly strange that for discussing AI in low resource environments, a lot of our discussion was about deep maths, it was about how models are actually built, because that is important in low resource environments, and this is where the opportunity really lies. It isn’t because they’re low resource environments they shouldn’t be at the cutting edge of the technologies which are being developed.
[00:32:36] Michele: Thanks a lot, David. It was a wonderful conversation.
[00:32:39] David: Thank you.

