009 – Demystifying AI – IDEMS International Community Interest Company (CIC)

The IDEMS Podcast

009 – Demystifying AI

00:00 / 22:59

Description

As part of their series of conversations on Responsible AI, Dr Lily Clements and David Stern discuss “artificial intelligence” in general, breaking it down into three fundamental components: source data, algorithms, and learning cycles. Topics touched on include: the Turing Test; AI’s usage in the postal service; and using AI to identify birds through their songs, and the potential dangers of doing so.

Transcript

[00:00:00] Lily: Hello, and welcome to the IDEMS Responsible AI podcast, our special series of the IDEMS podcast. I’m Lily, an Impact Activation Fellow, and I’m here today with David Stern, a founding director of IDEMS.

Hi, David.

[00:00:19] David: Hi Lily, what are we discussing today?

[00:00:22] Lily: Well, we were initially planning to start our series with demystifying AI. However, we kind of accidentally started this series with Bletchley Park, because that was a relevant example, so I feel that this is… Well I feel since kind of 2022, ChatGPT has swept through and made these kind of huge impacts on a variety of aspects, which has kind of led to this re-ignition of conversations.

But maybe today we should step back and discuss demystifying AI.

[00:00:55] David: Absolutely, let’s go back a step. And remember, artificial intelligence is not new. This is something where the first real wave of artificial intelligence is almost 80 years old now. So this was in the 1950s. So artificial intelligence goes back quite a long way. It’s not quite 80, 70.

[00:01:16] Lily: Sure.

[00:01:17] David: It’s hitting 70. I think, actually, it’s only 67. I apologize. It was 1956, if I remember correctly. And in 1956, there was the first funding wave for artificial intelligence. And this is where, you know, one of the big things that came out of that was the Turing test, which became this mythical thing, which is just, very simply, if you have a human observer, observing a human interacting with another human and observing a human interacting with artificial intelligence, can the human observer tell the difference and identify the interaction which is happening with artificial intelligence? That’s it. It’s a very simple test. And, you know, this is a test which was unthinkable of being passed.

[00:02:13] Lily: I was about to ask that. Was it kind of more of a hypothetical thing thinking, this will never…?

[00:02:19] David: No, no, no, no, no. It was in 1956, this wasn’t the test, I don’t know when that was exactly. But very early on, this was the sort of thing that people were thinking, oh, we’ll do this in the next few years.

[00:02:29] Lily: Ok.

[00:02:29] David: No, nowhere near, and gradually over time, you know, the first wave passed, the second wave came and went, the third wave, as I understand it, that’s when deep mind happened, that was in the 90s.

[00:02:41] Lily: Ok.

[00:02:41] David: That’s when suddenly it hit consciousness, because, ooh, a computer can play chess and can actually beat the grandmasters at chess. That was a big deal. And it was a big deal. And what people don’t realize is that, actually, that was a big deal in a number of different ways. It was that point at which we started living in a society which depended on AI.

[00:03:08] Lily: Ok.

[00:03:08] David: Because it was in the 90s and it’s the same algorithms which led to the computer winning at chess, that also enabled post offices to automatically read handwritten postcodes, zip codes. And that’s important because that meant that the whole process of sorting the mail could be automated. Now, on one side, you could look at that as all the jobs lost. On the other side, you can look at it at the efficiency gained.

And so as a society, we then have become reliant on the efficiency gains that came out of these sorts of artificial intelligence systems. But they were all behind the scenes.

[00:03:47] Lily: Was it kind of efficient from, from day one?

[00:03:51] David: No.

[00:03:51] Lily: I don’t see how you can build something and for it to be…

[00:03:55] David: It wasn’t, and this is, this is the thing. I mean, you’re a statistician, I know we’re trying to push you towards data science. But, at the heart of it, if you try to do this with statistical algorithms, you just don’t know where to start. Standard statistical methods could not be used on the post office example.

But if you think about it slightly differently, then you can abstract the problem a little bit. And so they just considered it as lots of little dots, you know, and discretize the problem is the mathematical terminology.

[00:04:29] Lily: What’s the regular… ?

[00:04:31] David: Well, you’re right, I should be making this clearer. So, literally, instead of considering the image as it is, you could consider just sort of points of the image. So you could think of this as the pixels if you want.

[00:04:43] Lily: I see.

[00:04:44] David: But, you know, depends on what resolution you have. But the idea is, it’s like pixelating the image, so not considering it as a smooth image, but considering it as sort of lots of little points, which is of course what a printer does anyway. So, this is not a totally unnatural thing to do. So, After you’ve discretized it, after you’ve made it into these pixels, or however you want to consider it, then it’s just a sorting problem. You just need to look at if it’s sort of there or not there, black or white, and then you sort the images you’ve got into the letters they form. And then you just sort of categorize it.

And there’s a lot of things that can go wrong. And my understanding is there was only about 90 percent accuracy when they first did the learning cycles on this.

[00:05:37] Lily: Sure. Sure. So one in 10 of your post.

[00:05:41] David: Goes…

[00:05:41] Lily: I think of how much post… well, back in the 90s I imagine there was a lot more post as well than there are these days.

[00:05:47] David: Maybe, yes, exactly.

[00:05:49] Lily: One in ten is quite a lot to be missing.

[00:05:51] David: Absolutely. Well, but it doesn’t go missing. What happens if it goes to the wrong place? Somebody sees it and says, oh, that shouldn’t be here, that should be somewhere else. And they send it back and they feed it back into the system. So now when it gets read back into the system, this is a correction. And so now you can override how this should be read. And so now you put these learning cycles on.

Now my guess is that you are probably getting let’s say 5 percent errors with human sorters. Maybe not 5%. That’s 1 in 20, but it might be that sort of order of magnitude. Whereas now with the automated system, maybe the errors were a little bit higher, but over time they became less and less.

So think about it, you’ve now got a sorter who sorted for 30 years, and who’s been sorting consistently for 30 years, continually learning.

[00:06:38] Lily: Okay.

[00:06:39] David: That experience is much better. And the computer, you know, actually is a 30 year old algorithm, it doesn’t have the disadvantages of somebody who’s been in the job for 30 years, who’s now maybe a little bit tired, maybe their concentration level’s gone down, and so on.

So, actually you’ve got a system now which is in place, which is incredibly efficient. And so those learning cycles have created a really efficient system. And the basic algorithms that they’re using for this is, is these sort of deep learning algorithms, which are the same algorithms that we use so the computer could win a chess, and you’ve got a solution where you keep feeding back. Just like if you want the chess example, where actually, if you did play the AI system against your grandmasters, every time a grandmaster would win, then the system could learn from that game and actually take it into its database and then use it for the next time to understand, well, how could I have won that game… not how the computer thinks, but it can go through an algorithm where that’s a possible outcome, which you would then be using to avoid that outcome.

[00:07:48] Lily: Okay.

[00:07:49] David: So the point is that these systems which have on failure, learning that’s fed in are incredibly powerful for artificial intelligence. So your post office is perfect for that.

[00:08:02] Lily: And so how does this feed into kind of demystifying data science then?

[00:08:06] David: Thank you. My hope is that understanding data science is really about understanding that it’s about the data you’ve got at the beginning. So when you were first training the algorithm, it’s about the algorithmic methods you use to do the training. And there’s things you can do there. And then it’s about the feedback loops you get to be able to sort of then improve this over time. And what happens, how the learning evolves over time.

So artificial intelligence is not intelligence. It’s a way of learning from data, which can then improve over time. And the more data you feed in, the more it can learn. And so it’s all about thinking about these three things. One, what data is going in, know where you start from. Two, what the algorithm is. That, actually, for non-technical people, is not that important. Generally speaking, you know, there are algorithms that work pretty well, that’s probably all you need to know. So, you would probably, if you talk to the right person, be able to get a sensible algorithm. But the hard part is then, the learning of the data.

And I’d love to use another example for this.

[00:09:17] Lily: Ok.

[00:09:17] David: Birdsong. So my aunt uses an AI powered app. She takes her phone, she sticks it in her garden and she, looks at the phone and then she can tell me what birds are in her garden. I’m sure in the past, she’d have looked at the garden and looked at the birds and then told me what birds were in her garden. But right now she tends to look at her phone and she then is able to tell me all the birds that are in the garden because it’s been able to identify them.

[00:09:46] Lily: This is presumably from the bird songs, that’s why it’s called.

[00:09:49] David: Exactly, from bird songs. So from just the audio recording, it’s identified the different bird songs and, and therefore the species of bird which are in her garden. And this is incredibly powerful. Think about the applications of this for all sorts of things, including, of course, conservation, if you’ve got an endangered species.

[00:10:08] Lily: Okay.

[00:10:09] David: But that’s when I get worried. So, I have another friend who used a similar app, and there was a really rare bird that was identified by the app, and so they went looking for it, because they wanted to see it.

[00:10:21] Lily: Yes, yeah.

[00:10:22] David: It’s not just a, ooh, I was near a bird. And they were able to trace it down… To a toad.

[00:10:32] Lily: Ah.

[00:10:33] David: And so it had misidentified the toad as the bird. Now, of course, the particular friend in question was a data scientist, so he could well have contributed this back and understood that this was an algorithm problem, and this could get fixed if maybe that data would be fed back in. But in general, that’s not what’s going to happen.

In general, when that app, or when that algorithm misidentifies a bird as a toad, or as something else, or as another bird, or there’s been a change of something, maybe there’s been, uh, for whatever reason, a change in how a bird behaves. Or you’ve got one bird which copies another. Now that’s something which we know happens in nature.

[00:11:18] Lily: Yeah.

[00:11:19] David: And so, what happens when the algorithm gets it wrong?

[00:11:25] Lily: Okay.

[00:11:25] David: There’s no automated or natural way for that to lead into a feedback cycle.

[00:11:33] Lily: I see. So we have our data, we have our algorithm, but we don’t have that kind of human feedback, or any feedback check.

[00:11:41] David: Well, almost certainly the creators of the algorithm do have learning cycles and feedback cycles. It’ll be built in that way.

[00:11:48] Lily: Okay.

[00:11:48] David: That’s part of almost any AI system now. But the use of it is independent from that feedback system in certain ways. It might even be that all the data which has been collected everywhere in the world gets fed back into their servers so they could use it, but they can’t go and search to see was it a toad or was it a bird, because that information is lost.

[00:12:10] Lily: Yes, ok.

[00:12:11] David: They might be able to have somebody else analyse it and wait a second, I recognise that as a toad call, not a bird call. And so they might be able to have an expert. But, unlike the post office example, this is much, much harder to imagine how that could be happening going forward in a way which will mean that we get better and better algorithms.

The reading of the postcodes if it goes to the wrong place, someone will notice.

[00:12:38] Lily: Yeah.

[00:12:39] David: It might not make its way back into the system, it might get lost, someone might throw it in the bin, but someone will notice. And so a good proportion of the errors will get fed back into the system. And so, if I now think of this birdsong process being used for conservation, I get nervous.

Because there’s all sorts of ethical questions which relate to not the use of the system as it is now, but to how that system will work in the future.

[00:13:09] Lily: Yeah.

[00:13:10] David: And this is part of what I think is really, it is not mysterious. No, the hype around AI is justified. It is going to change our world. People’s jobs are going to change. People will lose jobs. New jobs will be created. There will be shifts in the way work happens across the world because of recent advances in AI. I believe that. But, I mean, you know that because you use it to optimise your code, you use it for all sorts of things. So, you know, it’s already entered into your working life.

[00:13:46] Lily: It’s increased my efficiency a lot, from all these various things, but I’m sure we’ll touch on that.

[00:13:51] David: On other occasions… So there is no doubt that the value, the potential of AI for our society has been transformed very recently due to these recent advances. And if you want, in some sense the recent advances, they’ve built on the advances that came before them, which have built on the advances which came before them. This has been 70 years, give or take, of advances. And while the funding has come and gone, while the hype has come and gone, there has been steady progress, in different ways, which has led us to where we are now, which is almost certainly a tipping point where the use of AI in people’s daily life is going to be much more visible.

It was already integrated over 20 years ago. So our daily lives became dependent on AI maybe 20 years ago, but it wasn’t very visible. I think that’s the tipping point.

[00:14:51] Lily: Yeah. And I guess this tipping point that you’re talking about, is this where the Turing test that you were talking about at the start…? Now the question is, people have done studies: now was it a human that wrote this or is this AI generated? And they can’t tell.

[00:15:06] David: Absolutely, and this is, this is exactly correct, that there are now cases where some people would argue that the Turing test has been passed in specific contexts. You’ll notice how careful I’m being with that.

[00:15:20] Lily: I did notice how very careful you were being at that one.

[00:15:23] David: And I am very confident with exactly how I worded that, and I’m really not confident whether it has been passed and in what context, and so on. I know there are studies where people believe it has been passed in certain contexts, and… in my mind, it’s irrelevant, at this point in time, whether we consider it passed or not.

It’s going to be passed. Even five years ago, I think there was still a doubt in a number of people’s minds whether it would ever be passed, whether we’d ever build those technologies. Whereas now, there’s no doubt. People in the field, we always believed that it was going to be passed and it was a matter of actually getting there with the tools.

[00:16:04] Lily: Yeah.

[00:16:04] David: But I think from a societal perspective this has come pretty fast. It was, you know, five years ago it was reasonable to have doubt as to whether or not it would be passed and now there is no doubt that it will be passed. It’s just a question of when, whether it has already been passed or whether it will be passed soon. You know, the technology is now at the point where that is possible.

[00:16:28] Lily: And so now we’re talking about kind of AI and, and it is going to be passed and it’s becoming more and more part of our lives. That’s why there’s a lot of talk at the moment about being responsible with it.

[00:16:40] David: I think this is the point. In some sense, being responsible with AI before the Turing test was passed is very different to being responsible with AI after the Turing test has passed. Let’s come back to remembering again, it’s all about demystifying AI.

[00:16:55] Lily: Yeah, ok.

[00:16:56] David: So, you know, the Turing test is can you, as a human observer, observing a human interacting with AI and a human interacting with another human, can you tell the difference? And the Turing test says no. So as a human observer of that interaction, you can’t tell the difference. Now, as a human interacting with AI, you may still be able to ask certain things in certain ways to be able to tell the difference. Now that’s a separate test, if you want.

[00:17:26] Lily: Sure.

[00:17:26] David: So this is sort of an external observer. If that is the situation, where the Turing test has been passed or is passed, then there is now a question about when you observe interactions, what are you observing? Now, if we go back a few years to some of the misinformation wars that have been happening more recently, this was before the Turing test was passed.

[00:17:57] Lily: Yes, okay.

[00:17:58] David: Yeah? Now, of course, if you can actually set up algorithms to just push out your single idea as an individual, then maybe nobody can know how many people actually think that way or not, because you can’t tell the difference between bots and people.

[00:18:16] Lily: Wow, okay.

[00:18:18] David: Yeah, these are things which are possible. So there’s all sorts of things, in education, people are worried about theses, you know, you can’t tell if it’s a human written thesis or an AI written thesis. Okay, well, that means education has to change.

[00:18:32] Lily: And that’s absolutely something that we do touch on, or will touch on in this series.

[00:18:37] David: Absolutely. And so these are going to be things where we’re going to dig into this much more in the series as we get into this. But the implications of this are simple. Let me try and just summarise what I hope people have taken away from this. Because we’ve touched on a lot of different things.

We’ve not really dug into the responsible side. But the point you’ve made is critical. In this world where we may not be able to tell what has been machine generated versus what is human generated. In that world, there are a lot of new issues which are where we need to be careful about what we can do responsibly.

That is clear and we’re going to have to dig into those in other, other cases. But the point which is so critical on this is not to get away from the fact that despite all this hype, despite all this change, actually AI can always be thought of as these three simple things: data in, algorithm, learning cycles.

[00:19:46] Lily: Sure.

[00:19:46] David: Fundamentally, those are its three components.

[00:19:50] Lily: Yeah.

[00:19:50] David: And all three of those components have lots of things that can go wrong. And that’s why the responsibility, responsible AI, is so important. And all three of those components have humans involved in the process. You know, thinking about the data that goes in, deciding what data goes in is a human process. Deciding and building the algorithms, is very much a human process. Thinking about what the learning is, what the human’s role is in that learning, humans in the loop, that’s part of that process. And so, we should not think of AI and artificial intelligence as not involving or including humans in the process.

We should be thinking about it, just like many other things, as a tool. It’s a tool which takes data, large amounts of data, takes them in, puts them through an algorithm, using an algorithm, and then has learning on the other end. And through that it can do amazing things. It can categorise things, post office examples, categorising the different birds. It can generate things, generative AI: whole essays, theses.

[00:21:01] Lily: Essays, yeah.

[00:21:03] David: And so on. Tweets. Oh, sorry, X, what do you call them now?

[00:21:08] Lily: That’s a good question.

[00:21:09] David: Anyway, never mind.

[00:21:11] Lily: Yeah, what do… I don’t know what you call it. But no, thank you very much for the kind of quick demystification, on just breaking it down to those three components which all need that kind of human interaction and responsibility… or human interaction to keep that responsibility.

[00:21:31] David: We’ll dig into this I’m sure in other places, but if you’re not responsible, things go severely wrong. There are suicides that can be associated to machine learning processes which have been done or have been interpreted incorrectly. Because of course it’s not just about what the machine learning does. That specific example, you know, it’s about how humans interpreted the output, which caused the problems, not about the… necessarily the algorithm itself. There’s other cases where it’s the algorithm and what they produce, which is the problem. But it’s also about how you use it. So thinking about using AI responsibly means that we need to understand what it can do and what it can’t do.

And I hope the birdsong example gives you an instance of, well, that particular algorithm, there’s a limitation on the learning, which means that using it, let’s say for conservation of rare bird species, you probably shouldn’t do that at this point in time, because you could get misled and not know you’re being misled.

[00:22:32] Lily: Interesting. Yeah. Thank you very much, David. It’s been a pleasure as always, and I look forward to doing this series with you.