208 – The Significance of the Turing Test – IDEMS International Community Interest Company (CIC)

The IDEMS Podcast

208 – The Significance of the Turing Test

00:00 / 21:49

November 7, 2025

Description

Michele and David discuss the Turing test, and its relevance today. They explore various philosophical questions about intelligence, the limitations of the Turing test, and the ethical dilemmas posed by AI, particularly in the context of self-driving cars. David emphasises the vital role of human observation in the Turing test and expresses skepticism about society’s ability to make responsible choices regarding AI regulation.

Transcript

[00:00:07] Michele: Hello everybody, welcome to the IDEMS podcast. I am Michele Pancera, an Impact Activation Fellow, and I’m here today with David Stern, one of the founding directors of IDEMS. Hi David.

[00:00:20] David: Hi Michele. Great to be having another episode together. What are we discussing today?

[00:00:25] Michele: Today we’re talking about the Turing test. The Turing test is nothing new, but still there’s a lot to be discussed. So, do you mind if I start with what I think the Turing test is and then we can go from there?

[00:00:39] David: Sounds good.

[00:00:40] Michele: So, my understanding is that the Turing test is a way, not really to determine if some, let’s say, agent is intelligent or not, but rather to check if we should put the agent in the same category as, for example, human beings. And then, if the human being is considered intelligent, the claim is then you should also consider agent intelligent.

The reason why I say this is because what the test is doing is checking if two separate chats, originally this was thought about in terms of chatting, if two separate chats are distinguishable or not when one is performed with a human and one is performed with a computer or a non-human agent.

And so, if the two happen to be statistically indistinguishable, my understanding is that the claim is if you consider one of the two agents to be intelligent, you should also consider the second one to be intelligent. And I have much more to say. But maybe I would like to ask for your feedback here.

[00:01:58] David: It’s interesting you frame it that way. I don’t like the interpretation of the Turing test as being a test of intelligence of artificial intelligence. I always considered, personally, that the Turing test was a test of advancement of artificial intelligence as a discipline. This was a question of, there’s a step change between AI systems that cannot pass the Turing test and AI systems that can pass the Turing test. This is a tipping point if you want, if in some languages potentially about what you can do with AI systems, which you couldn’t do with AI systems that don’t pass the Turing test.

That’s how I have always interpreted it, less about actually the philosophical debate about intelligence of artificial intelligence and more as a measure of advancement of artificial intelligence as a discipline or as a product potentially.

[00:03:06] Michele: Still, I would like to push you on the philosophical side a bit, if you don’t mind. Again, just tell me if you agree. When we interact, let’s say with people, but I mean with animals, with whatever, we are never able to explore their inside world if you want. The only thing that we know is how they behave, how they react, what we observe about them.

And then we can infer some characteristic of the being, for example, intelligence, by noticing that their behaviour is consistent to what the behaviour would be if they were intelligent. We never really know if someone else is intelligent or not, we only know the behaviour.

So, what I’m suggesting here is the Turing test is doing exactly that, is skipping the definition of intelligence, and it’s just saying, if the behavior is indistinguishable, then they are in the same category. What do you think?

[00:04:11] David: I’m gonna say this in the clearest way I can, I don’t know much about the philosophical debate of intelligence, and, this is something I’ve thought a lot about, and my conclusion is I know too little. I love the discussions around intelligence related to animals, certain animals display elements of intelligence, which are measurable and which are very interesting. There’s certain birds, there’s octopuses, there’s tasks that they can do. There’s the question was human intelligence really defined initially by the use of tools, and if so, do we need animals that use tools to be defined in similar categories?

These are wonderful, wonderful questions of which I believe I remain ignorant. And so I’m willing to live with my ignorance of whether or not the observation or the lack of the ability, because the key point to the Turing test, which you haven’t emphasised is who is observing it?

Now, it would’ve been a very different question if the observer was an AI system. Can an AI system identify the difference between human to human and human to AI communication? That would’ve been a very different question. The power of the Turing test is its human observation. So it’s the human observation of these interactions, and the reliability of that human observation in terms of actually identifying the human versus AI element, that’s what’s being tested by the Turing test.

And I don’t believe that humans are capable of measuring intelligence, and therefore I don’t equate the Turing test with a measure of intelligence because I don’t believe that’s what it’s set up to do. I believe what it’s set up to do is as a measure of the advanceness of the AI system. And that’s all it’s really doing from a human perspective because the measurement instrument is very human.

These are interesting questions and the fact that you could be having different tests, many different ways of thinking about tests, which could have different measurements, makes them different. But to me, the role it plays and the importance of that test is so important, it survived for a long time as a measure of artificial intelligence that could not be beaten, it was conceived a long time ago.

And it was many, many iterations of people failing miserably to get anywhere close to passing the Turing test for many, many years. So the significance of having systems, which in certain constrained environments now consistently pass the Turing test, this is hugely significant, but it’s significant not in terms of the intelligence or not of the underlying system, that I don’t know, I have no educated opinion on.

No, sorry, I have an educated opinion on, but my educated opinion is I am ignorant, I don’t know. But what I do believe we do know is because we now have systems, AI systems, large language models, generative AI systems, that pass the Turing test, we live in a different era. That was a step change in what was possible from these systems before and after. And that I think is really interesting, and that means we are living in exciting times.

[00:07:56] Michele: Yeah, sure. And I agree the Turing test is being passed at the moment, as you said, in controlled environments, and it’s probably going to be more and more like that.

So, should we talk about what we think are the limitations of the Turing test?

[00:08:13] David: I’m happy to, yes. In some sense it is now irrelevant because it’s been passed. Once it’s been passed, it is now almost irrelevant.

[00:08:23] Michele: I agree. I suppose one of the limitations is relying on chats. While when we, I’m sorry, I’m still framing the Turing test as I did previously, so one of the ways in which we associate characteristics to each other, and maybe not intelligence, but whatever characteristic, it’s not only about chats, it’s about how we physically interact with the world, is which choices we make, it’s way more complex. So I think, I suppose, one of the aspects that the Turing test is missing at the moment is the physical.

[00:09:10] David: But why? I mean, I don’t understand that in the sense that what is it that we want from our AI system, which relates to the physical? As I say, the Turing test, I believe was really useful as a benchmark under which to evaluate artificial intelligence systems. And for that, I believe it’s been very successful.

So if we take my framing, I’m trying to now sort of get your idea into my framing, what is the physical attribute or the physical behaviour we are wanting AI systems to achieve that would be a sense of a benchmark?

[00:09:52] Michele: Well, without having the ability to be too precise here, one could argue that at the moment we are putting symbols one after the other, and the symbols put one after the other in a very sensible way are very valuable. But, I suppose one of the possible guesses is that the way to prove that there is meaning behind those symbols is to show that they can be enacted in reality. You don’t just say that something is dangerous, but you also try to avoid it in sensible ways. You see what I mean?

[00:10:32] David: I can try and interpret it, so let me see how I would interpret it. All the work that’s happening in self-driving cars, what you are saying is that this identification is not just symbolic, that actually there is an understanding of prioritisation. Given a difficult choice, could you make an ethical, difficult choice, which relates to the physical attribute of the car, which people have to do when they’re driving in difficult situations all the time. So this is sort of, what are those choices and how does this work, how are those choices made? That could be part of what could be included in the tests for AI systems of, let’s say, a self-driving car.

But I don’t know how to define a very theoretic, and this is what was so powerful about the Turing test. The Turing test is very simple in its conception. The test, which I could imagine, which is an ethical test, is you have a car which is out of control, and you have to choose between running over one person or three people, these are ethical, wonderful ethical dilemmas.

Do we want to train AI systems to make certain choices? Well, I’m not sure I want to enter into that domain at all. I don’t know that I even want to conceive such a test. I don’t know what the answer is because these are ethical dilemmas in really difficult ways.

So, I don’t see that as a limitation of the Turing test at all, that it didn’t enter into the philosophical debate of some of the other tests that could be conceived around the ethics of AI. Do I think people should be working on this? Absolutely. Am I glad I’m not a pure philosopher who has to deal with these conundrums and actually try to put in place regulation that might relate to them? Absolutely, I’m delighted that that’s not my role. That’s a hard job to have and so important for society, but so thankless because I don’t believe there’s a right and a wrong answer.

And whatever happens, you’re gonna have issues going wrong. These are hard problems. So I don’t see that as a failing of the Turing test, of not covering some of these other philosophical angles.

[00:12:51] Michele: I am very curious to see how the regulators are going to behave in, I mean, so many aspects of society and life. That’s gonna, as you say, that’s gonna be difficult. I’m curious.

[00:13:06] David: Yeah. I love the self-driving car as an example. And the reason is it is pretty obvious that if everybody was using and driving self-driving cars, the roads would be safer on average. But who would be responsible when something goes wrong? The legal implication of that, of that level of responsibility, that’s a whole nightmare for our current legal systems. I don’t envy the people who need to think about that.

It is this crazy, crazy thing where, although on average the world might be a safer place, it is not necessarily desirable for society because if somebody is criminally reckless when they drive a car, then they pay, they pay to society, they’re imprisoned, and so on.

Now this does ruin their life, it ruins of course, the person of the life who they’ve already ruined. But if they weren’t responsible, it was just their car, it was a self-driving car, who is responsible? These are terribly hard questions.

[00:14:19] Michele: So interesting because as you say, we do want a safer world for everybody, but it’s not sufficient to have the technology. That’s so interesting.

[00:14:30] David: Yeah. And this is why at the heart of everything we think about, it’s always the sociotechnical, it’s about society. Technology should serve society and not the other way round. And there aren’t easy answers, I don’t believe there were easy tests to set up. I come back to the fact that the Turing test was an amazing test because it was so simple in its conception, and it was so powerful as a measure of the failure of previous AI eras.

Of course, now it’s been passed, it’s irrelevant. Somebody will come up with a sensible next one. I have seen people try, I’ve not seen anything sensible yet. I’m interested in what the next real articulation like a Turing test could be. I’ve heard people debating it, I’ve heard people discussing it, I’ve seen people write papers about it, but none of them actually satisfy what I believe the previous Turing test did. It was beautiful.

And the fact that we’re now living in that era beyond the Turing test, interesting times, exciting in positive and negative ways.

[00:15:46] Michele: Yes. It’s crazy.

To end the podcast on a funny note, there’s a self-driving company that has partially solved the issue of responsibility. The human has to always be in the driver’s position, maybe not while driving, but it has to be ready. And the system recognises dangerous situations and just signals to the human to take over. And that’s it. It’s your responsibility now.

[00:16:21] David: Well, that’s absolutely perfect for the AI company. And there was actually some claim, I think it was in Germany, where they monitored what happened with a rather large, well-known, car manufacturer that has self-driving properties, and it was found that the self-driving turned itself off in the fractions of a second before an accident so that they could claim that it was not in self-driving mode when the accident happened.

So the actual driver had no chance to stop it. It was not designed so that the human could take responsibility. It was just redesigned so that the company did not have responsibility. That’s not a good solution to me. This is actually the opposite of what I believe a good solution is. And fundamentally, if you’re having an accident type situation, it happened so fast, trying to give back to the human in that fraction of a second for them to do what? That’s exactly when you don’t want a human in command, you want the best possible chance to avoid this because that should be something which can be optimised.

Now, avoiding that beforehand is the sort of thing that maybe humans could do, but this is why self-driving cars on average should be safer. They should be better at reducing the risks of these accidents in the first place than putting that into a human responsibility. So that specific example is everything that I believe is wrong with what the current company setups are where they don’t want to have responsibility. They want to say and blame the humans in the car when really it’s their fault.

Anyway, that is a whole nother debate and I think it is a really serious one. It is so, in my mind, unethical for big companies to be extracting so much money and wealth, and deliberately avoiding responsibility for the consequences of their decisions. Especially, and I don’t want to get into the details on this, when their decisions go against industry standards, which have been shown to be more effective than the decisions they’re making about their self-driving modes, because they rely on a different set of sensors, which are, anyway. Yeah, we could get stuck into that, but I don’t think it’s what we should do.

[00:18:46] Michele: Yeah. I can’t wait to see what happens. The future seems so interesting.

[00:18:51] David: But it is rather worrying to me that I don’t, I’ve been called an eternal optimist in so many different contexts. My optimism that we as society will make good choices is low. I believe that individuals within society will fight for good choices on some of these things, but that we as a society, collective, will make good choices for society as a whole on things like legal responsibility and what that legal responsibility means, I’m not an optimist at this point in time.

And that doesn’t mean I’m not optimistic about other elements and other things. But I do believe that seeing what emerges, there’s going to be, in a lot of parts of the world, a lot of bad things will have to happen before good decisions will be made. Unfortunately, that seems to be the route we are choosing as societies. And it’s rather worrying to me.

[00:19:55] Michele: Maybe we could end this episode just like we ended the last one. I’m very happy that IDEMS is thinking about these topics.

[00:20:03] David: So when you mentioned the last episode, that was the episode on AI in low resource environments. And I believe that, while IDEMS is deeply engaged in AI in low resource environments, and actually that is something where I think we could constructively contribute, I don’t see our ability to contribute to these debates, which are so important and happening. I don’t feel that we are the right structure. There are philosophers I know who are engaging in these debates, who are trying to build regulation around this and who are doing work, which I am in awe of.

This is something where I want to actually give credit where it’s due there, to the people who are deeply engaged in trying to think how do we leverage AI for human flourishing while regulating it to ensure responsibility and how difficult that is at this point in time. We may be thinking about it, but we are not engaged in that actual process. My level of knowledge and others within IDEMS, we are not as careful in our thinking as some of the philosophers I’ve met who are engaging in this, in incredible ways.

[00:21:28] Michele: Thank you, David for this episode. It was very interesting.

[00:21:32] David: Thanks.