236 – Is STACK Necessary in the Age of AI?

The IDEMS Podcast
The IDEMS Podcast
236 – Is STACK Necessary in the Age of AI?
Loading
/

Description

Students are increasingly turning to LLMs (Large Language Models) to solve maths exercises and get feedback. In light of this, is there still a place for deterministic online assessment tools like STACK? In this episode, Michele and David argue that this problem is an opportunity for educators and developers to build better alternatives, potentially embedding generative AI features in STACK to provide a more interactive, conversational experience. They consider more generally how LLMs affect exams, curriculum decisions, and student motivation, noting uncertainty about future skills and warning against reactive curriculum changes while encouraging experimentation.

[00:00:07] Michele: Hi and welcome to the IDEMS podcast. I’m Michele Pancera, I am an Impact Activation fellow at IDEMS, and I’m here with David Stern, one of the founding directors of IDEMS.

Hi, David.

[00:00:21] David: Hi, Michele. Really nice to be doing another episode with you. What are we discussing today?

[00:00:26] Michele: So, today I’m going to play devil’s advocate because what we are discussing is “do we still need STACK given all of the advancements in AI that we currently have?”

[00:00:41] David: Oh, that’s a quick episode. Yes, okay, done.

[00:00:45] Michele: Well, the idea for the episode stems from a comment that I got from Andrey Chesnokov – I hope my pronunciation is correct, I’m sure it’s not. We had an event in Sevilla that was very nice and at some point he mentioned that some of his students just copy-paste whatever the STACK question is into an LLM, and they don’t care anymore about the feedback provided by STACK because for some reason they prefer the feedback that they get from the LLM.

Now, one of the main reasons why we like STACK, “one of” the main reasons, is that we have the PRT and students can have a personalised, detailed feedback.

[00:01:37] David: This is the Potential Response Tree, PRT.

[00:01:41] Michele: Yes. So, the feedback the students will receive depends on the answer that is given and on the kind of mistakes that the student may have made. But then, if they don’t use it, if they prefer an LLM, do we really need STACK?

[00:01:59] David: That’s the wrong way to look at that question. If you have a question for which posting it in a random LLM can give you equivalent feedback, then maybe it is arguable whether or not you need STACK. The point is, and this is where I think, really, we feel, or I feel very strongly that LLMs should add value, they should enable us to make much better questions.

One of the visions that we have is what if actually you had an LLM embedded in STACK, so you could have that same interaction with an LLM. And actually, of course we’ve talked before about the fact that I’d much rather this eventually get to an SLM, a Small Language Model rather than a Large Language Model, but that’s a whole different story, irrelevant for this.

But I’d love to get that sort of interaction that people value the backwards and forwards, the discussion part related to a STACK question where the learning of the model is specific to that question. And so actually your potential response tree is all something that the LLM or the SLM knows about. And so it’s not just that you are able to get a predetermined feedback, it’s that then that deterministic layer, which is distinguishing between the different types of potential mistakes that could be made, that is maybe sandwiched within language models.

That’s what I envisage in the long term about this. I don’t believe that the next stage of really good, if you want, artificial intelligence progress is going to come from big generalised intelligence systems. I believe the next piece of really detailed progress is going to be about getting these down to really specialised, a specialised language model, which knows about the specific question that you’ve been working on and is tailored to that. That’s what I believe is the next real frontier, which I hope we’re going to work towards.

So, rather than just your general intelligence, your specific intelligence, really growing up, now that’s where I put my money.

[00:04:21] Michele: Yeah, I guess what STACK is missing from the point of view of the students is being so interactive and so personalised, and the ability of having a detailed conversation. While on the other side, what they may not realise that they are missing when they just copy paste is all of the context, is the professor using that kind of language. Does the answer that the LLM provides make sense within the course where the question is embedded, and so on and so forth?

So, your idea of having an LLM that is actually embedded in STACK makes a lot of sense to me. I guess at that point we would have a deterministic part that becomes context for the LLM.

[00:05:17] David: Yeah, exactly. And this is to me where I don’t think students are missing out by copying it in because they don’t have an alternative at the moment. This is us as developers missing out on actually creating those alternatives. So if the alternatives we create are not better than them copying and pasting it into a generalised AI system, then it is to me the fault of the developers, and us as educators.

I love listening to lecturers who bemoan the fact that now, you know, students have access to LLMs, what they used to do is no longer working and saying it’s all the fault of the student. No, this is you not being imaginative enough. Don’t get me wrong, there’s real challenges because the technologies are not there. But not to see this as an opportunity, that’s the mistake. We should be able to educate better because of the advances in technology, not to go back and say, “oh, I wish for the old days when we used to be able to recite timestables, and that was how we learned about numbers”.

Actually, recognising that new technologies are not all good, they’re not all bad, technology is neutral. It’s how we use it for education, but also in society. This is what matters.

[00:06:37] Michele: Well, related to this, there’s another story stemming from the same event in Spain, again, from Andrey, he has so many good stories. He noticed that some students answered very complicated questions way too quickly, online, of course, questions that would take days or maybe weeks, they solved in a few minutes, just the time to, again, copy paste.

And so he wrote an email to those students asking if, for them, the exam or the questions were too easy, and how he could make the exam actually challenging for them. And, I mean, that was a fun idea. But at this point, we could actually seriously ask ourselves “should we be asking the same questions?” Before calculators, we had different sets of exercises in school that we fortunately don’t have anymore, because they’re not needed. So given LLMs, should we still be asking the same questions or should we change what we are teaching?

[00:07:48] David: This is a weighted question, in many ways. The answer is always “it depends”. But, I would argue that, even without LLMs, the mathematics we’re teaching is not the right mathematics we should be teaching. And once you add LLMs to the picture – and maybe SLMs in the future – then it’s so clear that there are other skills, which are going to be the determinant skills that we need students to come out with in the future.

Now, that question is a really hard question right now because we don’t know what the skills of the future are going to be. What do we need? Where are the jobs of the future going to be? These are things which are really up for grabs right now, there is just a lot of unknown.

And so, should we suddenly, right now, change the curriculum? No, because that’s not a constructive way of going about this. Should we be trying out imaginative different things to be able to figure out what a future curriculum should look like? Absolutely. Should we be envisaging a future curriculum, which is totally different? Oh, that’s exciting. But should we be reacting now, in the moment? I don’t know, that doesn’t feel right. It doesn’t feel like that’s going to be done thoughtfully or well.

There is very much a case that the first people to do it, are most likely to get the most wrong, and maybe those who follow will do something good, but maybe the first people, if they get certain things right, they’ll have an advantage. Do you want to be the first one to be radical in your change of the curriculum, or do you want to be the third or the 10th?

These are questions which I think, as societies, people should be asking themselves, not whether you should be changing the curriculum, but how soon? and how? and who, who should be determining that curriculum in the future?

If you want to go where the money is, should you be asking big tech what curriculums, who they’re wanting to employ in the future? But they’re known for laying lots of people off, suddenly. So, do you really want people in an industry which is changing so fast? Should we be looking to governments? But is that really the main employer we want to imagine the future?

How should we be finding out what these jobs of the future and what the skills for the future should be? That’s a whole different and difficult question. I don’t have good answers to that at this point in time.

[00:10:20] Michele: Yeah, yeah. It used to be the case that you could have a long-term project of becoming a lawyer or a doctor. And, if you did well enough, you would certainly have a career. At this point, I’m not sure.

[00:10:34] David: Well, this is a really interesting question. I won’t go into lawyers, but doctors is a really interesting question. Actually, health workers, you know, maybe I hope that we have at least as many health workers in the future as we do now, and maybe more, so that seems like a good set of skills that we’d want to maintain. Maybe the role of the health workers would change, but I certainly hope that the number of health workers doesn’t suddenly diminish.

And I think the same for education. If I think about, in the future, what are the jobs that I want to see at least as many as we’ve got today? I hope there are at least as many people in education as there are today, and maybe more. So, actually, in terms of the share of the job market, health, education, these are things which I hope people care about and work in, in the future.

Of course there’s lots and lots of other things out there, and your question of lawyers, do I want there to be more or less lawyers in the future? That’s a more difficult question, I don’t know. I do want a system which is just, and so on, but whether lawyers are the best way to achieve that in the future, I don’t know. And so this is a really interesting case, maybe I do hope that there are at least as many people working in law enforcement and in the area of justice. But what role they play is an interesting question.

[00:11:57] Michele: I think I get what you’re saying.

There’s another related question here, related to the curriculum question, about motivation. And I have another story, so today is the day where I have anecdotes. I think a bit before starting with IDEMS, so this would be, say, eight months ago, I started a course on Python just because I like it, well, I thought I was going to like it. I always kind of liked coding, and it was a Harvard course. It was very well done, I really enjoyed the lessons, and then they provided very hard problems to solve that, again, would take at least a week or maybe more.

And, not being an expert, I also asked some questions to LLMs, and I quickly realised that you could just provide the question to the LLM and the whole code would appear in a matter of seconds. And of course, I mean, that’s not what I want, I want to learn and I want to have fun. But, the reason why I’m telling this story is that motivation faded away. I’m not sure why, but it felt a bit less important, or special to solve the difficult problem. And I guess that many students are having similar feelings at the moment with the problems that we provide in school, at university. I think this is going to be an issue.

[00:13:42] David: It’s a really big problem, and it is a really interesting question here, and it does relate back to where you started. Is STACK going to be needed in the future? And, if we are thinking about electronic assessment and what we’re assessing people on, should this be changing?

And I believe that finding ways to make the process motivational and recognise that the tools we’ve got, the skills that people need to learn are changing. I’ve always said for years that learning to code, this isn’t that important a skill. I like to go back to a guy called Conrad Wolfram, computer-based maths was his thing. He did a TED Talk over a decade ago, and one of the quotes that I like to refer to is that “the mathematics we teach is always going to be interesting”, you know, when he went to school, his school also taught ancient Greek and that was a really interesting subject. Just like the mathematics we currently teach is going to remain an interesting subject, stimulating for the students who want to do it.

But is it the mass subject we want everyone to learn? Is that really what we want this huge human effort to go into? That some people will learn this, absolutely. That some people will want to go through, even though the machines can do it, to learn how to do it themselves, because it’s a stimulating skill, it can be fun and so on. But is it the skill that we want to have everybody learn? Probably not.

Just like we no longer want ancient Greek to be the subject that everyone does. Doesn’t mean some people shouldn’t learn ancient Greek, of course. But it isn’t necessarily the subject for the masses. What is that subject or what are those subjects for the masses and how are they going to be different? Those are the interesting questions that I think we can really ask and try to rethink in our education systems.

We can maybe have much more specialisation coming on much earlier on. But, what is it that we want every member of society to know? Because in almost every society in the world, not every society, but almost every society, schooling is compulsory up to a certain age. What is it that we’re wanting everyone to learn?

Now, of course, where we started with the STACK questions, this is university level mathematics, this is already not everybody, it’s people who are signing up and going into a specialised route. And the fact that we want them to have a certain motivation to engage in that in certain ways, and they’re to be affordances, that’s different, because they’ve already chosen the direction they’re wanting to go.

But, mass education, compulsory education, these are very different questions. These are very different subjects. There’s some interesting work happening and opportunities there. Again, I think we are answering the right or we’re asking the right questions, and we’re not providing answers. But to come back to the original question of, you know, is something like STACK needed? I believe we need something deterministic inside the stochastic thing, which is the LLM models.

And stochastic just means random. So it’s this element of combining things that can be controlled and can be made structured and rigid, that’s the deterministic layer, and then your randomness is what enables to have a more human-like conversation, to be able to deal with the unknown, and so on.

So having that combination of stochastic and deterministic, to me, that’s the systems that we need to be building if we’re really wanting to engage people in the future in education.

[00:17:39] Michele: Well, I have to say that I really like this idea of STACK being the deterministic core of a more complex technology that also includes AI. And exactly how that will or could take shape, that’s also an interesting question, maybe for another time.

[00:18:00] David: Absolutely. And I know it’s something that you are keen to think about, you’ve been doing this for authors, STACK authors, people creating the assessment. But I think the new thing here is to think about it from the user’s side, the actual student, the learners themselves, and actually think how does their experience get enhanced?

[00:18:19] Michele: Yes, we are gonna see what we can provide.

[00:18:24] David: Well, we’ll probably get a, you know, let’s hope we can get to a prototype in the next sort of few months or a year or so.

[00:18:31] Michele: I’m pretty sure we can. And for the moment, this has been a very nice and broad conversation. Thank you very much, David.

[00:18:42] David: No, thank you. I know this is a topic you’re very interested in and you are engaging in ways where you are actually, you are in a place to drive this forward, so thank you for engaging me in this conversation.