Description
George Simmons and David Stern discuss the development of educational courses with a focus on tool agnosticism, particularly in their collaboration with the Open University of Kenya for their Msc in Mathematical Innovation course. They explore the challenges and benefits of assessing student work without being tied to specific tools, and highlight the importance of good question design in ensuring conceptual understanding. They consider the evolving nature of technology in education and the increasing role of AI, emphasising the need for students to adapt to multiple tools rather than mastering just one.
[00:00:07] George: Hello and welcome back to the IDEMS Podcast. I’m George Simmons and I’m joined again by David Stern, our founder and director. Hello David.
[00:00:15] David: Hi, George.
[00:00:16] George: Today, so we’re following on from a previous conversation we had about the development work we’re doing with the Open University of Kenya on their Masters of Mathematical Innovation course.
And this is really the first time I’ve been thrown in with the job of helping develop one of the courses, or multiple, I’m currently in the process of developing the first one entitled Introduction to Systems Modelling. And in our previous conversation we went through a bit about what that entails and the structures of it.
In this conversation, I guess I want to get a little bit more practical into the delivery of the course. And my question really starts with other courses that IDEMS have been involved in developing, where we really want to instill this idea of the delivery or the content being tool agnostic in some measure.
So this is exemplified in some of our courses, Introduction to Working with Data, I suppose that in some of our AI related courses, I can’t name, maybe you can fill in. But when we have a concept in data, like data frames, spreadsheets, operations on them, we always try and offer that content can be achieved in your choice of data manipulation software, whether that’s R-Instat, whether it’s R, whether it’s Excel, whether it’s Google Sheets.
And the assessment of that when we try and assess is really on the outputs. It’s a: here’s a data frame, analyze it, sort it by whatever, and find out the most common or the best or whatever measure. And then we assess based on that. And the analogy here with the modelling course we want to do is, hopefully try and get to that same sort of tool agnosticism. There are differential equations, but there are many systems in which differential equations can be input, solved, analyzed, graphed, whatever, Mathematica, Python, Julia, name as many as you want.
I see a real benefit in our course having that same kind of analogy of agnosticism in the tools. But my question that I thought about thinking about putting this together is the assessment we would do there would again be implement the model, maybe show us your output or get the data frame of the simulation output and we can somehow check that against a reference that we have.
At no point in that process is there then an assessment of how you implemented the model, how you got to the answer. And I suppose the analogy back to data is if you’re just focused on outputs, how do you assess, or is there ever a time where you need to assess how someone got to the answer? I guess in mathematics exams, there’s always this emphasis on show your work. You get one mark for the answer, but you can get four for the working through it.
So I think that leads into my question in both the data context and the modelling context is, if we place so much importance on being agnostic to how people interact with the course, a) how can we be sure that they’re building in the right way just because their outputs are okay, and b) if we do decide it’s important to assess people on how they’ve built their model or done their analysis, how do you enable that for the facilitators of the course itself who may not know Mathematica or Julia?
[00:04:04] David: It is a really good question. This has not come overnight. This approach of actually trying to be software agnostic has emerged over the last 15 years for me, where at the beginning I guess I fell into the trap that I feel many educators fall into, that instead of teaching the concepts, I was teaching the tools.
Now if you are not focused on being software agnostic it is so easy to fall into the trap of trying to teach people how to use a tool to do what you want to do instead of actually getting them to be thinking, reflecting, to be understanding, to be building the conceptual understanding. And actually I go back 12, 13 years to Conrad Wolfram and these presentations he was giving where he was arguing that there is a unique opportunity right now to not only teach the skills better, but also to teach the concepts better. And normally you have to choose between one or the other.
But his argument was that we are in this unique position where right now for the mathematical sciences, if we use technology well, we should be able to teach both better skills and a better conceptual understanding. That’s the deep, that insight, which changed my perceptions over a decade ago, and it’s being hard work.
And I think it’s only really now we are getting to the stage where I think we are starting to understand how to do this well. I want to emphasize the fact that I don’t believe we are there yet. This is work which is ongoing. I do believe it’s work which we are, certainly in the data areas, we are at the forefront of what’s happening internationally on this.
I’ve been part of the statistics education community for quite a long time. I’ve been inspired by the best practices that are happening all over the world, and I know that other people are struggling with these same challenges that we’re struggling with, and we are up to date with what they’re doing. I’m less up to date in the modelling world. In the data world, and I want to try and explicitly focus on this problem that if we are only looking at the result, we are missing the how. And while that is potentially true, really good question design can change that.
So if you have a good understanding of what the different ways are that people might approach a particular problem, then if there are subtleties as to what you should do, and I’ll give a very simple example working with data. What about if there are missing values? How should you treat your missing values? Should you be treating them as zero or should you be ignoring them?
So we have a question which is beautifully designed when there are actually two types of missing values that haven’t been distinguished, and some of them should be treated as zero and some of them should be ignored. And so now it doesn’t matter what tool you are doing, the results you’ll get if you ignore all of them, or if you treat all of them as zero will be wrong in specific ways.
And this is where STACK, one of the tools that we use, is fantastic at then giving automated feedback related to that. If it can identify not only have you got this wrong, but you’ve got it wrong in this very specific way, which you would only get if you didn’t think about missing values in the way you were doing this in the correct way, you can give tailored feedback.
Let me just dig into this, this is critical because being able to put that effort in the question design rather than the feedback means it doesn’t matter if the lecturer doesn’t know the tools that the students used. And this is something which is a design principle that I needed to design for, because I know in the context I was working, it is unreasonable to ask the lecturer to know all the tools that might be useful for the students to use. But we want the students to gain these skills.
And so designing, putting that work into the question design rather than responding to what people have done is one way to achieve that. And it’s hard work. But it is something which we have some experience of it being really effective.
[00:09:22] George: That’s a really, that really reverberates the idea of being able to judge a question or a method based on your experience of what could go wrong or how people could attack it in different ways. And I think maybe a good analogy is things like LeetCode, where big set of programming based problems where the system is built so that it doesn’t matter what language you use to complete them.
And the way the assessment’s set up is essentially a load of test cases, which test from the question setter’s perspective, those edge cases, which if you didn’t do it correctly, would come up as failures on the test.
[00:10:09] David: You are exposing the fact that, as with almost everything else I don’t have an original thought. I was a computer programmer in the past, I was aware of test based development. And I was thinking exactly those ideas are the ideas we need to bring into maths education. So it’s not that I have an original thought on these things, it’s that there are these wonderful examples where these things have been shown to be successful in other contexts and are really powerful.
[00:10:40] George: Yeah. And I suppose one of the answers to my questions of, maybe how we do this for models, as you indicated, we’re probably not there yet on the technology, but we can at least come up with some proxy of if your model’s built correctly, then with these initial inputs or these parameters, you should get a specific answer.
[00:11:02] David: Exactly. And more than that, you can say if it is built incorrectly in certain ways with these inputs and these specific, you know, you will get outputs which differ in this specific way. And therefore we can identify what you’ve done or what you haven’t done. Thinking about this as test-based development, which is broadly what it is, it’s sort of exposing those tests in a way that they become assessment. It requires hard work at the beginning.
Let me just take a step further. It is absolutely conceivable that you don’t want people to start building their models from scratch, and then you need to provide them a model. And if you’re providing them a model, surely you need to provide it in a language, in a implementation. You don’t need to just provide through one implementation. Again, the work shifts from the end to the beginning.
So now you actually need to go through the work of developing it in multiple contexts. And you don’t need to be, this is one of the deep insights again, you don’t need to be comprehensive from the start. There might be 10 different languages that would be sensible to have. Start with two. Just don’t start with one.
And if you’ve got two, then over time, a student can say, but I wanted to do it in this, and I’ve actually built my own model in this ’cause I already knew how to use this to build a model. And you say, great, this is all open educational content, do you want to help me to take what you’ve done and actually turn it into educational resources for next time? And now the students actually help get out to the full ten over the years. You’re not trying to build everything in your first attempt.
[00:12:46] George: That’s a really nice, that’s a really nice way of putting it. And we’ll pretend that I haven’t just done one implementation for now, but we’ll get there. But yeah I really see how that’s a beneficial way of looking at this.
And I think it does answer the part b) had at the beginning, it’s not that the facilitator who is assessing, if you decide that the methodology’s important, it’s not that the facilitator needs experience in all of those technologies, they just need to write good questions, which are applicable to any.
[00:13:26] David: And of course the big advantage is there’s a big difference between the content creation responsibility and the facilitation responsibility. Because in content creation, you don’t need one person to hold everything, you can have collaboration, you can have multiple people coming in. You’ve mentioned the Introduction to Data course, and the reason you mentioned the Introduction to Data course is you’ve been asked to do the Python part of that.
And so this is exactly the point, the people who are authoring that focused on R, that was their comfort zone and so on. And they brought you in to say could we also do this in Python? You don’t need to have one person holding all the knowledge. You can have that knowledge coming in through collaboration, which is the best way to work, and especially if you’re developing open educational resources.
[00:14:16] George: And the idea that it doesn’t just have to be the senior lecturers who are participating in that collaboration. It can be the students who are helping drive things forward as well. I think that’s a really lovely place to, to maybe finish this discussion.
[00:14:30] David: Maybe just before we finish, the thing which is so important about this, it isn’t just that if you teach with a single tool you might mistake teaching the concepts for teaching the tool. And that happens all the time. There’s so many courses out there which are, an analysis in R, analyzing in R, whatever it is, which is really just an R course. It’s not teaching you about analysis.
That’s what I mentioned at the beginning, but there is another really much more important concept, which I believe is essential if we are looking at educating the future generations. Technology is moving really fast. The tools you are going to teach people today are unlikely to be the tools that they will need to use tomorrow.
So we shouldn’t be focusing on the tool, we should be, and this is why I say at least two, if you can learn, if you just know how to do things in R, that’s it. You’re done. You’re stuck. But if you see I do things in R sometimes, I do things in Python sometimes, I actually use Stata because it’s really nice and I’ve got it and so on. And so you’ve got multiple ways of doing the same analysis, then you might find there’s certain analysis you like doing in one tool, and there’s other analysis you like doing in another tool, there’s other analysis which are better done in a third tool. And then adding another tool is not a big deal.
Actually, the tool isn’t what’s important because, for the data analysis and for the modelling, actually the things you need to do are broadly the same, independent of the tool. And so actually, we should be encouraging and enabling students to use multiple tools. And very concretely on this, I believe this is an urgent thing right now because of what AI is doing to coding.
A few years ago everybody was obsessed with teaching coding. And what AI has recently shown is it isn’t an important skill. You can actually ask ChatGPT or any of the other large language models to write the code for you, and they’re really pretty good at it. It’s not an efficient thing to become an expert at because you are not going to be better at it than you can use artificial intelligence for. It is not the right skill to put that emphasis on.
And therefore, the language, the interpretation between languages, the ability to understand, the ability to read code, that’s extremely important and much more important than the ability to write code. And just like any other reading, the more languages you can read in, the more you have access to information.
So if instead of focusing on people writing code, we focused on them reading code in multiple languages, that would be a much better education we’d be giving. Preparing people for the future, not the past. This is all, this is why this sort of simple thing is so important and it is central to everything we are doing, to everything we are developing.
It’s something which I’ve pushed pretty hard for and I am now not the one pushing for it. It’s great. My guess is that actually this came to you through Lily. She pushes for this more than I do now. It’s great.
[00:18:12] George: I can’t name my sources David.
[00:18:18] David: So I cannot overstate the importance I believe of this idea of being software or tool agnostic. I think it’s central to educating a future generation that will have tools that we cannot imagine today. That’s what they need.
[00:18:38] George: Yeah. And I suppose just one final thought on the AI as well, and especially in context of data and modelling, of both those courses. Almost certainly possible now, and definitely, in the next couple of years, those tasks we give will be able to be fed to ChatGPT and it will produce the code and everything.
I suppose, by setting good questions, you still retain that ability to test that the student has understood what they’ve prompted, what the AI is outputted that they still understand those fringe cases or those different missing values of data in ways that, yeah, they can still prompt the AI to start solving, that’s fine, but what matters is that you’re testing the conceptual understanding of the scenario.
[00:19:30] David: And more importantly, the ability to be able to use a tool, even if that tool is just give this to the AI, is the AI doing the right thing based on what you have given it? That’s maybe the question in the future, which is very different from having to write the code in R or do it in Python or to use a front end like SPSS or R-Instat or whatever it may be. If you give it to AI and AI isn’t getting you the right answer, that’s a great education. And if you give it to AI and it is giving you the right answers you don’t need to worry about it.
You have the ability to answer that question. This is where it comes back to really good question design. Because if everything you give to AI, it just gets right, great, you’ve got good AI, you don’t need to do anything else. But my experience is, something like missing values, there is no AI system at the moment which would correctly answer this missing value question.
It’s just impossible to conceive because the reason that you should replace some by zero and some by missing is because of other data which is there, and the interpretation of the whole row of data, there’s no AI system which is built with that sort of complexity at this point in time. And yet, any student interacting with it, understanding the data well, has everything they need to answer it correctly.
[00:21:02] George: To ask the right prompts to interpret it. Everything.
[00:21:05] David: Exactly. But it would be for an AI system, it would be a sequence of prompts where you’d need to first of all, actually deal with the data cleaning steps and then go through and once the data is in the right shape, then you do your analysis. If you do your data analysis directly on the original data, then the AI system can’t get it right ’cause it can’t know how to clean the data. Even if you say, can you clean this data for me? How would you know? You actually have to understand it.
[00:21:36] George: That’s a really fascinating outlook. And I love that we tied back to good question design being really the keystone in how we solve that initial trade off, I guess we had at the start, of my question.
So this has been a really fascinating conversation, David, and I’m really grateful we had it. Now I can go on to design, design everything.
[00:21:58] David: And in particular, you might want to ask somebody for some Julia code to do what you’re doing in Python.
[00:22:03] George: Yes. Very good.
Thank you very much, David. We’ll speak again soon.
[00:22:11] David: Thank you.

