094 – Is Data Maths?

The IDEMS Podcast
The IDEMS Podcast
094 – Is Data Maths?
Loading
/

Description

Lily and David consider the complex and often contentious topic of whether data and statistics can be considered a subset of mathematics. The discussion touches on the historical separation of statistics from mathematics and its implications for the field, and the rise of data science and computer science. David proposes the radical idea that the current mathematics curriculum, heavily centred on calculation, should pivot to emphasise data interpretation, advocating for a future where data literacy is foundational in education.

[00:00:00] Lily: Hello, and welcome to the IDEMS podcast. I’m Lily Clements, a Data Scientist, and I’m here with David Stern, a founding director of IDEMS. Hi, David.

[00:00:15] David: Hi, Lily. Looking forward to a discussion. What are we discussing today?

[00:00:19] Lily: Today we’ll discuss, is data maths?

[00:00:22] David: Oh, oh, this is a contentious topic.

[00:00:26] Lily: Yeah, well, I know that we finished the last podcast that me and you did, and I said, you know, this kind of leads me to the question, is data maths? And I don’t think that we can cover that quickly. Even now I don’t know how we can cover it in a whole podcast, but I do know that when I first met, or the first conference that I went to with you in 2017, it was a massive conversation there at the International Statistics Conference.

[00:00:54] David: The World Statistics Congress.

[00:00:55] Lily: That’s the one, the World Statistics Congress. And it was to the point that someone even gave a talk on something really interesting and in their talk said, as a throwaway comment, statistics, a subset of maths, and then that was everything that the conversation was on. So that for me is my first experience I guess of this contentious topic. I assume that’s what you mean by it’s contentious.

[00:01:19] David: Yes, so let me give you a bit of the history behind why it’s so contentious.

[00:01:24] Lily: Great.

[00:01:24] David: Historically, statistics was considered part of maths and you would have it within the mathematics departments. And it was a concerted effort by my father’s generation, around the 70s for statistics to break free of its mathematical origin.

[00:01:43] Lily: Your father being a statistician.

[00:01:46] David: My father’s a statistician and was part of one of the first applied statistics departments that was actually housed in agriculture rather than being housed in mathematics. And this was a big deal and they were really successful at the time, they were a biometry department. They focused on statistics applied to the biological sciences. And that independence from mathematics was really important because it allowed them as a department to value applied statistics, you know, data from a side of application more than the theoretical foundations of statistics, which are much more mathematical.

[00:02:27] Lily: I see.

[00:02:28] David: This is 50 years ago.

[00:02:29] Lily: Sure, and like with physics, I guess, you have your applied part and your mathematical part more?

[00:02:35] David: Well, physics is very different and that story goes back much further. You can trace that story back to Einstein, actually, where the applied physicists actually won. But anyway, that’s a whole different story. We can have that one as another episode at some point.

[00:02:49] Lily: Sorry.

[00:02:49] David: But let’s get back to the sort of data. And the key point is, of course, that, part of the reason this is such a sensitive topic is that even now, statistics departments being within maths or separate is a contentious issue. The arrival of data science has made it more confusing. Are they part of statistics? Are they under mathematics? Are they under computer science? Are they their own thing as well? So the interplays between statistics, mathematics, data science, computer science, there’s no way to pose an opinion which everyone will agree with because there are people who have fundamentally opposing views.

What I will say, and you’ve got to bear in mind that my background is mathematics, but what I will say is that considering statistics as mathematics has certainly done it a disfavour because, really, if you’re wanting to start with data, then actually the mathematics is just a tool, it’s a language, it’s not what you should be studying in my opinion.

Now I’m a mathematician, I love studying maths, I love doing maths for maths sake, but that’s not what happens when you start with data. When you start with data, you want to use the tools that are there, you don’t want to be constrained to be thinking abstractly and to pursue the abstractions, which is what mathematics does so fantastically.

So I understand, and I suppose it’s borne out of years discussing with my father, who is an applied statistician, and I’m very much now a sort of data scientist, somebody who works with data a lot, I distinguish it from my mathematical foundations. And so I absolutely understand and really support this idea that you should be thinking of data as having a different foundation from just mathematics.

However, I do have a controversial view about mathematics.

[00:05:08] Lily: Okay.

[00:05:09] David: Which is something I’d love to dig into more, which will probably be another episode, but which relates to this, which is, there was a movement about over ten years ago now, where Conrad Wolfram put forward his philosophy for computer based maths, where he proposed that it is possible to teach maths both more conceptually and more practically at the same time by using computers and technology. And I bought into this and I went to all the events and conferences and I really got sucked into trying to say, well, how could we make this happen?

And I have to confess, I was rather disappointed by the outcomes. In that time when I was deeply engaging with this, I reached a conclusion which surprised me and did not gain traction. And the conclusion I reached was that if you want to really pursue computer based maths as something which could be both taught in a way which is both more practical and more abstract, then you have to replace the core concept of Maths education, which is currently calculation, with something which enables you to have a similar level of progression. That was the conclusion I reached.

[00:06:42] Lily: Okay. And what do you mean by this? Sorry.

[00:06:45] David: So if you think about calculations, from an early age, you learn a bit about calculations, and then as you progress, you learn more and more calculations. You go from being able to understand numbers, maybe count them, to being able to maybe add them or subtract them. Maybe then you can start to multiply them together, and then you can start dividing. And then you can start getting on to, you know, all sorts of other expressions and formula and maybe you can start to do calculus and things like this that many, particularly in the US, but in many education systems, the calculus is the height of high school education.

And yet, I can explain to my five year old and seven year old the fundamental ideas of calculus, which is how things change. I wouldn’t teach them about derivatives in the same way, but the ideas of how things change is very natural. Discussing speed is very natural with a five year old. And this is where computer based maths would sort of argue you can teach these concepts much earlier and in different ways.

And one of the problems that was there is that observation is all fine and good, but if you want to build a curriculum you need progression. And calculation gives you a very natural progression where actually students do get taken on a journey where their mathematical knowledge and their understanding progresses in a way which is very constructive towards building up, layering up, understanding and concepts.

My hypothesis, which I put forward now over 10 years ago, was that if you want to really pursue computer based maths, then you need to replace calculations with data, and you need to put data at the heart of your curriculum.

[00:08:45] Lily: I need a second to kind of absorb that. Okay. Okay, so you need to replace calculations with data.

[00:08:58] David: So, as a progression, to get a progression of mathematical concepts.

[00:09:03] Lily: And at what level? I mean, I was speaking to Kate, who also does podcasts with you, and she said to me that her problem with maths is always that, you know, you’re taught things in context in other subjects, you’re not taught just a sentence out of context in another subject. Whereas with maths you’re taught adding and things out of context. Is that sort of what you mean of having…?

[00:09:25] David: Well, in some sense, yes. I believe that through starting with data, and it could be data about all sorts of things that people are interested in, you could get to the concept of, well, actually addition is a pretty useful thing to be able to do. Or multiplication could be got as an abstraction of something you need to do when you’re looking at certain types of data and you’re wanting to understand it in certain ways.

In fact, my claim at the time was that actually you could cover the whole current mathematical curriculum and more starting with data and then abstracting out based on things that would be useful to find out and actually having a need.

[00:10:12] Lily: Can we step back a bit and then just define data in this?

[00:10:16] David: Yeah.

[00:10:16] Lily: To me, I guess I’m thinking of a data set, like an Excel spreadsheet.

[00:10:20] David: Yeah.

[00:10:21] Lily: Which I think is something that can just come to mind, but a quite classic one that you’re given as a child is Fred buys four apples and Tim buys seven apples. How many apples do they have?

[00:10:32] David: Yeah, but that’s not interesting data. Those are false problems in some sense.

[00:10:36] Lily: Okay.

[00:10:36] David: No, I’m thinking of data as being much simpler. Let’s try and understand how many children are in this class. Let’s understand how many children are in the other class. How many children do we have as a whole? Now, the children in the class, well that’s data. You know, that’s just the number of observations. You could get information about those children. You could have their heights, you could have other things. And you could then think about the fact that, okay, now, just by having a look and thinking about that as data, well, we’re collecting information about the children, then we’re collecting information about the classes.

You know, you could get to multi level data by age six. We were talking the other day about the fact that you didn’t get to a multi level data much as an undergraduate.

[00:11:21] Lily: Or postgrad.

[00:11:22] David: Or postgraduate. And this is the thing, let’s teach at age six to have data…

[00:11:28] Lily: Of a statistics degree as well.

[00:11:30] David: Yeah, exactly. Yeah, we should be teaching this to six year olds because it’s so natural a concept to have information about, well, you have information about the children in the class, and you have information about the classes themselves. Who’s teaching the class? That’s information about the class, that’s not information about the children.

But it applies to the children, because if you know who’s teaching the class and you know which class they’re in, then you know who their teacher is. And this is something where thinking about data and the relationships between data, why aren’t we doing this to our kids at a young age? And why don’t we then think about that and how that enables us to rebuild maths education as a whole.

Now this is really challenging for me as a mathematician because I didn’t learn maths that way. In fact, I remember taking a statistics class, first year of undergraduate, and thinking, oh no, I don’t like this way of doing it. I’d grown up around statistics. I knew quite a bit of statistics, but I really didn’t engage well with being taught what was a very good practical course. I liked abstraction. So it’s hard for me to actually suggest something which for me as an individual might not have worked as well because I love abstraction. But just because it doesn’t work for me doesn’t mean it’s not better for society.

[00:12:59] Lily: This is a really interesting idea I don’t think it’s easy for you to convince me things anyway. But, does this build out to kind of other levels? So we said about addition and stuff, but what about like fractions or just those kind of less natural things?

[00:13:18] David: There’s nothing which is not natural. That’s the power of mathematics. Mathematics is abstraction. The height of the high school curriculum in mathematics almost everywhere is calculus. Look at that in terms of data, and if you’ve got a time series, this relates to rates of change. And so this is something where it’s so natural to think of this in terms of actually calculus, how this comes out.

[00:13:42] Lily: But so what, from what you’re saying, we kind of are building, a course around data.

[00:13:50] David: It’s not building a course around data, this is the point. Instead of having the mathematics curriculum defined by children’s ability to calculate, if we assume computer access, we could have the whole curriculum and all the concepts we want to teach defined about the ability to work with data and understand and interpret data.

Don’t get me wrong, this is hard. New Zealand is the only country, we’ve mentioned this before, is the only country to have introduced data from the first year of primary. And, they found one of the problems that they had was that they didn’t have enough of a progression within how the data curriculum evolved. And that meant that actually the learning wasn’t necessarily as deep as it should have been because people thought, oh, I’ve seen this before.

And they didn’t actually progress as much as they should from one year to the next because the progression wasn’t clearly defined. This is the power of the current mathematics curriculum around calculation, your year to year progression is really simple. You keep learning new tools, new things, new tricks, in some sense, they’re not tricks because if it’s maths, it’s the right way to do it. But there are sort of easier ways to think about it.

I’m not saying this is easy, and I’m certainly not saying I know how to do this. But as a thought experiment, once I started engaging deeply with the fact that computer based maths could transform how we learn mathematics, I became convinced that the only way to really do this properly would be to rethink the whole maths curriculum from first year of education all the way up, based on data playing a central role, and the mathematics really emerging as abstractions, a language to help you and to help your data literacy journey.

[00:16:18] Lily: Interesting. I can definitely see where certain groups or certain people would, well, I can definitely see the controversies.

[00:16:27] David: Oh yeah.

[00:16:29] Lily: I suppose you’re saying this was your thought process kind of 10 years ago and when Conrad Wolfram was talking about computer based maths. What are the thoughts today on computer based maths?

[00:16:42] David: Globally, I’ve been following as much as I can, and my conclusion is nobody’s getting anywhere. And that’s not quite fair. There have been little bits of progress here and there, Mosaic is a wonderful model. There have been some wonderful projects. So, don’t get me wrong. But nothing very ambitious, as I would put it.

And I think the reason, computer based maths was supposed to be this big, game changing, ambitious thing, and it wasn’t. And my understanding is that the reason it wasn’t, was because it was lacking a deep insight. And in the 10 years since I first, it’s over 10 years since I first had this crazy idea that we should be putting data at the heart of maths, I’m afraid I’m more and more convinced.

Everything I’ve seen since has done nothing but convince me. If we could do this, this could be transformative. But also all those interactions have convinced me that I don’t see how this is going to happen. I would love to be in a position to actually try some of this out, but it’s such a big task, and it’s so controversial, as you say.

And what was really surprising to me is, I thought, you know, statisticians and data scientists, would love this idea. They won’t touch it with a bargepole, because they’re scared of maths. Mathematicians, to them, that’s the pinnacle for many. And mathematicians won’t touch this with a bargepole, because, well they, like myself, they loved the calculations. They thrived on that. So there’s this sort of, you know, it’s a, it’s an idea which is so difficult. I don’t see how this could really be taken up.

But it’s a really simple and really powerful idea. If we assume from very early on children have access to tools, I still like Dragon Box, which is a game which actually helps teach bits of algebra using just visual play if you want, and actually, really, I think the way they’re doing it is broadly that play is very data focused. You’re getting information, you’re sort of looking at things, and then you’re drawing conclusions based on them.

It’s a powerful approach to be able to say, well, okay, let’s use our ability to collect data and then make observations about that data and interpret things, try to identify when things are the same and when they’re different. Apples and oranges are different, but five apples and five oranges are both five. As data, you’re saying that the quantity is the same. Now, you might find that five apples weigh the same as three oranges. They might be big oranges and small apples, who knows.

But if you did, then the weight of five apples and the weight of three oranges could be the same, even though the number is different. And this is as data, as analysing data, this is absolutely natural. But when do we teach this in terms of maths? These concepts, what sort of literacy would we be able to get at scale if we could actually get this idea of data and the idea of actually collecting data on things central to how we then build up our understanding of equality, equations, what it means when things are equal.

And then you get to the subtleties, maybe they’re equal in one way but different in another. You could get to those concepts which I know PhD mathematicians who struggle with that concept, but it’s so important in real life. This idea, in general, equality is not equity. Well, what does it mean? What does equity mean compared to equality?

Those concepts are societally so important, and they are deeply mathematical. But most mathematicians I know have isolated themselves from those realities of worrying about these sorts of things. You sort of put yourself into a narrow discipline. Whereas I believe there is an opportunity to not only teach mathematics in a way which is more powerful, I believe we could teach mathematical concepts much earlier on. Calculus would no longer be an end of school thing, it would be taught in primary school, in sensible ways.

And yet at the same time, people could gain practical skills which would be useful at scale. What would this mean for society? Oh, this is an exciting idea. I’ve been sitting on it for a long time. So this is where when you suggested discussing, you know, is data maths? Well, I think maths is data.

[00:21:53] David: That’s the key thing. Maths is abstractions, it’s a language of abstraction, which you could gain from really observing data. So yes, I believe there is an element of data is maths and maths is data. But it’s complicated. And I think there is an opportunity for us to rethink this at a societal level for our education systems.

I’m conscious I could go on for hours, so we’d better cut this one short.

[00:22:24] Lily: Yes, yeah, it is an incredibly different point of view, a different conversation as well than I was expecting.

[00:22:35] David: Sorry, I knew that was coming.

[00:22:38] Lily: But it’s very eye opening. It’s always fun to have your way of thinking kind of be challenged, to be out the box. And of course you love being out the box.

[00:22:51] David: Well, I mean, this particular one is one where I have found myself, you know, I, my attempts to discuss this in communities, who I think are really good at what they do, you know, the stats education communities, the maths education communities, these are really good at what they do. They have not been receptive to this idea and it’s really interesting to me.

It could be that it is just a crazy idea, which is why I haven’t really pursued it. But the more I’ve reflected on it over the years, nobody’s given me any good reason that this couldn’t be powerful, transformative, exciting, as a new way of thinking about education and the education which is appropriate for the future, where, actually, the ability to calculate is not what defines, but the ability to interpret data and not be misled by data, that’s so important in our day and age.

What if that was the central pillar on which we rebuild the whole of our mathematics curriculum? Oh, we better cut this off because I’d love to actually dig into that and it’s not going to take too long.

[00:24:11] Lily: Yeah, absolutely. Well, thank you very much, David. It’s been a really good conversation. I’ve really enjoyed hearing your thoughts.

[00:24:18] David: Thank you. And thank you for bringing such an interesting topic.