Causal Bandits Podcast
Causal Bandits Podcast with Alex Molak is here to help you learn about causality, causal AI and causal machine learning through the genius of others.
The podcast focuses on causality from a number of different perspectives, finding common grounds between academia and industry, philosophy, theory and practice, and between different schools of thought, and traditions.
Your host, Alex Molak is an a machine learning engineer, best-selling author, and an educator who decided to travel the world to record conversations with the most interesting minds in causality to share them with you.
Enjoy and stay causal!
Keywords: Causal AI, Causal Machine Learning, Causality, Causal Inference, Causal Discovery, Machine Learning, AI, Artificial Intelligence
Causal Bandits Podcast
Causal AI & Dynamical Systems || Naftali Weinberger || Causal Bandits Ep. 005 (2023)
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Support the show
Video version available on YouTube
Recorded on Aug 29, 2023 in München, Germany
Can we meaningfully talk about causality in dynamical systems?
Some people are puzzled when it comes to dynamical systems and the idea of causation.
Dynamical systems well-known in physics, social sciences, and biology are often thought of as a special family of systems, where it might be difficult to meaningfully talk about causal direction.
Naftali Weinberger devoted his career to examining the relationships between system dynamics, causality and the phenomena known broadly as "complexity".
We explore what does "intervention" mean in a dynamical system and we deconstruct common intuitions about causality and system's equilibrium.
We discuss the importance of time scales when defining a causal system, analyze what could have inspired Bertrand Russell to say that causality is a "relic of a bygone age" and ponder the phenomenon of emergence.
Finally, Naftali shares his advice for those of us just starting exploring the uncharted territory of causal inference and discovery.
Warning: this conversation might bend your sense of reality.
Use with caution!
Ready to dive in?
About The Guest
Naftali Weinberger, PhD is a Researcher at Munich Center for Mathematical Philosophy at LMU. His research is focused on causality, dynamical systems and fairness. He works with scientists, researchers and philosophers around the globe helping them address challenges in diverse fields like climate change, psychometrics, fairness and more.
Connect with Naftali:
About The Host
Aleksander (Alex) Molak is an independent machine learning researcher, educator, entrepreneur and a best-selling author in the area of causality.
Connect with Alex:
Causal Bandits Podcast
Causal AI || Causal Machine Learning || Causal Inference & Discovery
Web: https://causalbanditspodcast.com
Connect on LinkedIn: https://www.linkedin.com/in/aleksandermolak/
Join Causal Python Weekly: https://causalpython.io
The Causal Book: https://amzn.to/3QhsRz4
I'm a limited being, operating at a particular time scale, and I wanna not have to know about everything happening in the universe to predict what will happen at this table. To reconcile what feels to you like your own personal experience with whatever broader system makes the world. Hey Causal Bandits, welcome to the Causal Bandits Podcast.
The best podcast on causality and machine learning on the internet. Today we're coming back to Munich to meet our guest. He has a truly unprecedented attention to detail. Growing up in the suburbs of New York, he has lived in eight different cities around the world. He developed a passion for studying the Talmud and Sufi texts, but he has chosen philosophy as his long term partner.
It was his mom who persuaded him to create a Twitter account. Ladies and gentlemen, Dr. Naftali Weinberger. Let me pass it to your host,
Alex Mola. Ladies and gentlemen, please welcome Naftali Weinberger. Great. Thank you
for having me. It's so great to be here. How are you today, Naftali? I'm doing great. Yeah.
Yeah. I'm very excited for our conversation. And really nice
meeting you. Over 100 years ago, an iconic British philosopher Bertrand Russell said that causality is a relic cause. a bygone age. What would be your
comment on his argument? So I have many comments on this. I guess the first thing to say before diving into the details is it very clearly isn't.
So, so I'm a philosopher, I'm a philosopher of science, and there's often an undefended assumption that If causation is going to be somehow scientifically legitimate, it must play a role in fundamental physics. And at the time, of course, Russell was thinking of a different type of physics. He was thinking of gravitational astronomy, uh, quantum mechanics.
Hadn't happened yet. And what I want to say first is that, well, should we think about causation? Uh, is that, you know, scientifically legitimate, relevant? Uh, look at any science. Look at epidemiology. Look at sociology. Look at industry, see if they need to use causal knowledge. And I would firmly argue they do so at least on the usefulness or whether it's a relic of a bygone age, I think we can just very clearly answer right off the bat that it's not, and that it is something of.
extreme usefulness. Uh, and then we can have a further conversation of what is it? And if you do in fact think that there's no causation in physics, well then how is that compatible with higher level causation? And there's A lot one could talk about there, but whatever we say there, the legitimacy of causation, the scientific legitimacy of causation, should not be in question.
One of the directions you focus on is looking at causation at different time scales. Can you share a little bit more with the audience about this research ?
I think in any science you consider that's not just about causation, almost any claim you make or any particular phenomenon you decide to study, uh, timescale and scale more generally, spatio temporal scale is going to be crucial.
So any physicist would know this. It depends, yeah, whether you're looking at a system at the quantum scale , or at higher scales where Newtonian physics is a better approximation. You're going to get different behaviors at different scales. If you're a biologist, whether you're doing cell biology or systems biology, you're going to have different relationships.
So I think this is a very general feature of, of any scientific model, and it's something that, to my mind, has not been systematically talked about in the causal context. So this is something I've been working on, very inspired by the work of Herbert Simon, and in particular by a paper from 1994 by Yumi Iwasaki and Herbert Simon, Causality and Model Abstraction, where they really give lots of the fundamentals on this.
But, just to maybe give you You know, just a quick idea about, what I'm talking about when I say timescale. So I remember I was once at the dentist, and we were talking about flossing, which I guess it's a normal thing to talk about with your dentist. And he said that he always, you know, he tells people, yeah, well, he asked people, do they floss?
They say, I don't floss because it makes my gums bleed. And he says, no, your gums bleed because you don't floss. Now are they contradicting one another? I think very clearly not. If you haven't flossed and then one day you start flossing, that day your gums are going to bleed. If you floss every day, for however long, after a week your gums won't bleed.
So, even with this simple everyday example, when we talk about the relation between flossing and bleeding, you get different relationships, or different relationships, because they're different scales, depending on how you describe the system, and that's the type of thing which I think is all over the place.
That when you're giving a causal representation, a causal model, making a causal claim, it's going to be building in assumptions about what is the timescale at which you're considering the system. In
industry, many people, especially people who are starting with causality are very concerned about the possibility that they might be, there might be hidden confounders in the system.
And there are probably, roughly speaking, two ways of thinking about those confounders. Some people will just focus on the variables that they might know based on the expertise in the field that they exist and they are hard to measure. And the second way of thinking would be about variables that are operating on like a large time scales.
What are your thoughts about this? So
one thing I'd say is I'm not sure I accept either of those options. First of all, I mean, it is true that when you motivate a particular confounder, you say, well, you know, we're looking at the relationship between, let's say, uh, yeah, income and education and we think parental income is going to be a confounder.
And that's a way to motivate that you need to measure parental income. But I think methodologically, when we're thinking about a system, and especially if it's a case where you don't know about it, it's not there's a confounder that you add to your model if you think it's there. I think rather if you give a causal model, or any model, and you do not include confounders, and you're trying to get causal knowledge, you're in some way presupposing that they're not there.
So it's actually a very strong assumption not to allow for the possibility of hidden confounders. And this is of course one reason why lots of people like experiments, because at least with certain types of experiments you can, by design, remove the influence of confounders. About the relation to timescale, I think, I think that's a very subtle issue that hasn't been fully studied, because It's not always clear that timescale is a problem here.
So people have theorized about causation, so philosophers, so an example of this would be Mackey in Cement of the Universe, but also people like Dan Housman, who is my grad school advisor. So, it's a commonly made claim, which I think is right, that whenever you make a causal claim, it's relative to some concrete system.
So If I just start talking about some gas in a container that's governed by the ideal gas law, which says that pressure times volume is proportional to temperature, well, I don't know what the causal relations are between those variables, but now I tell you it's a movable piston, it's in a heat bath, it's maintained at a constant temperature, now I can add the causal story.
And this could be true for lots of variables. So even if you're doing epidemiology and you want to know the effect of watching violent television, on behavior, that could vary from society to society, and it's going to depend how much television people watch, and it's going to depend on a lot of things about how things are organized, and often this is left in the background of any causal model, where you don't, you just give the variables, and the variables are the things you're most interested in, the var yeah, the ones you want to know the relation between them, and sometimes if you have some variable which is just very Constant over time, you wouldn't say that's a confounder because confounders cause problems when they vary, but sometimes if something's just fixed, it just gets relegated to the background context.
So, for example, suppose you were doing a study of certain, you know, trading behavior and capitalist economies, and over a few hundred years, you have this country that's a capitalist country with a free market. Well, you'll have one model for that, and that's not a confounder. I mean, it is a cause in some sense.
It is. It's influencing everything you consider, but it's not a confounder because it's just in the background. It's just, influencing every observation you make. It's a precondition for a lot of what you want to say about how markets will react and how people will engage with one another in this economy, but not a confounder.
So I think sometimes long term factors are not necessarily a problem in terms of confounding. They are important if you want to understand the conditions under which a model applies. Those background factors.
That, could be constant over a long period of time . It sounds pretty much like as something that we could, simply control with an intercept in the model.
Yes. Just comparing the baseline for different observations or different classes of observations.
Yes.
Yes. And when, when those background variables, can start being problematic.
So you mean background variables related to.
So you mentioned a couple of examples when we can think about those variables that are, that we treat them as exogenous because on the shorter timescale, as if I understood correctly on the shorter timescale, we see them as constant, but on the longer timescale, maybe there is some dynamics to them that is just, we just don't see it because in our data, because we are not observing them for long enough.
So, yes, and let me, I'll unpack the question a bit if that's okay, because we were talking to each other beforehand, and I want to make sure it's, clear to the audience what's going on. So I think a great example for exactly what you're talking about, and that just gets the central intuitions on the table is, An example I took from a paper by Simon and Rescher from the 60s.
So a very simple example has only three variables. It's the amount of crops planted in a field, the amount of crops that grow in that field, and the amount of rainfall. Now rainfall is something known as an exogenous variable. Basically that means it does not depend on the values of the other variables in the model.
And any listeners who have done anything in social sciences, know that the process of finding a truly or even approximately exogenous variable, it's a nightmare. You can always find something which seems like it doesn't depend on things, but especially in social systems people might respond to that particular variable.
But rainfall is a very good exogenous variable. It's as good as an exogenous variable that you're going to find because the amount of crops that grow in a field depend on the amount of rain that falls and not vice versa. Right, that seems pretty good. Now where timescale enters in here is if we don't just think about these as two events, so if we don't just think about, you know, you have one year's rainfall and one year's crop growth, but these as repeatedly measured variables.
sO for those of you that know more, this would be a vector of variables or maybe a time series. Well, over a long enough timescale, over a hundred years, over two hundred years, agriculture can influence climate. So the amount of crops who grow year after year, Presumably over a wide number of fields that can influence future rainfall.
And if you were to look at the long time scale model, you would in fact have a feedback loop. So the amount of crops you grow influences rainfall. To summarize what we have here, we have some variable which we claimed was exogenous and everyone was really happy, I assume and then I said, well, it's not truly exogenous.
It's, in fact, at a longer time scale, there's a feedback loop and the amount of rainfall is changing, but when I give you the original model, and I say rainfall is exogenous, I think most people are thinking at shorter timescales. They're thinking, well, over the next five years, even if there's some epsilon, some minuscule influence of crop growth and rainfall, that's just negligible.
And even if there's some feedback loop, that's negligible. So we have these two representations, one has a cycle, one doesn't. Both of them are appropriate at their timescales. And it's not necessarily a problem, provided that you apply the appropriate model at the appropriate timescale. The issue arises when you extrapolate across timescales.
So, if you were a climate scientist and you wanted to study long term effects of global warming, well then you better not anymore be using the model that says that crop growth has no effect, because You wanna consider longer timescales in which it makes a difference.
This reminds me of of the ideas that are extensively, discussed by Ray Dalio.
He's a famous investor and he's keenly interested in history. In his work, what he did is he collected data from ancient China, ancient Chinese empires, up to modern day for. All major empires in the history, and he looked at them looking for patterns, and abstracting maybe a little bit off why he did this.
And so on. One interesting thing that he says is that in investment, we are often limited to Our lifetime of observations, but this is just a very short sample. When we look at the history globally, do you think that this kind of long term cycles when you work with climate scientists or other scientists, is it also relevant in other fields?
Oh, yes. It's a great example, so I should look at it. But I think once you start looking at scale and trying to think about systematically, you see it everywhere and I think what you're saying or what you're reporting, Ray Dalio is saying, this issue of, I mean, it's, it's partly a causal question.
It's partly a data question. It's whether you have the right answer. amount of data to address the type of causal problem you want and finding a characteristic timescale. So what's the right characteristic timescale for understanding historical events or understanding certain types of predictions or investments?
I mean, I really, I started thinking about this as a very local causal problem. Well, isn't it kind of interesting that maybe you can have different behaviors at different timescales? I think that the notion of timescale and the idea you can have different patterns and behaviors at different timescale It's all over the place once you start looking for it.
I think it's a fun thing to look for even if you don't care about causation. But certainly if you do care about causation, I think there's a huge amount of work to do there. When we discuss
those scales, time scales, and, and maybe spatial scales and so on.
it's hard not to think about, about complex systems and the phenomena, like emergent phenomena. Yeah. What are your thoughts about emergence and
causality? So, first of all, I think that's totally right that, yeah, that, yeah, the broad project, which I mean, I'm focusing on timescale, because that's the thing that I feel like I have the most to say about.
But the big question here is, is there causation in complex systems? And there's lots of discussions, both suggesting that maybe there can't be causation in complex systems. Uh, as well as questions about, you know, emergence and these types of issues. I guess what I would say here is, that if you accept my sort of thesis, which I mean it's a hypothesis, but I think I can give these clear examples that causal relationships are relative to a timescale.
I think you're going to understand a lot of what people are talking about when they talk about emergence. Part of the picture is that you're, you consider a system at one, say, lower level of description. And it looks like everything depends on everything else, and you go, there can't possibly be causation.
And then you look at it at a different scale, either lower or higher, but let's say it's a higher level system. And then patterns of regularity emerge. They don't come from nowhere. You can look at the dynamics, and dynamic systems theory is all about. Very broadly speaking, how order emerges from chaos.
But the first point is just because you see a certain property at the, lower spatiotemporal scale does not mean That you won't have it at the higher one and going back to Russell I think this is another issue about physics versus other sciences, which is let's say Russell is right about physics whether What he says applies to other sciences does not carry over partly for this reason Now in terms of how it relates to emergence, I mean, this is gonna be a very philosophers answer.
I think it really It depends what you mean by emergence and lots of philosophers have racked their brains about how this could be possible, where does any of this come from? The more I've looked at it, the less I've worried about that. And the reason is that I think that the picture lots of philosophers use still often is Very much in terms of levels.
So often, it will be, the lower level as parts, or the higher level, or there's some more complicated pictures where you can have orders of properties. So you have an object that at one level of description is described at just a physical object, at a higher level of description is described as a corkscrew.
So something, you know, so you're giving a functional characterization. And then there's all this discussion about what's the relationship between levels, but still understood. where you're always quantifying over the same object. So you can describe something as a corkscrew or just, you know, in terms of some physical characterization, presumably it's some sort of lever or something.
And, you're sort of still, you know, even though you're changing levels, you're not changing objects. When you consider a system at different scales, in some sense you're considering different objects. So you're zooming in and zooming out is a way I like to think about it. tO give an example, suppose I have a room that's regulated by a thermostat, and I turn on the oven, and the oven makes it the case that five minutes later the room is a bit hotter, and an hour later the room is not hotter because the whole point of a thermostat is to equilibrate the room to make sure the temperature does not change in response to turning on the oven.
Now, there's no contradiction between those models. There's no arguably interesting metaphysical question or puzzle how turning on the oven can, have this effect at one interval and not another. So you don't get these puzzles that, and there are lots of puzzles in the metaphysical and philosophical literature, but at the same time, when you go to even more complex systems, so systems with various.
feedback loops and patterns of self regulation and systems hooked into one another. You do get these types of complexity that you wouldn't see just looking at a part of the system. So, at least to the extent that people talk about emergence as something that's spooky or, you know, where did it come from?
My view is that you, I don't get that effect when I look at these models because if you understand what's going on, there's no contradiction, or at least, you know, no puzzle in the fact that
Maybe we could have lived in a different universe where you didn't have that, but we very clearly live in a universe that does, yeah, exhibit different patterns of behavior at different, time and spatial, spatio temporal
scales. We started our conversation with Bertrand Russell and his stance on causality.
And while replying to the question about Your view on his stance, you mentioned the question of what is causality actually, how do you think about causality as a philosopher?
First of all, when I think about causality and is there causality, I don't think about like, let me. find a catalog of the universe and see well, where's the causal stuff or something like that.
Rather I really start methodologically. So here I and everyone else or lots of other people are influenced by Cartwright, her causal laws and effective strategies where the idea is, is that we need causal knowledge because we want to be able to understand the difference between Mere prediction and cases in which we actually want to intervene on a system.
So you don't just want to know that, you know, people that take the drug are more likely to recover. You want to know that if you give people a drug, they're more likely to recover. And that's, well, at least that's a starting point for thinking about causality, where you think about the methodology. And there's a lot of methodology.
So as, probably lots of the listeners know, there's now, there are now many algorithms for discovering causal models where the foundation of a lot of this is thinking in terms of interventions. Now, here I want to be a bit clearer because I know that at least, at least on the philosopher's side of this.
Some people will hear the answer I just gave and it sounds kind of pragmatic, it's related to practice and say, well, oh, that's not an account of causality, you know, yeah, that's not telling you what causality is. I want to push back a bit, because I don't think it's the whole story, but I do think that there's, you know, when you study the methodology and when you study the methodology seriously, you're not just doing.
You're not just kind of studying human reasoning or capacities to the extent that causal methods work. They work presumably because of features of the world. And I think, and this is something I've also argued for in a recent paper with Jim Woodward and Porter Williams, , that you can look at the causal methodology, , and think about what are the worldly assumptions.
that make these methods work. So I do think that this is not just a epistemological or a merely epistemological project, but that by looking at causal methods, we can learn a lot about, you know, the nature of causation and the nature of the world and that's certainly a project I think we should be engaging in.
Do you
feel that, progress in physics that we observed for the last century or so, has influenced how people think about causality? Did it devaluate, in a sense, Russell's stance?
I think in at least one important sense Russell has been vindicated, which is that, so he's writing in a time, so this is early 20th century.
And my understanding, and here I'm just reporting what other people say, is that when you look at, you know, 19th century textbooks, the idea that there was causation in physics was not particularly controversial. You know, the idea that forces were like causes or that, you know, acceleration causes velocity causes position.
Those were kind of part of, you know, at least how people would present the theory. And I take it what someone like Russell and even earlier Ernst Mach realized was that you actually didn't need that just to get the physical theory to work. And I think whatever else we've learned, that seems right. I mean, You could argue, and people do argue, so Matthias Frisch has a book called Causal Reasoning and Physics, defending that causation has a role to play in physics, but it's certainly something you need to argue for now in a way that you didn't have to before Russell.
So in that sense, he's vindicated even if the physics looks very different. Now as far as I can tell, and I'm not, I'm of course not a physicist or even a philosopher of physics, it still remains very controversial whether there is causation in physics. So sometimes physicists will talk about a principle of causality, and they mean something like a locality principle, and that to me looks very different than the causal relations in causal graphs.
So the types of methods I use. So it, you know, it has the same name, but whether it's telling you something about causality, in the sense used in other science, I think is unclear. I think it's a fair question whether it's causation in physics. Often when people describe this, they assume the question concerns whether it's causation in fundamental physics, because I think actually, clearly, in some areas of physics.
There is causal reasoning, if you're building a large Hadron Collider, you're going to be doing certain types of interventions on it. But yeah, I think it's still an open question and I'm fine with that. When answering
the question about the nature of causality and how we think about it, you said that you don't see it as a merely epistemological project.
Now where we talk about physics, you give this examples of different levels. Do you have a feeling that there is a certain level at which talking about interventions as we do in the Perlian framework stops making sense in physics or more broadly in science?
I think it's a live possibility.
I think it's unresolved. So certainly. In the physics context, there's a fairly large literature on Bell's inequalities and EPR experiments. So sets of quantum mechanical setups and well confirmed results, where there's really good reason to at least doubt whether you can give the, you know, these types of scenarios and these phenomena a coherent causal interpretation.
So that would provide at least some evidence that maybe, at certain scales, causation. Breaks down. And I think it's a live possibility. What I would say is, first of all, even if that turns out, that doesn't threaten there being causation at other levels in a meaningful sense. And I think we really reason more systematically and think about this scale of relativity and how causal relationships depend on scale.
Before we can really figure out whether we know the answer to the question.
At some point of your career, you are also interested in cognitive science and the connection between. The physical foundations of mental processes, and perhaps some non physical properties of those processes, if there are such.
What are your thoughts about the possibility that mental states can influence physical reality? I guess
it's clear enough that they can, um, the question is then, you know, so if I have a mental state, I make a decision and then I do something and I move an object, I've changed it. The question Is always, okay, how do we understand the relationship between the mental and the physical?
And yeah, it's, this is something I've certainly haven't thought about in a lot of detail, in a while. It's another one of those areas where I've become a lot more pragmatic in a certain sense, so well in the following sense. Which is that, so, philosophers sometimes like talking about, like, levels, and they talk as if there's like two levels, like the mental and the physical.
And then it's like, what's the relationship? Sometimes called the Kant's Error. Yeah. And I mean, first of all, as you can already tell from our conversation, I way prefer to talk in terms of scales. I think that's more precise and more scientifically interesting. And I think lots of times when you talk about different scales, whether it's, you know, the scale of the table or molecules, there might be 10 orders of magnitude between those.
And I think that matters. That sort of all this. armchair reasoning of, you know, the mental and the physical and there's some relation between them, but we don't, you know, we're going to talk about levels. We don't know how many levels. Yeah, I mean, I have less confidence than I used to. They could say very general things without looking a bit at the details.
So just to maybe fill in something a bit more concrete. I had a fun paper that I wrote with Colin Allen, having to do with certain debates on, in cognitive science between Various ideas of cognition, roughly, you know, is cognition like a computer, like the traditional picture, or is it like a dynamical system?
And one thing we did in that paper was looked at, some of this excellent work by, Randy Beer's lab in Indiana. Where they evolve these, you know, multiple neuron artificial organisms and evolve them over many generations to do certain tasks. Maybe to find food sources where all of this is, you know, again, on a computer.
Uh, well, or avoid, you know, poison sources or catch an object or what it may be. And then they do some analysis about what's going on in the neurons when these evolved organisms, you know, do the task they've evolved to do. You know, we talked just about, you know, What does the representation look like?
Should that better be understood using one model or another? Are these different paradigms even in conflict? So the same way that you can consider a system at two scales, it might be that thinking about a system computationally or dynamically, those are just two ways of thinking about the same system.
And the point I just want to make for this question is these organisms, these, you know, Yeah, minimal cognitive agents, as they call them. I mean, these, are so simple. There's no debate about whether these are conscious. But we learned a lot just by thinking about them and modeling them.
And There's many stages between that type of task and the type of tasks that humans do, you know, even before you get to the higher tasks. So I really, I've become less focused on what's the relationship between the mental and the physical and more thinking a little bit in this engineering way of, let's think about different types of cognitive goals, let's think about how the organisms do it.
And in general I try to understand simple systems that have at least a little bit of complexity and try to understand those really well before moving on to something a bit sexier, but we want to know the answer to, but maybe we just don't even know the questions we're asking. You mentioned,
Taking those two perspectives, the computational perspective and in a dynamical systems perspective and treating them as two different descriptions of the same system.
Do you think about them as, equally relevant descriptions
as well? Uh, yeah. I don't see why they need to be in conflict. The backstory here was that there was this long project in, cognitive science. You had gone back to, you had touring theories of computation where you, we think of the brain as a computer.
That's the model you have. And it's, of course there's a whole theory of computation about how computers work and how you can divide a task up into sub tasks and build these modules too, where each module does a subtask, and so on and so forth. And the idea of the dynamical picture, or at least the, the picture of the dynamical picture is something like.
Maybe that's not how the brain works. Maybe the brain, or maybe cognition, doesn't work by dividing something up into different tasks. If you think just about, you know, what it takes for a human being to walk, it's not like you, you know, pause, and you You know, try to balance, and then kind of move your leg, and I mean, everything's happening in real time, and everything's happening, you know, everything's, you need to be doing everything at once, and you need to, for everything you're doing, you need to be, have a system that's monitoring your position, and your weight, and your balance, and part of the motivation for the dynamical view was the fact that long after machines got really good at playing chess, they were really bad at walking.
And the thought is maybe we have the wrong picture for what they're doing. So that's why dynamical models were seen as really important. And at Foresight, I also like dynamical modeling tools for lots of problems. It's just that the reason I don't think they're in conflict is that people sometimes talk about dynamical models as if they're not models.
As if you just describe the system. You just write a bunch of differential equations. But I think if you ever actually look at a set of differential equations, and more importantly, how they're applied to a concrete system. So not just in a textbook, but, you know, how do you use them to model a pendulum or, you know, or a neuron or whatever you're modeling.
And those are modeling assumptions like any other. And I think those modeling assumptions are very similar to the type of assumptions that go into a causal model or any other type of model. So it seems to me totally plausibly that sometimes, you know, when you're comparing dynamical models to more computational models, it's not that one is just the pure description of the system and the other is, abstraction where you've broken it into parts, even the dynamical model works like that, even that implicitly is, choosing, an interval at which you're considering the step, sometimes, you're making assumptions about boundary conditions, initial conditions, parameter values, all of these build in a lot of assumptions that are not fundamentally different than another type of
model.
When you talk about assumptions, it reminds me the conversation we had yesterday when we met, and he told me about this idea that people, while seeing a less complex description of a system might automatically assume that there are less assumption
yeah. And that's, that's false. So as we were talking about.
I think one of the key features of a causal model is that the assumptions in a causal model correspond to the arrows that are missing from the model. So if you ever have two variables in a causal model and you don't have an arrow between them, and you don't have a common cause, that's a very strong assumption.
Compared to if you just have an arrow and then maybe there's a causal relation or maybe not. And yeah, I think that's a very important point that really What makes causal models work or not work are these background assumptions, and Yeah, it's of course hard to establish them, which is, I mean, that's not the model's fault.
That's just how, how the world is. I mean another, famous Cartwright ism is no causes in, no causes out. You need to make causal assumptions. But the hope is that by doing so you're able to take a complex world and render it simpler and render it more tractable. So to make it into something which you're able to interact with in a localized way.
Many people who are starting
with causality, have this very fundamental fear that if they build A DAG, a graph that represents the system, they might be wrong. What would be your advice to them?
First I would just say that if they don't build a graph, they still might be wrong. Unfortunately. I want them to be right.
I'm rooting for them. But, I think it's a legitimate fear. I think, you know, so, so I'm a philosopher and I work a lot with scientists and I want my work to be relevant to scientists. But. My job is not to, or I'm not qualified to tell scientists, Oh, you absolutely have to use causal models.
I always need to make a conditional claim, which is, Look, maybe you just do prediction. And if you just do prediction, maybe, you don't need these models. But, if you want to make causal claims, if you want to model the effects of interventions, in some cases, even to interpret certain types of data, if you want to know which probabilistic relationships will or will not be invariant, you do need more modeling assumptions, and at least in the first two cases, you need causal assumptions.
So I think what often happens is people reasonably say something like, you said, well, I'm just nervous, I might be wrong. And it's totally consistent to say that and then just not make causal conclusions. But I'm, I worry that all too often what happens is people just don't use causal methods and still make causal conclusions.
So I think, one of the prime offenders here is, psychology, psychologists. I think that's an area where, with some exceptions, so I've had some really good interactions with, uh, you know, University of Amsterdam people who have become, way more sophisticated in causal methods. But there's still kind of a reputation that's not unearned of, well, they go to school, they're taught correlation does not imply causation, and they get that drilled into them.
So So you see the paper and in the method section of the paper, not a word about causation, maybe you do hypothesis testing and various other types of statistical methodology, you turn to the discussion section and you'll see words like, you know, X promotes Y. Well, promote is a causal claim. So, what I end up wanting to say is that if you're going to draw causal conclusions, you need causal assumptions and there's nothing magical about causal models, but they do allow you to make those assumptions more systematically and to spell out what assumptions you would need in order to get the conclusion you want.
And I see that as a huge amount of progress. I also
like to emphasize that if you build a model and this model is wrong, you build this model explicitly. And then you learn that the model was wrong. It's a learning opportunity for you to modify the model, make it better. While if you just. Push it to your scientific or statistical subconscious ,
Maybe you, you just don't learn anything. Maybe you will just retrain the model, right? Again and again.
Yes. Yes. One hundred percent. I think that the great thing about models is if it's a good model, you learn something about being wrong and that's as much of a conclusion. Uh, so, uh, yes, I'm totally on board and I think it's true both for.
bOth in science and in philosophy, where there's kind of a cottage industry for talking about non causal types of relationships, which in principle I'm, you know, on board with. Well, there could be as many types of relationships in the world as there are. I mean, there's nothing, nowhere is it written there only have to be causal relationships.
But I think sometimes, you know, to see if there are, at least my strategy is. Build a causal model, see if I can model a phenomenon, that they claim needs this other type of relationship, whether it's a mechanistic or constitutive or non causal relationship. And I would claim that if I'm able to model it, that puts some pressure on whether you actually need this other type of relationship.
And the point isn't that, maximally parsimonious and, yeah, but just to really be clear about what we're doing. And I think, if nothing else, causal models are very clear about what their assumptions are. And that means we could figure out what happens when they break down. And the more you use them, the more of a sense you get regarding which assumptions are more just assumptions of convenience, which ones are more fundamental.
I mean, I think there's a lot more to think through there. But, but that the models really do come with rules and interpretation and They're developed for certain purposes, and you can see what happens when they succeed or fail at those purposes. In
Pearlian framework, we have this idea of the ladder of causation, and we have associations, interventions, and counterfactuals.
In 2018, you published an interesting paper about different outcomes of interventions in dynamical systems, depending if we just perform a point intervention, just for like a, for a second, or we hold this intervention constant for a longer period of time. Can you share a little bit more about this?
Sure.
Yeah. So the paper was called, intervening and letting go. And my thinking about that was inspired by a fairly simple case. So earlier in the interview, I mentioned ideal gas systems. So you have. Yeah, a gas in a container. And you have thermodynamic variables, yeah, pressure, volume, and temperature.
So we can imagine the gas is immersed in a heat bath, and governed by the ideal gas law. So pressure times volume, is proportional to temperature. And we know with this type of system that, you can have different setups. So once again I said, yeah, to go back to another thing from earlier, our conversation, Causal models are relative to a setup.
So if it's the case that you have a sealed container, so a fixed volume, well now you're going to have one causal model. So volume and temperature will be causes of pressure. On the other hand let's say you have a movable piston. Net, well, now it's the case that volume can change. Pressure is constant.
So pressure and temperature cause volume. Okay, so far we're just talking about the equilibrium relationships in the system. And there's just a little bit of a puzzle having to do with interventions even at this stage. And the puzzle is as follows. Say we start with a movable piston. So again the one in which, pressure and temperature cause volume.
And now you fix volume. Say you put a pin into the side of the piston. So now instead of the piston moving it, It's just fixed. Well, now this system will behave like a sealed container. And what's puzzling about this, and without getting into the details, is that, so the whole point of an intervention, and the whole ability of representing an intervention using something known as the do operator, entails that there's certain semantics for how the causal relationships in a system should change.
if you intervene on a variable. Now when you stick a pin into the side of the container and fix the volume, that looks like an intervention but you don't get the result that you would expect if it were an intervention controlled by the do operator. So that's the original puzzle. I first discussed this in a paper with Ruben Stern and Dan Hausman, 2013 or something like that.
And then we just said, well look, there's just these two systems with different causal relations. Causal relations are relative to system. And here's just something that illustrates that. And that there's no single graphical representation of both. Now, where things get, perhaps a bit more interesting is that after doing this, Clark Leemore directed us to the work of Denver Dash, who, graduated from Pitt in 2000.
I think he now works for Microsoft, but I'm not sure. But anyway, he, wrote a dissertation in which he gave a similar type of case and also focused on the dynamics of the system. And if you give a dynamic representation, something in the style of Iwasaki and Simon, from the 1994 paper I mentioned earlier, well, you see a, you know, a different timescale representation.
So instead of just talking about the relationships at equilibrium, you have a feedback loop. So you can imagine that there would be periods of time in which the piston would be away from equilibrium, and there would be a feedback loop between pressure and volume as the piston expands or contracts, you know, say in response to a change in temperature.
And now the question is, what is the relationship between the dynamic and equilibrium models? And I know there's a lot on the table here, so, thanks for bearing with me. Now what Dash thought, skipping a lot of the details, was that basically, the dynamic model was a good model, and that if you looked at the dynamic model, it actually privileged one of the two equilibrium models.
So The basic argument is something like, well, look at the sealed container model. So the one in which volume and temperature cross pressure. Well, that model perfectly well predicts the result of all interventions. So that's a good causal model. The movable piston model does not. So get rid of it. And in fact, get rid of both because you don't know which one is the right one till you look at the dynamic model.
So would that mean
that, one model is a special case of the other model? Or they are qualitatively different.
So in the equilibrium models, there's sort of no translatability. That's the sort of worry. I think there's something to that. I mean, in the sense that it's very worrisome when you have an equilibrium model and it, there's certain interventions it doesn't predict.
But I think what's very interesting here is exactly why the movable piston model does not predict the result of interventions. And it has to do with the fact that we're thinking about interventions in a particular way. So here's where timescale comes in again. And here's where, hopefully, some of the pieces start to come together, which is that so when you're considering a system at a longer timescale, or when you're assuming that the system has had time to reach equilibrium, you're not considering the system in time.
And when you're talking about an intervention, so you say intervene on volume, intervene on pressure, You're talking about holding it fixed indefinitely for all time. You might call it a clamp intervention. Now, when you look at the dynamic model, in the dynamic model, you'll see a feedback loop, a feedback loop that, you know, brings the piston to its equilibrium volume.
Now, when you hold volume fixed, you're basically not allowing that feedback loop to operate. The intervention destroys the system's ability to reach a certain type of equilibrium. And the point of the paper is that this is a general feature of causal models, and that in some way there's no, or I would argue, or I did argue, that there's no basis for preferring one of the equilibrium models to the other.
It is true that only one of them can predict all the interventions, but each model is true for a different equilibrium state of the system. And in the same way that the movable piston model does not predict what happens when you insert the pin, The sealed container model does not, you know, predict what happens when you remove the pin.
So, letting go is when you remove a constraint on the system that needed for it to be there to have those properties. Those two actions are separate. Putting the pin in, pulling it out. It happens to be that putting the pin in corresponds to the formal operation of an intervention while in equilibrium models, there's no operation for pulling it out.
But this is a blind spot of the formalism,
the blind spot of the structural causal model. Yes. Of formalism.
Yeah. Mm-Hmm. . Yeah. So the point of that paper is to at least defend the equilibrium models on their own terms. Now I should say so. 'cause again, there's a whole lot on the table there and really the point of that paper.
So I've been playing around with these dynamic causal models for a long time and, and it was really important for me to respond to Desh. 'cause I saw, and I should say, you know, his dissertation is great and it's really, if anyone's reading my dissertation in. 20, 30 years then. Yeah, whatever they want to say about it, right or wrong.
So, really amazing work. But I really felt like in some ways people took this result as a pessimistic result, about, you know, not using equilibrium models. While I thought my paper had a more optimistic takeaway. And for me it was important to make sure that these models worked. And he was giving some reasons that at least some people take to undermine certain features of the models.
So in some ways that was a paper which I needed to do to make sure the details worked. In terms of What to take away from this? That's really what I've been working on now, or one of the things I'm working on now. So right now I have a very exciting collaboration with these two Miguel professors.
So the statistician Russell Steele and the epidemiologist Ian Schreier and really we're trying to kind of now that, you know, this sort of foundational or at least, you know, the kind of kicking the tires work has been worked out to really draw some broader lessons. And, you know, the broader lessons for practicing scientists aren't, you know, about the epistemic validity of equilibrium models.
It's really about what do we learn about these types of systems. And I think what you learn really is that When you start to introduce time into causal models, it changes things in several important ways, including, uh, how you think of interventions. So, when you think of a variable, not just something you measure once, or not something that's constant for all time, or not something that's equilibrium, but something you could repeatedly measure.
So, you're measuring a person's temperature for many days, or, I mean, really anything you measure repeatedly, it doesn't matter. what type of variable, and then you talk about an intervention, and now we can distinguish, at the extremes between a clamp intervention, something that holds it, you know, in place for, you know, all time or indefinitely, a shock intervention, something that just influences it one time step, lots of things in between, including repeated shocks, you know, at different intervals, and things.
It's not that no one has made those distinctions before, but I think these can really more systematically be incorporated into causal inference. And I think you, what we want to argue is you really need to do it to think about size the outcome of interest. So people in the potential outcomes literature are often focused on defining, you know, here's the variable you're targeting.
Here's the outcome. Now, if you think about what's going on, when you're administering chemotherapy to a cancer patient. And this will be one of the main things I'm talking about where, you know, I'm doing this at a high level. I don't, I'm not, yeah, I don't know the underlying science, but at least my general understanding of what you're trying to do is, before you do the chemotherapy, the body is in a state where it is not able to eradicate the cancer.
What you want to do is to give enough chemotherapy so that the, you know, you've destroyed enough of the cancer, so then the immune system can take over. And. What you're doing here is you're trying to change the whole state of the system. And you're trying to influence the feedback loops of the system.
And you're trying to bring the system into a new equilibrium state. So I think that's just one example where really you're trying to change the stability properties of the system. And that I think is kind of the, the bigger picture here where. In the intervening and letting go, there's this very broad stroke thing where it makes a big difference whether you sort of intervene and hold something fixed or you let go and you choose not to intervene on it.
But everything you're doing in the real world are interventions lasting for different durations and different amounts of time. And on both the cause and effect side, you care about these durations and about how long do you need to intervene, how persistently. And when do you expect the effect of interest or what's the time scale at which you'll observe an effect?
One
description of those systems that comes to mind is by using differential equations. On the other hand, the idea of intervention is very well defined, as you mentioned, in the Perlian framework. Is there any bridge that you see that we could build between those two different perspectives on these systems that could make it happen?
Easier for us to unify these perspectives and take what's best from both
worlds. Sure. So first of all, I need to, do a shout out here to the great work coming out of Joris Moy's lab at, the university of Amsterdam, Conrad de Vries Institute. Cause they really, I mean, before a few years ago, there was almost no work on this and now there's a ton and it's.
Almost entirely due to them and, yeah, so, Joris, as well as, he recently had a grad student, uh, Tanika Blum. I mean, there's a whole bunch of great people there. And so there's now a lot of work being done there. And I'm not going to go into detail about all of that. But before I answer anything of this question, it would be, I'd be remiss if I did not kind of highlight, that there's really exciting work being done there and anyone with interests in this should also look at what they're doing.
But very generally, yes, I see part of what I'm doing is trying to understand the relationship between these, because I've mentioned a few times in passing this work by Iwasaki and Simon, where these dynamic causal models, which basically what they do is they allow one to understand the relationship between causal representations at equilibrium, and causal representations away from equilibrium.
So more dynamical representations. And in one of my papers on near decomposability, in Philosophy of Science, I explain, I spell out the details of how this also relates to timescale issues. So if you think about systems as having a long run equilibrium or tending towards a long run equilibrium, and dynamics as perturbations from that equilibrium, you can use this framework that was developed for equilibrium and away from equilibrium to also understand the relationship between causal relations at different timescales.
But to get to your question, so what's, one of the interesting things about the dynamic models is that they do involve things like time derivatives, and other, yeah, things like uh, integrate, the operation of integration, which is another operation applied in a dynamical system, yeah, just the inverse, so integration is the inverse of differentiation and You have these operations that appear in, you know, just standard calculus or in dynamical models, and they're now appearing in causal models.
And what I take this to show is that at least when you're able to write differential equations in the canonical way specified by these theories, Then, as far as I could tell, you can perfectly well give a causal interpretation. I mean, there's a very clear sense in which this is just a generalization of the standard causal framework.
Now, what it says about causation and dynamical systems in general, I think, is still an open question. I think, I mean, the point is not any arbitrary dynamical system can be represented causally. This at most shows that there's a way. When you have a certain type of dynamical system, that you can represent it causally.
So that is a bridge, and I think it's an important bridge, and, and I like the bridge because, lots of people don't know this, but Herbert Simon was a pioneer in the development of causal models going back to the 50s, so it's part of the structural equations tradition. So to me, it's helpful to have that continuity that the same person who was very influential in understanding what makes an equation structural, and where the causal asymmetry comes from, that you can use that framework, which sort of also influenced, you know, all of the later frameworks people know and love, people like, Spiritistically More Enshininess, so the fact that there's this continuity, I think, makes it easier as a philosopher to think about, okay, here's the case we understand, here's the generalization.
What do we learn about causation? Speaking about history,
In early 2010s, we experienced the so called Big Data Revolution one of the driving ideas behind this revolution was the idea that if we collect enough, Data, we can make very useful predictions and actually solve any problem with this data.
I was curious about your perspective. Do you think that this concept was related to the ideas that Russell has presented and to thinking that causality is something that is obsolete,
Today? As a causal person, you're probably not going to be surprised that I am suspicious of this because I think causality really isn't about taking all the data you have and just throwing it into a machine or a grinder and getting a result.
Causality is about making assumptions. in order to take the complexity of the world and be able to interact with it in a local and manageable way. And that's what causality is about. So the whole idea of just throwing lots of data in something and seeing what comes out, that goes against kind of what causality is trying to do.
I want to be careful here because I do think that But for me, it's a bit of an empirical question, what the best approach is. When it comes to what's the best approach for building a large language model, for example, and you might say, ah, you can't possibly do that unless the machine knows the grammar or you come with all these philosophical conceptions and we used to think like this.
Yes. And then someone gives you a machine that does it without that. And I think you need to take that seriously. So I'm. I'm very, I'm not going to venture any predictions here. I think whether you can or can't do something, you know, causally, I think, yeah, it's going to be an empirical question. And we shouldn't just, I think, you know, I mean, Dennett has this idea like evolution is smarter than you.
And I think there's some corollary here that the idea behind evolution is smarter than you or natural selection is smarter than you is just, you know, you thought there were N ways to do something, but no, there's this totally different way. that things evolved, and why couldn't that happen with technological, evolution?
I'm still not going to grant it to Russell here. And because again, I think that however things turn out, what causal modeling is about is Yeah. It's just a different way of thinking about things. It's not about if I'm given all the knowledge in the universe, you know, what can I do with that? It's, I'm a limited being operating at a particular time scale.
And I want to not have to know about everything happening in the universe to predict what will happen at this table. Or if you're a large language model, or if you're, or if you're a language model, you might think plausibly, okay, You have this model that was trained on the entire internet and could predict the next word.
Human beings were not trained on the whole internet. So you still, the fact that if you were trained on the whole internet, you could answer question X, Y, and Z. Does not answer the question of, well, how can someone who's only been exposed to a thousand words or 10, 000 words, learn grammar? And, yeah, I'm not here advocating some radical Chomskyian thing that, again, there's a lot that humans are doing.
There'll be many different models put out about how they do it, and some of them will be more causal, some less causal. My point is whatever large language models are doing, causal reasoning is doing something different. And it's doing something I think is both valuable and certainly At least intuitively seems to be closer to what, humans?
Are doing when talking about
The dashes paper, you said that you see his perspective a little bit more pessimistic and your perspective a little bit more optimistic. At some point in your life, you extensively studied the Talmud. Is there an influence between your knowledge and your understanding of the tradition of thinking in Talmud and the, your work and how it influences?
I have no idea if this is the right causal story, but this is what seems right to me through introspection, it does seem to me that, so, Talmudic literature, it's very much like a legal style of literature, and the way it functions is that in some ways, the text is fixed, you know, so there's these different sources, triangulate them, and you're trying to get a bigger picture by, you know, we have, You know, this source from this century, this source from this century, and what do we think is the right one?
And this rabbi says this, this rabbi says that and I think that inspires a certain flexibility of thought that certainly has helped me. I don't know how directly applied to causal models, but I think it's partly why I think that, , what matters about causal models is what if when they're wrong.
Because it's really not about, oh, I'm going to get the final model that's going to solve everything. It's like, let me find a systematic way to Explore this. Let me build a model. Might be wrong, but I'll still learn something. I mean, I certainly, the time I most felt that this influenced me, so, so one of my projects that we haven't talked about yet is about the causal analysis of racial discrimination and after a while of doing that as more of a causal modeling and sociology related discipline, eventually I had a really fruitful conversation with the legal scholar, Isa Koller Hausman.
I learned a tremendous amount from him, but also made me I realized I really needed to look at the legal literature on discrimination if I was going to be, you know, continuing to write about this. And I spent, a good summer kind of doing a deep dive to at least orient myself about how they talk about this.
And certainly, it was labor intensive. I certainly found that I was I was able to do it in a way that I, I'm not sure I would have been able to do had I not had the Talmudic training, because I think I just, had some basic feel for how legal systems work. Especially I was looking at, case law for Supreme Court cases and that something in the field brought me back to, you know, to the earlier days.
You also,
you also read Islamic Sufi texts. Yeah. What did you find interesting
about them? I guess one thing I noticed when I was doing this, and this is back in a course at Columbia University with Peter Ahn, who is one of my favorite professors at Columbia, and I think there's something that happens with like lots of mystical traditions, whether it's the Sufi text or Kabbalah, where there's kind of like something a bit different than the rest of the more traditional religion.
I mean you really, and I haven't thought about this stuff in a very long time, so I don't know the long term influence on me, but there's There's a real emphasis on human experience, and there's a real emphasis on how to reconcile eminence and transcendence, and it's given in a theological context, but, I think the question about how to reconcile, what feels to you like your own personal experience and, and what, feels very individual, with whatever broader system you think.
Makes the world function. I think that really fascinated me. It's probably why back in the day I used to read lots of Kierkegaard. I think it, it was a similar type of dilemma that just for a period of time in college I was very obsessed with. Who would you like to thank? So I definitely, owe a tremendous amount to my, graduate school, advisor, Dan Hausman, who really, at the point I was working with him, I mean, he hadn't really written on causality in a good 10, 20 years, but just was An amazing advisor and, really just kind of.
Someone who, when I had half baked ideas, he could sort of finish them. I mean, I think there's different, you should speak to everyone you can when you're working on an idea, but for someone who's an advisory role, someone who you can bring ideas that are not fully developed and for them to kind of be on the same wavelength, that's really important to that stage.
Yeah, so he was tremendously helpful and, and has been extremely supportive. Earlier, Eliot Sober, another philosopher in the department, he, on one, on the first paper I wrote in grad school, well, it's the first paper I wrote for him, which largely due to him ended up being published, but, he, I think, gave me feedback on six drafts, of that.
The first three might not have been particularly that good, but kind of that type of feedback was just indispensable. And I worry sometimes that students at all levels don't get kind of scenario where they just need to revise the same paper, which is where I think a lot, where a lot of the action happens.
In recent years, Jim Woodward has been extremely supportive and has been a, just a great collaborator. Yeah, I'm sure I'm forgetting people, and I don't mean any offense, but those were some of the names of, uh.
When people are starting with something more complex, and this is often the case when they are starting with causality, they might feel a little bit overwhelmed or unsure if they will be able to grasp all the terminology, all the tools that are needed to, do causality in practice.
What would be your advice
for them? It really is important to find some aspect that, you know, I find interesting and that relates to other interests of mine. Almost by chance, the first thing that I came across were debates over the causal faithfulness condition. A sort of parsimony principle for causal inference.
And that gave me an in. And I do think, ideally, having something like that is important. Because if you're not interested, it's very hard to. Yeah, get inside and once you have that you can sort of take whatever knowledge you have and branch out So I'm not sure what practical advice that means in terms of how to find That particular thing.
I mean certainly With a book that's you know, relatively accessible like the book of why that might be a good place to start to kind of see The landscape but the truth is is that there's many avenues into causal inference. So if you're interested in algorithms, there's the whole literature on algorithms, for doing causal inference.
And, if you're interested in philosophy, there's a whole lot of work, that's based on that. I think there's many avenues in, in finding one that works for you is important. And then going from there and, and certainly I don't. At least the way I do things, and maybe it's not the most efficient, but it at least works for me, is I think it's not important that the first thing you do is the most important thing, or to do something because you think that's going to be the answer to the question you want.
I think if there's some random book about causation that's broadly, connected to this literature and you find it interesting, yeah, delve into it. I think, don't overthink where you start. If it's something that appeals to you, my experience is, is that It often takes me a while to figure out why I care about something, but I've never regretted kind of just delving into something long before I could explain to anyone why, why I cared about it or why it should matter.
And what kept you going when you already started? Well, the good thing about causation, I think in any field, certainly in philosophy, is there's really, no end to it. I mean, so for me, as a philosopher, one of the exciting things is that I'm, able to engage with lots of different sciences and do it in a way I think is fruitful because lots of sciences care about causation already.
So it's not like I need to learn everything about the science. I need to find scientists who already care about it and then engage with them. And with the appropriate degree of modesty and, understanding my own limitations. But that's been very fruitful for me. That's, for example, how I've had a Wonderful collaboration with the Dutch psychometrics community, exactly that way and every project I've started, I felt like an imposter, but I think causality is great for really being able to do interdisciplinary work in a substantive and general way.
Now in terms of what keeps me personally going, I think, well, I think compared to when I just got the PhD, which is already quite a while back, I think I just, at some point, So I think, today we've only talked about sort of a small part of my research, but I think lots of the issues we've covered about what causation is, how causal models work, issues of timescale, I think those are radically important and underappreciated.
And because it's connected to a topic of causation that's of increasing interest in both academia and outside of academia it's really a wonderful way to pursue my interests and also be able to Hopefully in the long run make a contribution to lots of areas So that's very exciting and I mean in that sense.
I just feel lucky I mean that I found a topic that You know, both that interests me and that I feel like I can make a contribution. There are many people out there that are way smarter than me, and if they turned to this then I guess I'd be out of a job, but somehow my training and the various people have supported me and the direction I've decided to go has so far, been very fruitful and nice.
And, so yeah, I just find it a very exciting area to work in. And I think there's just a tremendous amount to do, because as much as people have talked about causality, I think things have just gotten started. What question would you like to ask me? One thing I've noticed in terms of your orientation is that, so of course you're speaking to me and I'm an academic based at university, but clearly your interests are focused on largely on causation and industry and I know you've interviewed or considering interviewing people that work for you know places like Uber or or Volt or Things like that.
And I'm, I'm curious how you see the interaction between sort of the causal researchers in academia and outside of academia, and should they be communicating, will that be fruitful? That's a great question.
I believe they can be fruitful and we still have a little bit of work to do in order to build more bridges.
One of the main motivations for this podcast, from the scientific and communal point of view, Was to bring different perspectives and different schools of thought to one platform so we can have, we can build in a sort of an agora for conversations between those different worlds. I think you referred to this, also today that some of the community, so some of the styles of thinking, uh, ghettoized in a sense, like we, we have those ghettos in causality, potential outcomes and, a single word intervention graphs and, and DAGs.
And then we have different schools of thought in epidemiology and econometrics and computer science. And often those different clans, they are talking about the same thing, but just using slightly different language. I think opening this discussion and opening the doors between those enclosed rooms is something that we really need in order to move the field forward, but also to promote the usage and application of causality in industry, because many of the problems that people in industry are facing today might be already solved in one of the niches.
But they just don't know about this because they read the literature coming from the other side.
Yeah, I can't agree more. And even sometimes when the same thing has been covered in two literatures, you don't know how to compare them or so.
Yeah, or translate them. Yeah. Yeah. I find it challenging as well.
I'm reading Robbins, for instance, Hernan and Robbins. And, when I was writing my book and researching the literature, Like building a bridge for me, you know, from Perlian language to their language and back. This required a significant amount of work. Although the ideas sometimes are very, very similar.
But the terminology is completely different. You need a dictionary to actually understand how this translates.
I think something someone could do one of these days is sort of, do a YouTube channel in which, they have different videos on the different intros to causal inference and kind of compare them.
I mean, of course, there's. We'd have to be a causal scholar already, and, uh, I'm not sure I have the time, but certainly this is something where I think there's a great need for, projects like this, so, yes, I'm really glad that you're doing it. What
resources would you recommend to people starting with causality?
It can be books, courses.
Some of the things I find myself recommending, and it's always hodgepodge of things, because I haven't really found one thing that kind of is a one size fits all. But some of the things I tend to recommend, depending on my audience, so first of all, Felix Elwert, sociologist, sociologist at Wisconsin, he has a handbook for causality, I'm sorry, he has a chapter in a handbook for causality, I don't remember the publisher, but it will be.
We will put a link in the description. And I find that that's really good for social scientists and also it covers a lot of good stuff. It covers confounding. It covers deseparation, covers endogenous selection bias, and also some methods such as time varying treatments that are less covered elsewhere, but are still.
importance. So that's one thing that I think is very good bang for your buck. A companion piece to that, it's more specialized, but him and Chris Winship, at Harvard, they have, a paper on, endogenous selection bias. So that's conditioning on a collider. That's not a general introduction to causal inference, but it's one of those, topics that is very counterintuitive to people and very unknown, and one of the things that can give you flavor of graphs.
Something else I recommend is, and this is for people if they're more interested in the algorithmic side. So there's a piece by Frederick Eberhard, at Caltech, who's, by the way, in general, one of the, certainly appeal with philosophical training, yeah, one of the or the most exciting people doing causal research these days.
He has somewhere, I think in 2012, kind of a very short introduction to, causal inference algorithms. We're saying maybe 12 pages, or, I mean, it's some, I don't know if it's 10 pages or, but it's very short and covers the different algorithms, does it quickly, but it covers both some of the more traditional ones.
also covers some of the machine learning algorithms. So one of the big advances has been You know, work from machine learning where traditional causal algorithms can't solve the two variable problem. So if x causes y or y causes x and you haven't measured any other variables, you can't determine causal direction.
There's some machine learning methods, yeah, which exploits, exploit parametric features of the error term that allow one to determine causal direction. And I think Eberhard's discussion is nice. The Other discussion for that, if you want more detail, is the, textbook by, Scholkopf, Janzig, and, Peters, but that's a more, involved, deep dive.
And then maybe just a few more, so one really random one, but Guy, Nielsen on his website, who had an introduction to deseparation, which still to me is the most accessible, introduction to deseparation. I believe the guy's name is Michael Nielsen, and studies something like quantum computing or something, but just at some point did a deep dive into Pearl's book and wrote this nice piece.
And finally, recently I did an article in the Stanford Encyclopedia of Philosophy on Simpson's Paradox. I wouldn't say that would necessarily be the first thing to read, but, if you're interested in how The causal modeling methods relate to some of the earlier philosophical theories, including probabilistic theories.
I discussed that a bit there, especially. So in the early sections of that are the mathematical framing, and that's stand alone, and you can go as deep or shallow into that as you want, but I think something that hinders philosophers from understanding the newer methods is that there's kind of a discontinuity between in the 20th century when people would do probabilistic theories of causality and were more traditional philosophical theories and then you have all these axiomatic theories and causal graphs and how do they relate and so at least there I try to explain you know really what the advances were like why having causal graphs gives you You know, a proper theory of confounding and historically why I think the probabilistic theorists didn't have that, and it's all a very, yeah, for example, you had a Pearl inspired picture, but at least I tried to put the pieces together there.
Where can
people find out more about yourself and,
and your work? Probably the best thing is my website, so all my articles are there. I'm also on Twitter, as you mentioned, thanks to my mother. And, on my Twitter as my pinned tweet, I just gave thumbnails to a bunch of papers that I thought would be relevant.
So yes, I think those would be the most direct ways to get an overview. What's your message to the Causal community? So what I want to say is that I still think there's lots of questions we're not asking. So there's been A huge amount of development in the algorithmic side, and that's been massively fruitful and certainly been the inspiration for lots of my research.
I think people sometimes think that because, you know, no causes in, no causes out, So, because you can't get causal outputs without causal assumptions, that kind of all you can do is, you know, give axioms. You could say, here's, you know, causal Markov, causal faithfulness, minimality, and that's all there is.
And I think that's just not true. Because it is, so it is true that you can't get a reductive theory. At least, I don't think you can. given account of causation that says what causation is without using the word cause or using some concept like exogeneity, which builds in causes. But I still think there's a different project we could be engaging in.
And it's a project having to do with the conditions of causal representation. So to think really about what are the types of systems to which these models apply. So one thing, we talked about yesterday was, I think sometimes a model might work and we might not know. Why it works, and I think that can happen.
I think sometimes you say here's my algorithm. Here's the assumptions We don't think the assumptions are correct and the algorithm still gives us something that seems sort of okay And then there's a question of why and I think that gets obscured by the purely, you know axiomatic computer science if then approach and one reason why I'm really interested in the timescale thing is I think that's an Antidote to that.
I think you can't say what exogeneity is without using the word cause But you can say, here's how facts about exogeneity will be relative to timescale. And that tells you about the conditions under which causal models apply. And this is one example. I mean, it's my pet example. It's the one I like. But I just think there's a lot more work to be done to that.
And I guess sometimes people get defensive. Where if you say, oh, there's a limitation to the model. That somehow you're saying the model is bad or it's not. I mean, you are saying it's not the whole picture, but somehow you're criticizing it. And I just think the fun of modeling is use the models, see where they work and then see where they break down and then use that to figure out what the fix looks like.
And is it a big fix or small fix? I mean, I think it should all be. part of the picture. So I just really think there are a lot of projects, that neither philosophers nor computer scientists are doing. And I hope that at least in a small way, my research at least points people in other directions, or at least encourages people to look in other directions, where, you know, there's a lot of work to be done.
One of the
questions that people in the industry are asking themselves or asking this question to others. Is, is causality one of those trends that just come and go, or is it something that That will stay here for longer.
Like lots of other parts of the interview, I'm not going to make any predictions about, well, probably anything, but certainly not about industry and what shiny new object they're going to find next.
As a philosopher, I can ask the question about, is it a valuable thing to care about? And I can give lots of reasons that it's valuable. And for all that people will often say, Oh, this problem is purely predictive. I think it's very hard to find a purely predictive problem often, because often people, don't, or typically, especially in industry, people don't just care about making predictions, people do care about intervening, and even if you don't, you really don't just care about correlations, you care about the correlations that will remain invariant, and maybe you need causation, maybe you don't, but correlation and stats are not going to do that by themselves.
At least, that's not what they're designed for. Yeah, as we talked about, maybe you add enough data and the machine will figure it out, but that's just not what they're designed for. So, I think the easy part in some ways is to explain why it is that Someone would want, this type of tool. I think the hard part is actually teaching people how to use it.
And I don't mean that in a way like, Oh, because people won't understand it. Cause people don't have the knowledge. I think it's really a question of. Implementation and incentives and it's not enough for something to be useful. It needs to also be useful in a way that can easily be used to apply to the problems as people see them.
And I think both in science and in industry, I suspect there's still lots of gaps there that, It's another one of those things that's not totally clear who the best person is to to jump in and help, but I think it's, it's a non negligible problem, and something I've seen with scientists, which, again, I've had more contact with them than with industry, is that you could teach them the Pearl stuff, and they get it perfectly fine, but they don't know how to use it in the papers they're writing, or if they don't know how to use it to publish the articles they need to, or if the journals they're submitting to won't recognize it, it's, they're not gonna use it.
I mean, Like for the most part, you'll have a few methodologists who will use it and I'll hang out with those people and we'll have fun. But, but in terms of like actually changing a build, that's where I kind of see the possible stopgap.
So we might need a change management expert now, not another
causality specialist.
Yeah, I'm always nervous to encourage people to hire more managers. But this is something someone needs to be thinking about.
Naftali, thank you. It was a pleasure to talk to you. So much. Pleasure to
meet you.
Yeah. Oh, it's a great pleasure. I really enjoyed it.
Thank you. And see you in the next episode.
Congrats on reaching the end of this episode of the Causal Bandits Podcast. Stay tuned for the next one. If you liked this episode, click the like button to help others find it. And maybe subscribe to this channel as well. You know, stay causal.