Causal Bandits Podcast

Open Source Causal AI & The Generative Revolution | Emre Kıcıman Ep 16 | CausalBanditsPodcast.com

May 20, 2024 Alex Molak Season 1 Episode 16
Open Source Causal AI & The Generative Revolution | Emre Kıcıman Ep 16 | CausalBanditsPodcast.com
Causal Bandits Podcast
More Info
Causal Bandits Podcast
Open Source Causal AI & The Generative Revolution | Emre Kıcıman Ep 16 | CausalBanditsPodcast.com
May 20, 2024 Season 1 Episode 16
Alex Molak

Send us a Text Message.

 What makes two tech giants collaborate on an open source causal AI package?

Emre's adventure with causal inference and causal AI has started before it was trendy.

He's one of the original core developers of DoWhy - one of the most popular and powerful Python libraries for causal inference - and a researcher focused on the intersection of causal inference, causal discovery, generative modeling and social impact.

His unique perspective, inspired by his experience with low-level programming combined with his vivid interest in how humans interact with technology, is driven by a deep seated desire to solve problems that matter to people.

In the episode we discuss:
🔹 What makes Microsoft and Amazon collaborate on an open source Python package?
🔹 Causal AI and the core of science
🔹 Is language model a world model?
🔹 When modeling physics is useful?

Ready to dive in?

Join the insightful discussions at https://causalbanditspodcast.com/

About The Guest
Emre Kıcıman, PhD is a Senior Principal Research Manager at Microsoft Research. He's one of the core developers of the DoWhy Python package, alongside Amit Sharma. He holds a PhD in computer science from Stanford University. Privately, he loves to climb and spend time with his family.

Connect with Emre:
- Emre on Twitter/X
- Emre on LinkedIn
- Emre's web page

About The Host
Aleksander (Alex) Molak is an independent machine learning researcher, educator, entrepreneur and a best-selling author in the area of causality.

Connect with Alex:
- Alex on the Internet

Links
Libraries
- DoWhy (https://www.pywhy.org/dowhy/v0.11.1/)
- EconML (https://econml.azurewebsites.net/)
- CausalPy (https://causalpy.readthedocs.io/en/latest/)

Books
- Molak, A. - "Causal Inference and Discovery in Python"
- Pearl, J. -

Should we build the Causal Experts Network?

Share your thoughts in the survey

Support the Show.

Causal Bandits Podcast
Causal AI || Causal Machine Learning || Causal Inference & Discovery
Web: https://causalbanditspodcast.com

Connect on LinkedIn: https://www.linkedin.com/in/aleksandermolak/
Join Causal Python Weekly: https://causalpython.io
The Causal Book: https://amzn.to/3QhsRz4

Show Notes Transcript

Send us a Text Message.

 What makes two tech giants collaborate on an open source causal AI package?

Emre's adventure with causal inference and causal AI has started before it was trendy.

He's one of the original core developers of DoWhy - one of the most popular and powerful Python libraries for causal inference - and a researcher focused on the intersection of causal inference, causal discovery, generative modeling and social impact.

His unique perspective, inspired by his experience with low-level programming combined with his vivid interest in how humans interact with technology, is driven by a deep seated desire to solve problems that matter to people.

In the episode we discuss:
🔹 What makes Microsoft and Amazon collaborate on an open source Python package?
🔹 Causal AI and the core of science
🔹 Is language model a world model?
🔹 When modeling physics is useful?

Ready to dive in?

Join the insightful discussions at https://causalbanditspodcast.com/

About The Guest
Emre Kıcıman, PhD is a Senior Principal Research Manager at Microsoft Research. He's one of the core developers of the DoWhy Python package, alongside Amit Sharma. He holds a PhD in computer science from Stanford University. Privately, he loves to climb and spend time with his family.

Connect with Emre:
- Emre on Twitter/X
- Emre on LinkedIn
- Emre's web page

About The Host
Aleksander (Alex) Molak is an independent machine learning researcher, educator, entrepreneur and a best-selling author in the area of causality.

Connect with Alex:
- Alex on the Internet

Links
Libraries
- DoWhy (https://www.pywhy.org/dowhy/v0.11.1/)
- EconML (https://econml.azurewebsites.net/)
- CausalPy (https://causalpy.readthedocs.io/en/latest/)

Books
- Molak, A. - "Causal Inference and Discovery in Python"
- Pearl, J. -

Should we build the Causal Experts Network?

Share your thoughts in the survey

Support the Show.

Causal Bandits Podcast
Causal AI || Causal Machine Learning || Causal Inference & Discovery
Web: https://causalbanditspodcast.com

Connect on LinkedIn: https://www.linkedin.com/in/aleksandermolak/
Join Causal Python Weekly: https://causalpython.io
The Causal Book: https://amzn.to/3QhsRz4

 016 - CB016 - Emre Kiciman - Transcript

Emre Kiciman: The thing that struck me was that each of these papers really well grounded in social science theories, really well kind of backed up by, by data analysis, coming to these beautiful conclusions and then the very last side would inevitably be, but this is observational data. Correlation is not causation.

Anything could be going on. We don't know. Once you start using causal analysis approaches to gather insight from data, you realize how important it is. To be driving any decision making process causally rather than through a hand wavy guess based on a correlational analysis. 

Alex: What's next for DoWhy and EconML?

Emre Kiciman: So with the DoWhy, well, one of the projects I'm excited about is 

Marcus: Hey Causal Bandits, welcome to the Causal Bandits Podcast. The best podcast on causality and machine learning on the internet. 

Jessie: Today we're traveling to Vancouver to meet our guest. Computer science is his long standing love. He moved from low level Computation to Social Computing, and from there to Causality.

He's one of the core developers of the DoWhy Library. A passionate climber and a family man. Senior Principal Research Manager at Microsoft Research. Ladies and gentlemen, please welcome Dr. Emerick Kidjiman. Let me pass it to your host, Alex 

Molak.

Alex: Ladies and gentlemen, Please welcome Emre Kıcıman. 

Emre Kiciman: Thanks very much, Alex.

I'm happy to be here. 

Alex: I'm very happy you find a while to join us for today's episode. Emre, are large language models causal parrots? And if they are, does it matter? 

Emre Kiciman: I think the question of whether large language models will be able to reason causally at some point is up in there. Maybe someday they will.

I don't think that they really are right now. However, the power that they bring with their embedded knowledge about the world, if we treat them as a beginning of a common sense database about how the world works, I'm very excited about how that information can be used to help augment the causal analysis process today.

Not replacing, you know, statistical estimation methods and all the algorithms we've been developing, but really augmenting and opening up an opportunity to provide support for people who need domain expertise at their fingertips. Some kind of help for setting up their causal assumptions. 

Alex: What would be the scenarios where you find LLMs to be the most useful today in the context of causality and causal inference or causal discovery process?

Emre Kiciman: Yes, you know, for. A long time we've talked about the importance of the assumptions you bring to a causal process, right? I think it's a refrain that everyone says. You can't get causality just from the data. You need to bring in your own knowledge about the data generating process and what might be plausible, what mechanisms might be plausible, what information might be unobserved.

And until now, we've had to tell people that they have to go and figure that out entirely on their own. Our computers can't help. And what I'm most excited about at the moment is large language models coming in and providing us some way to provide technological support to people at that stage of analysis.

You know, they can now Come with a open question, some data set, and say what are the plausible causal mechanisms here that I should be considering and the LLM can give them responses or they can come in and say here's what I think might be happening. Is there anything I'm missing? And the large language model can critique their assumptions and tell them where they might be going wrong and maybe where they need more verification or validation.

This doesn't mean, of course, that the domain expert shouldn't have the final say, but it does mean that the domain experts burden is greatly alleviated. They're not starting from scratch anymore. 

Alex: In our conversation with Ishan Shukupta from BMW Group. We discussed using LLMs, uh, for this process of constructing the knowledge graph with domain experts within the organization.

And one thing that I really liked, uh, that he shared with me was that LLMs were not only helping them speed up the process, but were also an element that was providing additional motivation for the domain experts to share their knowledge because they were like, they were presented with something, like the.

Initial graph those obtained using class language models. And then they were inspired or motivated to, to also criticize it and share the knowledge and show that their knowledge is valuable. So it was not only making the process more efficient time wise, but also It was inspiring people to engage more within the process.

Emre Kiciman: Yes, I was really excited when you told me about that story and I'm looking forward to going back to your last podcast and listen to the full description of it. I find that really exciting and I can understand why. You know, we engage differently when we feel like what we're doing is valued. And so if you see that someone has taken the effort to already build a causal graph and you can start to see the connections about how that information is going to be used for, you know, the task that I can understand how that will be very engaging, right?

Alex: And so that's interesting that the technology is changing, but something about the human nature stays the same, although the environment is different. Yes. What DoWhyou think are the most important challenges today when it comes to causality, causal inference, causal discovery, in general, working with causal models?

Emre Kiciman: I mean, from a, I guess it depends on whether we're talking about this from an academic or a practical standpoint. I think from the practical standpoint, just getting causal methods deployed more widely in every place where it might be relevant. for decision making and understanding of the world. I think that education is still quite critical.

I think just understanding the basics of causal concepts and how they, and how to work. I think other fields like, you know, basic statistics, for example, have a, have a leg up on causality there. So I think we need to make some more More progress, but it's, it's happening with, you know, books like yours, for example, on the academic front, I think that there's wide open opportunity.

We've gone very deep on algorithms and processes for a set number of tasks. But as computers are becoming more widely deployed all around industries and society, I think there's a lot more data about a much broader set of problems. And so that means that Simple effect inference under, like, binary treatment is, that's already considered a rather simplistic case.

I think we're seeing that there's now a much broader set of tasks, and so I think that there's a lot of opportunity. For example, I think more complex modeling of more complex physical processes, where you have feedback loops over time, I think that's, that's something that's, something that causality should be tackling.

And, you know, again, like wide open opportunity to, to find the right approaches for, for modeling those systems. 

Alex: You started working with causality relatively early before it became popular recently. 

Emre Kiciman: Not as early as many people, right? I mean, it's been going on for, for decades, centuries. 

Alex: Yeah. Well, of course we have those waves, right?

So I'm referring to the, maybe the most recent wave and taking this as a point of reference. Uh huh. What was your journey to causality? What made you inspired to start asking those questions and dig deeper into this area of research?

Emre Kiciman: Yes, I was at a point where I was working on a research topic related to computational social science, in particular, the analysis of large scale social media data to gain insights about the world around us, about how people behave in the world, how do people make friends, where do people want to go during the day, what drives health decisions, all sorts of really fascinating topics.

But all coming from what people were just happening to talk about on public social media. And what I found one day was one very specific day when I was at a conference and there was, you know, an academic conference and people were talking about these really exciting insights they were gaining. And One of the thing that struck me was that each of these papers, really well grounded in social science theories, really well kind of backed up by data analysis, coming to these beautiful conclusions, and then the very last slide would inevitably be, but this is observational data, correlation is not causation, anything could be going on, we don't know.

And that was so disappointing. It's like you've got this beautiful balloon at the very end of your presentation, you just pop it, you say, we don't really know. And that was, and that, uh, I had heard about this thing called causal inference, and that you could, under certain conditions, actually make this.

Causal claims based on observational data and so that was the day when I decided you know I need to go learn that and bring it back to this task here this problem. 

Alex: Mm hmm. Mm hmm How did you start learning about causality? What was your your journey there? 

Emre Kiciman: My first step was to pick up Pearl's "Causality" book That's courageous.

My second step was to put it down. No, I really tried hard, but it was difficult for me to make progress with that book immediately. So I started looking at, uh, papers, just any paper I could get my hands on about causality and how people were applying it to real problems. And in the end, the one that really helped me click with just the.

The intuition behind why observational data might give you any hints about causality was Rosenbaum's paper on the importance of propensity scores. 

Alex: It's a classic paper from the 70s. Yeah, the 70s. 

Yeah. And that paper, there was just, it had just a very, very, clear description of how the propensity score acted as a balancing a score to balance your control and treatment groups, such that at least for the features that you are explicitly accounting for.

So the causal assumptions that you are basically simulating or randomized control trial, right? Obviously. You know, not a real randomized control trial, which would require fewer assumptions. But, uh, still that intuition suddenly made things click for me. And then from there I was able to better understand graphs and make my way back to Pearl's book.

Yeah. And the rest of the literature. 

Yeah. Pearl's Causalities is a book that might be very, very challenging for anyone. just starting with the topic because of the way it's constructed. It's a great book. It's amazing book. I call it the Talmud of causality because it not only contains the content itself, but also commentary on this content and commentary on someone's commentary on this content and so on.

But nevertheless, it can be very challenging for someone who is just, just starting. Yeah. Today we are in a much better situation in terms of available resources. What resources would you recommend to people who are just starting with causality? 

Emre Kiciman: I think depending on where they're coming from, uh, there's different, different resources.

I think there's quite a few new books out there. I haven't read all of them or most of them even, but I have most of them. I think I just haven't read them. Um, and I think that, you know, someone who's coming from like a data scientist or programming background is going to probably want a different path than someone who's coming from a statistics background, for example.

Um, I do see most books seem to be organized by method, right? So here's propensity score based approaches. Here's double ML based approaches. Here's instrumental variable based approaches. And I think that's great. I do wish that there was something that a resource that approached it more from a high level concepts first, and then, you know, the methods happen to implement those concepts, right?

Kind of providing a broader umbrella over the material, over the area. 

Alex: You, one of the, uh, creators of doY and EconML packages. What was the main driver for you to engage in those projects? 

Emre Kiciman: It was really a desire to, uh, you know, broaden the usage of these methods. Once you start using causal analysis approaches to gather insight from data, you realize how important it is to be driving any decision making process causally rather than through, uh, A hand wavy guess based on a correlational analysis.

It just makes your assumptions much clearer and it makes it much easier to recognize the limits of your analysis and the places where you can be highly confident about the the outcomes. We, uh, with Amit Sharma, the two of us started giving a tutorial about causal approaches with the The purpose of, you know, educating more people about these methods.

And we found that we started working on the DoWhy library, almost like a pedagogical example, right? Just so we had some library that could, you know, we could use for coding examples. And that was actually, so that was one of the reasons why we've structured the DoWhy library around these four stages of causal analysis.

Coming up with your models and your assumptions, then analyzing those models to identify a causal estimate and figure out an approach to answering a causal question. A third stage then being actually doing the statistical estimation to calculate the, your values from data. And then at the very end, validating your assumptions or refuting them, trying to refute them.

And I guess this ties into what I was saying earlier about, about that kind of high level process overview. These are four steps that you have to do regardless of what causal estimation methods you're working with. That was that pedagogical library then people found started finding useful. And so we started thinking harder about, you know, how, what we needed to do to make it more robust and more practically useful.

And it's, it's grown quite a bit since then. And we've, you know, joined. We've broadened the initiative and a lot of people have joined in on the effort to make it a robust library. And we've also become part of a broader ecosystem of libraries that are working together. for different aspects of, of causal analysis.

Alex: DoWhy has been now moved to a new project recently, uh, called PyWhy. Can you tell our audience a little bit more about this project and the motivation about organizing the structure around DoWhy and other libraries in this, in this way? Yeah. 

Emre Kiciman: The purpose of the DoWhy library was always to broaden usage of these methods, right?

That's the pedagogical interest and then the kind of the more practical reason for making it a robust data science library. And that was also what drove the creation of the PyWi organization. So do I was, uh, an open source project under Microsoft's GitHub organization. And when others wanted to join in, you know, it didn't quite feel right for it to, to be a Microsoft organization still.

So Amazon in particular, uh, wanted to join in with a significant contribution. We said, you know, for the purposes of continuing to grow the, and foster the community around causal inference. It was the right thing to do to make it an independent organization. And since then, we've had great contributions from, also from MIT and Columbia.

We have, uh, Carnegie Mellon as a, uh, key partner, having contributed the causal learn package, the Python version of, of the tetrad algorithms and more, as well as contributions from, uh, WISE as well. 

Alex: It's, it's really great to see. Not only the growth of the package and the entire ecosystem, but also all those synergies that appear on the way.

And I think for many people, it was really inspiring to see two major. Market players like Microsoft and Amazon contributing together to one open source tool. 

Emre Kiciman: It's one of those cases, I think, where we just see that empowering the community to make better decisions with causal analysis, it's going to make data more valuable, it's going to make computing more valuable, and that helps all of us.

Alex: Beautiful. I'm a huge fan of this four step process in DoWhy, and I think it's really great. Every time I am teaching about causality, I use this even if I don't talk about the software, I'm using this as an example, how we can structure your thinking about the causal problem. The last step, the reputation step is as I understand inspired by, by the scientific method itself in a sense.

So what? A famous philosopher Karl Popper proposed in 1950s, if I remember correctly, that we can try to falsify theories. We can never prove them, but we can try to falsify them. And so this idea of refutation seems to me very closely related to this, uh, idea coming from Popper. What was that? Your inspiration or?

Emre Kiciman: Not, not a proper, uh, specifically, but, uh, just generally the understanding that we can't, that we can find in data signs of contra, things that contradict our assumptions, but we can never know that, we can never prove that the assumption is, is completely correct. If we could prove it, it wouldn't be an assumption.

It also shows us 

Alex: that causal modeling is so close to the boundaries of human knowledge. We just cannot prove stuff and that's just where we are as as as humanity. So in this says. For me, this is the, an epitome of, of, of scientific method of science itself. 

Emre Kiciman: Yes, I, I think that causal discovery and effect inference are really critical parts of, of science.

It's the really core to what everyone does with experiments and with, with what we're trying to do with understanding. the world around us. People use lots of different approaches and methods. So I think what we tend to call, you know, causal effect inference and the specific approaches that we use aren't the only way that people get at that.

But conceptually, yeah, I agree with you entirely. It's, it's, uh, really at the core of science. Yeah. 

Alex: What's next for, for DoWhy and EconML? 

Emre Kiciman: So with the DoWhy, we are, uh, with, uh, Well, one of the projects I'm excited about is PyY LLM. Right now it's an experimental library starting up, but really what we're looking at is how we can incorporate LLMs into the analysis process with DoWhy.

So using PyY LLM to help people use LLMs to generate causal graphs and to refute or critique their assumptions. So really plugging in certainly at the beginning and the end of this four step analysis process. And then experimenting with opportunities to do. Maybe identification style analyses, for example, using domain knowledge to identify potential instrumental variables and also maybe even providing support to code up analyses as well.

So those are a little bit tentative, but we're relatively confident that we'll be able to use LLMs to in some way to bootstrap assumptions and critique assumptions. 

Alex: Causality recently is not always explicitly, sometimes implicitly, but at the forefront of recent most hot discussions, most engaged discussions about artificial intelligence.

It comes from the fact that the models like GPT and Sora and recently released Latch world model also works, uh, all of this stuff. Uh, is going into the direction of modeling the world in some, in some ways. And if you want to model the world in, in a way that is, uh, sound causality, in my opinion is a necessary element of such a model.

What are your thoughts about all those generative and non generative methods that we have today? And DoWhyou think that those models can learn world models or causal world models or approximately causal world models? 

Emre Kiciman: I think it's, it's plausible to think that they can. I think with current like large language models, I think there's, you know, you can imagine that the amount of data that they've seen has counterfactual scenarios.

And it's plausible that they could then, that could lead them to actually model a, a true, Causal model, if that was the most efficient representation, for example, however, I don't think that we've necessarily done that on purpose. And I certainly think that even if this was true, we would probably only have observed counterfactuals for a certain, you know, you wouldn't have population support, right, at some point.

And once you start extrapolating, not clear to me what would happen. So do I believe that it's possible for these to, possible for them to learn causal models? I think it's possible. Do I think that they are? No, not now. Then there's a second kind of meta question, especially around the language models, which is they're not actually modeling the world, they're modeling language.

And so now the question is, if they are learning a causal model, they're learning a causal model of language, which is not the same thing as learning a causal model of the world. And so then I think we have to think about what would it mean for them to learn a model of language, and then at what point would we think that that leads to something more, something deeper?

That's a very squishy question. I think very ill defined is probably the, the formal way to say it. As we move, uh, foundation models to operate over different kinds of data, not language, but more direct observations of the world, I think that'll give us an opportunity to think more clearly about what. The models are actually capturing.

Alex: I think that's very interesting what you said about the support, the population support for, for those models. So just for those people in our audience who are less familiar with the term support, this means that observing the full scope of possible situations that rings a bell for me, it's very close to my thinking about these models as well.

Especially when Sora was released and OpenAI suggested that Sora is, is a physics simulator. I thought this is an overstatement and I think you can see this in the, in the video. So going back to the Popperian logic, right? I would say that some of the videos And that's what that we've seen so far show falsify the claim that this is a physics simulator, but they do not necessarily falsify the claim that this model learns to be an approximate physics simulator or a local.

approximate physics simulator. So it can simulate physics in certain areas, but not necessarily in other areas. And when we think about it from this point of view, we can take broadly speaking, perhaps two perspectives here. One would be that the model only learns to predict something. So just learns certain shortcut that leads to a plausibly looking output, or It really learns a function that is a correct or approximately correct function.

of how the world works locally. Yeah. What are your thoughts about this and which of those ways DoWhyou see as more plausible, if any? 

Emre Kiciman: I think you're right that, that if anything, it's learning an approximate local simulation. And I think that that's quite reasonable. In some ways it ties into questions about whether these models, whether it's okay for these models to be generating ungrounded responses or hallucinations.

If you're doing creative writing, yes, that's perfectly fine. It's part of the task. If you're summarizing a conversation and you want to make sure that summary is accurate, no, it's not okay. Similarly, if you're writing, if you're asking something like Sora to generate a creative video, It's fine if it skips corners around physics and stuff to make the visual look, look right.

Like, one of their example videos is two pirate ships battling in a coffee cup. Why DoWhyou need physics? To, to, like, it doesn't, the scenario doesn't make sense. So why DoWhyou, why DoWhyou care about whether you're violating physics or not? And in that scenario, you know, there's the waves of coffee in the coffee mug that look really, really nice.

But if you think about it for a second. Why are there waves in a coffee cup? That doesn't make sense, right? And so you, you need it to, to, to violate physics in order to, in order to satisfy the creative, you know, imperative. 

Alex: And to make something look realistic, right? Yes. 

Emre Kiciman: That's a paradox. 

Alex: Yeah. 

Emre Kiciman: I think it's fine that they, that, that they aren't following physics.

Now the question is then, when we do want them to Correctly model, uh, the physics of the world, are we going to have the right controls to allow us to do that, you know, like if I pick up a ball right here and drop it, you know, in a sore video and I say, what's going to happen if I drop the ball, it's probably going to say that it's going to drop, but, you know, it's making an assumption about where we are.

The ball is actually a helium balloon. And it's going to go up instead. Those are all, that would be just as physically plausible. And yet, you know, it's going to have to make an assumption about which one, which approach to model, and then how do we then change that assumption if we, if we actually wanted something else.

Alex: That's a very interesting point. In a sense, I'm sometimes thinking about two axes there with those generative models. One is like impressiveness, and the second one is usefulness. And sometimes something very impressive can be very useful. For instance, in a example that you gave, Creative writing or creative video generation, but sometimes impressive is negatively correlated with useful, right?

In certain cases. What DoWhyou think is the future of this generative revolution that we are experiencing today? 

Emre Kiciman: Oh, it's so hard to tell. I feel like we're at the beginning of the internet, uh, the commercial internet in the mid nineties, right? Where we are gonna, we can start to see what's coming in the horizon.

But it's going to take a while and we're, there's a lot that's going to happen that we don't know. Like in the mid nineties, I remember we could, it was very clear that video was going to move over e commerce was going to happen and yet, you know, taking e commerce as an example, it took at least 10 plus years.

For that to be plausible, we needed to invent secure HDTP. We needed to invent, we needed to engineer secure database back in so people couldn't hack the e-commerce databases. We needed credit card companies to generate like fraud, uh, their fraud policies to protect consumers before people felt comfortable.

All of this infrastructure and engineering had to happen, even though we could see that it was obviously gonna happen. And now I feel like we're in a similar spot. We can see that so much is possible, but it's going to take more work and probably more work than we anticipate to make that happen. I think we'll move faster than we did with the internet revolution.

Just a pace of change seems to be faster this time around, but we'll have to see. And that leaves out Analogously, that leaves out surprises like Uber and Lyft style applications that were only plausible once you had smartphones and the internet together. That's something that was harder to see in the mid 90s.

Right. And so what's the equivalent, what's the complimentary technologies that are going to come along to make AI, you know, more, even more impactful than we were imagining today. I think that's, that's a little bit harder for me to figure out what might happen. 

Alex: In your journey with computer science, you started with low level computations.

How these experiences impact your understanding of causality today? 

Emre Kiciman: For me personally, I think that I tend to approach analyses and understanding of causality analyses more mechanically. So you know, getting to the, when I read the Rosenbaum paper about propensity scoring, I didn't just read the paper.

And I also implemented the analysis from scratch and like stepped through and figured out, like looked at the data as it was being balanced and like measured and double checked that, you know, my intuitions made sense, you know, and that, that they were, were correct and experimented with different setups, right?

To push the boundaries of what happened to really understand his behavior. here. And I think that continues today. So when I'm trying to figure out what, you know, how LLMs might behave causally, it's, I need to get my hands dirty to really give myself confidence about any intuition. So I think that continues.

And I also try and think about how we might use these models. It's not just on their own terms, text in, text out, but also thinking about, you know, what we can be doing at different layers of the inference stack to, you know, control their behavior. You know, what are all the knobs that we might use to influence whether a foundation model is appropriate for a particular task or make it appropriate for a particular task?

And this is not necessarily different, I think, than how machine learning researcher would think about these models of Uh, behavior across different levels of, of abstraction, uh, but it maybe is a slightly different set of knobs just because of the systems work that I bring, and I probably miss things that the ML folks, uh, you know, uh, think are obvious as well.

Alex: In my recent conversation with Robert Ness, your colleague, I remember Robert mentioning that one of the paths with generative models that he finds Inspiring and potentially useful is to learning how to causally control the generation in a sense. Just, you know, creating a set of knobs for, for ourselves where we can just modify just one or two aspects of the, of the image without generating it from, Uh, from the scratch.

What DoWhyou think about this, uh, this path and thinking about causality in terms of causal impact on the output of a model rather than the model modeling causal reality outside of itself? 

Emre Kiciman: I think there's lots of ways to mechanically influence the behavior of a generative model, right? I mean, there's even people going in and artificially reweighting activations.

you know, based experimentally on what works or doesn't work. And I think that's, that's a fine way to approach the problem. In a sense, the one way to think about it is that we have the standard input output to the model, but then we have all this other knowledge that we can bring to bear that describes what the generation should look like.

I mean, the obvious kind of examples are You know, we're telling the model to generate JSON. Well, we also know exactly what JSON looks like, right. Or Python code or whatever structured information you might want to get out of a text based language model, or, you know, maybe less structured, but similar kind of knowledge about what we want to get out of, out of an image.

What, what parts to change or not change. And it's difficult to express those in a text command, but we have other knobs. So trying to figure out how we can use other knobs to. impose that other knowledge that we have that we can't otherwise express, I think is a great way forward. What keeps you motivated in your work?

Uh huh. Um, I really, I think what keeps me motivated is, is really thinking about the, the potential to, you know, impact real, real problems. That's what brought me from lower level systems problems to higher level social science, computational social science, uh, tasks that were maybe closer to people in society.

Even today, when I'm using causal analysis methods, you know, even though I'm, you know, thinking about the platform we're building horizontally, I'm always looking for kind of end to end problems to validate that these approaches work or to push the boundaries of technology. And the ones, those ones that excite me the most are the ones that, you know, touch on some society issues.

Important problem. So that's, that's what continues to, I think, motivate me is, is that question of how is going to make a difference in the world? Who would you like to thank? Oh, so many people. I think in this context, I clearly have to thank my closest collaborators. So Amit Sharma has been an incredible collaborator for so long, and really, I can't say enough good things about him.

My other recent collaborators in the causal realm, of course, would be Robert Ness and Chen Hao Tan. We wrote the paper on causality and large language models, uh, last April together. And then all of the, the people who are working together with us on, you know, in the PyY environment. So colleagues at MSR New England who develop EconML, our collaborators at, at Amazon and in academia who are contributing to the library.

I think really it's great to see like everyone. It has the sense of, you know, here's, here's the direction we're all going. And, you know, we're, we're working on different pieces of technology, but, you know, we see how, how it all connects together and how, you know, everything fits to push forward kind of clearer, more causal clarity around important tasks and decision making.

What would be your message to the causal Python community? I think it's actually pretty healthy right now. I think that we have quite a few libraries. So beyond PyWire, we have others as well, CausalPy and others. And I think you have great resources around what that full ecosystem looks like. And I think that a good advice that I'd have for people who are trying to Continue pushing the causal Python community forward is to not lose sight of the end goal.

That is like, what are the problems people are trying to solve with these methods and really looking very broadly at approaches that make it easier to solve those methods with causal analysis. So that might be something very deep in algorithm. It might be something more. You know, software engineering related, like how easy is it to ingest data into one of these algorithms, or it might be even documentation.

There's just so many ways to push this technology forward in ways that make it more useful and applicable to real world tasks. 

Alex: What question would you like to ask me? 

Emre Kiciman: I know you go around and are working with many different groups of people on causal problems. I'd love to chat with you about what you're learning about people's causal journeys, what trips them up, and what doesn't.

We, you know, developing the causal open source community, it can do to, to help, you know, is it education? Is it better tooling or more easily accessible tooling? Or is it really making sure that these things scale to industry problems? What's what that would be? I think something I'd love to learn from you.

Alex: That's a great question. And it probably could cover an entire episode, um, but trying to be more compact with this I think one thing that could be very, very useful for the community, aside of education, which I think we still need more of it, would be to make partial identification slash sensitivity analysis slash proximal learning.

Stuff more accessible to people. My belief based on my experiences with industry, but also academia people. I feel those three things which are somewhere at the core. I feel very closely related are probably one of the most underutilized and underrepresented concepts in causality. That can really move many cases forward.

So many people don't realize that using sensitivity analysis might be just enough for them to make optimal decisions, even if their model cannot be fully specified. The second point here is that some people realize this or might realize it partially or I just heard about this, but they don't know how to do it technically.

And so this stops them from moving this direction. Like many of those people out, you know, as, as we are also busy and they don't have too much time to devote to studying in depth something from scratch and implementing something from scratch and so on and so on. So I think that will be one thing that could have a very large impact on applied causal inference today.

Okay. And I know that do I use this, uh, the. The method from, from Chinelli from Carlos Chinelli so you can, you can use it. It's a great method for, for sensitivity analysis, but there's much more than this and much more many methods that can go beyond. Um, the limitations of, of the methods, uh, proposed by, by Chinel, right?

So, so I think that would be one, one thing that could really move the needle, the needle forward. 

Emre Kiciman: Yeah. Partial identification. That's, I know that we're starting to design. So Adam from Columbia, one of our contributors for, in the context of, um, causal discovery and the Pi Y ecosystem, um, has been working on graph representations for partial graphs.

Um, and I think that, that'll lead to. More of these are broader set of identification algorithms. Yeah. But it is still pretty, pretty early. 

Alex: Um, 

Emre Kiciman: and of course, sensitivity analysis then comes after. Right. Yeah. Yeah. It's very interesting. Yeah. That's something that, uh, we'll have to keep in mind. 

Alex: Emre, what's next for you?

Emre Kiciman: Yeah, so the, the, the major, uh, effort that I'm excited about in the context of, of causality is I think there's two, two directions. So one is continuing to push on, uh, making it more practical to use large language models to support people in the standard causal analysis process. How can we use. Uh, large language models to suggest causal graphs, suggest potential missing, missing data and missing confounders and critique the analysis process as people put them together.

The second direction that I think might be starting up is looking at how foundation models might help us better model more complex physics style systems. I think that's a very early going, but it's something I'm starting to look at to see if it's. plausible as a contribution. But if it is, I think that would open up models to be applied in a really broad set of very exciting applications.

Alex: Mm hmm. Before we conclude, I would like to ask you one more technical slash opinion question. Some time ago, I remember Jan LeCun tweeted that predicting pixels or training a model to predict pixels, and he was referring to, I think, back then. It is a wasteful way to learn something about the world. And he proposes architectures that are more predictive than generative at its core.

What are your thoughts about those two approaches to learning something about the world? 

Emre Kiciman: Yeah. I think there's maybe two ways to take that, that statement. One is like, are pixels the right representation? And I think pretty clearly, no, like there's some latent representation that, you know, that we have about the world and that we would expect a model would develop, uh, as well.

Right. And when we think about the physics of the world, we think about the physics in that latent representation, not in, you know, pixels. And so I think that's, that's right, that we don't want to be learning just based on pixels. But, you know, what view do we have of the world? Opportunistically, we have pixels.

And so maybe pixels are the best way to get to that latent, uh, representation. Whether you very quickly switch from pixels to latent representations, or whether it's done implicitly as part of a larger process, I'm not sure. And that's where I think the second aspect of his statement, you know, saying that it's wasteful is, is more of a valence of like, You know, if you have the GPUs to spare, who knows?

Maybe you're not going to see it as, as wasteful now, but regardless, I think going for latent representation of the world that's learned from a variety of signals and then learning, then. Implicitly or explicitly learning physics over that does make a lot more sense to me. 

Alex: Where can people find out more about you and your works?

Emre Kiciman: I have a mostly up to date website that kajiman. org so k i c i m a n dot o r g 

Alex: Yeah, 

Emre Kiciman: most of my papers are there. Uh, if they're not there, they're on google scholar 

Alex: Where can people best connect with you? 

Emre Kiciman: So probably email is the the best, uh, best way and that's on the on my website 

Alex: Great. Cool. Thank you so much.

It was a pleasure. Thank 

Emre Kiciman: you. Alex. It was great. 

Alex: Thank you. Amre