Causal Bandits Podcast

Causal AI at Causal Learning & Representation CLeaR 2024 | Part 1 | CausalBanditsPodcast.com

Alex Molak Season 1 Episode 24

Send us a text

Root cause analysis, model explanations, causal discovery.

Are we facing a missing benchmark problem?

Or not anymore?

In this special episode, we travel to Los Angeles to talk with researchers at the forefront of causal research, exploring their projects, key insights, and the challenges they face in their work.

Time codes:
0:15 - 02:40    Kevin Debeire
2:41 - 06:37    Yuchen Zhu
06:37 - 10:09   Konstantin Göbler
10:09 - 17:05   Urja Pawar
17:05 - 23:16  William Orchard

Enjoy!

Support the show

Causal Bandits Podcast
Causal AI || Causal Machine Learning || Causal Inference & Discovery
Web: https://causalbanditspodcast.com

Connect on LinkedIn: https://www.linkedin.com/in/aleksandermolak/
Join Causal Python Weekly: https://causalpython.io
The Causal Book: https://amzn.to/3QhsRz4

Alex: Root cause analysis, explanations, and causal discovery. This week, we traveled to sunny California to hear from researchers who presented their work at this year's edition of the CLEAR Conference on Causal Learning and Representation. Enjoy.

Kevin Debeire (DLR): I'm Kevin Debeire. I work for the German Aerospace Center in Munich.

So my work is about applying causal discovery methods with climate time series. And for this conference, I was presenting a paper in which we used bootstrap aggregation to provide confidence measures of the edge outputted by time series causal discovery methods. And also we found, uh, empirically that the aggregated graph resulting from the bootstrap aggregation And also, like, higher precision and, uh, also recall compared to the baseline method that we tested for the empirical experiment.

Alex: What impact would you like to see of your work in the, in the real world?

Kevin Debeire (DLR): So for me, I'm really focusing on applications with climate time series and I would like to, to see applications of my methods with, uh, Climate time series. In order to understand better, climate processes are linked together so we could understand better how climate change is impacting those processes from climate models.

And also we would like to understand if climate models are able to represent these processes correctly in the historical data.

Alex: What are the main insights, something that you learned while working on this project?

Kevin Debeire (DLR): Good question. I think, uh, for me it was, um, My first time working with so many people on the paper, I would say, so I, I learned how to write the paper in like a group and yeah, all the review process and yeah, that's what I learned, I think.

Alex: What should people type in Google in order to find your work?

Kevin Debeire (DLR): Uh, I think if you type, uh, bootstrap aggregation for time series Cozell Discovery, you will find the archive. Uh, paper, yeah.

Alex: What is the best causal paper you read last quarter?

Kevin Debeire (DLR): Uh, yeah. So it's actually a review paper which was written by, uh, Gustavo Kamphuis from the University of Valencia.

And it's, uh, presenting all the current methods. And uh, I think the paper is called, uh, Discovering Causal Relationships and Equations from Data. And it's explaining like the current methods to estimate causal relationships and equations from data and also presenting opportunities and challenges in this domain.

So it's more like focused on physical sciences. So for me, it's quite relevant and I really enjoyed the paper.

Alex: Right. Thank you, Kevin. I appreciate it.

Yuchen Zhu (UCL): Hi, my name is Yuchen Zhu. I'm a PhD student at UCL. What is your work

Alex: about? The work that you're presenting?

Yuchen Zhu (UCL): Okay, so the work I'm presenting today is, uh, called a meaningful causal aggregation and paradoxical compounding.

It's, um, some work I did while interning at, um, Amazon Research Tübingen. Uh, this is joint work with Kailash Buddhatoki, Jonas Kubler, and Dominic Janssing. Um, in this work, we We show an interesting paradox, uh, we find when we consider aggregated causal models, which is that even the property of confounding, which is a structural property of causal models, even that is well defined when we, um, consider ambiguous interventions on aggregated variables.

Um, we, we make, um, a simple realization which we call natural intervention, which, um, can help us mitigate, uh, this problem. Um, and we also try to, uh, generalize this observation to larger blocks.

Alex: What would that mean to make an ambiguous intervention?

Yuchen Zhu (UCL): Uh, yes. So, um, when the variables that we consider are aggregated variables of some fine grained, um, micro level variables, then when we intervene on an aggregated variable, there are many different ways of realizing that, uh, intervention on the micro level.

For which we assume that we do have a robust causal model for. Could you give an example? For example, Amazon, um, has millions of products, uh, each of which might have different prices. Um, then when they want to ask questions such as, uh, if I sell more cleaning products, how would that impact my downstream revenue?

Um, Number of cleaning products sold is an aggregated variable, and you might have different, like, different cleaning products with slightly different prices, so how exactly do you sell them will have different impacts on your downstream revenue.

Alex: What impact of your work would you like to see in the real world?

Yuchen Zhu (UCL): Because most variables that we care about are aggregated variables. Yet, there's only limited academic work, um, analyzing how we deal with this, uh, like the consequence of ambiguous interventions. I would definitely like to see this, this problem being analyzed more, um, for real world applications. And also, it would be nice, um, if there can be work that can learn from data, um, a set of macro variables.

For which, like, which is complete in some sense in summarizing this micro model. Um, I think that, that'll be, um, really useful for, for real, for real world applications.

Alex: What should people type in Google in order to find your work?

Yuchen Zhu (UCL): The title works, uh, Meaningful Cognitive Aggregation and Related Paradoxes, or you can search my name, uh, Google, and it's on my Google Scholar.

Alex: What is the best cohort paper you read last quarter?

Yuchen Zhu (UCL): Um, so there is this paper that I always go back to called, um, multi level cause effect systems. Uh, it's written by Krzysztof Klubka, I think in 2016 or 17. Basically, in this case, it learns these like sufficient macro variables, uh, for a specific model.

Um, and, uh, Yeah, I, I always go back to it even, and I think, uh, it's, it's even given me some inspirations for understanding, uh, how to, um, mitigate reward hacking, for example, for large models, uh, because in, in that case, we also want to think about transfer from some kind of lower dimensional models. Um, yeah, I think it's a really nice paper.

Alex: Great. Thank you so much.

Konstantin Göbler (TUM, Bosch): I'm, uh, Konstantin Goebbler. I'm, uh, from Germany. Yeah, I'm, uh, I'm a fourth year PhD student at the Technical University of Munich. I'm with Matthias Stutten's group there. And since two years, I'm also affiliated with the Robert Bosch GmbH, and yeah, we do a bunch of stuff with them together, I guess.

Alex: The work you presented here, what is it about?

Konstantin Göbler (TUM, Bosch): So the work that I presented is, uh, I called this causal assembly, or like we call this causal assembly. And it's basically aimed at facilitating, uh, benchmarking causal discovery algorithms. So it's more or less like a service for the community, but also I think, um, it's filling a gap of, uh, in the field.

There's some issues of, uh, attaining, um, reasonably complex ground truth data for causal discovery. So we, uh, managed to, um, convince the people at Bosch to, uh, Make public access. Um, not really public, but we try to manage to find out a tool that makes propriety data publicly accessible by some semisynthetic data, uh, semisynthetic data steps.

We do so in a somewhat principled manner, uh, with a bunch of like steps in the middle so that people can share. People are assured to when they benchmark. Um, the causal discovery algorithm, this is what this library that is implemented in Python that, um, the data really follows the ground truth causal graph that is also implemented in there.

Alex: What is the advantage of your approach, uh, compared to other approaches that are available in Python today?

Konstantin Göbler (TUM, Bosch): I think the advantage is that we almost should do this basically for any implementation. I think that is, uh, where people are somewhat uncertain about the ground truth causal structure that is given to somewhat.

complex real data, I think, because as soon as there's some uncertainty and you run your favorite causal discovery algorithm on, on this data, like the worst case scenario is that you, um, your algorithm is correct, but you think it's wrong because your ground truth is wrong. So what I, what we basically do is we make sure that the data that you run your causal discovery algorithm on really follows the ground truth that is at hand.

Alex: What impact of your work would you like to see in the real world?

Konstantin Göbler (TUM, Bosch): That's a hard one, I think. So the impact of my work ideally is that it, like I said, it facilitates progress in the field such that, um, causal discovery is more reliable, um, and is able to scale better such that real world problems can be solved.

To a certain degree.

Alex: What should people type in Google in order to find your work?

Konstantin Göbler (TUM, Bosch): Causal assembly.

Alex: What's the best causal paper you read last quarter?

Konstantin Göbler (TUM, Bosch): So it's not necessarily, I think, a strict causal paper, but I really, like, I dived a little bit into, like, ICA and, uh, and there was a paper on nonlinear ICA that I very much enjoyed.

The paper itself is very readable, but the proof is very hard, so. Um, I very much enjoyed the nonlinear ICA by Ignav here. And a couple of co authors, I think. Yeah.

Alex: Do you remember the author of the title?

Konstantin Göbler (TUM, Bosch): So, I know that the second author is Ignavier, who was also, uh, so much. Thank you.

Urja Pawar (MTU): Hi everyone. I'm Oja Pawar.

I am from Wurster Technological University, Cork in Ireland. And, uh, my PhD is funded by Science Foundation Ireland and McKesson Private Limited. I am presenting the impact of neighborhoods on explainable AI frameworks, such that I want to know how much, uh, they are able to convey the sufficiency sufficiency The fundamental, you know, blocks of any explanation.

We don't want to know which feature is sufficient in maintaining the classification as it is, and which feature is very, very necessary to, the necessity of that, if, if the feature binders get changed, then the classification will get changed. So there are these sufficiency necessary concepts. Now we do have many, many frameworks in explainable AI, uh, especially state of the art frameworks like Sharp, Blind, Dice, Dice is a counterfactual library.

And we extracted feature importance scores using that libraries and we wanted to see how much these feature importance scores are able to convey the sufficiency and necessity of features. The reason why I was doing that It's because if you are provided with a Feature Importance Ranking, you can interpret it very, very different ways.

I can say that, Oh, A is very, very important. So if I change the value of A, the classification might change. If somebody else is reading that Feature Importance Ranking, He or she, let's say a clinician is reading it, you know, and he looks at the feature ranking and it's like, this test is not given that much importance by a machine learning model.

So it might not be very important as per the model. So go train your model again because this test is very important. So there can be different interpretation and we don't know what to interpret from that ranking, right? So that's why we developed this concept called Explanandum, we define it properly and we try to see what exactly to interpret from that ranking that we are getting provided.

So I focus on the basic fundamental blocks Explanandum, which is sufficiency and necessity. Uh, I wanted to know if I'm looking at the top most ranked feature by sharp. Am I gonna get conveyed that this feature is the most sufficient feature or the most necessary feature and so on. Now the problem is these frameworks are very, very sensitive to the neighborhoods that they use for explaining and different neighborhoods can produce different explanations.

Sometimes people, uh, you know, you can always innocently use certain samples in the neighborhood and it can produce very, very false explanations. We tried a range of neighborhoods to see how much each of the neighborhoods. Can help these frameworks to convey the sufficiency and necessity. And we wanted to identify the best neighborhood that can help, let's say shop to convey the sufficiency of features.

The outside based neighborhoods basically perform well, which means that. If I'm focusing on samples that are outside the decision boundary, SHARP can help me in producing a ranking that can tell me, oh, look, this is the most sufficient, less sufficient, less sufficient, and so on. Same with necessity, uh, of features.

But the high level takeaway I will say from my paper is that people should draft a specific explanandum and then try to see whether these frameworks are able to convey that explanandum or not. And if not, then experiment with distant neighborhoods and try to find your answers, which neighborhood is giving you the most, most, uh, relatable or relevant answer to answer your own explanandum.

So, yeah, that was pretty much my research.

Alex: Right. Does this conclusion mean that it's very difficult to find one frame on this? It was universally work. Yeah,

Urja Pawar (MTU): so, um, we have loads of papers that talk about there is no freelance theorem in explainable AI as well. The main thing is that you need to be clear of your explanatory requirements here.

If I am a clinician, I am a, you know, some stakeholder in, in any finance company where loads of machine learning models are getting used. Without defining specifically what to interpret from the explanation that I'm provided, anything is very, very like ambiguous and it's very useless because now it becomes a tick box exercise that, Oh, okay, you're writing this solution, but I trust your model and that's it.

What is that explanation doing? Is it helping you? Is it, is it really representing the model truly in all ways? So I think multiple explainable AI frameworks using different neighborhood context is important to understand the whole actual truer picture of a model basically.

Alex: What impact of this work would you like to see in the real world?

Urja Pawar (MTU): I actually had a chat with nephrologist in Galway University Hospital back in Ireland and I wanted to Talk to them about like, uh, I developed a more complex explanandum in medical domain, which basically meant if I'm looking at the feature importance ranking, can I say, okay, I don't need to do further tests as per machine learning model.

I don't need to go for the test. Or can I also answer another question, which is. Uh, as per the machine learning model, which is the next most important diagnostic test that I would like to do. So these are the questions that we wanted to answer, uh, using the same methodology that we adopted in this paper.

And we discussed this with the clinicians, um, and they really liked our work. So I really want to see more and more, uh, truer forms of explanations with clearly well defined explanations. So that. People don't follow all this as a checkbox exercise for ethical permissions and all, and actually present the true picture of the model to the, to the stakeholders, basically.

Alex: What's the best causal paper you read last quarter?

Urja Pawar (MTU): I have not read a causal paper. Uh, I don't remember if I read a positive because my domain, I really focus on explainability. I don't think, of course, my domain should be very, very overlapping with causality because without causality, it would feel so bad.

It's kind of ambiguous to have explainability. And we did work on causal relationship in a way that we constructed medical workflows, like which test comes after what test and everything. I think in the last quarter, the most, my favorite people that I can share is DL, XAI community, um, It's, uh, people that you can just, just write the dear XAI community and that people talks about how much here and there the research is going on in explainable AI.

People are not standardizing the concept. People are not focusing on one single definitions of things. is defining and also the importance of explanandum is highlighted in that paper that you need to be very clear of who is your audience, what are you exactly trying to explain and what action someone can take looking at your explanation.

So I think these are the questions that are the whole point of our domain and people should work more and more on that rather than just developing novel XAI frameworks. To create just another ranking without defining what to get from that ranking. So yeah, thank

William Orchard (University of Cambridge): you so

Urja Pawar (MTU): much. Thank you so much

William Orchard (University of Cambridge): Hi, I'm Will Orchard.

I'm at the University of Cambridge, but the work I did here I did Amazon as an intern in Tuebingen. My work is about root cause analysis from a number of directions. One is how root cause analysis should be formalized as a causal problem and two is the real world application of root cause analysis to real world problems such as in microservice based applications at Amazon.

Now, can you say a little bit about your approach? The approach to formalizing root cause analysis or so. Essentially, a lot of existing, there's a lot of existing works on root cause analysis, and each of them have a, typically approached it not from a causal perspective, despite the fact that the cause is in the name.

One thing I've done is I've just gone through a lot of works and tried to figure out whether there is a unifying causal description you can give to each of these methods. And at the moment there, um, there isn't very much literature on how root cause analysis methods should be classified. But sort of unsurprisingly, you can classify them according to the causal hierarchy about the way in which they think of what a root cause is.

Um, whether or not it's an associational question, an interventional question, or a counterfactual question. And so my approach has long been to just try to classify what already exists. And then in particular, because I've worked with Dominic Janssing, he's very interested in the counterfactual approach.

So, um, so yeah, so in a counterfactual contribution approach to root cause analysis would be. What were the main challenges

Alex: to this work?

William Orchard (University of Cambridge): Yes, that's a good question. So, I think there's a number of things. So, one is A question which comes up when you're trying to formalize any problem, right, which is, uh, what is the best way to ask a question which sort of captures your intuition about how a problem should be formalized, right?

And whether or not it truly captures the thing that you're trying for it to, uh, capture, right? And I think causality can give very convincing answers to, uh, to whether or not you're giving a good explanation to something like a root cause. And so just trying to get my head around about whether or not we were asking the right question, and how other, uh, approaches were asking this question, and what the relative merits were of each of those things, that was challenging just on a sort of conceptual level.

And then sort of relatedly, the question was whether or not this formalization actually works in real life for real problems. And that involves obviously talking to engineers and people like this, who, who work with this data and have sort of developed their own heuristics and things like this for solving this problem and whether or not this, this formalization captures their experience as well.

Yeah. What are the main learnings, main insights that you got from this work personally? I think. The first thing I learned was, I think the question of what is a root cause is actually a very deep question. It's, I think it's interrelated to lots of questions in causality about, um, explainability. It's closely related to questions of sufficiency and necessity.

It's also closely related to the relationship between anomalous values or rare values and interventions and things like this. So I really think I learned a lot about. Yeah, causality from this perspective and root cause analysis, I think really is connected to lots of these things. I also learned that, uh, engineers typically aren't very interested and, and it really requires some convincing to tell them that their stuff is important and that it can, it can help them out.

So that was very interesting. What impact of your work would you like to see in the, in the real world? I think the first thing, particularly with this work, is there really has been a total, almost a total lack of standardized data sets available for evaluating root cause analysis methods. And I think one of the impacts of this has been very many methods being developed, not, uh, from a causal perspective, which are being, uh, benchmarked against many other methods.

So like each, each new paper has its own way of evaluating the approach, typically with simulated data. And then they don't release the data publicly. And it becomes very unclear as to the way in which you formalize the problem, uh, um, is seen in benchmarking results, right? So having publicly accessible datasets, I think is really important actually for trying to convince people that causal approaches are important and also to actually understand which ways of understanding this problem work in, work in practice.

I really hope that people are going to use this dataset, learn more about root cause analysis, and, um, yeah, we can come to some kind of consensus about the best way to approach it. Thanks a lot. What can people find out more about you and your work? I think mainly my Twitter. Um, my Twitter is where I usually post information about my work.

In this case, This paper is available on archive. The data set is available on a, on a GI repo. And uh, there are QR codes in the corner here. Um, uh, and then I of course also have a Google scholar. But yeah, that, those are the places to go. What should we type in Google to find the dataset? Yeah. Pet shop datasets, Amazon, that kind of thing.

Yeah. What is the best colorful paper you read last book? Um. Yes, this is a very good question. So I think one paper I found quite inspiring recently, these causal discovery methods. So this is somewhat unrelated to, to root cause analysis, but these are score based causal discovery methods, which are using the score in terms of the, uh, the derivative of the log of the probability, right.

Um, uh, of the, of the data. And, um, so there's this score algorithm. There's this no, uh, no GAN method. And I think the sort of, the theoretical work which has gone into the identifiability results for this sort of thing and the way of using score matching to do causal discovery. I found, um, yeah, really exciting.

It's just, I think some of the best causal discovery work I've seen in a long time. So yeah, that's been a, that's been great. Um, and then also this, this work from UNF Peaches on using heavy tail distributions, causal discovery and identifiability there, that also comes to mind, I think this is related to, uh, to root cause analysis.

So.