Causal Bandits Podcast
Causal Bandits Podcast with Alex Molak is here to help you learn about causality, causal AI and causal machine learning through the genius of others.
The podcast focuses on causality from a number of different perspectives, finding common grounds between academia and industry, philosophy, theory and practice, and between different schools of thought, and traditions.
Your host, Alex Molak is an a machine learning engineer, best-selling author, and an educator who decided to travel the world to record conversations with the most interesting minds in causality to share them with you.
Enjoy and stay causal!
Keywords: Causal AI, Causal Machine Learning, Causality, Causal Inference, Causal Discovery, Machine Learning, AI, Artificial Intelligence
Causal Bandits Podcast
Autonomous Driving, Causality & Long Tails || Daniel Ebenhöch || Causal Bandits Ep. 004 (2023)
Support the show
Video version available on YouTube
Recorded on Aug 27, 2023 in München, Germany
Is Causality Necessary For Autonomous Driving?
From a child experimenter to a lead engineer working on a general causal inference engine, Daniel's choices have been marked by intense curiosity and the courage to take risks.
Daniel shares how working with mathematicians differs from working with physicists and how having both on the team makes the team stronger.
We discuss the journey Daniel and his team took to build a system that allows performing the abduction step on a broad class of models in a computationally efficient way - a prerequisite to build a practically valuable counterfactual reasoning system.
Finally, Daniel shares his experiences in communicating with stakeholders and offers advice for those of us who only begin their journey with causality.
Ready?
About The Guest
Daniel Ebenhöch is a Lead Engineer at e:fs Techhub. His research is focused on autonomous driving and automated decision-making. He leads a diverse team of scientists and developers, working on a general SCM-based causal inference engine.
Connect with Daniel:
- Daniel Ebenhöch on LinkedIn
About The Host
Aleksander (Alex) Molak is an independent machine learning researcher, educator, entrepreneur and a best-selling author in the area of causality.
Connect with Alex:
- Alex on the Internet
Links
Packages
- PGMpy (https://pgmpy.org/)
Books
- Molak (2023) - Causal Inference and Discovery in Python
- Pearl (2009) - Causality
- Peters et al. (2017) - Elements of Causal Inference: Foundations and Learning Algorithms
Causal Bandits Team
Project Coordinator: Taiba Malik
Video and Audio Editing: Na
Causal Bandits Podcast
Causal AI || Causal Machine Learning || Causal Inference & Discovery
Web: https://causalbanditspodcast.com
Connect on LinkedIn: https://www.linkedin.com/in/aleksandermolak/
Join Causal Python Weekly: https://causalpython.io
The Causal Book: https://amzn.to/3QhsRz4
You need to know what you want to know, and you need the price for this. And identification scheme can give you both. I want to completely encourage you to drop this fear. Hey, causal bandits. Welcome to the causal bandits podcast, the best podcast on causality and machine learning on the internet. Today we're traveling to Munich to meet our guest.
His parents advised him not to take much risk, but he had other ideas. He was a child experimenter, torn between chemistry, physics, and electrical engineering. During his studies, he dove deeply into information theory to fall in love with causality a few years later. He loves connecting with others and has an unparalleled drive for innovation.
Ladies and gentlemen, Mr. Daniel Ebenhoch. Let me pass it to your host, Alex Molak. Welcome to the podcast, Daniel. Thank you for having me. I'm very happy that you were able to join us today. Daniel, I want to start with a very particular question. In your opinion, what is the problem with the current approach to autonomous driving?
There is actually no approach at all. I mean, there is a theory that you basically need to drive millions of kilometers to basically get all the events in a natural environment possible. And that's simply not feasible and you can break it down with simulation, but simulation has complexity issues, has issues to build it up properly.
And yeah, something is missing. There is a gap in gap to frame it. How could we address this gap in your opinion? I think there is a framework, you probably, as you are looking in this podcast, or maybe are new to this topic, is basically like, I don't know the English expression, but in Germany we would say it's , the best match.
It's like eye, grabbing obvious to use causal models, use a model which basically has all the distributions. Uh, of natural things appear and you model and play around with this distributions and answer query. So my answer is we should look into the direction of causal models. Is it because causal models are more efficient in describing the parameter space?
Exactly. And there is another property. Of course, you can incorporate the time into causal models, but for this time is a problematic. If you can keep time out of your equation, then you're much more efficient , sometimes time is obvious that it has an impact. For example, when you go with your autonomous driving function and have a camera into a tunnel, then you have the logical sequence of, okay, it's light, then it's dark, and then it's light again.
And this exact time sequence can make a difference, but for other things, time doesn't play a role. So you can more often than not take time out of your equation and just have the distribution of natural events and can combine them in a mechanism to use this word. So you can use functions, you can use complex functions, you can use machine learning neural networks to model these things.
And that you don't need to do it real time or even a speed up real time. Like in simulation, you can combine parameters by space, much more efficient. I would say in short, how did your journey into causality start? Basically, I got interested like so many with Judea Pearl, so he was the first I read about and he has this, uh, you know, special desire and burden inside, which catch me basically to say, hi, I have problems there.
I mean, problems throughout the industry who are not solved, and this is a framework which looks so much. Matching to solve, part, or maybe all of this problems we have, I need to look into it. It has testable implications. So there was so many green flags to go deeper and when I started to read the book of why I read it.
In the evenings, I read for many hours. It was one of the books I, you know, when you are caught by something, you read nonstop. So you like, uh, oh, I'm happy for the next evening to, to continue my journey. So, so how I got into this. Which parts of Pearl's framework were the most inspiring to you?
Inspiring was that it's a framework, which basically has testable implications. So you are. Somehow on a safe ground at the other thing was that it allows to have counterfactual things. So you always experienced the world in a factual way. you have some recordings of, for example, driving maneuvers when it was sunny, but at the same time, maybe the, what would have been if it would rain.
So this counterfactual aspect with probability of necessity and sufficiency to go a bit deeper and was imminent to see that, yeah, that could be a solution to. Broaden your tool set to, to tackle this problems. You mentioned counterfactuals. Do you think that counterfactuals are necessary to build autonomous agent, let it be driving agents or whatever, other kinds of agents that.
can act effectively in the world? That's a tricky question. I need to frame it a bit. I want to come from really the causal direction. So we all know, or probably most of you know the causal ladder and in the causal ladder you have this associative, this interventional, and this counterfactual rank. And it goes, you know, everything.
mentioned first is incorporated in the second step. So if you do counterfactual, you incorporate interventional and associative. So you have everything basically. To answer your question, I want to say, what is the difference between interventions and counterfactual? So the difference is in interventions, you are the manipulator.
You manipulate values to certain values to make sure it is like this. So you feed your causal model, certain parameters. And you say, calculate me the outcome exactly in this way. But it is like this. You mean making sure that the mechanism is the way that we expected it to be. Uh, you, you fix the input of the mechanism and that can be good in some scenarios, but in complex scenarios with autonomous driving, you are not sure for all your parameters to be fixed, but you want to incorporate knowledge for you have from already observed data. And when it comes to counterfactual, you incorporate this knowledge. So you made this abduction step, you propagate your knowledge from observation up to your background.
Exogenous, they have many names, these variables. And you say, how does the distribution shift under this observation? And then only in the second step, you forward your causal model into the, in the state you have set it through your observations. And I think this is much more Mighty and more usable than simply fixing some interventional values to certain values and look for this parameter space, but also to incorporate what you already know.
And I mean, it's a unique step you need to do in counterfactual queries. Maybe you could give our audience for those people who are not that familiar with counterfactual reasoning. a short definition of what abduction is. Abduction is somehow, upward propagation. So you have your network. You can imagine a tree network, for example, and you have a observation on a variable, which is way.
Down in the tree. So it's caused by something. It's a child. It's a child. It's a child or a grandchild Exactly, it's the family ancestry basically when you think of the latest heritage of a heritage tree and He you observe something about him. For example, he has So you want to up flow this knowledge into your distributions.
There is a distribution to stay into this picture about his grandparents, great grandparents, they still can be any hair color. But some phenotypes are more likely when this grandchild is blonde. And you want to basically update these distributions of the hair color or phenotypes of ancestors, by this fact.
This doesn't mean that you set it to a single value, so it's not, necessary that all his grandparents need to be blonde. But the likelihood is shifted towards some phenotypes. And this is basically abduction. We know that in order to compute counterfactual queries, We might require a very rich description of the system, a lot of information about the system.
How do you deal with this challenging practice? You came straight into the greatest problem for the industrial use of causal models. So you have basically a two legged way. You either spend a lot of engineering and time and effort and resources to make a full specified causal model. We use the structural quotient model framework because it incorporates any, functional relationship you can think about it.
So it's the most general, of course, it needs also the highest effort to fully specified structural quotient model with all the mechanisms, distributions you need, and you can either go there, you can decide, okay, we want to go there, then we are sure to answer any counterfactual query. So when you once decided that you have.
You're not specific. You need to answer many questions and you're not sure what it is. Then you will need to go this way or maybe you can strip the way a bit. So this means that what you're saying, I understand as if we want something very general in the future, we need to put a lot of work today in order to make this guarantees for the future.
And then you can be go on another branch. You can say. . I'm very specific. I have two, three questions I want to, get an answer to, and I will sit down, make a DAG structure, and then we have an algorithm. It's called the identification algorithm. It's the ID IDC, ID star, IDC star. And these algorithms, they can give you a simple answer.
You just need to sit down and have the structure, and then they will tell you, you can answer your question or you can't. And then it tells you a second thing when it's answerable. So when the question it's identifiable, then it will give you basically, also the recipe how to calculate the query, how to calculate the query.
And in this case, you can tell your boss, we need to collect data for this and this and that variable. And then we can answer you the query. Of course, this can be maybe too cumbersome to collect all this data. Then maybe more often than not, maybe you ask a too difficult question and maybe a decision can be done by a much easier question.
And then we can maybe model the question. So that's the. Effort, outcome ratio is much more, more better for you. So you can play with it. You can say, okay, this question may be too hard let's ask a easier question that's maybe for a first step.
If this question is negative, we don't need to know the more difficult question, right? Because we are already done. It brings to mind, this idea of computing optimal decisions when it comes to the effort, the energy spend, uh, gain, we expect to gain from computing a given quantity. This is something that we are also familiar with in science, especially engineering.
Definitely. I think to the capability to ask the right questions is half the goal so if you know what you want to know, and if you know the price for each of these questions, so a scientist, he wants to know something, how much data he needs to collect, , I think of Humboldt, for example, he traveled half the world and collected so much data because he was curious to know something.
So, You need to know what you want to know and you need the price for this and identification scheme can give you both. Is your question answerable and how much is the effort to get an answer? With your team you have been working on a very interesting project for some time already. Can you tell us a little bit more about this project?
Yeah, it started 2020 it was in the heights of Corona time. So, it was locked down. We had a meeting on the rooftop terrace, keep distance and everything, and we started a public funded project. So it's half, funded by the Bavarian Ministry of Economy and half by, by my company EFS.
And, we were asking following question to make this autonomous driving function safe. You don't need to test, you know, the vanilla basic tests, but you need really need to get into corner cases. A corner case is a test case where at least two parameters coincide. And they coincide so that not each of these parameters is necessarily an extreme value.
But by coinciding, by a middle value of air pressure and a middle value of temperature, something happens which only, you know, When they came together, like coincide, you mean that they interact with each other, they interact with each other and they make something which is problematic. So each parameter alone is a piece of cake.
But when they interact with each other, for example, rain and temperature, there can be a lot of rain. The street is okay for the wheels. There can be a low temperature and dry. No problem. But when it's minus one degree and slight rain, then you have a problem. Those corner cases. What made you pay attention to them?
It came from my, job. At this point, I was test manager for autonomous driving function. SAE level 3 and later SAE level 4. And there is exactly this question. Where are these test cases we missing at some point? And what is the corner cases? So I thought using the causal framework, using the queries.
And. Using the probability of necessity, the joint probability necessity and the joint probability of sufficiency can guide the answer. The team you are currently working with is a very diverse team. How is it to work with people with such different backgrounds on something as fundamental as causal learning or causal reasoning?
It's a pleasure I can say it's in some meaning a very diverse in other meaning. It's not so diverse. So we have, PhD physicists, we have mathematicians, they all have this logical. To take problems in some different flavors, but we are diverse cause we have, also UX UI designers who ask completely different question.
We have luckily a good mixture of female and male team members who give their strengths. And so, when it comes to strengths, we have, a very multi, motive to say it like this team here. What are the main strengths that physicists are bringing to the table and mathematicians are bringing to the table?
That's a very good question because they bring exactly some opposite side characteristics. Physicists are used to think critical and open minded. So, for them, nothing is a complete rule, but they try to challenge. And, there is this principle, a theory in physics is only true if you can't falsify it.
So, they put everything on the table and say, nothing is granted. And this is very good to be explorative, to, you know, not shy back and go in uncharted territory. On the other hand, mathematicians, they want their solid mathematical framework. They want something which is provable, which you can write down, which is not such a crazy idea like I confront, uh, my colleagues sometimes with.
And they ask, immediately, what's the... proof. Where is the paper? I can read this. Well, what is the background? And this place well together because it's fruitful to have some explorative minds and to have minds which bring you back to the translation into mathematics and into algorithms.
Cause in the end A fluffy idea doesn't bring you any algorithm dial. You mentioned physicists, physicists are often accustomed with experimental work and this experimental work often involves iterations. So we plan something, maybe it didn't work. We want to try again, changing something. We changed something.
I went, we want to try again. I had a very interesting conversation with some of my guests and they emphasized that this. Iterative culture is substantial to their work with causal models what is your view on this? I couldn't agree more. Actually, when you work with causal methods, sooner or later, you need a benchmark example.
You need something, which is understandable and you know, we all know this vector and frontographs where you have X, Y, four nodes and some meaning. Uh, but we needed a benchmark model, which is a bit more complex than this, but at the same time, we want to use it as a communication thing to show others, Hey, that's the outcome of our work that's could be.
And we were thinking about, you know, we made that those in time, the ice cream example on our slides, but we wanted to say something more difficult, more difficult, but at the same time, comprehensible. And then we started to think, we are working on autonomous driving. What can we do? Let's make a lane changing assistant.
So, we interviewed, I think, 30 plus people, what is the influencing factors about lane changing. We came across everything. It's, you know, the lane simply disappears, so you need to change to the one beforehand you . So, there are many influencing factors and we worked hours and hours.
I think we worked, uh, 150, 200 cumulated hours. And we had a huge model, which is, uh, you need basically a textbook to explain this model. So it's easier to get boundless with a hundred plus notes than to, you know, uh, contract yourself and look at the very minimal. So what's iterative on this?
You easily end up with very, huge models and you need to ask what really I want to model. What, what is the thing? And to. You need to first, I mean, there is in many directions iterative workflow. You need to think about what I do I want to know. Then you need to, can iterate over the structure. Uh, you need to incorporate all the structure information.
Cause as we all know, when you have a link, a missing link. It's a stronger indication than a link when there is a link between two variables, cause in the end you can set it to zero, so there's no impact. So with the structure, you need to integrate with the mechanisms. So mechanisms don't fall easily. I had a conversation with a colleague and he said, Oh, it's so depressing.
This mechanisms, this is so difficult to model them, to get the data. So you won't have an easy way to have everything at once. And you just need to think. What is the first goal I can make? I can make a substructure. I can make a Frankenstein structure, like I call it, where you put pieces together which, ah, it's not perfectly suiting, but maybe it comes a step closer.
So, causal models imply iterative mindset, I think. It sounds to me like one of those aspects that is important to you in your work is to... Build a prototype first, a prototype idea, and then try to find out which elements of this prototype are really necessary to answer the questions that we are interested in answering.
You're right. One thing is clear. You always need the target variable, so you need to make sure what variable, for example, uh, the story goes up a little bit further. We had this lane changing assistant, but lane changing is something. Not so easy to grasp because you have, time, domain there, cause it's not, you know, lane is changed, but you have different phases and we think about, we rethought our thing and we made a emergency brake assistant.
So we make assistance with the environmental factors and say, what is the reason that he breaks in time? So there is no collision or there's a collision. So our target variable was, does a collision happen? If yes, how strong was it? So how many centimeters or meters was the breaking distance too short that an impact happened?
So you, you are very sure, I think one or two or three target variables, but just the environmental variables, you're completely right. You can add some things. For example, we figured out the temperature is one of the main drivers of our model. Why the visual? Since the visuals sensor Zurich, they seldom have an impact on the target variables.
So they are almost neglectible. So you can think about if it's almost neglectible. So the cultural strengths from this path is very low. Maybe let's make an easier life. And neglect is for diverse iteration in and think maybe our question is also answerable with a sub graph.
Mm-Hmm, . So we're coming back in a sense again to this place. Of making an optimal decision, or finding a trade off between. Accuracy or precision and making something practically exactly. That's, that's a dilemma in the end, you need to be like, uh, you know, somebody is balancing on a rope. I would always encourage to be more on the practical side, because you can so easily get lost if you want to have high standards of accuracy beforehand, cause if you need it first, make a short try.
And like I said before, maybe you're already in a dead end and all the accuracy you would have added at the first place is wasted time and effort. So , if you're a perfectionistic , you can get lost easily, but causal models stay or a playing ground. You know, you can make experiments and things with very low effort.
And when you have the courage to, neglect things, you can get easily questions which direct you. Like, you know, we don't plan the journey, beforehand and we know the goal, but we zigzag, uh, towards the goal. And with every iteration, with every structure, with every added mechanism, we get closer, but we also don't run too much effort in dead ends.
Many people who are starting with causality, Uh, intimidated by the fact that a DAG or the graphical model that they might propose or they might come up with might not be correct. I heard this from many people and it seems that for many of them, this is a blocker to start with causality and start applying it.
What would be your advice to those people? It's a very, I heard it too, but I want to completely encourage you to drop this fear A DAG is something written down about your assumptions. So assumptions are there anyway. So you won't do anything with data and with science and with us making decisions without assumptions.
You have them implicitly. We know all the stomach , our stomach, our gut feeling. We don't make decisions without assumptions. And they are always there, but implicitly. And a deck is the perfect tool to make this assumption explicit. And once you made it explicit, only maybe some say a sword is not so strong than a pronounced word, right?
When you once get it to the light, you already, it's changed a little bit how you think about it. And you, you don't need to keep all your assumptions in your mind and, have a memory over a blast, but you write it down. You can sleep a night and revisit your assumptions the assumptions are there anyway, but we make it explicit with the DAG, so you can only win.
You won't lose anything. And if one wrong assumption deleted by your DAG, you already have made such a huge way forward. You, you wouldn't have when you wouldn't use the DAG. So it's the perfect tool. Causal bandits. This advice is on fire. Beautiful. Daniel. You already have a working prototype of your solution that can be applied.
In automotive and in other industries as well, wherever decision making is at stakes. I imagine that your path was not an easy one. What were the dead ends paths that you thought will lead you somewhere, but then you felt it's, it doesn't lead us anywhere. Oh, we had plenty of dead ends.
Uh, spontaneously one comes to my mind. We were starting like so many with Bayesian models, you know, this was at the time we started a framework, which was comprehensible, which was inferable, where you had some theory and you could make your calculations. But we always had the feeling that Bayesian models, you always have the restriction somehow to be more prone to be discrete.
You can't be always continuous. Otherwise, you have very special cases for your mechanisms. And we wanted to, we always had the feeling we want to be more, general. We don't, want to make a tool or a set of many tools. And for each problem you have another tool. We want as far as possible to incorporate it.
So we started to think, okay, that there exists SCMs, which have certain assumptions, even strong assumptions, , the independence of, the backgrounds were variables. Uh, but this is a framework which allows you. Discrete, continuous, mixed of discrete and continuous and every mechanism you could think about to set this as our gold standard was a first milestone on a project, but you ask for dead ends and one of the dead end was, we want to bridge the gap or make the connection between Bayesian models.
Which have always this conditional probability tables in the discrete case, and we want to re parameterize. You probably, some of you who are deeper in this topic, know this re parameterization trick. And we try to translate, you know, the world of Bayesian models into... The SCM world, and this was an idea to reuse or basically just, to have some stability to, to connect these issues.
And we use, we implemented the ization trick to translate this conditional probability tables into into a mix of fixed causal mechanisms and uncertainty. But this was a dead end cause you brought two worlds together which don't match in the end. And we sticked with the general SCM from then on.
What were your biggest discoveries on the way? I would take out two discoveries. Um, The first discovery is that basically abduction. The abduction step, uh, it's close to the heart of everything you want to do with More or less completely defined SCM. So when you want to basically have an observation and you want to propagate, this observations when you want to calculate the posterior of your distributions of your background variables, then you need abduction.
And you need this so often to incorporate into your model, uh, certain observations and how the distributions of the background variable shift accordingly. And abduction, so you need to understand that abduction is very much at the heart, so you need this. And, uh, of course you can recalculate it on pen and paper, so many examples.
wiTh small examples you can do it with pen and paper. As soon as it gets more complex, hence more notes, more complex mechanism relationships, not linear, but, nonlinear things and so on. And then it's not easy abduction. It's not, these updates don't come natural. And first we were confident and looked at some methods, but they always had very big shortcomings.
We looked at sampling methods, MCMC sampling we looked at. Other sampling methods and we looked at numerical methods. We looked at closed form integral methods and all things have the perk, but many failed. In the end, we looked at nine different methods for more than one and a half year. And, you know, it was always so feeling we have it, it works, but then there was a disappointment at some moment and it was a hard journey.
And now I don't want to get bad luck from saying, but now I think we have found a method. fulfills the requirement that it's time efficient. You don't want to wait hours and days to have the abduction step, but it needs to be efficient. On the other hand, it needs to be as precise as possible.
And of course it should work for many different mechanisms and it should work for many kind of distributions. And what we don't have, you want to have different kind of mechanisms, how the noise is incorporated. So of course you have the additive noise, but if you want to make a SCM, you have a general noise.
So you have also a multiplicated noise, which is either a skedastic scheme or even more crazy functional relationship. And the higher the requirements on how do you want to incorporate it, the more difficult the question is how to make the abduction step in the end. And it's still something you need to research, you need to get better.
If anybody has the solution, of course, I'm open handed, but we really explored the field, for a long time and very intensely. This is a very interesting point. And I think many people in industry are also facing similar challenges. So your solution, if at some point you decide to make it public or share with a broader audience, broader community.
I'm sure that this will be very, very highly appreciated to perform abduction step. We also need data and your case in autonomous driving is one of the most complex ones probably that we can think of. How do you deal with this challenge? Looking for data, collecting the data. It's not easy.
You need data for the abduction step. I would say that is the best case , because usually you have some data points from maybe accidents happened or some records or maybe some synthetic data where you want to show something but where you need a lot of data is to find the mechanisms and there we look at public available data sets, huge data sets we can incorporate.
Even experts can give you a guidance. Maybe this is a quadratic relationship between these two variables. Then you already have the form which you need to fit and only need to fit the parameters of this form. So you maybe have a better informed choice. So it's not easy to have all data to fit all the mechanisms and we try to compensate this with talking with those who make the test runs, going on the test field having this, uh, data black boxes, let's call it in the car
of course, they need to be coherent enough to incorporate it in the causal mechanism. It's not simply that you get the hard drive from this car and you have, you're done, but it needs to be plausible data. You need to be plausible and connected, causally appropriate. And on the other hand, we have a simulation.
Simulation can also be a great data get around cause you can model something specific and use the data from the simulation to learn the mechanism. Of course you can argue in some way the snake is biting its own tail, but I think simulation data and synthetic data can also be helpful to fill this gap.
I want to take a step back and go back to what we discussed in the beginning of our conversation. We said that existing. Autonomous driving framework of frameworks come with a set of challenges or limitations. And one thing about which people talk in the industry is so called long tails, long tails in complex distributions that might bring some unexpected events.
What are your thoughts about this? Great topic. Long tails are something when you experience it, it's something you, your alarm bells immediately ring because it's something. You look, you should look at when you're doing safeguarding autonomous systems, when you're doing testing, because time after time, they will produce a things where you stumble.
And it's something basically where, the. People who are responsible can't sleep calm if something, every time happens and still happens again. I first came to this when I was, looking at the work of Taleb, a professor at New York State University, I think about black swain, and there he discusses what long tails are.
And I think we can very, very smoothly connect it with the structural causal model framework because we can reverse the question, we can think. Long tails are a problem. Long tails you should have a look at. And we can say what structures, what mechanisms, what distributions constitute or plug together will produce in your target variable like the impact of a car accident will produce a long tail in your target variable.
So you can basically, guide your search much more. Efficiently, you can pre filter it and you can, I always love to ask experts when it comes to causal models because they can give you so much structural information, mechanisms, coincidences and things, which is all great implicit knowledge, which you can incorporate into your causal model and when you guide the experts and say, if this and that mechanism plays together, if this distribution came from here, you And he will think, ah, that's the natural distribution of this thing.
And when this comes together with that mechanism, then we will have a long tail, uh, distribution in this target variable. Then we have a much more efficient process, a much more efficient thing to, to came towards this. Yeah. This difficult to handle long tail things in practice. It sounds like a great idea to reduce or compress your search space.
If I understand you correctly, you're saying the following, you look at the distribution and say, Hey, there's a long tail in this particular place or this particular variable or this interaction, and then you go back to the model and you look for the settings of the model that can produce a distribution like this, a long tail distribution, and then you triangulate your findings with the experts.
Okay. To narrow down the search space further. Is that correct? Exactly. That's an idea. We need to look at what implications this has and how we can do this. But this would, I mean, in terms of parameter search in corner cases, a coincidence of two, two or more, uh, incidents like rain fog.
Temperature conditions, which individually are no extreme, which won't take off the car, but if they work together, like minus one degree, no problem when it's dry, no much problem for the wheels, slight rain, also no problem, but when minus one degree temperature and slight rain comes together, then you will have sudden ice on the street and it's a problem.
Basically one easy example for a coincidence where you have a corner case and this corner cases, if you want to find them, you always have this huge multidimensional parameter and if you can narrow it down, if you can make the search field smaller, you are very lucky because you have achieved something which is very difficult.
I want to ask you one more question related to what you said before. When you talk about those corner cases, and we talk about Other difficult scenarios that we might encounter, then we ask ourselves questions like which basic questions I should ask to make sure that we can deal with those challenges will often go back to the idea.
This very basic idea causality called identifiability your solution. How important this idea of identifiability is and how did you tackle? Thank you. The challenges related to the fact that sometimes maybe we need to accept the fact that the system will have unobservable variables. I wouldn't even say, it's the unlikely case.
So I think it's the likely case that you have unobserved variables, that you have hidden confounding, that you have everything, which you have not a completely defined causal models with all mechanisms, parameters where you can't collect data easily. So I think this is in most cases. I mean, depends when you have a mechanical system where you know everything, then you're fine, but this is, I think the more general case and you still can work with it because you mentioned it, the identifiability, what, what is it, it tells you, you have a specific question, you have one, two, three specific questions we can tell you, can you answer this question from this causal structure?
At first , you need no data, no mechanism fitted. You just need. To think about the structure and you need to think about your questions you want to answer in this domain and then we can tell you can we answer it from here and we tell you also for which variables you need to collect data and you can decide.
Okay, I take the effort and I'm capable of collecting data for this variable. Thank you. Then we can give a very good answer to this question, or you say, Oh, that's a huge data. It's not practical or too much cost and whatever thing to, to collect data for this. Maybe let's answer a easier question, but if the answer to this easier question is already, no, we don't need to take the effort to answer the more difficult question.
So we maybe are capable to replace your. more sophisticated, difficult question by one or two easier question. And if already one is the answer, no, you don't need the effort to already make the second step. So this is a very good guidance in the jungle of decision making. It's a great thing to know what is ready to be made and what's the price behind the answer.
When you don't know, when you start from the blank and don't know what's the price, for one question, the price is one month, one engineer sitting down, okay, and for another it's maybe two years and I don't want to spend the money and the resources and effort maybe in another. The price for answers, the price for good decisions we can give them a price tag.
That's an excellent point. I would love everyone to hear that. Many people who are just starting with causality, they have this fear about drawing a wrong DAG. But the second fundamental fear is that they will have in their system some unobserved variables that they cannot recall and add to the system.
The point of view that you just presented, and from the point of view of more advanced identification algorithms, we know that this is not always and not necessarily a roadblock. Daniel, what would be your advice for people who are just starting with something, maybe machine learning or causal inference, or they want to start working in industry, and they feel a little bit unsure if they will be able to learn everything.
That is needed to be successful in those fields. All right. You asked me for characteristics. My first thought was, uh, I'm honestly not cause I'm standing in front of you, but your book is a great introduction. So we have a working student in artificial intelligence. She makes her bachelor at the moment.
And he learned from your book and she gave the feedback, on presenting some part. Uh, that it's great introduction. So if you want to learn the theory, then the theory and the practical application and have a great overview. Your book is I think one of the first addresses to look at when it comes to characteristic I think resilience is an important thing cause causal inferences and, uh, Exercise, which is practiced with ease.
Sometimes when you have done all your homework, your preparation, then you might be striving, but it's nothing which comes. So easy to you. So you should be resilient. You should have frustration, tolerance, to keep being sneaky, to be confident that finally you will have something better you had before.
So that I think are very important characteristics. Of course, they apply also to other fields, but here I would highlight them because it's not a charted field. So you need to experiment, you need to play around. So experimental mindset would mention maybe also a playfulness. So, okay, uh, let's start with the first little deck.
Let's see what it can bring us and let's iterate. And, from the starting point, let's be curious. Let's be playful. Let's be experimental. Who would you like to thank? Oh, I, I own, I really own. Thanks. A lot of people, before of all, I want to thank my team. So I hope I don't forget anybody now. It's Clara.
It's Toby. It's Claudio. It's Katerina. It's Sergei Thomas and it's Andrea. So I owe them basically All the paper they read, all the ideas, all the effort they put in the crazy ideas, I had to make them reality. So they are a great team and I owe them a lot. I want to thank them honestly for their effort the last years.
When you go with the prototype of your solution to the market and you speak with your potential customers, what are the biggest challenges that you encounter in? Conveying the value of the solution. It's not easy to talk to customers because you need to find the right language. We are still probing to find the right thing.
Our last try was to basically answer or ask three questions related to each stage of the causal letter for their domain. To make them, you know, see what, what kind of questions they can get answers to. When I talk to people who have certain problems where they already match to our solution is, one common theme was transportability.
For example, they have machines in different environments. And they have a downtime, you know, nobody wants that their machine has downtime and it's the same machine or almost the same machine. And here it comes. It's different climate conditions. Maybe this machine's operating different time into function.
So on some parameters, they are identically or almost identically, but to others, they are diverse. And this is a. transportability scheme where you want to see if you can change your model towards what's same and what's different to answer questions. So this was heard from some manufacturing, related companies that they have a problem in this domain.
Just to give those in our audience who are less familiar with the idea of transportability. A little bit of an intuition, transportability is when we want to take conclusions from one context and apply them to another context, which is in itself a causal problem. Vanilla example is you have a shop in Los Angeles selling surfboards.
Uh, how would, the surfboard selling be in New York? So some parameters are the same. You have the same surfboards, but you have different clients. They have different beach access and so on. So some things are similar and others are diverse. And you want to compare this things. And more often than not to stress it, you compare, if you take non causal approach apples with pears and that's problematic. And we basically want to recognize that's a P in the apple, but we want to still press out what still can be done this, uh, circumstances. Do you feel that there is a need for more people with causal expertise in industry in general? Definitely. I would say, and not only I'm working in this field cause basically when you don't use causality, you are at a certain point in a dead end. You will make wrong decisions. You will have implicit assumptions you don't understand your data and you don't make your decisions righteous. Time after time, this will be important. And just, the first step is to accept that many models are just associative. So, you basically just copycat everything, which is there, that works for many domains, very good, but at some points you will make decisions which are simply wrong. I have a famous example which is cited by Paul Hunemund in his online course on Udemy that Google basically wanted to analyze how is the gender pay gap and long story short they raised the wage of men because they made a so called Simpson Paradox reverse and So you take decisions and even huge industries like Google, where you think they should be very prone and very, uh, you know, skilled with , handling data, even they take wrong decisions because this connections between.
Causality, storytelling, data scientists, and from this incorporated in making good decisions, this is something, which is important. So every data scientist, everybody related with decisions, everybody related with complex systems will take huge profit now and even in the next years from causal mindset, from causal methods.
What resources, books or other resources had the biggest impact on your causal journey? It's quite common, I would say there, I need to think if there is something extraordinary, but okay, let me cite the common, most common thing has started with the book of why, with the standard work causality from Pearl, with the primer to, you know, uh, have a little bit of shortcuts from, from everything, I read the book from Schulkopf and Jantzing elements of causal inference.
Especially, two chapters with interventional and counterfactual queries. They have already good Python examples, uh, then of course the many toolboxes. So PgEmpire already mentioned, uh, there are other great toolboxes. They kept the bridge between, it's a theoretical model and you actually can process something and.
That's as a practical guy has a huge impact on me. So of course, causality is something where you can have the complete spectrum. You can be a philosopher and just use your mind and you can be the hands on practical guy who has a, you know, a pot of data and want to process it. Then all the spectrum is influenced and will be influenced by causality.
Where can people learn more about your and your team's work? We published a few papers, so that's let's say academic access to us. We are always happy, to discuss with you personally. So you just can email me or reach out for me. Probably we can link my email address or my team members and we made, already.
Uh, LinkedIn excursions to contact people from different industries to simply discuss their problems, with us and see if this causal methods is, you know, a perfect match because at this state where we are, we have a prototype, we have really great algorithms that start, but we need a pilot project where we can, you know, one thing is that examples where you, what you do in.
A development project and the other is to get real benefit in industrial setting and to get nearer to this. We need an early adopter minded customer who wants to discuss this problem with us and we say if this is a match and we can, can go through this. So you can contact me on LinkedIn via email, we can have a short conversation.
I'm, I'm very open, to discuss this field, which I'm very close to. Daniel, is there any question that you would like to ask me? Oh, that's an unexpected one. I would need to think a moment. Of course there are many questions. We know each other now a bit. We had a brunch together. But what was, I know that you was into machine learning also through the Android NG course and I was curious what was your, you know, transition from machine learning to causality.
What brought you there? Maybe all the audience know already, but I'm curious. Honestly. That's a great question. My first encounter with causality was in 2017, we had a neuroscientific hackathon and we're trying to translate the ideas of, so-called functional connectivity. So how activities in different parts of brain might produce certain emergent qualities.
we're wondering if this ideas can be applied to other complex systems like economy or social groups, interactions and so on. And at some point, I realized working with different methods for analyzing associations like information, theoretic measures and correlation and all this stuff, we're hitting a wall constantly because we didn't know, Hey, actually, what does this mean that these things are related using this method and another method?
Maybe they are not related actually. That was the first time when I started thinking about. About the importance of causal thinking and then a couple of years later, I read the book of why, and that was the moment where I really felt I want to dive deeper into this. I was struck by the clarity of Pearl's framework and clarity of his thinking in the book. That was my, that was my journey. And then. I was so curious how to implement, how to put this in practice. And that was around the time when Amit Sharma from Microsoft started developing DoWhy. So that was the first package that I started diving deeper into. What's your message for the causal community or causal Python community?
Stay curious, stay experimental, look out what development brings us try this , causal mindset. It's like, a good virus, not a bad virus, a good virus. Uh, once you are infected by this virus, it will be maybe a lifetime journey to be curious to have this high standards of making sense of things, answering difficult questions and making things more.
understandable, proper, and better that's my message. Great. Thank you so much. It was a pleasure, Daniel. I hope you will have a wonderful rest of your day. Thank you, Alex. Congrats on reaching the end of this episode of the Causal Bandits podcast. Stay tuned for the next one. If you like this episode, click the like button to help others find it and maybe subscribed to this channel as well.
You know, stay causal.