Causal Bandits Podcast

On Causal Inference in Fintech & Being an Author || Matheus Facure || Causal Bandits Ep. 009 (2024)

February 05, 2024 Alex Molak Season 1 Episode 9
On Causal Inference in Fintech & Being an Author || Matheus Facure || Causal Bandits Ep. 009 (2024)
Causal Bandits Podcast
More Info
Causal Bandits Podcast
On Causal Inference in Fintech & Being an Author || Matheus Facure || Causal Bandits Ep. 009 (2024)
Feb 05, 2024 Season 1 Episode 9
Alex Molak

Send us a Text Message.

Support the show

Video version of this episode is available on YouTube
Recorded on Oct 15, 2023 in São Paulo, Brazil


Causal Inference in Fintech? For Brave and True Only

From rural Brazil to one of the country’s largest banks, Matheus’ journey could inspire many.

Similarly to our previous guest, Iyar Lin, Matheus was interested in politics, but switched to economics, where he fell in love with math.

Observing the state of the industry, he quickly realized that without causality, we cannot answer some of the most interesting business questions.

His popular online book 'Causal Inference for The Brave and True' was a side effect of his strong drive to learn causal inference and causal machine learning, while collecting as much feedback as possible along the way.

Did he succeed?

------------------------------------------------------------------------------------------------------

About The Guest
Matheus Facure is a Staff Data Scientist at Nubank and the author of "Causal Inference for The Brave and True" and "Causal Inference in Python".

Connect with Matheus:
- Matheus on Twitter/X
- Matheus on LinkedIn
- Matheus's web page

About The Host
Aleksander (Alex) Molak is an independent machine learning researcher, educator, entrepreneur and a best-selling author in the area of causality

Co

Should we build the Causal Experts Network?

Share your thoughts in the survey

Out-of-the-box insights from digital leaders
Delivered is your window in the minds of people behind successful digital products.

Listen on: Apple Podcasts   Spotify

Support the Show.

Causal Bandits Podcast
Causal AI || Causal Machine Learning || Causal Inference & Discovery
Web: https://causalbanditspodcast.com

Connect on LinkedIn: https://www.linkedin.com/in/aleksandermolak/
Join Causal Python Weekly: https://causalpython.io
The Causal Book: https://amzn.to/3QhsRz4

Show Notes Transcript Chapter Markers

Send us a Text Message.

Support the show

Video version of this episode is available on YouTube
Recorded on Oct 15, 2023 in São Paulo, Brazil


Causal Inference in Fintech? For Brave and True Only

From rural Brazil to one of the country’s largest banks, Matheus’ journey could inspire many.

Similarly to our previous guest, Iyar Lin, Matheus was interested in politics, but switched to economics, where he fell in love with math.

Observing the state of the industry, he quickly realized that without causality, we cannot answer some of the most interesting business questions.

His popular online book 'Causal Inference for The Brave and True' was a side effect of his strong drive to learn causal inference and causal machine learning, while collecting as much feedback as possible along the way.

Did he succeed?

------------------------------------------------------------------------------------------------------

About The Guest
Matheus Facure is a Staff Data Scientist at Nubank and the author of "Causal Inference for The Brave and True" and "Causal Inference in Python".

Connect with Matheus:
- Matheus on Twitter/X
- Matheus on LinkedIn
- Matheus's web page

About The Host
Aleksander (Alex) Molak is an independent machine learning researcher, educator, entrepreneur and a best-selling author in the area of causality

Co

Should we build the Causal Experts Network?

Share your thoughts in the survey

Out-of-the-box insights from digital leaders
Delivered is your window in the minds of people behind successful digital products.

Listen on: Apple Podcasts   Spotify

Support the Show.

Causal Bandits Podcast
Causal AI || Causal Machine Learning || Causal Inference & Discovery
Web: https://causalbanditspodcast.com

Connect on LinkedIn: https://www.linkedin.com/in/aleksandermolak/
Join Causal Python Weekly: https://causalpython.io
The Causal Book: https://amzn.to/3QhsRz4

 009 - CB009 - Matheus Facure - Audio

Matheus Facure: We're today calling it causal inference, but it's actually a collection of stuff from multiple fields. And I feel that reinforcement learning is one of those fields. Can we do cross validation with causal inference model? Can we do feature selection in this meat grinder way where we try a bunch of stuff, and then we take this model that we like better?

Matheus Facure: And I feel that at the heart of this problem is Hey, Causal Bandits, welcome to the Causal Bandits podcast, the best podcast on causality and machine learning on the internet. Today we're traveling to Sao Paulo to meet our guest. He thinks of himself as a weird Brazilian because he's not interested in soccer or carnival.

Matheus Facure: He grew up in a rural area and loves the sense of community that he remembers from his childhood. He studied politics but switched to economics where he fell in love with math. An author and a staff data scientist at one of Brazil's largest banks. Ladies and gentlemen, please welcome Mr. Mathias Fakhoury.

Matheus Facure: Let me pass it to your host, Alex Molak.

Alex: Ladies and gentlemen, please welcome Matheus Facure. 

Matheus Facure: Thanks for having me, Alex. So are you enjoying Sao Paulo so far?

Alex: I enjoy it pretty much, especially now when the weather became a little bit more sunny. Yeah, it's, it's beautiful. And I love the parks and

Alex: there's a lot of nature here. Very beautiful 

Alex: nature.

Matheus Facure: Glad to hear. 

Alex: Yeah. And the city is huge. Definitely. I learned yesterday from you that it's 20 million people living here. 

Matheus Facure: Yeah. That's about right. 

Alex: It's like 50, 50 percent of, the entire population of Poland, for instance. That's the scale. Matheus, what was the biggest challenge for you when you were writing your book?

Matheus Facure: I think, let me pause a little bit here, but probably working around all the family duties plus writing. So I was lucky in some sense that most of the content off the book, I have already written some shape or forms of, because of the, the open source, the online book. And I thought to myself, okay, this is going to be a piece of cake.

Matheus Facure: So I'll be able to do it in no time. Uh, I have a kid coming along the way, but it's going to be easy. It was not that easy. I didn't take that into account. It would be so much harder, like so hard. So taking care of the kid, the wife that just had a kid, plus the book, doing all of that was insanely difficult.

Matheus Facure: I'm very blessed because my company gave me four months of paternity leave. On top of that, I added an extra month of vacation, but still was a lot of work. So like working around the schedule and making sure. I was not behind what I proposed to do and making sure that the quality was right for me.

Matheus Facure: And I was happy with the material while taking care of the film. Definitely the hardest part by far. It's not the context, not the complexity of the content. Yes, it is a fascinating and complicated topic, but the sheer work of it. And I think that's the hardest part. 

Alex: When was the first time when you realized the value of causal, causal inference or causal modeling 

Matheus Facure: in general?

Matheus Facure: It was. Back when I was mostly working with predictive models and, well, traditional predictive model essentially, but have a bunch of data and you predict some outcome and outputs your number. That's interesting in some cases. But it's not the end goal. Like you have a number you have to do afterwards.

Matheus Facure: Like the company is usually not that interested in a number. They want to take that number and transform it into a decision. And that part is a little bit fuzzy if you don't take into account stuff like causal inference. For instance, at some point, I was working with, debt collection. And essentially, we had to figure out how to make people who are late pay their debts to the bank.

Matheus Facure: And we had a very good predictive model that told us what's the probability of someone paying us. Okay, so we have a very good predictive model. It tells us, okay, this is the probability that this person will pay us. And that model was very good. But how do we use it? Like, should we target the people that are very likely to pay us?

Matheus Facure: Or should we target people that are very unlikely to pay us? Like what end of the spectrum of the model do we target? How do we transform that probability from the model into a decision, an optimization for the company? predictive machine learning approach doesn't answer that fully. So then I started to work with, okay, I need to think about, I need to specifically think about the actions or, or the strategy that I want to employ on top of the model.

Matheus Facure: And how do I evaluate and how do I find the best actions and the best, decision to make in that framework? So it was in, I think, in this approach, like we have predictive model, we were trying to force them into a decision making process and then didn't like to force it. I went to, I wanted some sort of formalized framework where I could take the predictive model because it was very interesting, but I didn't want to force it into a decision making process and causal inference came into a very natural way and a very natural formalization to.

Matheus Facure: Evaluate and decide and optimize decisions and actions that you can take within a company. That's a very specific context, but it's more like afterwards I realized that it's much more general than just debt collection. It actually happens a lot of places. 

Alex: In industry, when you speak to stakeholders, and you talk about causality.

Alex: With them, you might sometimes hear a question from people, why should I invest in causal models if we already have machine learning in our company? What would be your answer to this question? 

Matheus Facure: Yeah, great question again. I think it ties back to what I was mentioning, like

Matheus Facure: It's hard for you in, in most situations that business want, I can find exceptions, but in most situations, they don't care about purely prediction, like prediction gives you a number, but they don't care too much about it. They care about making a decision that is, uh, bring more customers, increases conversion, decreases churn, increases profitability.

Matheus Facure: Cuts costs, so they usually want to do some sort of optimization where the predictive model is just a tiny piece of the system. And most of the convincing that I had to do or that I usually have to do is usually in the sense of, you don't even have to get that technical in terms of mentioning causal inference and being okay, so you want causal inference because causal inference is nice.

Matheus Facure: It's more like, okay, you have a model, you have a predictive model, it makes prediction and you want to make decisions on top of those predictions. Like how, what's the best way to do it here? It's, it's a framework that you can make this decision and you can see, okay, is this the best decision for this type of customer or is this best decision for that type of customer?

Matheus Facure: So I usually frame the problem, a sort of a personalization. This is a very easy way, a very easy sell for people because managers and product managers, project managers, they're usually very interesting in personalization. They want to treat users differently, uh, treat a user how that specific user wants to be treated.

Matheus Facure: And personalization is the, let's say the selling point, but we can translate personalization into a causal inference approach very naturally. It has a very ugly name and we call it treatment effect heterogeneity. But it is personalization in the end of the day. So it's just sort of translating the technicality into a way that sells more.

Matheus Facure: But the idea behind is very easy to sell. Like again, companies don't want predictions. They want to make better decisions. Sometimes a machine learning model, predictive model is good enough for this. I can think of examples like fraud. If you just predict that a transaction is going to be fraudulent.

Matheus Facure: That's good enough, but most cases you want to actually execute decisions like you want to figure out a price, you want to figure out who to target an ad for, you want to figure out who to call, stuff like that. When you have to deal with decisions, then causal inference becomes a very natural, uh, formalization of the decision making process.

Matheus Facure: So I don't, I, I feel it. If you understand this, then it becomes much easier to sell it. 

Alex: What was your path towards understanding this and learning about heterogeneous treatment effects?

Matheus Facure: Yeah, I learned the hard way. So again, I come from economics and I like to think that I I understood causal inference from the beginning.

Matheus Facure: And what happened is that I, when I started working, I wasn't working mostly as an econometrician, I was mostly working with traditional data science, predictive modeling. And there were a bunch of problems where we were tackling them with predictive modeling that didn't solve the problem. I gave the example of the depth collection.

Matheus Facure: So we had many problems. Okay. We have a model that's getting better over time in terms of performance of predictive metrics, or AUC, or cross entropy, or whatever predictive metrics you like, like R squared. So we had very good models in terms of those predictive metrics. But the decision itself, we're not bringing more money or not a profitable or we're not improving.

Matheus Facure: So we took those models, we put them in production and we saw, okay, where's the return? Like we're not seeing it. Essentially, we figured out that we were misclassifying the problem. We're treating as it was a simply predictive problem where it was not. It was again, a decision making process that You can frame it as a causal inference process, but mostly it was the hard part, like we tried something and it didn't work.

Matheus Facure: We had to learn something that did, and that's something else that did was causal inference. It was actually very, like, for me, I think it was very easy at first to see that those problems. They reminded me of stuff that I saw in econometrics course. And I said, okay, Oh, I know what's going on here. I know why this is not working.

Matheus Facure: This looks like a causal inference problem. We're treating it wrong, but it took me around two years to actually learn how to phrase this problem. So that the people from data science can understand it. So there's also this challenge of. Causal inference is relatively new to data science, so you have to talk the same language for them.

Matheus Facure: So you have to understand exactly, okay, how can I make this econometric stuff sound like a machine learning or data scientist stuff so that we can communicate. So a bunch of the work was actually figuring out the communication and how can we speak in the same language. So that we can frame the problem in a way that we can solve it, but also at the same time, both of economics, uh, econometricians like myself and traditional data science could understand it.

Alex: You mentioned this language barrier, within causality itself, we also experienced it. So we have people with background in econometrics, like yourself, some people with background in epidemiology, for whom the natural language to think about causality is, is the language of potential outcomes.

Alex: And we have people who maybe came to causality through machine learning, through graphical models and they speak the graphical language, the language of Judea Pearl, broadly speaking. How did you deal, with the differences in framing problems in the, in those two languages,

Matheus Facure: I didn't actually. So the thing is, usually what I gravitate towards more is the potential outcome that we learn from econometrics. Again, most of the translation that I did was from causal inference. Towards people that are not familiar with causal inference, not necessarily between different flavors of languages of causal inference.

Matheus Facure: That's a very diff difficult thing to do. Again, I feel like I don't think I'm qualified enough to do that because I, frankly, I need to study much more about the Jude the Apparel approach and to understand exactly like, okay, what's he saying? I use a lot of graphical models, but usually, and I think that's something that we differ a lot.

Matheus Facure: Like I read your book, you usually start from the graph and then you move, along with it. The way I tend to work with the graphs is like, okay, I have this problem, I want to formalize it. And then I draw a graph and I use it to explain to people that are going to work on the project, usually technical people, like graphs are not that simple that you can just show any stakeholder, but we use the graph and say, okay, so this is the graph, this is what we have to be careful about.

Matheus Facure: So this is the things we probably don't want to control for. This is the things that we want to control for. But it's only a formalization for us to have a clear picture of what are the assumptions that we're making, what we want to estimate, what's the question that we're asking, and what we should be careful about.

Matheus Facure: And from then on, we don't necessarily use the graph for, let's say, estimation or identification. Then we move to the data and we model everything only having the graph, let's say, as a background picture in our head. That's mostly of my bias for, from coming to a kind of, from coming from, the econometrics.

Matheus Facure: But, uh, still, I understand the value of the graphs. I just didn't manage to use it on a more broad, uh, perspective, like apply it throughout the entire project. I feel it's more like the beginning when you're trying to understand the problem, use the graph, but then you leave it behind 

Alex: when you leave it behind.

Alex: You leave it behind explicitly, but implicitly you already made decisions based on the graph, which variables you will include in the model. Is that correct? 

Matheus Facure: Yeah. Like when I say leave it behind, like it's always there in the back of your head, but we don't necessarily use it for like estimation. We forget about it.

Matheus Facure: And I say, okay, we have to estimate this relationship while holding this and this and this constant, or we have to perform a test or an intervention here and here and here, which will allow us to measure this. But we rarely go back to the graph and say, okay, so this is valid. This is not valid. This doesn't, this doesn't hold.

Matheus Facure: So it's rarely the case where the graph keeps, we keep the graph throughout the entire project. Like we don't go back and do, let's say, I think, what's it call it? The final stage where you question the graph and you, 

Alex: Ah, you mean like, refutation tests?

Matheus Facure: Yes, yes. We don't, usually don't do that. Yeah. 

Alex: Refutation tests are not that strong of tests, I would say. But it seems that the entire modeling process is, is very similar to what people with graphical background would do. When you say we don't go back to the graph, I initially thought that you mean that you do not go to evaluate the correctness of, of the graphical assumptions, which I think some people are doing and in industry depends on the use case, right?

Alex: But especially in open ended systems, complex systems, people do this this often, and produce like a human in the loop process. What are the main challenges that you're facing today in your work? 

Matheus Facure: Okay. So just to give a little bit of context, I think that's something we also chatted before. And it's like, for what I do, usually confounding is not an issue.

Matheus Facure: Just explaining what this means is that when you're doing causal inference, especially in academia, you're very worried, worried about confounding biases. So let's say you want to understand if wine is good for you. because people that drink wine, they tend to live longer. You have to adjust for other like confounding are the other factors that vary together with wine conceptions.

Matheus Facure: For instance, people from Europe, they tend to drink more wine, but they're also from a richer country where they have better health care. So how do you know if it's the line or it's actually the health care you have to isolate those factors. So this is confounding. For most of what I work, I don't have to worry too much about those stuff because I get to intervene.

Matheus Facure: And I feel this is actually What you see most, we might disagree on this one, but this is from my experience, what we see in most cases in companies, because usually companies are worried about, okay, so I have this stuff, this action, this thing that I can do, it might be sending an email, might be sending a call, might be changing the price of something, and that's something that the company can control.

Matheus Facure: If they can control, usually they can randomize with some some exceptions, of course, like marketing is hard to randomize, but usually something that they can change. So let's say it's price. They want to understand the impact of price on sale. They can set price here. They can set price there. We could do some sort of.

Matheus Facure: randomization or natural experiment and they can understand the impact of price. So I work mostly in this situation where confounding is not an issue because I get to intervene on the variables that I care about, which doesn't mean that there aren't challenges. Like that's only a small part. What I'm mostly concerned about if confound is not an issue is what I, what we said before is, effect heterogeneity.

Matheus Facure: So we have something that we want to do, let's say, call a customer, because he's late. We want to figure out who should we call first. Like, should we call this type of customer? Should we call that type of customer? And for that, I want to understand who will my action, in this case, calling someone. We will have a higher incremental impact in terms of increasing the probability of that customer paying us.

Matheus Facure: So this is one example. Other examples that we have is for instance, we have, we have, we want to do cross sell. So someone has a credit card. Maybe we want to cross out that person to a prime credit card. So who do we cross out to? Who is more likely? To benefit the most with this cross out, like who's the incrementality of the cross out we cross selling it to will increase the most, the probability of that person converting to, to the prime credit card.

Matheus Facure: So essentially it's a world where. We're not very interested in finding if an action is good or bad on average. It's more a world where we want to know, okay, is this action or we call it treatment and causal inference is better for this customer versus that customer or maybe better for this one and sort of a personalization.

Matheus Facure: No, it's not. Sort of. It's exactly personalization. So that's the world, I come from. And in this world. The hardest thing for me, I guess I would say it's first when things are non linear. So for instance, if you want to understand the impact of credit lines, so credit lines on the probability of someone defaulting on their loan.

Matheus Facure: This is not linear. Like if we increase credit lined, the probability of someone default increases because it's harder to like, it's harder to control yourself if you have too much credit. So it increases, but at some point it saturates and it becomes flat. So this is sort of non linearity that if you increase above a certain point, then the Y variable doesn't change that much.

Matheus Facure: So this is one very, very tough problem to crack. And the other one is, uh. evaluation metrics. So there are a bunch of treatment effect heterogeneity models, and we want to figure out which one is the best. So we can fit a bunch of them, and we want to pick one of them, essentially do model selection. It's very hard to come up with a metric to evaluate a causal model because causal quantities are very hard to put your finger on.

Matheus Facure: Like, they're usually non observables. They're always non observables, I would say. So it's very hard to estimate them and to put a predictive, a performance metric, which is very different from traditional machine learning, like do cross validation, see AUC, if it's better, okay, this model, let's pick this one and deploy it to production.

Matheus Facure: So what we're trying to do now is take this framework from machine learning, we call it the meat grinder framework, essentially try a bunch of things, see the performance metric, whichever has the best performance metric deployed to production. We're trying to do something like that for causal inference.

Matheus Facure: Can we do cross validation with causal inference model? How do we do it easily? Can we do feature selection? Which is also an incredibly hard challenge to figure out what to put in a causal model and what to take out. So can we do feature selection in this meat grinder way where we simply try a bunch of stuff, see if the metric goes up?

Matheus Facure: And then we take this model that we like better. And I feel that at the heart of this problem is coming up with a good causal inference, a good evaluation metric for the causal model. I think this is the core, the hardest part. 

Alex: What are the lessons you learned, thinking and working on evaluating models in your company?

Matheus Facure: I think the first lesson that we learn maybe the hard way is that it's very hard to trust, any evaluation metric. If you don't have randomized data, so if you don't have any sort of experiment, or if you have confounding, you always have that feeling like, okay, is the metric this good, or is just the bias that is messing up my metric.

Matheus Facure: So first, I guess the thing that I learned the most is, okay, first, I need to have a data set that I trust, and I know it's completely free from any biases, and I trust this data set with my heart. And then I can use this as a validation set. I can use this set as, okay, I'll perform, I'll evaluate the metrics and the model, compute the causal metrics in this, because I know there's no confounding odds forever in this.

Matheus Facure: I don't even need to use this clean and pristine data sets to train my model. Like to train, I can throw a bunch of garbage in it. But as long as I have a data set where everything is randomized, I can make the traditional and confoundness assumption. And I use this data set to validate and I trust this data set, then the problem becomes much easier because I have something that I can trust, that I can ground myself and say, okay, if the model doesn't perform good on the data set where treatments are random, it's probably will not perform good in production.

Matheus Facure: So this is the, I think the more important lessons, like as much as possible, simplify the decision making process in this case, randomize the thing. So that the evaluation becomes much easier and does not and do not does not complicate the analysis. So in a sense, the idea is to have simple decision processes so that you can also have simple analysis.

Alex: I remember that at some point you shared on LinkedIn that you also worked with reinforcement learning. Are there any experiences, from that period that you were able to translate to working with causal models? 

Matheus Facure: Personally, I think reinforcement learning is actually a flavor of causal inference or the other way around, depends where you're coming from, which is actually very interesting causal inference, like we're today calling it causal inference.

Matheus Facure: But it's actually a collection of stuff from multiple fields. And I feel that reinforcement learning is one of those fields. Because if you look at reinforcement learning strictly from like, okay, let's simplify reinforcement learning and see what it really is. It's something like this. You have, information about the environment.

Matheus Facure: And you have a thing, a metric that you care about. So something that you want to optimize, and then you have actions that, you want to perform in order to maximize the metric that you care about. Well, if you look at this, this is very similar to causal inference, like. We just call it the, in causal inference, we call the action, the treatment, but we are also interested in finding out what's the treatment that maximizes the outcome condition on a set of covariates, which in reinforcement learning, we called, environment was the same thing with different models, which is very interesting because it allows us to use a bunch of stuff from.

Matheus Facure: Reinforcement learning, particularly what I like a lot is, offline policy evaluation, where you can look at stuff you did in the past and figure out, okay, if I do another thing, a different policy, so if I take other decisions, how would that, how would those decisions have performed if I've done that on the data that I have?

Matheus Facure: So you can use this technique to evaluate a policy, even if that policy has never been to production. Which is very interesting. And it comes from the reinforcement learning approach. And I also want to be very careful when I say, okay, causal inference is reinforcement learning, because. When people think about reinforcement learning, they think about a model that it's self updating.

Matheus Facure: Like the model sees the environment, performs action, and retrains. See the environment, perform actions, and retrain. And that's done automatically. I would say that you don't actually need to do that automatically. You actually have got to have, you might want to have a human in the loop that actually presses the button to refit the model constantly.

Matheus Facure: But if the model is training on treatments that it did in the past is essentially the same thing, but with a human performing every iteration of the reinforcement learning updates. So I think this is what I came to realize when I was working with, again, the collection strategy. We had a bunch of actions.

Matheus Facure: We had a bunch of customers and customer had different contexts. Let's say some customers are. have very late delinquencies, like they're two months late. Some customers, maybe they just forgot to pay their bill, like they're three days late. You probably don't want to treat those two customers the same.

Matheus Facure: And this fits very nicely to a reinforcement learning framework. And it also fits very nicely with, the causal inference framework. So I think. It's interesting to look at the same problem from different angles because it gives you different insights into how to tackle it. But fundamentally, I think just different names that we give for the same formalization of a problem of a decision making problem.

Alex: Now, on the other hand, we know that reinforcement learning agents. Might be susceptible to confounding as well, which means that they might learn world models that are associational in their nature, which might lead to suboptimal decisions.

Matheus Facure: It's not that they will learn if you're not careful, which is why I advocate for the human there retraining the model because the human can use techniques to actually debias the data set.

Matheus Facure: So if you throw, for instance, if you naively throw a model like this in the wild, it will learn on the things that it performed in the past, right? So for instance, if let's keep with the collections example. So if for instance, this by chance, this machine learning agent, it tried to call only people that have very low delinquencies.

Matheus Facure: So that they're only like three days late or five days late, okay? Those people are the ones that will probably pay. Like they're not very, maybe they just forgot, they were not paying attention to the billing cycle. So they just forgot to pay. So they will probably pay. If you look at this data naively, what will look like is that calling has a very high impact on the probability of payment because you only called people that pay and you did not call people that didn't pay.

Matheus Facure: But this isn't causal, it's just like correlational. And this will certainly happen if you simply fit the model on the decisions that it made on the past. But there are various, a myriad of techniques that you can use to alleviate that. For instance, if you take the actions that the model did and you make them probabilistic instead of deterministic, so the model can decide to do this, but it has a probability of not deciding that and actually doing something else.

Matheus Facure: Suddenly you have a non deterministic policy. So we have a probability association to associate it to each action that you can take. And then you can use propensity squared to correct for all those sorts of biases. And again, that's why I advocate for the human actually looking at that and say, okay, this is how I'm going to refit the model instead of just letting this model lose in the wild, because it will certainly learn correlation instead of causation.

Matheus Facure: But again, We know that there are techniques to, to correct for that bias and to make, to use that bias data and in a, in a way that it's satisfied the causal assumptions like and confoundness so that we can learn and use the data to update the model, let's say, or refit the model so that it learns from the biases decision, biases decisions that it made in the past.

Alex: Causal ops is a new area that It's interesting to more and more people and organizations that are interested in, using causal inference in their operations, what would be your advice to people who are interested in deploying causal models at scale? 

Matheus Facure: I tend to use, I think everything that we use for traditional ML Ops can be used for causal models, at least if you're thinking from the perspective of, causal treatment effect heterogeneity.

Matheus Facure: So what you want in those cases is that you essentially want to predict for each person what's the impacts that a treatment will have. So it's nonetheless a predictive thing, a predictive model, although it predicts something that you cannot observe because like you can never see for anyone, what's the treatment effect?

Matheus Facure: So what, how much that people will benefit from having this, this treatment applied to that person. So it's an odd treat predictive model, but from the production aspect is nonetheless a predictive model. So I would say, like, most of the recommendations that I would give from traditional ML Ops translate to causal effect models.

Matheus Facure: So, like, use standard libraries, LightGBM, stuff that is easy to deploy. Try to avoid as much as possible, like, models in Python. I know that because the causal inference, community is relatively new, a lot of the models are not implemented in efficiently in let's say C it's usually pure Python, while if you take machine learning models, it's usually C with a Python wrap around it.

Matheus Facure: So for instance, in our case, Although I love the work of Susan A. T. and CausalTrees, we've never managed to deploy any CausalTrees in production, although I find it fascinating, simply because it's too slow. Like, it's usually pure Python, it's not implemented efficiently, so we cannot use it. So, most of the recommendation is Try to treat it as much as possible as a standard machine learning model and should be fine.

Alex: I performed this experiment in my book to assess the computation requirements for different models. And I think, compared to S learner, causal forest was 39 times slower. 

Matheus Facure: Yeah. Yeah. And that's, that's the reason, like if you, it's. It adds good and bad, to it. Like if you want to understand how a causal model works, causal forest works, I don't know.

Matheus Facure: I think you're, I'm assuming you used, EconML. Yeah. So if you want, if you use EconML, you can just look at it, call it in Python code. So you can understand what the tree is doing, which is a good thing. Like if you, from the learning perspective, it's very interesting, but from the production perspective, it's just

Matheus Facure: not feasible to deploy something like that, at least not in this large scale. You mentioned 

Alex: before that confounding is not that much of an issue for you because you are able to randomize your treatments and use randomized data either for training or, for evaluation or both. But confounding is not, or like backdoor paths where you have a common cause is not the only graphical structure that can lead to causal bias.

Alex: Have you experienced, challenges with other types of confounding? If we talk about confounding. Broadly

Matheus Facure: confounding our bias in the sense of bias. 

Alex: So I say about confounding as any back, any path that any non causal path that is opened in the model. 

Matheus Facure: Definitely. I'd like to make the distinction between like confounding where it's usually, unheeded common causes.

Matheus Facure: And I like to distinguish this from selection bias, which is if you. Condition on something that it should not like. So a collider or there's something, in the path between treatment and outcome or something after the treatment, something like that. So definitely, uh, those are the less intuitive, things to work on.

Matheus Facure: And I confess, I still, I still think about them on a weekly basis because those are very complicated problems. I've posted about this a lot recently. So one issue is the conversion issue. So I'll try to describe it very briefly. Let's say that you want to understand the impact of interest rates on how much a people, the size of a loan of a person.

Matheus Facure: So you would expect that if you put interest rates down. Then person would borrow more. And if you increase interest rates, people would borrow less, right? Let's say a very, hypothetical experiment that you want to understand the impact of interest rates on amount lent.

Alex: It's from a homo economicus point of view.

Matheus Facure: Yeah, yeah, yeah. Yeah. You define interest rates and you have the interest rates treatment and you have your outcome and you want to find that effect. But, turns out that most of your customers, they don't want a loan, so they don't convert. So regardless like of the interest rates, the loan amount that they have is zero, like zero.

Matheus Facure: So if you plot this distribution, if you plot the outcome, the distributions of outcome, you're going to see a huge spike on the zero and some loans amount above zero because the vast majority of people, of your customers don't get a loan. Even though they have seen the price, seeing the interest rates

Matheus Facure: so if you try to naively estimate the impact of interest rates on loan amount, the effect will be very tiny, very diluted by all those zeros. So what people usually do when they face the situation, this is a loan example, but it happens if you want to understand like any sort of pricing where you want to understand the impact of price on sold amount or purchase volume.

Matheus Facure: So what happened in this case is, okay, let's filter out the zeros. So let's only look at people that actually took a loan and then see the impact of price on people that took a loan on loan amount. But the moment you do that, you open a spurious path because you condition on people that converted and convert.

Matheus Facure: The conversion is in between the treatment price and the outcome purchase volume in our loan amount in this situation.

Alex: So, so we basically lose the benefits of randomization in this. 

Matheus Facure: Exactly. Exactly. So, and this is very counterintuitive, like on some aspects of the problem, if you look at it, it's not causal, but it's fine because as long as you estimate the effect of prices on conversion, So you have the effect of price on conversion, and you have the effect of price on loan amount, given conversion, which is not a causal quantity because of the bias.

Matheus Facure: But if you multiply them together, then the biases disappears. And this is very counterintuitive, as with many things related to, to collider biases or to condition or just to all those sorts of selection biases. So it turns out it's not a problem if you're just wanting to apply and to break down a difficult problem into two easier problems and want to estimate and to make counterfactual predictions.

Matheus Facure: So the goal here being, you simply want to know how much people will borrow depending on the interest rates. So you can do that. Even though you caused, as you broke down the problem into two, as you condition on the people that, converted. You introduced bias there, but that's not an issue as long as you multiply whatever that is.

Matheus Facure: By the effect of prices on conversion. That's very non intuitive. Like it doesn't seem that it would work. Like, how can you have a biased measure for an effect and then you multiply with something else and suddenly the bias goes away, it's very counterintuitive. We say it in words, it's easy to see like the math.

Matheus Facure: And if you do the simulation, it becomes clear, but it's not something natural that it, okay. If you explain to someone that person say, okay, this is obvious. It's far from it. So this is my current take on it. I've looked at this problem for many different angles. And my opinion has changed along the year.

Matheus Facure: At some point it was like, Oh, this is, we cannot do this. We should not select on this. It will bias everything. We should shouldn't do it. But as I studied it and I looked to it and say, okay, maybe. I'm being too harsh here. Maybe this is some bias that we can live with. But I might change opinions in the future.

Matheus Facure: Like, that's where I stand today. I think some sorts of biases, you just don't have to worry about them.

Alex: Let me ask you a technical question. So would this multiplication of those subproblems, work also in nonlinear cases or only in linear cases? 

Matheus Facure: They would work in any cases because like it's a simple, it has no assumption on no causal assumption.

Matheus Facure: It's a simple, let's say, a result of how probabilities work. So if you take in this case, what we're doing is that we take a Broader probability. So we're estimating the pro the, or an expected value. So the expected value of a loan, which is the loan amount given price. This is the goal. Like in this problem, we're want to understand how much people will borrow given the price.

Matheus Facure: And the issue here is that this is complicated because there's a bunch of zeros on the Y variable. So this expected value will be very close to zero. So what you can do is that you can break that expectation into a term which is the expected value of the loan amount given price, condition on conversion being one.

Matheus Facure: So you got rid of all those zeros. And another term, which is the probability of converting given price. There are no causal assumptions here. Like you're simply breaking down an expectation into the two terms that it has. So it's a simple mathematical truth, like, and the issue comes when you want to estimate those terms.

Matheus Facure: One of them is going to be a causal, a causal quantity, because if you randomize prices, so probability of conversion given price, you don't have to worry about that, because price is randomized, you are not conditioning on anything, spurious that that is going to cause any trouble. But on the other quantity, the expectation of loan amount given price condition on conversion, that's not called like when you come to estimate that quantity, it doesn't turn out to be a causal parameter.

Matheus Facure: But again, if you multiply things out, they work out beautifully. 

Alex: What was the favorite part of your book? When you were writing it. 

Matheus Facure: The regression chapter. I love regression. I say it like sort of kidding, but it's actually true. I love going back to basics and revisiting basics, uh, through different angles.

Matheus Facure: I think people, especially data science, they are too quick to dismiss regression as a simple and simplistic tool, but it's so much powerful than that. Like, when you need to control for stuff, they don't understand the fact that Regression actually is controlling for stuff when they usually want to implement a bunch of group bys and stuff that don't work.

Matheus Facure: So I feel that looking at regression from with all the care that it deserves was very interesting to me. It was a very interesting study for me because I also had to look back and say, okay. What is regression really doing? What's the assumption that it's making? Like, uh, what should I be careful about when I use regression?

Matheus Facure: And what should I not be careful? What is regression similar to? Like, how can I explain regression? So can I say regression is similar to x or it's similar to y? So seeing regression from these multiple angles and giving, I think, the attention that it deserves was a very interesting thing. From, for me to study and also for me to translate it into a language that it's hopefully more accessible.

Matheus Facure: And I think also the part of, evaluation, causal model evaluation, because that's something that I've been working so long with it. It bothered me for so much time. And it's something that I feel that I have something to contribute on. For instance, I think that probably what we did in Nubank were.

Matheus Facure: I don't remember seeing anything, relating to causal model evaluations for continuous treatments. Which is something that we've been doing like for, for four years now. And I think we have a very not definitive answer, but we're more, we have a more mature understanding of the problem that most, thing that you see out there.

Matheus Facure: So that's something definitely that I enjoyed writing about, because again, I feel that's something that I can contribute to. 

Alex: What would be your advice for people who are just starting with causality? 

Matheus Facure: So let me just see if I understand from people that are starting with causality coming from where, like from a data science perspective or a fresh undergrad that's want to join data science in general.

Matheus Facure: And wants to start with, Causality, like who, who are we talking to here? 

Alex: Pick the group or the groups that you feel, your advice would be the most valuable for. 

Matheus Facure: I guess I wrote the book for data scientists that want to transition into, to causal inference or to learn about causal inference.

Matheus Facure: I think the general advice is to be trying to see that the machine learning model is only as tiny piece in a much bigger system that it's making decision for whatever you work for or whatever your project like, yes, machine learning. It's a very fascinating topic and doing predictions is a very fascinating topic.

Matheus Facure: It's not simple by any means. You have to be careful with leakages and making sure that, stationarity holds or if it doesn't, how you deal with it. So prediction is a very fascinating problem, but it's only a piece in the puzzle. Like after you have a good prediction, you actually have to make decisions with that prediction.

Matheus Facure: So how do you do it? Like. How do you go from the number that outputting to actually a decision that is close or in the direction of optimal decisions? So if you take that broader perspective and you interest yourself in not simply the technical aspect of Making a prediction, but actually designing a system that it's overall better.

Matheus Facure: I think having this broader perspective helps a lot, mostly in the motivation part. I think it's something if you want to motivate yourself to learn causal inference, I think that that's it. Like you should be thrilled. By making better decisions, not necessarily just by making better predictions. Like yes, making predictions is a part of the problem, but it's not the entire problem.

Alex: What resources would you recommend to people who are just starting? 

Matheus Facure: I think that's a very easy selling point. So I think any books like from me or from Alex, they're very interesting places to start again. Because we went to all this stuff and it's a very new field. And I think for us, it was hard in the sense that, I don't know your experience, but from my experience, we had to, I had to piece out many papers.

Matheus Facure: There wasn't a complete like body of work that summarized all this stuff. Now you have the opportunity that this is done. You don't have to go through all the work that we had. So definitely a work that summarizes causal inference, like my book or Alex's book. If you want something, let's say open source or something that is available for free.

Matheus Facure: I really like the American Economics Association webcasts where they, they usually have a course on econometrics every year. So every year they have someone go there and talk about econometrics. The one that I really liked that I liked the most is the one by Joshua Ingrist and Alberto Abadie.

Matheus Facure: I think it's from 2020. Where they go through the basics of econometrics. So I feel that's something that data science would benefit a lot from learning like the basics of econometrics, understanding what is it for its benefit, what's the counterfactual outcome, potential outcome, what's counterfactual, how to frame the problem as a decision making problem is an optimization problem.

Matheus Facure: So those stuff, I think it's very, a very interesting place to do. There's no coding in that. So, so you only learn the theory. It's not applicable to the industry. It's a course focused on academics, but I feel the fundamentals there are amazingly thought like, uh, Josh is the amazing professor, of course, and learning those fundamentals, I think.

Matheus Facure: Helps a ton in learning causal inference in the applied setting later on is the future causal. I think the present is, again, I'm a very practical guy. I didn't go out into causal inference because it was a fashionable is like I had a problem and I guess my company had a problem where we had a bunch of models.

Matheus Facure: But we didn't want the models like, yeah, we want more good models, but the end goal was not a model is, is to make decisions. And I think that's why causal inference is becoming so popular because there are a bunch of companies that hired a bunch of data scientists and data science are doing a bunch of models that don't actually solve the problem or that only partially solve the problem, like they get to the prediction.

Matheus Facure: But then they have to hammer in that prediction into a decision making process. And this isn't very nice, like, you shouldn't want to hammer a model, a predictive model, into a decision framework. You would want a more natural and seamless way to integrate decision making with machine learning. I'm pretty sure causal inference is the answer for that.

Matheus Facure: So I think in a very natural path forward, companies will see, okay, I care about making better decisions. So I want to take this machine learning, I want to take this data science capabilities that I have, and I want to supercharge it so that it not only outperforms numbers, which I don't think it's the end goal for companies, but actually out, uh, outputs decisions and decisions that improve over time and become even more optimal.

Matheus Facure: So I think that's the end goal, at least in the industry setting. I think for sure, companies will start to see that they're already starting to see that. And that's why I believe like, again, the present is. probably already causal. I wouldn't say with that words because it sounds not natural, but definitely I agree.

Matheus Facure: It is a way to put it. Like the present is companies should, and they will be better off doing causal analysis.

Alex: What question would you like to ask me? 

Matheus Facure: I have a bunch of questions that I wanted to ask you. Like, how do you build a podcast? But let's Take , with, the topic, one thing that I want to ask you is that, you seem, can I ask two questions?

Matheus Facure: So the first one, you seem very concerned with confounding. So I'm taking that, the problems that you face are not the same that I face where confounding is not an issue. So I, I take it that, for what you work with, confounding is most definitely an issue. Like, so that's why you're very interested in stuff like, uh, from your book of partial identification and you're very careful with the graphs and making sure that that works.

Matheus Facure: So I'm very eager to understand. Okay. Query companies still suffer with, uh, identification because from my perspective, okay, if they can control, they shouldn't suffer from it. And the second question is regarding causal discovery. This is a topic that I'm not, Well, versed in, I wanted to know, like, have you seen any instances of companies using it and applying it into successfully to solving a business problem?

Matheus Facure: Where does causal discovery fit into the picture? 

Alex: Regarding the first question, the first question was about about confounding. I think it really depends on industry or what companies can do. If you can randomize, that's always great. Even if you don't randomize for the data that you train your model on, you can use it for evaluations.

Alex: And that's, that's very valuable. But not everywhere you can easily randomize. For instance, in certain settings in marketing, randomization would be difficult. There are certain settings where you can do it. There are certain ways that you can leverage randomization. For instance, a little sub samples of, of Of your population, but it's not always a feasible thing to do.

Alex: So I sometimes like to think about this, that there are like open ended complex systems where we not necessarily can randomize. And on the other end of the spectrum, there are systems that are essentially enclosed, where we sometimes we can randomize. Sometimes we cannot randomize, but because the system is essentially enclosed, 

Alex: we have very good chances of drawing a graph that will be complete enough in order to give us estimates that will have practical applicability. I think in my book that this comes did the focus on confounding, um, comes from. Two sources. One is thinking about those cases that are open ended and complex.

Alex: And the second one is that it comes also from thinking about identification. And I think a lot about identification because, I think a lot about cases where you cannot randomize , or sometimes you can control the treatment, but you cannot randomize fully. So that will be the two main sources of, of this thinking.

Alex: And also, I think I come from a place where I think about experimentation as a special case of causal inference. So when you look at, Pearlian causality from like a more general perspective in, measure, theoretic perspective or topological perspective, you can see very clearly that randomization and changing the world physically or factually can be seen as a special case of performing, causal inference over a system.

Alex: I think what I find very attractive in the topological perspective is that. You can very clearly see what are the limitations of randomization as well, that for certain queries that would be counterfactual queries that you mentioned before, randomization cannot essentially answer those, those questions in general.

Alex: Of course, there are special cases as we also have special cases with observational data where we can. Uh, answer interventional queries, but there's, there's only special case regarding the second question about causal discovery. Yeah, I've seen it in industry and going back to those different settings, causal discovery might be much easier when we deal with enclosed systems.

Alex: Because many causal discovery algorithms require that you don't have hidden confounding in your data not all of them, but, but many of them. And so those enclosed systems, they are also often well documented. So we can have pretty complete expert knowledge about those, those systems.

Alex: But this is not the only setting where causal discovery can be useful. I've seen it used also in different settings that are more open, and then often the process is iterative, which means that collect expert knowledge, we combine it with other systems. Thanks. We feed it to the algorithm, we run the algorithm, and then we either test with, with the data that we have, or if we have some, randomized data with randomized data, or if we can build a generative model, we just compare the distributions to whatever data, data we have.

Alex: We confront this with other people and then maybe reiterate with the algorithm. So that will be a very general framework for. Using this kind of methods in practice. 

Matheus Facure: Can I give an example, which I think is causal discovery, but feel free to correct me. I was talking with this, uh, this data science and she, she worked for a solar plant company.

 The problem, the thing that they were trying to solve is that they have, uh, solar panels all across the countries and sometimes some of them fail and they have a predictive model to. which, solar panel is going to fail next so that they can send the technician there before the thing fails.

Matheus Facure: This is good. but you want it to go even further because once the technician got there, the technician didn't know. What exactly was the problem because it didn't fail yet. So it could be like overheating, it could be wind, it could be sand, something like that. So the technician didn't know the cause of the problem.

Matheus Facure: And she said, okay, I want to help that. I want, I want to fix that. I want to give to the technician, not only, okay, this is, this is the, the panel that is going to fail next, and it's going to fail because of. Whatever. And I want to figure what this, this is what it is that is going to cause the power plant to, to go down.

Matheus Facure: This seems to me like a causal discovery problem where you want to understand the cause of the issue, let's say a root cause analysis of, that problem. Like is this, uh, a way, an application of causal discovery, like can use causal discovery to tackle this problem? 

Alex: Great question.

Alex: Causal discovery could be used in this case. Probably not as an ultimate solution, but as an element of a solution. So, we could use causal discovery algorithms in order to find the connections, Within the system, which variables are connected to which ones? If the system is linear, we could also using some methods, we could potentially also learn the coefficients on the edges.

Alex: But if the system is nonlinear, of course, that would not make too much sense. That said, if we learn the structure, which variables in the system can Cause the failure, then we can train a structural model on this structure and the structural model can be used to perform root cause analysis.

Alex: So that could be one way to tackle this challenge. Now when you asked me before I wrote causal discovery, I said that it's often used as an element in a iterative process where people are looking at the graph and so on and so on. One of the reasons for this is that causal discovery, from observational data.

Alex: Is not possible in general without any additional assumptions. So any algorithm, any causal discovery algorithm that we'll take, and we'll work with will require us to make certain assumptions, but by using causal discovery algorithms, especially in systems that are, that are enclosed.

Alex: And I understand that this system with panels is a system that is like more or less enclosed, plus some external forces that can be also quantified or measured. Yeah, it's, but we, we know the mechanism from, from experts. So, so this is a case where this could be a solution. So I iterative process that involves experts and causal discovery algorithms, uh, could be a good first step.

Alex: To build a model that then could be leveraged to perform root cause analysis. What was your motivation for writing your open source book? Your first book? 

Matheus Facure: It's a pretty selfish motivation. So it was a pandemic to 2020. We didn't know what was happening. Everyone's in lockdown. I say, Oh, I need to do something to go through this pandemic.

Matheus Facure: And I think I want to go back to basics and review econometrics. So I went and watched, the course that I've mentioned early by Josh and Abadie. And I said, okay, this is very interesting. The way they teach this stuff is very compelling and it's very clear. And I want to learn this very well. So what I did is, okay, if I want to learn this, I want to take what it's here.

Matheus Facure: And I want to apply it to my own data. So I went and I fetched data and I found, or sometimes simulated data, but mostly I fetched it from, from the internet and I took whatever they were teaching the course, and then I thought it again, like I translated the course into written material. And then with the example and with the data that I had, and I used the data that I had like to replicate the simulations that are the explanation that they gave in the course.

Matheus Facure: And this essentially gave me a translation of their course into Python, which I then sent to Justin and say, Hey, here is the, I watched your course, I really enjoyed, and here is the Python version of it. If you want to share it, feel free to watch one with it. And then he shared in his Twitter. Which got me a little bit of traction, not much, but enough.

Matheus Facure: So that some, people started to comment and to track the repository, the open source repository. And this is very interesting because at that point in time, then people started to correct me and say, okay, you did this wrong. This is not clear. Can you clarify this? And then I got a bunch of feedback, which is exactly what I wanted.

Matheus Facure: Like I wanted to learn the stuff. I think that writing and teaching is a very, powerful tool for learning, but feedback from the community is also a very powerful tool for learning. So that's why, I mean, like the reason is selfish in the sense that it was mostly for me. In essence, I think it is, it still is for me whenever I see a paper, then I thought, okay, this is interesting.

Matheus Facure: Let me try to implement it and then I'll publish it and people will comment and give feedback. So I guess that's how I tackled the, initially the book. Essentially as a translation of Josh's course into Python. And later on, I tried to take the stuff that I was doing, in my company, in Nubank, and I tried to lay out in a format and an applied format and say, okay, this is what I'm doing, which is part two of the book.

Matheus Facure: This is not traditional, let's say, not, not harmless econometrics, I certainly have to take this with a grain of salt. I'm no, scientist. I've never published on this. This is what I do in practice. This is what works for me. This is why I think it works. And I started like this sort of informal science, let's say, talk about what, uh, the things that I was doing and how I was using causality.

Matheus Facure: And that became part two of the book, which is essentially. Huge, discussion on machine learning and effect heterogeneity and model evaluation, which at some point I managed to compile into, into structure, into a book format, which is also not obvious. Like you have a bunch of content scattered. How do you need them together so that it forms a cohesive structure?

Matheus Facure: So it, when I, that's sort of the point. And right now I think the book is still live, the open source book is still changing. I certainly don't update it as frequently as I would like to, but mostly because, again, a young kid, lots of work, I need to find more time to do this stuff, but that's definitely something I plan to is to keep the book, updating the open source book with things that I find interesting and things that I want to learn more about.

Alex: What would be your message to, to Causal Python community?

Matheus Facure: I think Causal Python community it's in a very good place again, because I think Causal Inference is very practical and Python is also very practical. So if you want to apply this stuff, I think Python Causal community is the place to be. Like, it's where you'll find models that are easy to deploy, that you can go to production unlike with, let's say R.

Matheus Facure: Although I enjoy R, but not for project for production. So you have the benefit of having models that are very easily deployable. You have now a bunch of materials, a bunch of amazing libraries like CausalML. I feel it's EconML and DoWhy I feel that they are maturing to, to be Very standard models for causal analysis and, all the environment for machine learning, which I think is fairly interesting to, to use in causal analysis is also alive and well in Python.

Matheus Facure: So I think that's a very interesting place to be if our goal is, applied stuff and industry stuff, mostly tech industry stuff. 

Alex: Matheus, what was the most challenging moment in your career so far? 

Matheus Facure: I think it's right now, right? I have to understand how much time I want to dedicate to work and how much time I have to dedicate to family.

Matheus Facure: That's definitely a challenge. There's a lot of priorities shift that happen once you become a father. So I'm having to deal with that. It's not a challenge in terms of the technical challenge. This problem is too hard is more a challenge of how do I juggle all those balls at once? So I think definitely that's the biggest challenge right now.

Matheus Facure: If I restrict the question to, okay, what's technical, that the technical challenge that I had, I would say. Credit? Definitely. It's hard for me to say, okay, this is the, it's a part of my career that was really tough because I still work with credit. I've been working with credit for five years now, but credit is a really tough job.

Matheus Facure: It's a really tough, really tough field. For instance, if you give a loan today, or if you give someone a credit card with, I don't know, five, 5, 000 in credit limits or 10, 000 credit limits, maybe this person would not default this month, nor next month, nor the other, but in two years time. So how do you know, you have to essentially wait a lot of time for you to see what happens to that person once you give the loan.

Matheus Facure: So there's a huge delay between you doing something and you actually seeing how that something performs in reality. And in a sense, it's sort of an explosive business. Like if you've messed up today. You might only figure this out one year in the past, and by then it's too late. So it's a very risky, risky business.

Matheus Facure: You have to be extra careful. There is a loss of particularities and loss of continuous treatments, which are very hard to deal with. Like credit lines or interest rates are both treatments that a bank cares a lot about, and they're continuous. They don't have linear response functions. So it's a very Tough problem overall, like , it's, it's no jokes dealing with, with the problem of giving credit of giving loans to people.

Matheus Facure: So it's a very tough, a tough nut to crack on the plus side is very interesting. Like I feel like banking is a very nice place to, to be, if you want to work, with causal inference again, because there are all these problems that are very interesting. And because banking already has a lot of the culture that comes from economics, the field in the economics and econometrics is already well established.

Matheus Facure: So it's easier to find literature on and easier to convince people that an econometric approach or a causal approach actually makes sense for the problem that we're trying to solve.

Alex: What skills, non technical skills that you learned earlier in your life? Do you find most helpful today in your work or when dealing with the challenges that you, that you mentioned?

Matheus Facure: I wouldn't say earlier. I don't know how earlier you learned this, but definitely writing. I feel that being able to structure your thoughts well and with informant is very powerful. Also in spoken language or PowerPoint, whenever you have like an idea and you can structure it in a, in a presentable manner, I think that's a very.

Matheus Facure: Powerful skill. And especially if you work for a big company or if you have to work with different backgrounds, so not necessarily managers, right? Sometimes you're a data science and you want to work with engineering or you want to work with, someone from product. So not necessarily your boss or your boss's boss, some folks from different backgrounds that work together to solve a same problem.

Matheus Facure: You have to be able to communicate effectively with them and being able to translate the technical difficulties and challenges of a product without them being in doubt, like without like dumbing it down and actually capturing the essence of it in a way that it's clear. I think it's very, a very valuable skill.

Matheus Facure: I've, I didn't master it by any means. I'm trying to, but it's something that I try my best and I, I think I'm fairly decent at it, let's say. Who would you like to thank? Most of the academic causal community, I think most of them are amazingly kind. Definitely Joshua Ingrist for all his work. Not only his work as a researcher, I think mostly.

Matheus Facure: With his work as a teacher, he's a phenomenal teacher, definitely a source of inspiration. Like the way he teaches, is amazing again, back to the point of communication. I think, teachers have good teachers have that, that figured out. So we definitely should learn with them. And a bunch of nice folks that helped me along the way.

Matheus Facure: For instance, I can remember a bunch of people that were very kind to me. So, for instance, Pedro and Carlos, Pedro Sant'Anna and Carlos Cinelli, were incredible researchers in this field and also Brazilians. They were very kind to, to come and talk in a meetup they organized. Also, Nick, who again has a book on causality and econometrics.

Matheus Facure: And Shan also, when I invited them to talk, they were also very kind and lend their time. Besides that a lot of people from the causal community, they tend to be very open and accessible. So I generally like post questions and I ask them and they reply and say, okay, you got this wrong. You have to think about this way, or they explain something or they give a perspective on something that a plan.

Matheus Facure: So a bunch of researchers that. I wasn't expecting them to be so open and so approachable. So Casper, who worked a lot with, putting bounds on top of, uncertainty bounds on synthetic controls, Peter Hull, who I learned a lot from in terms of The basics of regression, I think learning the basics is very interesting.

Matheus Facure: So the causal, I think the researchers in causal inference and economics in general are very kind and I would like to, almost all of them, I would like to learn. And everyone that I've interacted with, certainly everyone that I interacted with, I have nothing to complain. Like all of the interactions have been very positive.

Matheus Facure: That's great. 

Alex: I think that's a great message to the community as well. If you have a problem, if you want to ask someone a question, just reach out. There's a chance that, those people will answer you. And maybe this chance is higher than, than you believe yourself today. 

Matheus Facure: Yeah, definitely.

Matheus Facure: Like sometimes the answer will be, Oh, here's a link. You can learn, might not be a full fledged answer, but they will at least point you in the right direction. I think that's definitely, definitely will be the case. Where can people 

Alex: learn more about you and reach out to you? 

Matheus Facure: So social media, all the stuff related to causal inference.

Matheus Facure: Try to post on social media. I try to talk about it. I'm not that active. When I don't have anything new, I'm not like, I don't do weekly post. I usually just post when I have something new or something that I find interesting or a question that I find that I want some answers for. So usually social media, also LinkedIn and Twitter.

Matheus Facure: I'm usually there. You can find my emails also on those networks and GitHub. Like if you want to give any feedback on the open source book or reach out. I'm also there. I'm also active and, trying to, in the, the open source Python and causal inference stuff related to Python.

Alex: Are you planning to still update your online book?

Matheus Facure: Yeah, definitely. Again, I already have some branches that are there, gathering dust, but I definitely want to go back to them as soon as I have time. I want to talk a little bit more about, reinforcement learning. I think we chatted a lot about it, but I've never written and formalized what I'm thinking to a written format.

Matheus Facure: So I'm definitely working to some stuff related to reinforcement learning and how it, relates to what we know as causal inference.

Alex: Great, Mateus, thank you so much for your time. It was a pleasure.

Matheus Facure: Thank you for having me, Alex. You're also too kind. 

Alex: Thank you. And hope to have another conversation with you in some time.

Matheus Facure: You too. Congrats on reaching the end of this episode of the Causal Bandits podcast. Stay tuned for the next one. If you liked this episode, click the like button to help others find it. And maybe subscribe to this channel as well. You know, stay causal.

(Cont.) On Causal Inference in Fintech & Being an Author || Matheus Facure || Causal Bandits Ep. 009 (2024)