.png)
Causal Bandits Podcast
Causal Bandits Podcast with Alex Molak is here to help you learn about causality, causal AI and causal machine learning through the genius of others.
The podcast focuses on causality from a number of different perspectives, finding common grounds between academia and industry, philosophy, theory and practice, and between different schools of thought, and traditions.
Your host, Alex Molak is an a machine learning engineer, best-selling author, and an educator who decided to travel the world to record conversations with the most interesting minds in causality to share them with you.
Enjoy and stay causal!
Keywords: Causal AI, Causal Machine Learning, Causality, Causal Inference, Causal Discovery, Machine Learning, AI, Artificial Intelligence
Causal Bandits Podcast
Causal Secrets of N=1 Experiments | Eric Daza S2E3 | CausalBanditsPodcast.com
π½οΈ FREE Online Course on Causality
π Causal Inference & Discovery in Python
Causal Secrets of N=1 Experiments
Join me for a one of a kind conversation on the opportunities and challenges of n-of-1 trials, Eric's causal journey, his path into statistics, his love of sci-fi, and how single-subject experiments could reshape personalized medicine.
Video version available here
About The Guest
Dr. βEric J. Daza is a biostatistician and health data scientist with over 22 years of experience (Cornell, UNC Chapel Hill, Stanford). He works at Boehringer Ingelheim. Eric is a creator of Stats-of-1, a health innovation newsletter & podcast on n-of-1 trials, single-case designs, switchback experiments, and personal AI for digital health/medicine.
All views and opinions expressed by Dr. Eric J. Daza represent no one but himself. These views and opinions do not represent the views and opinions of his employer.
Connect with Eric:
About The Host
Connect with Alex:
- Alex on the Internet
- ππΌ Consulting and Causal AI Training For Your Team: hello <at> causalpython.io
Episode Links
Papers
- Daza (2018) - "Causal Analysis of Self-tracked Time Series Data Using a Counterfactual Framework for N-of-1 Trials"
- Matias, Daza et al (2022) - "What possibly affects nighttime heart rate? Conclusions from N-of-1 observational data"
Books
- Asimov, I (1991) - "Foundation"
Apps
Webpages
Causal Bandits Podcast
Causal AI || Causal Machine Learning || Causal Inference & Discovery
Web: https://causalbanditspodcast.com
Connect on LinkedIn: https://www.linkedin.com/in/aleksandermolak/
Join Causal Python Weekly: https://causalpython.io
The Causal Book: https://amzn.to/3QhsRz4
S02E03 - Eric Daza - N-of-1 Trials
Eric Daza: An N of one trial is a, uh, a study of one person in the trial sense. It's randomized, it's a randomized trial. How useful are these designs? They don't generalize, do they? And the answer is, that's correct. They're not meant to generalize. They're personalized models. That's the partial answer The other.
Part of that answer is they are meant to generalize to yourself.
Alex: For people who would be interested in running experiments like this, maybe on themselves or maybe, uh, designing trials like this, what do you think would be some of the most important aspects to pay attention to at the design stage and the implementation stage, and then analysis stage to make sure that, uh, our conclusions are really, uh, trustworthy or reliable?
Eric Daza: The self-deprecating way I say it is, you know, I have, I have pretty bad memory. Uh, most people can do that work and they understand it, but I had to keep repeating it, and that's the reason I had to get a doctorate because I needed to really push it into my brain. You know, I keep coming back to Ruben's quote, design Trump's analysis.
I don't know if at this point it's some kind of a mantra, but, uh, that really is important, but it extends even beyond causality because. Just thinking about, you know what the study question is, what is the question you're trying to answer that is the roots of really good science.
Marcus: Hey, causal bandits, welcome to the second season of the Causal Bandits Podcast, the best podcast on causality and AI on the internet.
Jessie: He started with neurobiology, but found his true passion in statistics and it was theater, music, and the history of disease in his family that inspired him to switch gears. He studied at Cornell Chapel Hill and Stanford trained concert pianist, creator of stats of one and researcher principal, clinical data scientist at one of the leading pharma companies.
Please welcome Dr. Eric Daza. Let me pass it to your host, Alex Molac.
Alex: Welcome to the podcast, Eric. It's a great pleasure to have you here. I'm very glad you found time. It's very early in the morning for you. I know.
Eric Daza: Yes. It's, uh, it's, it's getting, uh, not so early, but yes, it is quite early. Thank you for having me on the show.
Alex: Amazing. I'm really happy to have you here. And I think this topic that we'll talk about, one of the topics at least that we'll talk about today, we'll be super interesting for our audience. And so I wanted to start with a question with a hardcore bump question. What is an n of one trial?
Eric Daza: Ah, perfect. Yes.
The starting question. So, an N of one trial is a study of one person in the trial sense. It's randomized, it's a randomized trial, and it's essentially a crossover design. So you're curious about, oh, you know, maybe does a drinking coffee affect the amount of sleep I get something like that. And you want to experiment on yourself, so you would randomize yourself.
Maybe today I drink coffee, maybe tomorrow, no coffee, something like that. And then you see how much you sleep each night, and then you compare the average, uh, amount of sleep you get each night under coffee or no coffee. So that's like a, a nice playful example of an n of one trial. Some of your listeners might know it as a time series design, so that's another way of saying it.
Mm-hmm.
Alex: And what inspired you to go into this particular direction when it comes to causality and causal methods?
Eric Daza: Well, let's see, with causal methods, I actually worked in, uh, pharma as a biostatistician, uh, with a master's degree many, many years ago. Won't say exactly when. And during that time, I wanted to learn more about the statistics underlying the software, you know, programming I was doing as a stats programmer.
And so I prepared to go back to grad school for a doctoral degree. And one of the courses that I ended up taking by chance was a causal inference course by Mark Flan. And lemme tell you, I had no idea what causal inference was. I'd never heard of it, and I had no idea who Mark was. So now I tell people, it's like, you know, at least in the US if you, uh, took, uh, basketball lessons from, uh, LeBron James or Steph Curry, and, but you didn't know A, you didn't know what basketball was, and b, you had no idea who those two people were.
So, looking back now, I'm like, oh my gosh, no wonder he was really good at it. And it's so, uh, interesting. You know, how, uh, he is able to explain these concepts. So that was causal inference for me. And then with NF one. After I got back into grad school, I was about to graduate at the end and, uh, realized, oh, I, I need to find a new focus.
So I like causal inference. Is there something I can do with it? Susan Murphy, professor Susan Murphy came to campus and she gave a series of talks and, uh, your listeners might know that she, uh, works on dynamic treatment regimes, which is, um, a branch of causal inference and it's personalized in a, in a meaningful way.
I saw her work and I was, I was, uh, very impressed and also very daunted and, uh, she mentioned this other field called NF one trials, so that's kind of like what sparked it for me. Plus one of my loved ones has irritable bowel syndrome, which is a very idiosyncratic, chronic condition. It's the kind of thing that you can treat with an NF one trial.
So that's, that's kind of the three streams for how I got. Got here
Alex: in one of your papers, uh, you write about MOTR model twin randomization. That is a device that leverages the G formula or backrow adjustment. For those of us who are more on the pert side under serial interference. What is serial interference?
Eric Daza: Yes. So serial interference is, you'll recall interference in the, the standard causal case where you have different people in, in the study. And you know, if it's you and me, your treatment might affect my potential outcomes, right? And vice versa, my treatment might affect your potential outcomes. I come from public health, so typically, uh, that's talked about with a vaccine example.
If I get vaccinated, then both of your potential outcomes for the sickness are now kind of more protected. So my treatment has affected your potential outcomes. That's interference between people. I. Now imagine that Eric and Alex are the same person, but uh, at different points in time. So now the interference is still between us.
If I get vaccinated at time 0.1, it affects my potential outcomes at time 0.2. But that's you, you're, you're me in the future, whichever way you wanna spin it to make it look, uh, better for you, of course. But, uh, that's the idea. And so serial interference is just interference over time. So the neat thing is, uh, you can't interfere with my potential outcomes because you're in the future for me.
So you can't, it can't interfere backwards. So that's really what serial interference is.
Alex: What are the main challenges, uh, in, in the settings of, of time series and, and, and just one individual when we can have some carry, carry over effects, for instance, and all those challenges that we can meet in, in any, um.
Time series based scenario.
Eric Daza: Yes. So, uh, so carryover, like you said, is one of the, uh, the big ones. And, uh, that just means that, uh, the treatment that I get now, uh, could impact, uh, the, uh, outcomes I have in the future. So there's again, uh, a way to imply or induce serial interference and, uh, that's called carry over because the effect carries over into the future.
You could also have, uh, just plain old auto correlation. So perhaps, you know, uh, if, if I get like migraines, perhaps my migraine chances tomorrow are affected by my migraine chances today. Whatever's predisposing me, uh, naturally, um, to get migraines today also makes me more, um, likely to get a migraine tomorrow.
So that would be like auto correlation. And you also have, uh, exogenous factors. So things like the weather that can affect your outcomes, uh, over time. So those are things that are not affected by, uh, your treatment or exposure or the outcome itself. Again, examples like the weather, I can't affect the weather with my coffee drinking or my migraine probability, but it can affect both of those things.
Um, so those are the typical complications. When you are trying to do causal inference in this setting,
Alex: how do you deal with them, uh, in in those methods, in those, in those settings with just one, with just one subject.
Eric Daza: Well, the, the idea is that, uh, if you're trying to figure out, for example, what is, uh, cause like what's giving you migraines or what causes them to appear and affects their intensity, uh, then the idea is you're trying to figure out what the triggers are and uh, what are the recurring triggers?
Like, is it usually when I drink more coffee, that's what. Increases my chance of a migraine. Um, maybe it's coffee for me, but, uh, uh, for you, maybe, you know, if you get migraines, maybe coffee doesn't do anything for you. It could be just the amount of sleep you get, or it could be your physical activity or exercise, or it could be some drug you're taking.
So these designs are very useful when you have outcomes that have very heterogeneous triggers across people. Uh, and by triggers I even mean the same trigger perhaps for me. I drink coffee and it makes my migraines worse. But for you, you drink coffee and it's the opposite. It actually prevents your migraines.
If we had done a standard randomized control trial, or even just a study, observational study between people like you and me, you would find no average treatment effect, they would cancel out because coffee is helping half the population and it's hurting the other half. So that really tells you where these designs fit in when it's really heterogeneous.
Alex: In one of your papers, you talk about, uh, the fact that today we have access to many measurements on personal level. So for instance, people might record different indicators related to their personal health using things like Fitbit and, and other types of personal devices. In this paper, you also talk about an estimate that is called average period treatment effect.
So we know that in the context of traditional randomized controlled trials, we talk about a TE. Average treatment effect here you add this one P here, a PTE, average period treatment effect. Can you tell us a little bit more about this estimate and what's special about it?
Eric Daza: Sure. So the average period treatment effect, the A PTE, it's similar to the a TE in the sense that it's also an average effect between two or more treatment levels.
Um, any pair of those, right? Uh, but just like when we were talking about, um, uh, the time series setting and interference just now, right? So recall the difference between, um, interference and serial interference is with interference. It's between people with serial interference, it's the same person over time.
So in a sense, it's between people. It's just that same person with the A PTE, the period part. It's kind of a placeholder for, um, the time period. So my time period at time period one versus time period two. So it's separating those two time periods out of the same person. So it's just a way of describing an a TE except across different periods of time for the same person.
But the analogy is exactly the same. We're taking a group of, of me at, uh, certain time periods when I drink a lot of coffee, and the other group of me when I don't drink a lot of coffee and comparing the difference. And that's, that's all the, uh, uh, the a PT is. Those times, those time segments are, are called periods in the, uh, NF one trials literature.
And that's why I picked that term.
Alex: And when we analyze trials like this, is there anything specific about the statistical machinery that we use get to conclusions to build our inferences on top of this data coming from single subject studies, randomized single subject studies?
Eric Daza: It's a great question. The largest difference is that it is a time series setting.
So when you look at the assumptions in, uh, a standard group based trial or study, usually you have, uh, an independence assumption, independence between participants of the study. Here, that assumption is replaced by something like stationarity for one. So stationarity means that over time there are basically averages that stay the same over a certain time span.
And that's an additional assumption compared to like the standard study that you do across people. And then the independence assumption. Gets replaced with basically a wide sense stationary is, stationary is part of it, but, um, the errors are no longer dependent on each other after you control for things like lag and auto correlation.
So you still have that independence assumption, but, uh, you have to work more to get to it because now your lagged outcomes can be correlated with your present outcomes, which is a, uh, very common complication in the time series setting. So those are the two basic differences. And, you know, on top of that, they have lots of downstream implications, but, um, the modeling itself is pretty open and it's, it's.
Very analogous to the group based scenario.
Alex: What would be some of the best materials you think for people who would be interested in learning more about NF one trials
Eric Daza: N one trial? Well, shameless self plug, I do have a newsletter called, uh, stats of One. There it is, whoop. And uh, and so at at Stats of One, we focus on a lot of the statistical methods, data collection software, that kind of, uh, subject.
We focus on those materials that are connected to NF one trials, but also, uh, single case designs. That's what they're called in psychology and, uh, education. A lot of the field actually has its roots in psychology and education. But, uh, again, I've mentioned time series, so there's, uh, an entire literature of time series where we have resources from that literature.
At, uh, the newsletter, it also touches a field called functional data analysis, where you're looking at, you know, instead of an average, uh, on a given day, you're looking at the entire curve. So maybe it's like your step counts over the day and does your step count curve differ when you, uh, have coffee versus when you don't?
And, uh, let's see. And there's a lot of deep work that's been done in econometrics and, uh, and some, a little bit in finance. In finance, your, uh, much of your audience will know an NF one trial as a switchback experiment. So that's the same thing. It's, uh, it's this randomized or experimentally manipulated condition and you wanna see is there a repeated pattern over time.
Alex: Mm-hmm.
Eric Daza: So all those literatures have really great nuggets of information in them. And at stats of one, we're just trying to bring them all together. So I. You know, somebody from uh, switchbacks can talk to somebody from NF one and say, Hey, oh, you worked on this, um, covariant structure. Oh, we, we already did that in our field.
And then in the other way, oh, you work, you know, you're doing this missing data thing. Oh yeah, we did that here. So, you know, let's use our methods and, uh, and advance the field together that way. So, yeah. Mm-hmm. So that's, that's one great resource. Um, the, the, we have a resources page in where you can find resources, um, in these areas, um, as well.
So, and then I could list off things, but I'll, I'll stop there and see if your audience gets curious about going
Alex: to our resources page. I'm sure many people will. Um, Eric, when you think about, uh, causal identification. Let me take a step back here. You said you started with, with Mar Marfa and Van Delan. Uh, you said about your background in epidemiology and we know that different schools of thought in causality, uh, sometimes refer to different traditions when it comes to the assumptions and different frameworks.
I mean here mainly the potential outcomes framework and the perian so-called structural model, uh, framework. What are your thoughts about this framework and their application? Um, to, to the topic of causal identification. In case of what, in case of your area of interest, NF one trials.
Eric Daza: I did have one more thought briefly before I answer that, uh, on the resources, uh, your audience.
You should also check out the NF one, um, collaborative network, just. Search for that. Mm-hmm. Uh, fantastic group of people. They're based out of Australia. Uh, I learned a lot of my, uh, my first, um, foundational concepts in NF one trials from them. They've been doing it for decades. So they're definitely the experts specifically in NF one trials and mm-hmm.
That area. Um, so they're fantastic. And they have like hundreds of subscribers as well, so they're, they're great to check out.
Alex: That's great. We'll link, we'll we will add a link in the description of the video. Or the podcast for, for our audio listeners, li listeners, so everybody can, will be able to go and visit their page.
Eric Daza: Definitely go check them out. Um, and then, so to your question about the frameworks, it, your question was, uh, you know, how, how do they relate or, you know, how do they, uh, connect? That kind of thing.
Alex: My question was specifically in the, in the, your context of identification, right? Causal identification and how do you find those frameworks?
Which one maybe is more useful or in more useful in which cases for you when you, when you think about causal identification in your area of interest in, in the area of NF one trials?
Eric Daza: To be honest, I haven't, I haven't pushed into, um, other areas myself, outside of the potential outcomes framework. I will say that, uh, and that's not to say, you know, you shouldn't, it's just that me and in my recent work, I haven't really explored the other frameworks that much from what I know.
I tend to, uh, lean on the DAG framework a lot, directed a cyclic graphs, um, from Pearl and others. And I find it, uh, I personally find it very useful because I am visual. I like to see where the arrows go, and that helps me, uh, write out the equations that I need, uh, to see if the effect is actually identifiable, right?
And how to construct a model to, um, estimate that effect or, and to identify that effect. So the DAG framework for me is, is quite useful. Uh, I've definitely leaned on that one a lot. It could be especially useful with NF one, uh, and single case designs because again, they're time series, so you're forced to draw this dag that's recursive.
It really hits home because you start to draw arrows from the past to the present, and now that whole diagram is shifting over time. So it really helps you understand, uh, what. You need to put in your model and what you don't, what assumptions you're making when you don't draw an arrow. I, so I think that the, the TLDR R for me is, uh, dags are great.
They're great for coming up with an identifiable model.
Alex: When you think about, um, running, designing and running studies like this n of one studies, what would be your advice for people who would like to start with something like this? So maybe we have in our audience, some people interested in things like nutrition or longevity that are aware of, uh, for instance, Brian Johnson.
So this is a person is famous for doing a lot of experimentation on himself, but he always emphasizes right that. This findings, he cannot give us a high probability that this findings will translate to other people because he only knows about, about himself. For, for people who would be interested in running experiments like this, maybe on themselves or maybe, uh, designing trials like this, what do you think would be some of the most important aspects to pay attention to at the design stage and the implementation stage, and then analysis stage to make sure that, uh, our conclusions are really, uh, trustworthy or reliable?
Eric Daza: Great question. So two thoughts. One, um, with the, uh, the, the comment about like generalizability, I get that a lot. Uh, you know, how useful are these designs? They don't generalize, do they? And the answer is, uh, that's correct. They're not meant to generalize their personalized models. Um, so that's partly, that's the partial answer.
The other part of that answer is they are meant to generalize to yourself. To figure out in general for you, what is the pattern? And then, um, other goals like do, is there a pattern that's more general to other people? Those are secondary In the world of, uh, NF one studies, that's a secondary question.
It's important and it connects to methods that borrow information across people, uh, in biostatistics. Uh, typically they're mixed effects models. That's sort of the gateway. I, I tell statisticians, I say, look, you know what a mixed effects model is, or, or random effects model, depending on your terminology.
Now take each person in that, uh, in that study, look at it in a repeated measures way. Uh, voila. There's your, uh, NF one. Uh, the NF one edition is that the models per person can differ themselves. You're not constraining them to have mostly the same model like you are in a mixed effects, uh, study or mixed effects analysis.
That was about the generalizability piece. Now, if you're designing your own study. You, it, it's, it's like designing a research study. You, uh, have to, as much as possible, specify the variables you're going to collect what you're going to measure. You know, think about how you're going to measure them. Uh, is it feasible?
Are you going to use a, a passive sensor like, uh, you know, a wrist-worn sensor? Are you going to use a daily diary? Are you going to take lab values? So think about what you wanna collect. Um, but most importantly, think about the question that you want to answer because I. You can collect a lot of things, and that's great.
I, you know, if you're curious, that's awesome, but you have limited time and, and money and resources, and also probably patients if you're involving other people that are helping you collect these data on yourself. So really think about the question you want to answer and then start to figure out what might be, um, what might help answer that question, number one.
And number two, which of those things can you measure and how are you gonna measure them? And that will inform your design. You know, there's that wonderful quote of course by like Don Rubin, design Trump's analysis. The point being that you really have to think about the study question and the variables involved and the timeline involved.
Um, and like, to your point, you know, how are you gonna collect that data and then analyze it, uh, what models you're gonna use. Those things are all more important than the analysis because otherwise it's the, the typical garbage and garbage out. Like you don't design it well. You could find lots of things that don't really translate to being meaningful or useful for you.
Alex: On your webpage, you have a section that talks about software that can be used when we work with NF one trials. Can you give us a little bit of a glimpse into the software ecosystem that can be helpful in designing or analyzing the, these types of studies?
Eric Daza: Yeah. I've talked to a number of folks over the years who have developed like self data collection and experimentation, uh, type platforms, you know, for different applications.
One of the earliest was called, uh, tummy trials and it was for, uh, people with irritable bowel syndrome, which was close to my heart because that's what motivated me to go into the, the field. I don't know if they progressed with that app. It was, it was done for a study. I. Um, and they may have converted it to a different kind of app.
I'm not sure. Um, I think Migraine Buddy is another one, but that's more of the self-tracking side. I don't know how much they help you then design your own self experiment. Mm-hmm. One of the, the newest, uh, really awesome looking packages is called, uh, study, study Me, I believe. Um, mm-hmm. I don't know much about the package itself, but I, I know the pi, um, Stefan Korski and, um, he's excellent.
Does really fantastic work. So I would encourage your listeners to go check out that package. Um, and of course we can put these links in, you know, so they can check.
Alex: Yeah, definitely. We'll put the links below in the description below or just in the description. Depends if you watch this, uh, as a video or, or you just listen your journey.
Into statistics or your statistical journey was very adventurous at, at least it looks like. So you've been at Stanford, you've been at Chapel Hill, you've been at Cornell, and you were doing various things at various moments in time. Can you tell us a little bit more about your journey and how it started?
What inspired you to go into statistics and, and what was your trajectory later on when, when you already knew that you want to learn more about statistics and then causal inference?
Eric Daza: I hope this is inspiring for your listeners, especially the ones that, uh, don't think of themselves as math people. So I started off in undergrad as a neurobiology major.
I was very interested in consciousness and, you know, what makes us conscious and, and all that. But very early on I developed a, another interest in theater, in student theater. Now I am a, a pianist. I'm a trained concert pianist, and so I had a musical background. Um, I also lived in on purpose. I lived in the dorm on campus that was the theater dorm.
We had our own student run theater. And so that naturally drew me into that entire crowd. It also distracted the hell out of me. Uh, I was constantly working on theater projects instead of my homework. Uh, this appears, if you look at my grades, you'll see that there's this, uh, inverse curve of my, uh, my academic performance and my theater involvement goes like that over time.
Um, so I was still very interested in neurobiology, but I had a lot of trouble concentrating. So towards the end of my undergrad, I was, uh, taking a cognitive studies course, and we had a lab component where you had to design your own experiment and analyze, collect the data from your fellow students who were subjects, and then analyze the data.
And that was really the bug that got me. 'cause at the end of undergrad, I, uh, I still love neurobiology, but I realized after, after doing the, the classwork and uh, doing a a bit of lab work and summer jobs and neurobio, I realized, oh, I don't really necessarily want to do this kind of work. I think it's very interesting.
The findings that come out of it are fascinating. But to do the, the bench work, the field work, uh, eh, it's, it's okay. So discovering this cognitive studies class and, uh, the spark of, uh, designing the experiment and analyzing the data, that was a really nice find at the end of my ac my, uh, undergrad career.
So that sparked it. But, you know, it was my fourth year, uh, in the US it's four years long. And, uh, I was about to graduate and then my school came up with this brand new one year master's program in applied Statistics. So, to me that was a, i, I like to say it's a statistically significant moment. The universe is telling me, ah, you're getting interested in stats and now you have this program.
We have this new program for you. Huh. And it's only one year long. So I had to prepare for it. Of course, I was not a math major. And again, I, at that point, my grades had dropped. I had a lot of trouble focusing. Um, but I really like, uh, hooked onto it and prepared for it. Um, I fought to get in, uh, you know, with, with myself, with, you know, training myself to take the intro math courses and all that.
I got in one of my proudest moments, uh, finished the master's a year later, I. And that set me off on the rest of my journey. So that journey would take me, um, briefly, it would take me through pharma for about five years. I was a master's level biostatistician at a small, mid-size pharma company. And during that time realized, uh, the work I was doing was statistical programming, but I really wanted to know more about the underlying concepts and statistics.
You know, like, why do I do a t-test here? I know it's, it says it's so in the manual, like do the T-test. But why? Uh, I like to the self-deprecating way I say it is, you know, I have, I have pretty bad memory. Uh, most people can do that work and they understand it, but I had to keep repeating it. And that's the reason I had to get a doctorate because I needed to really push it into my brain as much as possible just to do the work that other people could already do.
So that, that was my drive to get back in and, uh, and yeah, so. So briefly, I got into, after, after three years, I applied to, uh, doctor doctoral programs and biostatistics. I didn't get in. I applied to 10 programs. I, uh, didn't get into any of them, but I knew that, I knew my academics were mediocre at the time.
Uh, I just needed to go out there, apply to these programs and, you know, get, literally get data on what do I need to do to get into the program. UNC Chapel Hill, uh, was fantastic. They flew, uh, maybe 30 or 40 students perspective students out to Chapel Hill and gave us a tour. Uh, you know, we stayed there for a few days.
They told us about the program and they told us, you know, we're, we won't accept all of you, but we just wanted you to understand this kind of program. So it really left a good impression on me and the, uh, graduate admissions director at the time. Um, sushi, uh, rest in peace is a wonderful man. Uh, he was very encouraging and targeted and he said, look, we love your work experience.
Uh, you could do better academically, so here's what you do. It's kinda like a doctor, like, take two pills and call me in the morning. Uh, and he said, you know, take these courses to prepare you for this program, and then you'll be much more competitive. So I did that over two years, uh, from that point on. Uh, and that's when I met, uh, mark Derland because he was one of the, the, uh, professors whose class I took by chance, uh, to get into grad school.
And that set me off. And then, and then there's that whole journey.
Alex: And what inspired you to go to Neurobiology in the first place?
Eric Daza: Yeah. When I was in high school and before that, I. You know, I moved to the US from the Philippines and I had the kind of stereotypical middle, upper middle class Asian upbringing where my parents, they encouraged their kids, okay, go be a doctor or a lawyer or an engineer.
Those three things pretty stable. They were very encouraging. We could pursue whatever we wanted, but that was kind of the average. Good advice. Right. And that's true of a lot of immigrant groups, not just Asians Of course. So pre-med, that's the short answer. Uh, I was encouraged to be pre-med, but again, I was my, my brain was kind of all over the place and I also wanted to do music, so I pursued a music major for a while.
That's part of what explained my grades tanking. 'cause I was trying to do music and biology. So yeah, the pre-med is what got me into biology. And then the biology is what got me into neuropsych and cognitive studies. And then that got me into statistics. Which is where I am today. The one little bit that your audience will appreciate is that, uh, I, I discovered last year, uh, two years ago, I was diagnosed with A DHD, so Neurodivergence.
And that explained a lot of why I wasn't able to focus in my early days. And it explained a lot of how I was able to do work as a functioning adult because I would procrastinate until the last moment and then do it procrastinate and do it. Mm-hmm. And I got very good at turning those time periods into small chunks.
So then I would still procrastinate, but I would do it, and then I would do these little tasks along the way. So that's something that, yeah, hopefully your audience finds some inspiration and also some encouragement. Along those lines.
Alex: That's very interesting. I haven't summarized it statistically, but I think maybe even majority of guests at this podcast have something to do with music.
They play instruments, they used to be musicians, and so on and so on. And I myself used to be a music producer and a musician also, and I had my own studio and, and I was a large, uh, large part of my life at some point. I was very curious if, are there any lessons from music, like playing music or understanding music, understanding harmony, understanding structure that you found useful in your work as a scientist or statistician?
Eric Daza: It's a fantastic question that I have not really thought about in any depth, but it, it has definitely come up because, uh, in, in these quant fields, again anecdotally, it, it, there does seem to be a sense that there are a lot of folks who have some tie to music, um, or some kind of creative outlet, but music comes up a lot the.
Closest off the top of my head that comes to mind, there's two things. One is, uh, um, I also used to write a little bit. I write a little bit of music and, uh, I love the notation structure, and I would just write those notes and yada yada. And if you're familiar with musical Western musical notation, they come in bars.
And bars are kind of like time periods that repeat. So, you know, in my head I'm like, maybe, maybe that it's gotta be right. That's gotta be some part of what in the back, put me into this time series and of one world that's like, it really clicked for me. The second thought is, uh, this is super nerdy. Uh, I'll just reveal, it's not really an Easter egg, but uh, the newest paper I've been working on that will hopefully be in print, uh, this year.
Uh, in it I defined a quantity a. Uh, that's a, uh, so like, um, at a given point in time, say I have like my migraine probability, but maybe it's been affected by my coffee drinking in the past. Um, or there's different patterns of coffee drinking I could have taken to affect that migraine probability. Uh, and, uh, and now I have my, my, uh, potential, uh, outcome of my migraine probability if I drink coffee today versus I don't given my entire past.
Right? So if you average over all of those past streams of drinking coffee or not drinking coffee, you have this average potential outcome and you have them under both the coffee drinking and the not coffee drinking conditions. So I call that in the paper. I call that the current average potential outcome.
It's based on an approach that Hudgens and Hallan have taken, you know, some folk, a lot of folks of the interference literature. Um, so side plug, this happens a lot when you have a DHD side. Thoughts side plug. Uh, Hudgens was my advisor in grad school, so that's another, that's another very clear reason I got interested in this.
This area of NF one 'cause of the interference. Anyways, it's called the c Current average potential outcome, current average potential outcome spells cap. So from my work as a musician, the cap, right, like that was on purpose. That was definitely on purpose because I also love wordplay and puns,
Alex: taking advantage of the, of the fact that we got a little bit back into, into the main topic of our conversation, or main technical topic of our conversation, so to say.
Um, and of, and of one trials. Yeah. I wanted to ask you, uh, what are some of the most interesting or surprising results that you've seen? Um, as conclusions from the single, single individual trials. And I'm asking here about your own research, but also anything else that you've seen in the literature that's really drew your attention.
Eric Daza: There's nothing, nothing. No one thing particular comes to mind. It's, it's really, you know, looking at the literature, for example, on migraines, um, and irritable bowel syndrome, and just seeing the heterogeneity across the, the different participants. I mean, nowadays you can find a lot of studies that have, that do secondary data analysis.
They'll take, uh, a paper, uh, that has a lot of heterogeneity, or perhaps there doesn't seem to be, uh, like an average, uh, association or an effect. And then they'll look at the different participants and there's a lot of heterogeneity between the participants, right? So there's no one particular finding.
It's, it's more, um, this realization that. Looking at NF ones is kind of the natural progression. If you're interested in heterogeneous treatment effects or the current, or I'm sorry, the, um, conditional average treatment effect. Right? So you, those, all of both of those cases, you're essentially looking at smaller and smaller subgroups of people.
Well, if you keep going down that logical chain, you get to one person. And, and in fact, the literature, in my opinion, and those of some of my colleagues, there's a misnomer in the literature of calling that the individual treatment effect or individualized treatment effect. And it's really not, it's a small group treatment effect.
It's just that the groups keep getting smaller. So it's, it's an individual treatment effect. If, uh, you know, if everybody in the group has that same effect, then sure, but the, the bottom up approach would be an end of one approach. And it doesn't always work, right? It, it works for recurring conditions. If you have something irreversible like death, then you can't study it like an N of one because it only happens once.
Um, but if it's recurring, uh, then you can study it that way. So, yeah,
Alex: I'm very much with you regarding the, the misnomer, misnomer diagnosis here, especially when it comes to individual treatment effects, because clearly, uh, conditional average treatment effects are not individual effects. Sometimes call them, some people call them, as you mentioned, individualized, individualized treatment effects.
And I think that's a better name because well, that makes sense. They are more individual than in, than the average. Uh, but individual treatment effect is a, is a purely counterfactual quantity, right. That we never have access to. Uh, if we look from the, um, paradigm identification point of view. And I think, I think that's, that's good to make this distinction here.
I remember I was reviewing a paper for a British journal. Of, uh, of psychology, maybe? I don't remember. Yeah. British Journal of Psychology. And, and there there was a review article about causal inference methods, and that was one of my main comments that, that using this nomen nomenclature can be very, very confusing for some people, right?
Because without a clear definition, it can mean very, uh, many, many different things. Uh, before we, before we move forward, one thought that comes to my mind often when I think about designs like this, like NF one trials, is about the fact that we might not always know if there are some changes in the background in the distribution of, of the outcome, right?
That we, that we just are not aware of. How do you deal with things like this, that there, there is no stationary for some factors, maybe, maybe for all, maybe for just some of them.
Eric Daza: As always, I have a trailing thought from the last thing, uh, that I'll mention, uh, briefly first, but, uh, with the individual lies treatment effect, um, even with NF ones, you still have the fundamental problem of causal inference, right?
So I can get an average individual treatment effect from me. But the fundamental problem of at any given point, I only took one exposure, one treatment. So I don't know what the other one is, it's still there, but it's now just done over time. So, just, just a quick note there. So, with background, uh, information or like non-stationary, this is definitely one of the upfront challenges.
Uh. In an NF one trial, uh, a classic NF one trial. It's designed by the clinician, uh, and co-designed with a patient. Uh, and they know what the exposures are, their, their therapeutics, their drugs, and so they know, they con you control a lot of the, the components that go into the model, right? Because you know what the treatments are, number one, so you know how long they stay in your body.
Number two. So that means you can wait for it to wash out before you randomize again. Um, and number three, you're randomizing it. So you've already controlled that piece. You only do it within, say, a few weeks. So you know that the time series that are involved are fairly stationary by design. So that's in that highly controlled setting of an NF one trial.
And you're also, you know, you're talking to your patients so you know, if, uh, they, you know, got sick or you're aware of other things that could impact. Uh, whatever that outcome is in the real world, it is, uh, it is tough and, uh, I have no good answer here. I did set up a framework to address those concerns, to formalize, you know, um, what the variables should be called and, and whatnot.
There is work that's being done on how to conduct this kind of analysis if the time series are not stationary. Um, so I think like, uh, people have taken, uh, markoff chain type processes, uh, to look or like, um, what is it? Um, Markoff states, um, and now I'm blanking, but, uh, that idea where you could look at the latent state and then look at the transitions, right?
In my opinion, that changes the question because if you look at a transition probability, it's, it's useful of course. Um, but it's a slightly different question maybe from your original question. So there's always like trade-offs. Um, in my 2018 paper where I set up this framework, I did address, uh, station ization.
So you could try to station arise your, your, uh, time series by taking like the first difference. But that changes the study question. Now you're asking about the derivative, the difference. Again, not that it's bad, but it's changing what you're answering and you should be aware of that. So, so there are some, some things you can do like that.
The trade off is they change the study question, and then of course if there's factors in the background, you know, you're gonna need some diagnostics in your modeling to see, oh, is is the time series still shifting? Um
Alex: mm-hmm.
Eric Daza: Like for even with stationarity, like, uh, in, in my 2018 paper in the analysis I conducted, I, I ran, uh, two tests for stationarity on the residuals because if they don't hold.
It's, it's like the independence assumption and group-based designs, like you need that assumption to hold. And if not, you gotta find a way. Maybe your time series is stationary from here to here and then from here to here. So you can segment it that way. Changepoint detection and other methods are used to try to find those stationary periods, you know, so that's another, uh, remedial strategy, I think.
Alex: What's next for one end of one trials? Uh, what are the big, uh, questions or big challenges that maybe somebody who, who's, uh, interested in those topics you would say it would be worthwhile for them to focus on this areas to move the entire field forward?
Eric Daza: Oh my gosh. Well, um, there's the, the, there's the field of NF one trials itself.
And, uh, those, because of the, the, like, the fantastic work they're doing in Australia at the NF one collaborative network, they're really helping to, uh, they've been help for, for decades. Again, they've been helping to popularize the design and the implementation of it. They actually run a consulting service, so listeners should check them out.
You can get involved with that group. I think they're called the N of one. Um, it's n of one something, but they have this, it's this consulting group that's affiliated with that team. And then on for my area, um, mines at stats of one. We're trying to connect a lot of different fields, not just NF one trials and studies and single case designs.
So. Like, I'm really excited about functional data analysis in that area. I'm actually in the middle of a conference right now that I'm attending virtually, um, on digital health, uh, technologies and digital health analysis and designs. And there are a lot of my colleagues and talks that focus on person specific designs.
And they're looking at not just the means and the medians that are repeated, but your curves of, again, I mentioned like step count, but you can also look at blood glucose. Um, if you are a type one, type two diabetic, you can look at athletic performance. So all those things can have these interesting curves that are average for you that, uh, perhaps are different, um, are likely different under different conditions.
So functional data analysis is another place. And then I very recently got, uh, turned onto this really, really nice, uh, I think it's just, it's a preprint for now. I hope it gets published. It connects NF one and causal inference to, uh, control theory and system identification.
Alex: Mm.
Eric Daza: And that is fantastic. Like, I, I, I, I can't go into that myself.
You know, I don't have time, but I've wanted to go to see that connection for a few years now, because one of my early colleagues in NF one, Eric Heckler, uh, brought that up. He's a psychologist by training. He's at UCSD, and, uh, he's been interested in using those engineering techniques to, to ask questions about human behavior and like, how can we optimize to get you back to a stable point or to bring you to a stable point?
Can we use these, uh, impulse response system identification techniques from control theory, uh, to do that? So yeah, this new preprint is out. Uh, it's just. It's really, it's really yummy. I hope to see it published. Yeah,
Alex: that sounds very timely when we think about agentic systems as well. Right? Uh, this, this, the problems, the problem with learning, uh, reliable world models is, is I think very, very crucial today.
And it's probably one of those problems that are at the, at the forefront of, uh, of, of the questions that we need to answer somehow if we want to move forward with the entire field. So that sounds very, really, very interesting and that was actually one of the questions I had in the back of my head and was thinking about asking you if we have still some time, but you already partially answered, so that's really great.
Eric, what are two books that changed your life?
Eric Daza: Let's see. So when I was younger, I, uh, I read a lot of sci-fi and, uh, a book that that has. Kind of, uh, really appeared a lot in my later professional life were, it was the, uh, the foundation novels by Asimov. And I read those probably in high school, maybe college.
But I didn't realize how foundational those books would become to my current career, right? Because it's all about, uh, using, you know, data and, uh, an understanding of human behavior to try to predict large events in human history. And lo and behold, I am now a statistician. So that book in retrospect, had an outsized impact on, uh, the rest of my life.
It's one of those things where I, I, I'm, I wasn't aware of it at the time, or at least, I can't remember if I was aware of it when I was thinking of going into statistics, but clearly now it's become, um, a big piece of. Of what I do. And it's a book I will, uh, try to encourage other folks to read because you can, you know, get this wonderful, playful idea or also dark idea of how things can go.
Sci-fi is a wonderful field because it's, it's offers, it's human beings trying to, uh, see these, they're not counterfactual yet 'cause they don't exist, but these possible futures, these potential future outcomes. Right. And uh, and that book did, did a great job of it really kind of, um, resonated with me when I was, when I was younger.
Yep. So that's, that would be the one that's, uh, at least relevant to this conversation.
Alex: What would be your advice for people who are just starting in an advanced field like statistics, machine learning causality,
Eric Daza: get into the data, get, you know, uh, start working on things. Start coding, start working through the math.
Really just start doing the work. Uh, don't be. Don't, uh, it's, it's easy to find it very daunting. Uh, I certainly did, and I just kind of hammered at it. But what I learned to appreciate were these little steps and the little steps you take that, you know, the five, uh, practice problems in the back of the textbook or on the, on the webpage that you do, they start to feel really good.
And then before you know it, you're much further up the mountain than you thought. I'm sure I've, I'm probably stealing this from some philosopher somewhere, but I like to think of it like the ous, uh, for the quantitative folks where, you know, you have like this big step that you, it's like, oh my gosh, I, I want to take this big step forward.
Ah, but it's so big. But if I break it up, okay, now it's a little more manageable, and then you break it up and it's a little more manageable. And if you add up those distances, they still add up to the same big step. But if you take it in the limit to that bot news, ah, it's actually shorter than that. So you've, so you've kind of, in a strange way, uh, mentally, you know, emotionally, psychologically, you've helped yourself get there by taking all these little steps.
So really the summary is like, just start, find something interesting, collect that data or find that data and start, uh, tinkering with it, playing with it, analyzing it. What's the study question, but, um, you know, what are the data variables I need? And you'll start to learn very quickly about the things that we have to deal with.
Oh, the data are not clean. Oh, I can't even get the data from this. Oh, what time did I collect that data? All these like wonderful little challenges that you never really thought of. You won't see them until you start trying to do it. So just mm-hmm. Start trying to do it. That's pretty, pretty much it. Just do it.
But, you know, Nike said that, so I can't say it.
Alex: I think that's a great advice. What question would you like to ask me?
Eric Daza: Oh my gosh. Um, I think you, you, uh, actually partially answered it already, which was, I was curious about how you got into this, uh, this field yourself. Well, actually, I guess you, you have it 'cause you talked about you, you did a lot of work, uh, musically, uh, which explains your finesse with the podcasting.
So I am very glad of that. Yeah. How did, yeah, what was that journey like for you? Like how did you
Alex: get in? That's a long story. Um, but in a nutshell, um, I think, you know, the really starting point I think was when I was a child. I just imagined that, you know, the world seemed very big for me. But, um, but I imagine that if you change your perspective, you, if you just like, kind of zoom out.
You yourself can see yourself as like a very small point, right? Like, um, like, just like an in, like a tiny insect on, on a huge dog or something. And then you zoom out and the earth itself becomes like such a small point and so on and so on. And so this was one of the moments I, I think really that really somehow switched something in me when, when I had this realization.
I was, I was, I don't know, maybe four or five years old back then. Um, and, and I think that was one of the things that inspired me to go and study philosophy later on. Hmm. And when I started, starting, when I started studying philosophy, I got interested in, uh, in logic, in formal logic and, and this kind of stuff.
And later on in my life, I had a long period of, uh. Relatively long period of time when, when I was doing music professionally, I had my own studio. Uh, later on I decided to go back and study something that was very interesting, became very interesting to me at the moment. And I was psychology and neuroscience.
Hmm. So when I went to study, uh, psychology, uh, in the first year I had stats class in, in introductory stats course. And I just, in the stats course, you know, I was, I was, I was always interested in science. So I, I like to go to Wikipedia and like read about physics or something like this. And, um, one of the, one of the huge, uh, challenges for me was mathematical notation.
You know, I, because yeah, I was just like, I just didn't know. Any, any, any more complex mathematical notation. And so in the STA class, and that's, I don't know, that's, um, even maybe embarrassing to to to to an extent. But I remember that the, the, our professor, the teacher, she was, she was an amazing teacher, really amazing teacher.
She noted some formula, maybe even for, just for, for the average, using the sigma notation for the sum right. And I didn't know what, what that is. I don't know how, but I didn't know. And I was like, sorry, what, what does it mean? You know? And she was like, you don't know. I, no, I don't know. So it's Sigma, you just sum, you just sum the values.
And I was like, what? I've seen this, this Wikipedia pages and I thought this is something very complex and this is such a simple idea. And, and it just like opened my mind so much before I, because I understood that. Now if I understand the Sigma, I can understand a lot of stuff. And if that was so simple, all, all other notations probably is not much more complicated.
It just maybe, you know, maybe it takes more time, but I can just learn it. That was a time when I just, just, something switched in my mind and opened me to, to statistics and I got very, very fascinated in statistics. Uh, but then doing my own research, I realized that, you know, um, those methods, I had so many questions, I dunno, mediation analysis.
I was taught mediation analysis in a, in a statistical approach for mediation analysis, which, which is not really, uh, causality meaningful, I would say. Uh, which, which I understood, uh, years later when, when I started reading Pearl and counterfactual approach to, to mediation analysis. Uh, but, but somewhere on the way I got, I got, I heard about machine learning, um, and then I got fascinated with machine learning and I said like.
I need to understand this. So I started studying machine learning on my own, and I started working in industry and so on. And, and, and this problem, uh, that certain questions that we, we actually cannot answer certain questions with machine learning similarly to what my experience, uh, when, when I was studying, uh, psychology and when I was trying to apply those statistical methods that I loved, you know, to answer different types of questions, I, I was, I was always hitting the same wall.
Mm-hmm. And so this brought me to, to causality. That was, and I think there were many coincidences on the way. You know, I just was at some conference somebody said about, Hey, the book of Y somebody said about something else, you know? And I started reading about it. So that was my journey, uh, in a, in a nutshell.
Uh, I think it's, it's, um, yeah. Um. That, that, that's it in a nutshell.
Eric Daza: Wonderful. Thank you. Yeah. I, I, I apologize. I realize coming out my mouth like, oh no, this is a very big question. Usually I'm good at being more targeted with the questions. That's fine. That's very, I I'm glad to hear that we, uh, we seem to hit a lot of, uh, very similar notes in our path.
Alex: Yes, yes. I, I feel so it's music and neuroscience, right? And, and statistics.
Eric Daza: Yeah. And for also, again, for your, your audience, the, um, um, I love what you were saying, your, your story about the Sigma notation. And I know you and I talked about this a little earlier, but, uh, statistics and math for me was very hard.
It was specifically stats is one of the hardest things I have ever done, and that's part of my story for how I got into it. Um, it was something I could really focus on. Uh mm-hmm. Right. And in retrospect, now that I understand my own context, my conditional, uh, statement that I, uh, have a DHD, uh, when you find something like that, that you could focus on, it's, it's a gold mine because now you obsess about it.
It's so hard. Oh, I gotta figure this out. And, uh, and lemme tell you, and even in my first year of the doctoral program at the age of, I think I was 28 at the time, you know, my, my fellow students, almost all of them had come straight from undergrad. They were quant background people. Uh, they'd been doing math notation for a long time, and they kept doing it.
And I was so happy when I, I re remembered how to integrate, like, oh, this is how you integrate. Uh, and even in stats, right? The big X versus the little x notation, I, I got so confused. Like it's the same letter and you know, you have to learn. It's not the same letter. But because of that. All those little things were felt so good for me.
And so now I'm very, uh, proud of, and like I really focus on, um, can we, can we say this in a mathematical way? 'cause I spent a lot of time learning those skills, which are really, for me, really hard. Um, so yeah. Anyways, just for your audience to know, like, yes, you could do it. And like I, it was, I wouldn't call myself a natural by any means.
I worked my butt off at it. And, uh, and now it's become very satisfying. Yeah.
Alex: That's really great. And, and congrats on your, and congrats on your perseverance. It's, uh, I think it's, it's really, really impressive, Eric.
Eric Daza: Thank you. Yeah, yeah, yeah.
Alex: Yeah. Before we finish, uh, what's your message to the causal python community or causal community in general?
Eric Daza: Oh, wow. Uh, I would say, you know, I keep coming back to Reuben's quote, design Trump's analysis. Uh, I don't know if at this point it's some kind of a mantra, but, uh, that really is important. Um, but it extends even beyond causality because. Just thinking about you, you know what the study question is. What, what is the question you're trying to answer?
Uh, that is the roots of really good science. Um, because from that question, you start to, you start to create the details, the specifics. What does the question mean? What does it specifically mean? What do, uh, what variables do I need to answer that question? It doesn't even have to be quantitative. It's just what are the pieces I need to answer that question.
So when, when you go into a doctoral program, that's really the, the big lesson that you get. It's, in my opinion, it's not so much about the field of study. It's training in the ability to ask and then start to answer a question deeply. Um, you're trained to push, push yourself on. On what the question is, what does that mean?
Now specify, okay, what are you assuming when you specify this, this, this, and this? Are those assumptions? Okay. Yeah. So that's really it. Now I'm not saying go get a, go get doctoral trainer or anything, but that's the skillset that I find very useful in approaching questions and, okay, so I'm gonna go even bigger.
Um, stats for me has been really fulfilling 'cause it helps even with my personal life and, uh, just thinking about, oh, why did this person react this way? Or Why did this person love this? Or like, they were upset at this when I said it, or they were upset at somebody else, or they really loved it when this person did this.
It helps, helps you kind of tease apart what's going on in your life. You can apply that to personal things, to, uh, politics, to, you know, your budget, whatever. But those skills are. Things that I learned for me through, uh, the practice of statistical reasoning. So, mm-hmm. That was not a short answer. I, I'm trying to keep it short, but
Alex: Eric, thank you so much.
I, I really appreciate, uh, I really appreciate your time. It was a great conversation and I hope to see you again in the podcast in some time.
Eric Daza: Thank you, Alex. Yeah, thank you so much for having me on your show. Um, I really love this show, so it's a, it's a treat for me to be on, so thank you.
Alex: Great. Amazing.
Thank you so much.