Scoopfeeds — Intelligent news, curated.
Causal inference diary: skiing causes snow
agentic-ai

Causal inference diary: skiing causes snow

LessWrong · Apr 28, 2026, 10:21 PM

I've been playing with causal inference lately, as one does.[1] I was thinking of writing a more formal sequence about how to do causal discovery and model comparison, and I might still do that.[2] Meanwhile, I'm going to start with a sort of informal diary of what I'm learning as I go.I started a while ago, so this diary entry is coming partway into the process. I'll go back and fill in the history if I write that formal sequence later. Here's the little bit of backstory you need for today's episode to make sense.I had a data set about diabetes that I found in an ML data repository and I already played a lot with it over the last few weeks. I extracted a set of variables from it that seemed maybe causally connected and ran a bunch of analysis on them. In the end I was disappointed with that set; they all just seemed very entangled and I did not find any causal graph, or set of graphs, that seemed clearly better than the rest.I started again with a different set of variables. This time, I picked ones that I believed were more likely to tell a story about the data - ideally, a story with at least one branch point, where the value of a variable would have downstream effects on other variables. Indeed, this time I did get clusters of causal graphs that fit the data far better than the rest, and in a way that made sense to me. Hurray!Today's tasks were to:Refactor my code to work over any data set, with separate configuration, DAG-generation, DAG-scoring, and results-reporting modules;Add some code to plot DAG fit over all possible DAGs, so I could see the clusters and identify what changed;Create a synthetic test data set over a known DAG and run my code over it, to make sure everything was working right.The code refactor was dull in the best possible way and went just fine. I won't tell you about it.Plotting DAG scoresWhen I started this morning, my code was printing out the total log evidence for the best causal model, and also showing me which edges in the DAG were

Article preview — originally published by LessWrong. Full story at the source.
Read full story on LessWrong → More top stories
Aggregated and edited by the Scoop newsroom. We surface news from LessWrong alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop