Irving Street Functionality

This is supposed to be a blog.

recent posts
posted 17 Jun, 2022

Experiment Selection for Causal Discovery; 2013

author Antti Hyttinen, Frederick Eberhardt, Patrik O. Hoyer
title Experiment Selection for Causal Discovery
howpublished Journal of Machine Learning Research 14
year 2013
where Helsinki Institute for Information Technology; California Institute of Technology;
url https://www.jmlr.org/papers/volume14/hyttinen13a/hyttinen13a.pdf

I read this paper as part of a research project to buid a static analysis tool for validating experiment design. While the basic model that they’re operating within is the one we want, it’s not clear yet what of the results they summarize we can actually use.

Common assumptions:

Common conditions for experiment sets:

These are frequently quantified over all pairs of verticies:

The discussion on page5 suggests that we don’t really care about the Order Pair Condition?

Some more set-oriented properties:

Fig.3 (page9) shows how to construct experiments satasfying all Unordered Pair Conditions.

Fig.7 (page14) and Algorithm 2 show how to construct a minimal (w/r/t intervetions) batch of experiments of a fixed size while satasfying all Unorder Pair Conditions.

Fig.6 (page12) shows how to construct experiemnts satasfying all Ordered Pair Conditions.

Getting minimal experiment batches satasfying all Ordered Pair Conditions is more complicated, and depending what’s being minimized may not even be a generally solved problem? Section 5.2 (page16) covers this.

Background Knowlege

Given a graph with vertecies/variables V, there are O(|V|^2) pairs of vertecies, which means we may be interested in O(|V|^2) [Un]Ordered Pair Conditions. Much of the domain/background knowlege we might bring into a problem can be represented by “checking off” pair conditions as unnecessary. Instead of searching for minimal cut coverings of complete [un]directed graphs over V, we can consider incompletely connected graphs.

Doing this in a minimal way is NP-hard, and basically reduces to graph-coloring. This isn’t bad: existing aproximations of the graph coloring problem can be applied without additonal loss.

When

then we can (usually? depending on the observability of a “skeleton”?) consturct a Markov Equivilence Class representing the possible underlying models that the background knowlege would be incapable of distinguishing.