An individualized treatment rule is a treatment rule that assigns treatments to individuals based on their measured covariates. An optimal rule is a rule that maximizes the population mean outcome, i.e. a rule with the maximal value. The estimator presented in Luedtke and van der Laan (2015) estimates this quantity by (i) estimating the optimal rule on a subset of the data, and (ii) estimating the mean outcome under this estimated rule on a new chunk of data. Suppose we do this five times so that we have five estimates of the optimal mean outcome. The dots above have the following interpretation:
Blue dots: Mean outcome under the rules estimated in (i);
Red dot: Optimal mean outcome.
By definition, the blue dots are to the left of the optimal mean outcome displayed in the red dot. The sample splitting used to define the estimates in (ii) ensures that each of these estimators of the corresponding blue dots is minimally biased.
Our final estimate of the optimal mean outcome is a weighted average of the estimates of the five blue dots. Our estimate is unbiased for the average of the five blue dots, and this average falls below the red dot. It is thus unsurprising that the lower bound of our confidence interval for the optimal mean outcome, which is really a lower bound for a weighted average of the blue dots, is valid regardless of how well the blue dots are estimated -- though of course, estimating the optimal rule well (blue dots close to red dot) will tighten this lower bound.
In a clinical trial setting, this indicates that we can give a lower bound on the efficacy of a drug when treatment is individualized to respond to patient covariates. The tightness of the lower bound depends on how well the optimal rule can be estimated.
The estimator described above is introduced in
A. R. Luedtke and M. J. van der Laan, “Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy,” Annals of Statistics (to appear), 2015. [link]
In that work we also give conditions for the validity of an upper bound for the optimal mean outcome. An estimator of the optimal rule with proven optimality guarantees (get the blue dots as close to the red dot as possible) is described in
A. R. Luedtke and M. J. van der Laan, “Super-learning of an optimal dynamic treatment rule,” International Journal of Biostatistics (to appear), 2014. [tech rep]