# Under Review

**A. R. Luedtke**, J. Wu, "Efficient Principally Stratified Treatment Effect Estimation in Crossover Studies with Absorbent Binary Endpoints," under review, arXiv:1712.05835, 2017. [tech rep]

Suppose one wishes to estimate the effect of a binary treatment on a binary endpoint conditional on a post-randomization quantity in a counterfactual world in which all subjects received treatment. It is generally difficult to identify this parameter without strong, untestable assumptions. It has been shown that identifiability assumptions become much weaker under a crossover design in which subjects not receiving treatment are later given treatment. Under the assumption that the post-treatment biomarker observed in these crossover subjects is the same as would have been observed had they received treatment at the start of the study, one can identify the treatment effect with only mild additional assumptions. This remains true if the endpoint is absorbent, i.e. an endpoint such as death or HIV infection such that the post-crossover treatment biomarker is not meaningful if the endpoint has already occurred. In this work, we review identifiability results for a parameter of the distribution of the data observed under a crossover design with the principally stratified treatment effect of interest. We describe situations in which these assumptions would be falsifiable, and show that these assumptions are not otherwise falsifiable. We then provide a targeted minimum loss-based estimator for the setting that makes no assumptions on the distribution that generated the data. When the semiparametric efficiency bound is well defined, for which the primary condition is that the biomarker is discrete-valued, this estimator is efficient among all regular and asymptotically linear estimators. We also present a version of this estimator for situations in which the biomarker is continuous. Implications to closeout designs for vaccine trials are discussed.

**A. R. Luedtke**, O. Sofrygin, M. J. van der Laan, and M. Carone, "Sequential double robustness in right-censored longitudinal models," under review, arXiv:1705.02459, 2017. [tech rep]

Consider estimating the G-formula for the counterfactual mean outcome under a given treatment regime in a longitudinal study. Bang and Robins provided an estimator for this quantity that relies on a sequential regression formulation of this parameter. This approach is doubly robust in that it is consistent if either the outcome regressions or the treatment mechanisms are consistently estimated. We define a stronger notion of double robustness, termed sequential double robustness, for estimators of the longitudinal G-formula. The definition emerges naturally from a more general definition of sequential double robustness for the outcome regression estimators. An outcome regression estimator is sequentially doubly robust (SDR) if, at each subsequent time point, either the outcome regression or the treatment mechanism is consistently estimated. This form of robustness is exactly what one would anticipate is attainable by studying the remainder term of a first-order expansion of the G-formula parameter. We show that a particular implementation of an existing procedure is SDR. We also introduce a novel SDR estimator, whose development involves a novel translation of ideas used in targeted minimum loss-based estimation to the infinite-dimensional setting.

**A. R. Luedtke**and A. Chambaz, "Faster rates for policy learning," under review, arXiv:1704.06431, 2017. [tech rep]

This article improves the existing proven rates of regret decay in optimal policy estimation. We give a margin-free result showing that the regret decay for estimating a within-class optimal policy is second-order for empirical risk minimizers over Donsker classes, with regret decaying at a faster rate than the standard error of an efficient estimator of the value of an optimal policy. We also give a result from the classification literature that shows that faster regret decay is possible via plug-in estimation provided a margin condition holds. Four examples are considered. In these examples, the regret is expressed in terms of either the mean value or the median value; the number of possible actions is either two or finitely many; and the sampling scheme is either independent and identically distributed or sequential, where the latter represents a contextual bandit sampling scheme.

**A. R. Luedtke**and P. B. Gilbert. "Partial bridging of vaccine efficacy to new populations," under review, Tech. rep. arXiv:1701.06739. 2017. [tech rep]

Suppose one has data from one or more completed vaccine efficacy trials and wishes to estimate the efficacy in a new setting. Often logistical or ethical considerations make running another efficacy trial impossible. Fortunately, if there is a biomarker that is the primary modifier of efficacy, then the biomarker-conditional efficacy may be identical in the completed trials and the new setting, or at least informative enough to meaningfully bound this quantity. Given a sample of this biomarker from the new population, we might hope we can bridge the results of the completed trials to estimate the vaccine efficacy in this new population. Unfortunately, even knowing the true conditional efficacy in the new population fails to identify the marginal efficacy due to the unknown conditional unvaccinated risk. We define a curve that partially identifies (lower bounds) the marginal efficacy in the new population as a function of the population's marginal unvaccinated risk, under the assumption that one can identify bounds on the conditional unvaccinated risk in the new population. Interpreting the curve only requires identifying plausible regions of the marginal unvaccinated risk in the new population. We present a nonparametric estimator of this curve and develop valid lower confidence bounds that concentrate at a parametric rate. We use vaccine terminology throughout, but the results apply to general binary interventions and bounded outcomes.

# Publications

R. C. Kessler, R. M. Bossarte,

**A. Luedtke**, A. M. Zaslavsky, and J. R. Zubizarreta, "Suicide prediction models: a critical review of recent research with recommendations for the way forward,"*Molecular Psychiatry*, 2019. [link]
Suicide is a leading cause of death. A substantial proportion of the people who die by suicide come into contact with the health care system in the year before their death. This observation has resulted in the development of numerous suicide prediction tools to help target patients for preventive interventions. However, low sensitivity and low positive predictive value have led critics to argue that these tools have no clinical value. We review these tools and critiques here. We conclude that existing tools are suboptimal and that improvements, if they can be made, will require developers to work with more comprehensive predictor sets, staged screening designs, and advanced statistical analysis methods. We also conclude that although existing suicide prediction tools currently have little clinical value, and in some cases might do more harm than good, an even-handed assessment of the potential value of refined tools of this sort cannot currently be made because such an assessment would depend on evidence that currently does not exist about the effectiveness of preventive interventions. We argue that the only way to resolve this uncertainty is to link future efforts to develop or evaluate suicide prediction tools with concrete questions about specific clinical decisions aimed at reducing suicides and to evaluate the clinical value of these tools in terms of net benefit rather than sensitivity or positive predictive value. We also argue for a focus on the development of individualized treatment rules to help select the right suicide-focused treatments for the right patients at the right times. Challenges will exist in doing this because of the rarity of suicide even among patients considered high-risk, but we offer practical suggestions for how these challenges can be addressed.

R. C. Kessler, R. M. Bossarte,

**A. Luedtke**, A. M. Zaslavsky, and J. R. Zubizarreta, "Machine learning methods for developing precision treatment rules with observational data,"*Behaviour Research and Therapy*, 2019. [link]
Clinical trials have identified a variety of predictor variables for use in precision treatment protocols, ranging from clinical biomarkers and symptom profiles to self-report measures of various sorts. Although such variables are informative collectively, none has proven sufficiently powerful to guide optimal treatment selection individually. This has prompted growing interest in the development of composite precision treatment rules (PTRs) that are constructed by combining information across a range of predictors. But this work has been hampered by the generally small samples in randomized clinical trials and the use of suboptimal analysis methods to analyze the resulting data. In this paper, we propose to address the sample size problem by: working with large observational electronic medical record databases rather than controlled clinical trials to develop preliminary PTRs; validating these preliminary PTRs in subsequent pragmatic trials; and using ensemble machine learning methods rather than individual algorithms to carry out statistical analyses to develop the PTRs. The major challenges in this proposed approach are that treatment are not randomly assigned in observational databases and that these databases often lack measures of key prescriptive predictors and mental disorder treatment outcomes. We proposed a tiered case-cohort design approach that uses innovative methods for measuring and balancing baseline covariates and estimating PTRs to address these challenges.

**A. R. Luedtke**, E. Kaufmann, and A. Chambaz, "Asymptotically optimal algorithms for budgeted multiple play bandits,"

*Machine Learning*, 2019. [link] [tech rep]

We study a generalization of the multi-armed bandit problem with multiple plays where there is a cost associated with pulling each arm and the agent has a budget at each time that dictates how much she can expect to spend. We derive an asymptotic regret lower bound for any uniformly efficient algorithm in our setting. We then study a variant of Thompson sampling for Bernoulli rewards and a variant of KL-UCB for both single-parameter exponential families and bounded, finitely supported rewards. We show these algorithms are asymptotically optimal, both in rate and leading problem-dependent constants, including in the thick margin setting where multiple arms fall on the decision boundary.

P. B. Gilbert, Y. Huang, M. Juraska, Z. Moodie, Y. Fong,

**A. R. Luedtke**, et al., "Bridging efficacy of a tetravalent dengue vaccine from children/adolescents to adults in high endemic countries based on neutralizing antibody response,"*American Journal of Tropical Medicine and Hygiene*, 2019.
H. Qiu,

**A. Luedtke**, M. J. van der Laan, "Comment on `Entropy Learning for Dynamic Treatment Regimes’ by Binyan Jiang, Rui Song, et al.,"*Statistica Sinica*(in press), 2019. [link]
T. J. VanderWeele,

**A. R. Luedtke**, M. J. van der Laan, R. C. Kessler, "Selecting optimal subgroups for treatment using many covariates,"*Epidemiology*, 2019. [link] [tech rep]
We consider the problem of selecting the optimal subgroup to treat when data on covariates is available from a randomized trial or observational study. We distinguish between four different settings including (i) treatment selection when resources are constrained, (ii) treatment selection when resources are not constrained, (iii) treatment selection in the presence of side effects and costs, and (iv) treatment selection to maximize effect heterogeneity. We show that, in each of these cases, the optimal treatment selection rule involves treating those for whom the predicted mean difference in outcomes comparing those with versus without treatment, conditional on covariates, exceeds a certain threshold. The threshold varies across these four scenarios but the form of the optimal treatment selection rule does not. The results suggest a move away from traditional subgroup analysis for personalized medicine. New randomized trial designs are proposed so as to implement and make use of optimal treatment selection rules in health care practice.

L. Wang,

**A. R. Luedtke**, and Y. Huang, "Assessing the incremental value of new biomarkers based on OR rules,"*Biostatistics*, 2018. [tech rep]
In early detection of disease, a single biomarker often has inadequate classification performance, making it important to identify new biomarkers to combine with the existing marker for improved performance. A biologically natural method to combine biomarkers is to use logic rules, e.g. the OR/AND rules. In our motivating example of early detection of pancreatic cancer, the established biomarker CA19-9 is only present in a subclass of cancer; it is of interest to identify new biomarkers present in the other subclasses and declare disease when either marker is positive. While there has been research on developing biomarker combinations using the OR/AND rules, the inference regarding the incremental value of the new marker within this framework is lacking and challenging due to a statistical non-regularity. In this paper, we aim to answer the inferential question of whether combining the new biomarker achieves better classification performance than using the existing biomarker alone, based on a nonparametrically estimated OR rule that maximizes the weighted average of sensitivity and specificity. We propose and compare various procedures for testing the incremental value of the new biomarker and constructing its confidence interval, using bootstrap, cross-validation, and a novel fuzzy p-value-based technique. We compare the performance of different methods via extensive simulation studies and apply them to the pancreatic cancer example.

**A. Luedtke**, E. Sadikova, R. C. Kessler. "Sample size requirements for multivariate models to predict between-patient differences in best treatments of major depressive disorder,"

*Clinical Psychological Science*, 2019. [link]

Clinical trials have documented numerous clinical features, social characteristics, and biomarkers that are “prescriptive” predictors of depression treatment response, that is, predictors of which types of treatments are best for which patients. On the basis of these results, research is actively under way to develop multivariate prescriptive prediction models to guide precision depression treatment planning. However, the sample size requirements for such models have not been analyzed. We present such an analysis here. Simulations using realistic parameter values and a state-of-the-art cross-validated targeted minimum loss-based prescription treatment response estimator show that at least 300 patients per treatment arm are needed to have adequate statistical power to detect clinically significant underlying marginal improvements in treatment response because of precision treatment selection. This is a considerably larger sample size than in most existing studies. We close with a discussion of practical study design options to address the need for larger sample sizes in future studies.

**A. R. Luedtke**, M. Carone, and M. J. van der Laan, "An omnibus test of equality in distribution for unknown functions,"

*Journal of the Royal Statistical Society: Series B*, 2019. [link] [tech rep]

We present a novel family of nonparametric omnibus tests of the hypothesis that two unknown but estimable functions are equal in distribution when applied to the observed data structure. We developed these tests, which represent a generalization of the maximum mean discrepancy tests described in Gretton et al. [2006], using recent developments from the higher-order pathwise differentiability literature. Despite their complex derivation, the associated test statistics can be expressed rather simply as U-statistics. We study the asymptotic behavior of the proposed tests under the null hypothesis and under both fixed and local alternatives. We provide examples to which our tests can be applied and show that they perform well in a simulation study. As an important special case, our proposed tests can be used to determine whether an unknown function, such as the conditional average treatment effect, is equal to zero almost surely.

P. B. Gilbert and

**A. R. Luedtke**. “Statistical Learning Methods to Determine Immune Correlates of Herpes Zoster in Vaccine Efficacy Trials”.*The Journal of Infectious Diseases*, 2018. [link]
Using Super Learner, a machine learning statistical method, we assessed varicella zoster virus-specific glycoprotein-based enzyme-linked immunosorbent assay (gpELISA) antibody titer as an individual-level signature of herpes zoster (HZ) risk in the Zostavax Efficacy and Safety Trial. Gender and pre- and postvaccination gpELISA titers had moderate ability to predict whether a 50–59 year old experienced HZ over 1–2 years of follow-up, with equal classification accuracy (cross-validated area under the receiver operator curve = 0.65) for vaccine and placebo recipients. Previous analyses suggested that fold-rise gpELISA titer is a statistical correlate of protection and supported the hypothesis that it is not a mechanistic correlate of protection. Our results also support this hypothesis.

S. Sridhar,

**A. Luedtke**, E. Langevin, M. Zhu, M. Bonaparte, et al. “Effect of Dengue Serostatus on Dengue Vaccine Safety and Efficacy”.*New England Journal of Medicine*, 2018. [link]
M. Carone,

**A. R. Luedtke**, and M. J. van der Laan, "Toward computerized efficient estimation in infinite-dimensional models,"*Journal of the American Statistical Association*, 2018. [link] [tech rep]
Despite the risk of misspecification they are tied to, parametric models continue to be used in statistical practice because they are accessible to all. In particular, efficient estimation procedures in parametric models are simple to describe and implement. Unfortunately, the same cannot be said of semiparametric and nonparametric models. While the latter often reflect the level of available scientific knowledge more appropriately, performing efficient inference in these models is generally challenging. The efficient influence function is a key analytic object from which the construction of asymptotically efficient estimators can potentially be streamlined. However, the theoretical derivation of the efficient influence function requires specialized knowledge and is often a difficult task, even for experts. In this paper, we propose and discuss a numerical procedure for approximating the efficient influence function. The approach generalizes the simple nonparametric procedures described recently by Frangakis et al. (2015) and Luedtke et al. (2015) to arbitrary models. We present theoretical results to support our proposal, and also illustrate the method in the context of two examples. The proposed approach is an important step toward automating efficient estimation in general statistical models, thereby rendering the use of realistic models in statistical analyses much more accessible.

J. M. Platt, K. A. McLaughlin,

**A. R. Luedtke**, J. Ahern, A. S. Kaufman, and K. M. Keyes. “Targeted estimation of the relationship between childhood adversity and fluid intelligence in a U.S. population sample of adolescents”.*American Journal of Epidemiology*, 2018. [link]
Many studies have shown inverse associations between childhood adversity and intelligence (IQ), though most are based on small clinical samples and fail to account for the effects of multiple co-occurring adversities. Using data from the 2001–2004 National Comorbidity Survey Adolescent Supplement, a cross-sectional US population study of adolescents ages 13–18 (n = 10,073), we examined the associations between 11 childhood adversities on IQ, using targeted maximum likelihood estimation. Targeted maximum likelihood estimation incorporates machine-learning to identify the relationships between exposures and outcomes without over-fitting, including interactions and non-linearity. The Kaufman Brief Intelligence Test nonverbal score was used as a standardized measure of fluid reasoning. Child adversities were grouped into deprivation- and threat-types based on recent conceptual models. Adjusted marginal mean differences compared the mean IQ score if all adolescents experienced each adversity to the mean in the absence of the adversity. The largest associations were observed for deprivation-type experiences, including poverty and low parental education, which were related to reduced IQ. Though lower in magnitude, threat events related to IQ included physical abuse and witnessing domestic violence. Violence prevention and poverty-reduction measures would improve childhood cognitive outcomes.

**A. R. Luedtke**and M. J. van der Laan, "Evaluating the impact of treating the optimal subgroup,"

*Statistical Methods in Medical Research*, 2017. [link] [tech rep]

Suppose we have a binary treatment used to influence an outcome. Given data from an observational or controlled study, we wish to determine whether or not there exists some subset of observed covariates in which the treatment is more effective than the standard practice of no treatment. Furthermore, we wish to quantify the improvement in population mean outcome that will be seen if this subgroup receives treatment and the rest of the population remains untreated. We show that this problem is surprisingly challenging given how often it is an (at least implicit) study objective. Blindly applying standard techniques fails to yield any apparent asymptotic results, while using existing techniques to confront the non-regularity does not necessarily help at distributions where there is no treatment effect. Here we describe an approach to estimate the impact of treating the subgroup which benefits from treatment that is valid in a nonparametric model and is able to deal with the case where there is no treatment effect. The approach is a slight modification of an approach that recently appeared in the individualized medicine literature.

**A. R. Luedtke**and M. J. van der Laan, "Parametric-rate inference for one-sided differentiable parameters,"

*Journal of the American Statistical Association*, 2017. [link] [tech rep]

Suppose one has a collection of parameters indexed by a (possibly infinite dimensional) set. Given data generated from some distribution, the objective is to estimate the maximal parameter in this collection evaluated at this distribution. This estimation problem is typically non-regular when the maximizing parameter is non-unique, and as a result standard asymptotic techniques generally fail in this case. We present a technique for developing parametric-rate confidence intervals for the quantity of interest in these non-regular settings. We show that our estimator is asymptotically efficient when the maximizing parameter is unique so that regular estimation is possible.
We apply our technique to a recent example from the literature in which one wishes to report the maximal absolute correlation between a prespecified outcome and one of

*p*predictors. The simplicity of our technique enables an analysis of the previously open case where*p*grows with sample size. Specifically, we only require that log*p*grows slower than the square root of*n*, where*n*is the sample size. We show that, unlike earlier approaches, our method scales to massive data sets: the point estimate and confidence intervals can be constructed in O(*np*) time.**A. R. Luedtke**and M. J. van der Laan, "Comment,"

*Journal of the American Statistical Association*, 2016. [link]

**A. R. Luedtke**and M. J. van der Laan, "Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy,"

*Annals of Statistics*, 2016. [link]

We consider challenges that arise in the estimation of the mean outcome under an optimal individualized treatment strategy defined as the treatment rule that maximizes the population mean outcome, where the candidate treatment rules are restricted to depend on baseline covariates. We prove a necessary and sufficient condition for the pathwise differentiability of the optimal value, a key condition needed to develop a regular and asymptotically linear (RAL) estimator of the optimal value. The stated condition is slightly more general than the previous condition implied in the literature. We then describe an approach to obtain root-n rate confidence intervals for the optimal value even when the parameter is not pathwise differentiable. We provide conditions under which our estimator is RAL and asymptotically efficient when the mean outcome is pathwise differentiable. We also outline an extension of our approach to a multiple time point problem. All of our results are supported by simulations.

**A. R. Luedtke**and M. J. van der Laan, "Optimal individualized treatments in resource-limited settings,"

*International Journal of Biostatistics*, 2016. [tech rep]

A dynamic treatment rule (DTR) is a treatment rule which assigns treatments to individuals based on (a subset of) their measured covariates. An optimal DTR is the DTR which maximizes the population mean outcome. Previous works in this area have assumed that treatment is an unlimited resource so that the entire population can be treated if this strategy maximizes the population mean outcome. We consider optimal DTRs in settings where the treatment resource is limited so that there is a maximum proportion of the population which can be treated. We give a general closed-form expression for an optimal stochastic DTR in this resource-limited setting, and a closed-form expression for the optimal deterministic DTR under an additional assumption. We also present an estimator of the mean outcome under the optimal stochastic DTR in a large semiparametric model that at most places restrictions on the probability of treatment assignment given covariates. We give conditions under which our estimator is efficient among all regular and asymptotically linear estimators. All of our results are supported by simulations.

**A. R. Luedtke**and M. J. van der Laan, "Super-learning of an optimal dynamic treatment rule,"

*International Journal of Biostatistics*, 2016. [tech rep]

We consider the estimation of an optimal dynamic two time-point treatment rule defined as the rule that maximizes the mean outcome under the dynamic treatment, where the candidate rules are restricted to depend only on a user-supplied subset of the baseline and intermediate covariates. This estimation problem is addressed in a statistical model for the data distribution that is nonparametric, beyond possible knowledge about the treatment and censoring mechanisms. We propose data adaptive estimators of this optimal dynamic regime which are defined by sequential loss-based learning under both the blip function and weighted classification frameworks. Rather than

*a priori*selecting an estimation framework and algorithm, we propose combining estimators from both frameworks using a super-learning based cross-validation selector that seeks to minimize an appropriate cross-validated risk. One of the proposed risks directly measures the performance of the mean outcome under the optimal rule. The resulting selector is guaranteed to asymptotically perform as well as the best convex combination of candidate algorithms in terms of loss-based dissimilarity under conditions. We offer simulation results to support our theoretical findings. This work expands upon that of an earlier technical report (van der Laan, 2013) with new results and simulations, and is accompanied by a work which develops inference for the mean outcome under the optimal rule (van der Laan and Luedtke, 2014).
J. Ahern, D. Kasarek,

**A. R. Luedtke**, T. A. Bruckner, and M. J. van der Laan, "Racial/ethnic differences in the role of childhood adversities for mental health disorders among a nationally representative sample of adolescents,"*Epidemiology*, 2016.**Background:**Childhood adversities may play a key role in the onset of mental disorders and influence patterns by race/ethnicity. We examined the relations between childhood adversities and mental disorders by race/ethnicity in the National Comorbidity Survey-Adolescent Supplement (NCS-A).

**Methods:**Using targeted maximum likelihood estimation (TMLE), a rigorous and flexible estimation procedure, we estimated the relation of each adversity with mental disorders (behavior, distress, fear and substance use), and estimated the distribution of disorders by race/ethnicity in the absence of adversities. TMLE addresses the challenge of a multidimensional exposure such as a set of adversities because it facilitates "learning" from the data the strength of the relations between each adversity and outcome, incorporating any interactions or non-linearity, specific to each racial/ethnic group. Cross-validation is used to select the best model without over- fitting.

**Results:**Among adversities, physical abuse, emotional abuse, and sexual abuse had the strongest associations with mental disorders. Of all outcomes, behavior disorders were most strongly associated with adversities. Our comparisons of observed mental disorder prevalences to estimates in the absence of adversities suggest lower prevalences of behavior disorders across all racial/ethnic groups. Estimates for distress disorders and substance use disorders varied in magnitude among groups, but some estimates were imprecise. Interestingly, results suggest the adversities examined here do not play a major role in patterns of racial/ethnic differences in mental disorders.

**Conclusions:**Although causal interpretation relies on assumptions, growing work on this topic suggests interventions that reduce or mitigate childhood adversities might prevent mental disorder development in adolescents.

**A. R. Luedtke**, M. Carone, and M. J. van der Laan, "Discussion of 'Deductive derivation and Turing-computerization of semiparametric efficient estimation' by Frangakis et al.,

*Biometrics*, 2015. [link]

Comment on:

C. E. Frangakis, T. Qian, Z. Wu, and I. DÃaz, "Deductive derivation and Turing-computerization of semiparametric efficient estimation,"

Rejoinder:

C. E. Frangakis, T. Qian, Z. Wu, and I. DÃaz, "Rejoinder to 'Discussions on: Deductive derivation and turing-computerization of semiparametric efficient estimation',"

C. E. Frangakis, T. Qian, Z. Wu, and I. DÃaz, "Deductive derivation and Turing-computerization of semiparametric efficient estimation,"

*Biometrics*, 2015. [link]Rejoinder:

C. E. Frangakis, T. Qian, Z. Wu, and I. DÃaz, "Rejoinder to 'Discussions on: Deductive derivation and turing-computerization of semiparametric efficient estimation',"

*Biometrics*, 2015. [link]
M. J. van der Laan and

**A. R. Luedtke**, "Targeted learning of the mean outcome under an optimal dynamic treatment rule,"*Journal of Causal Inference*, 2015. [link]
We consider estimation of and inference for the mean outcome under the optimal dynamic two time-point treatment rule defined as the rule that maximizes the mean outcome under the dynamic treatment, where the candidate rules are restricted to depend only on a user-supplied subset of the baseline and intermediate covariates. This estimation problem is addressed in a statistical model for the data distribution that is nonparametric beyond possible knowledge about the treatment and censoring mechanism. This contrasts from the current literature that relies on parametric assumptions. We establish that the mean of the counterfactual outcome under the optimal dynamic treatment is a pathwise differentiable parameter under conditions, and develop a targeted minimum loss-based estimator (TMLE) of this target parameter. We establish asymptotic linearity and statistical inference for this estimator under specified conditions. In a sequentially randomized trial the statistical inference relies upon a second-order difference between the estimator of the optimal dynamic treatment and the optimal dynamic treatment to be asymptotically negligible, which may be a problematic condition when the rule is based on multivariate time-dependent covariates. To avoid this condition, we also develop TMLEs and statistical inference for data adaptive target parameters that are defined in terms of the mean outcome under the estimate of the optimal dynamic treatment. In particular, we develop a novel cross-validated TMLE approach that provides asymptotic inference under minimal conditions, avoiding the need for any empirical process conditions. We offer simulation results to support our theoretical findings.

M. J. van der Laan,

**A. R. Luedtke**, and I Díaz, "Discussion of 'Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data', by Jessica Young, Miguel Hernan, and James Robins,"*Epidemiologic Methods*, 2014. [link]
Young, HernÃ¡n, and Robins consider the mean outcome under a dynamic intervention that may rely on the natural value of treatment. They first identify this value with a statistical target parameter, and then show that this statistical target parameter can also be identified with a causal parameter which gives the mean outcome under a stochastic intervention. The authors then describe estimation strategies for these quantities. Here we augment the authors' insightful discussion by sharing our experiences in situations where two causal questions lead to the same statistical estimand, or the newer problem that arises in the study of data adaptive parameters, where two statistical estimands can lead to the same estimation problem. Given a statistical estimation problem, we encourage others to always use a robust estimation framework where the data generating distribution truly belongs to the statistical model. We close with a discussion of a framework which has these properties.

A. Beck,

**A. Luedtke**, K. Liu, and N. Tintle, "A powerful method for including genotype uncertainty in tests of Hardy-Weinberg equilibrium," in*Pacific Symposium on Biocomputing*, 2016, p. 368. [link]
The use of posterior probabilities to summarize genotype uncertainty is pervasive across genotype, sequencing and imputation platforms. Prior work in many contexts has shown the utility of incorporating genotype uncertainty (posterior probabilities) in downstream statistical tests. Typical approaches to incorporating genotype uncertainty when testing Hardy-Weinberg equilibrium tend to lack calibration in the type I error rate, especially as genotype uncertainty increases. We propose a new approach in the spirit of genomic control that properly calibrates the type I error rate, while yielding improved power to detect deviations from Hardy-Weinberg Equilibrium. We demonstrate the improved performance of our method on both simulated and real genotypes.

B. Greco,

**A. R. Luedtke**, A Hainline, C Alvarez, A Beck, and N. L. Tintle, "Application of family-based tests of association for rare variants to pathways," in*BMC Proceedings*, 2014. [link]
Pathway analysis approaches for sequence data typically either operate in a single stage (all variants within all genes in the pathway are combined into a single, very large set of variants that can then be analyzed using standard "gene-based" test statistics) or in 2-stages (gene-based p values are computed for all genes in the pathway, and then the gene-based p values are combined into a single pathway p value). To date, little consideration has been given to the performance of gene-based tests (typically designed for a smaller number of single-nucleotide variants [SNVs]) when the number of SNVs in the gene or in the pathway is very large and the genotypes come from sequence data organized in large pedigrees. We consider recently proposed gene-based tests for rare variants from complex pedigrees that test for association between a large set of SNVs and a qualitative phenotype of interest (1-stage analyses) as well as 2-stage approaches. We find that many of these methods show inflated type I errors when the number of SNVs in the gene or the pathway is large (>200 SNVs) and when using standard approaches to estimate the genotype covariance matrix. Alternative methods are needed when testing very large sets of SNVs in 1-stage approaches.

A. Hainline, C. Alvarez,

**A. R. Luedtke**, B. Greco, A. Beck, and N. L. Tintle, "Evaluation of the power and type I error of recently proposed family-based tests of association for rare variants," in*BMC Proceedings*, 2014. [link]
Until very recently, few methods existed to analyze rare-variant association with binary phenotypes in complex pedigrees. We consider a set of recently proposed methods applied to the simulated and real hypertension phenotype as part of the Genetic Analysis Workshop 18. Minimal power of the methods is observed for genes containing variants with weak effects on the phenotype. Application of the methods to the real hypertension phenotype yielded no genes meeting a strict Bonferroni cutoff of significance. Some prior literature connects 3 of the 5 most associated genes (p < 10

^{−4}) to hypertension or related phenotypes. Further methodological development is needed to extend these methods to handle covariates, and to explore more powerful test alternatives.
K. Liu,

**A. R. Luedtke**, and N. L. Tintle, "Optimal methods for using posterior probabilities in association testing,"*Human Heredity*, 2013. [link]**Objective:**The use of haplotypes to impute the genotypes of unmeasured single nucleotide variants continues to rise in popularity. Simulation results suggest that the use of the dosage as a one-dimensional summary statistic of imputation posterior probabilities may be optimal both in terms of statistical power and computational efficiency, however little theoretical understanding is available to explain and unify these simulation results. In our analysis, we provide a theoretical foundation for the use of the dosage as a one-dimensional summary statistic of genotype posterior probabilities from any technology.

**Methods:**We analytically evaluate the dosage, mode and the more general set of all one-dimensional summary statistics of two-dimensional (three posterior probabilities that must sum to 1) genotype posterior probability vectors.

**Results:**We prove that the dosage is an optimal one-dimensional summary statistic under a typical linear disease model and is robust to violations of this model. Simulation results confirm our theoretical findings.

**Conclusions:**Our analysis provides a strong theoretical basis for the use of the dosage as a one-dimensional summary statistic of genotype posterior probability vectors in related tests of genetic association across a wide variety of genetic disease models.

**A. R. Luedtke**and H. P. Stahl, "Commentary on multivariable parametric cost model for ground optical telescope assembly,"

*Optical Engineering*, 2012. [link]

In 2005, Stahl et al. published a multivariable parametric cost model for ground telescopes that included primary mirror diameter, diffraction-limited wavelength, and year of development. The model also included a factor for primary mirror segmentation and/or duplication. While the original multivariable model is still relevant, we better explain the rationale behind the model and present a model framework that may better account for prescription similarities. Also, we correct the single-variable diameter model presented in the 2005 Stahl paper with the addition of a leading multiplier.

**A. R. Luedtke**, S. Powers, A. Petersen, A. Sitarik, A. Bekmetjev, and N. L. Tintle, "Evaluating methods for the analysis of rare variants in sequence data," in

*BMC proceedings*, 2011. [link]

A number of rare variant statistical methods have been proposed for analysis of the impending wave of next-generation sequencing data. To date, there are few direct comparisons of these methods on real sequence data. Furthermore, there is a strong need for practical advice on the proper analytic strategies for rare variant analysis. We compare four recently proposed rare variant methods (combined multivariate and collapsing, weighted sum, proportion regression, and cumulative minor allele test) on simulated phenotype and next-generation sequencing data as part of Genetic Analysis Workshop 17. Overall, we find that all analyzed methods have serious practical limitations on identifying causal genes. Specifically, no method has more than a 5% true discovery rate (percentage of truly causal genes among all those identified as significantly associated with the phenotype). Further exploration shows that all methods suffer from inflated false-positive error rates (chance that a noncausal gene will be identified as associated with the phenotype) because of population stratification and gametic phase disequilibrium between noncausal SNPs and causal SNPs. Furthermore, observed true-positive rates (chance that a truly causal gene will be identified as significantly associated with the phenotype) for each of the four methods was very low (<19%). The combination of larger than anticipated false-positive rates, low true-positive rates, and only about 1% of all genes being causal yields poor discriminatory ability for all four methods. Gametic phase disequilibrium and population stratification are important areas for further research in the analysis of rare variant data.

A. Petersen, A. Sitarik,

**A. R. Luedtke**, S. Powers, A. Bekmetjev, and N. L. Tintle, "Evaluating methods for combining rare variant data in pathway-based tests of genetic association," in*BMC proceedings*, 2011. [link]
Analyzing sets of genes in genome-wide association studies is a relatively new approach that aims to capitalize on biological knowledge about the interactions of genes in biological pathways. This approach, called pathway analysis or gene set analysis, has not yet been applied to the analysis of rare variants. Applying pathway analysis to rare variants offers two competing approaches. In the first approach rare variant statistics are used to generate p-values for each gene (e.g., combined multivariate collapsing [CMC] or weighted-sum [WS]) and the gene-level p-values are combined using standard pathway analysis methods (e.g., gene set enrichment analysis or Fisher's combined probability method). In the second approach, rare variant methods (e.g., CMC and WS) are applied directly to sets of single-nucleotide polymorphisms (SNPs) representing all SNPs within genes in a pathway. In this paper we use simulated phenotype and real next-generation sequencing data from Genetic Analysis Workshop 17 to analyze sets of rare variants using these two competing approaches. The initial results suggest substantial differences in the methods, with Fisher's combined probability method and the direct application of the WS method yielding the best power. Evidence suggests that the WS method works well in most situations, although Fisher's method was more likely to be optimal when the number of causal SNPs in the set was low but the risk of the causal SNPs was high.

H. P. Stahl, T. Henrichs,

**A. R. Luedtke**, and M. West, “Update on parametric cost models for space telescopes,” in*SPIE Optical Engineering+ Applications*, International Society for Optics and Photonics, 2011. [link]
Parametric cost models can be used by designers and project managers to perform relative cost comparisons between major architectural cost drivers and allow high-level design trades; enable cost-benefit analysis for technology development investment; and, provide a basis for estimating total project cost between related concepts. This paper reports on recent revisions and improvements to our ground telescope cost model and refinements of our understanding of space telescope cost models. One interesting observation is that while space telescopes are 50X to 100X more expensive than ground telescopes, their respective scaling relationships are similar. Another interesting speculation is that the role of technology development may be different between ground and space telescopes. For ground telescopes, the data indicates that technology development tends to reduce cost by approximately 50% every 20 years. But for space telescopes, there appears to be no such cost reduction because we do not tend to re-fly similar systems. Thus, instead of reducing cost, 20 years of technology development may be required to enable a doubling of space telescope capability. Other findings include: mass should not be used to estimate cost; spacecraft and science instrument costs account for approximately 50% of total mission cost; and, integration and testing accounts for only about 10% of total mission cost.

# Book Chapters

R. C. Kessler, S. L. Bernecker, R. M. Bossarte,

**A. R. Luedtke**, J. F. McCarthy, M. K. Nock, W. R. Pigeon, M. V. Petukhova, E. Sadikova, T. J. VanderWeele, K. L. Zuromski, A. M. Zaslavsky. The Role of Big Data Analytics in Predicting Suicide. In:*Personalized and Predictive Psychiatry- Big Data Analytics in Mental Health*. Ed. by I C Passos, B Mwangi, F Kapczinski. Springer Nature.
M J van der Laan, A Bibaut, and

**A. R. Luedtke**. “CV-TMLE for Nonpathwise Differentiable Target Parameters”. In:*Targeted Learning in Data Science*. Ed. by M J van der Laan and S Rose. New York: Springer, New York, 2018. Chap. 25.
I. Díaz,

**A. R. Luedtke**, and M. J. van der Laan. “Sensitivity Analysis”. In:*Targeted Learning in Data Science*. Ed. by M J van der Laan and S Rose. New York: Springer, New York, 2018. Chap. 27.*Note:*Some of this text appears in the technical report “The statistics of sensitivity analyses”.# Technical Reports

**A. R. Luedtke**, I. Díaz, and M. J. van der Laan, "The statistics of sensitivity analyses," Division of Biostatistics, University of California, Berkeley, Tech. Rep. 341, 2015. [tech rep]

*Note:*Some of this text appears in Chap. 27 of

*Targeted Learning in Data Science*.

Suppose one wishes to estimate a causal parameter given a sample of observations. This requires making unidentifiable assumptions about an underlying causal mechanism. Sensitivity analyses help investigators understand what impact violations of these assumptions could have on the causal conclusions drawn from a study, though themselves rely on untestable (but hopefully more interpretable) assumptions. DÃaz and van der Laan (2013) advocate the use of a sequence (or continuum) of interpretable untestable assumptions of increasing plausibility for the sensitivity analysis so that experts can have informed opinions about which are true. In this work, we argue that using appropriate statistical procedures when conducting a sensitivity analysis is crucial to drawing valid conclusions about a causal question and understanding what assumptions one would need to make to do so. Conducting a sensitivity analysis typically relies on estimating features of the unknown observed data distribution, and thus naturally leads to statistical problems about which optimality results are already known. We present a general template for efficiently estimating the bounds on the causal parameter resulting from a given untestable assumption. The sequence of assumptions yields a sequence of confidence intervals which, given a suitable statistical procedure, attain proper coverage for the causal parameter if the corresponding assumption is true. We illustrate the pitfalls of an inappropriate statistical procedure with a toy example, and apply our approach to data from the Western Collaborative Group Study to show its utility in practice.

M. J. van der Laan, M. Carone, and

**A. R. Luedtke**, "Computerizing Efficient Estimation of a Pathwise Differentiable Target Parameter," Tech. Rep. 340, 2015. [tech rep]
Frangakis et al. (2015) proposed a numerical method for computing the efficient influence function of a parameter in a nonparametric model at a specified distribution and observation (provided such an influence function exists). Their approach is based on the assumption that the efficient influence function is given by the directional derivative of the target parameter mapping in the direction of a perturbation of the data distribution defined as the convex line from the data distribution to a pointmass at the observation. In our discussion paper Luedtke et al. (2015) we propose a regularization of this procedure and establish the validity of this method in great generality. In this article we propose a generalization of the latter regularized numerical delta method for computing the efficient influence function for general statistical models, and formally establish its validity under appropriate regularity conditions. Our proposed method consists of applying the regularized numerical delta-method for nonparametrically-defined target parameters proposed in Luedtke et al. 2015 to the nonparametrically-defined maximum likelihood mapping that maps a data distribution (normally the empirical distribution) into its Kullback-Leibler projection onto the model. This method formalizes the notion that an algorithm for computing a maximum likelihood estimator also yields an algorithm for computing the efficient influence function at a user-supplied data distribution. We generalize this method to a minimum loss-based mapping. We also show how the method extends to compute the higher-order efficient influence function at an observation pair for higher-order pathwise differentiable target parameters. Finally, we propose a new method for computing the efficient influence function as a whole curve by applying the maximum likelihood mapping to a perturbation of the data distribution with score equal to an initial gradient of the pathwise derivative. We demonstrate each method with a variety of examples.

M. J. van der Laan and

**A. R. Luedtke**, "Targeted learning of an optimal dynamic treatment, and statistical inference for its mean outcome," UC Berkeley, Tech. Rep. 329, 2014. [tech rep]
Suppose we observe n independent and identically distributed observations of a time-dependent random variable consisting of baseline covariates, initial treatment and censoring indicator, intermediate covariates, subsequent treatment and censoring indicator, and a final outcome. For example, this could be data generated by a sequentially randomized controlled trial, where subjects are sequentially randomized to a first line and second line treatment, possibly assigned in response to an intermediate biomarker, and are subject to right-censoring. In this article we consider estimation of an optimal dynamic multiple time-point treatment rule defined as the rule that maximizes the mean outcome under the dynamic treatment, where the candidate rules are restricted to only respond to a user-supplied subset of the baseline and intermediate covariates. This estimation problem is addressed in a statistical model for the data distribution that is nonparametric beyond possible knowledge about the treatment and censoring mechanism, while still providing statistical inference for the mean outcome under the optimal rule. This contrasts from the current literature that relies on parametric assumptions. For the sake of presentation, we first consider the case that the treatment/censoring is only assigned at a single time-point, and subsequently, we cover the multiple time-point case. We characterize the optimal dynamic treatment as a statistical target parameter in the nonparametric statistical model, and we propose highly data adaptive estimators of this optimal dynamic regimen, utilizing sequential loss-based super-learning of sequentially defined (so called) blip-functions, based on newly proposed loss-functions. We also propose a cross-validation selector ... [rest of abstract]

**A. R. Luedtke**and L. Tran, "The generalized mean information coefficient," arXiv:1308.5712, 2013. [tech rep]

Reshef & Reshef recently published a paper in which they present a method called the Maximal Information Coefficient (MIC) that can detect all forms of statistical dependence between pairs of variables as sample size goes to infinity. While this method has been praised by some, it has also been criticized for its lack of power in finite samples. We seek to modify MIC so that it has higher power in detecting associations for limited sample sizes. Here we present the Generalized Mean Information Coefficient (GMIC), a generalization of MIC which incorporates a tuning parameter that can be used to modify the complexity of the association favored by the measure. We define GMIC and prove it maintains several key asymptotic properties of MIC. Its increased power over MIC is demonstrated using a simulation of eight different functional relationships at sixty different noise levels. The results are compared to the Pearson correlation, distance correlation, and MIC. Simulation results suggest that while generally GMIC has slightly lower power than the distance correlation measure, it achieves higher power than MIC for many forms of underlying association. For some functional relationships, GMIC surpasses all other statistics calculated. Preliminary results suggest choosing a moderate value of the tuning parameter for GMIC will yield a test that is robust across underlying relationships. GMIC is a promising new method that mitigates the power issues suffered by MIC, at the possible expense of equitability. Nonetheless, distance correlation was in our simulations more powerful for many forms of underlying relationships. At a minimum, this work motivates further consideration of maximal information-based nonparametric exploration (MINE) methods as statistical tests of independence.