My library button
  • No image available

  • No image available

    We propose methods for inference on the average effect of a treatment on a scalar outcome in the presence of very many controls. Our setting is a partially linear regression model containing the treatment/policy variable and a large number p of controls or series terms, with p that is possibly much larger than the sample size n, but where only s “n unknown controls or series terms are needed to approximate the regression function accurately. The latter sparsity condition makes it possible to estimate the entire regression function as well as the average treatment effect by selecting an approximately the right set of controls using Lasso and related methods. We develop estimation and inference methods for the average treatment effect in this setting, proposing a novel "post double selection" method that provides attractive inferential and estimation properties. In our analysis, in order to cover realistic applications, we expressly allow for imperfect selection of the controls and account for the impact of selection errors on estimation and inference. In order to cover typical applications in economics, we employ the selection methods designed to deal with non-Gaussian and heteroscedastic disturbances. We illustrate the use of new methods with numerical simulations and an application to the effect of abortion on crime rates. -- treatment effects ; high-dimensional regression ; inference under imperfect model selection

  • No image available

    In this paper we study post-model selection estimators which apply ordinary least squares (ols) to the model selected by first-step penalized estimators, typically lasso. It is well known that lasso can estimate the non-parametric regression function at nearly the oracle rate, and is thus hard to improve upon. We show that ols post lasso estimator performs at least as well as lasso in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the lasso-based model selection "fails" in the sense of missing some components of the "true" regression model. By the "true" model we mean here the best s-dimensional approximation to the nonparametric regression function chosen by the oracle. Furthermore, ols post lasso estimator can perform strictly better than lasso, in the sense of a strictly faster rate of convergence, if the lasso-based model selection correctly includes all components of the "true" model as a subset and also achieves sufficient sparsity. In the extreme case, when lasso perfectly selects the "true" model, the ols post lasso estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by lasso which guarantees that this dimension is at most of the same order as the dimension of the "true" model. Our rate results are non-asymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the lasso estimator acting as selector in the first step, but also applies to any other estimator, for example various forms of thresholded lasso, with good rates and good sparsity properties. Our analysis covers both traditional thresholding and a new practical, data-driven thresholding scheme that induces maximal sparsity subject to maintaining a certain goodness-of-fit. The latter scheme has theoretical guarantees similar to those of lasso or ols post lasso, but it dominates these procedures as well as traditional thresholding in a wide variety of experiments.

  • No image available

  • No image available

    We develop results for the use of Lasso and post-Lasso methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, p. Our results apply even when p is much larger than the sample size, n. We show that the IV estimator based on using Lasso or post-Lasso in the first stage is root-n consistent and asymptotically normal when the first stage is approximately sparse, that is, when the conditional expectation of the endogenous variables given the instruments can be well-approximated by a relatively small set of variables whose identities may be unknown. We also show that the estimator is semiparametrically efficient when the structural error is homoscedastic. Notably, our results allow for imperfect model selection, and do not rely upon the unrealistic “beta-min” conditions that are widely used to establish validity of inference following model selection (see also Belloni, Chernozhukov, and Hansen (2011b)). In simulation experiments, the Lasso-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the Lassobased IV estimator outperforms an intuitive benchmark.Optimal instruments are conditional expectations. In developing the IV results, we establish a series of new results for Lasso and post-Lasso estimators of nonparametric conditional expectation functions which are of independent theoretical and practical interest. We construct a modification of Lasso designed to deal with non-Gaussian, heteroscedastic disturbances that uses a data-weighted l1-penalty function. By innovatively using moderate deviation theory for self-normalized sums, we provide convergence rates for the resulting Lasso and post-Lasso estimators that are as sharp as the corresponding rates in the homoscedastic Gaussian case under the condition that log p = o (n1/3). We also provide a data-driven method for choosing the penalty level that must be specified in obtaining Lasso and post-Lasso estimates and establish its asymptotic validity under non-Gaussian, heteroscedastic disturbances.

  • No image available

  • No image available

    Given a convex body S and a point x in S, let sym(x,S) denote the symmetry value of x in S: sym(x,S):= max{t : x + t(x - y) is in S for every y in S}, which essentially measures how symmetric S is about the point x, and define sym(S):=max{sym(x,S) : x in S}. We call x* a symmetry point of S if x* achieves the above supremum. These symmetry measures are all invariant under invertible affine transformation and/or change in norm, and so are of interest in the study of the geometry of convex sets. In this study we demonstrate various properties of sym(x,S), including relations with convex geometry quantities like volume, distance and diameter, and cross-ratio distance. When S is polyhedral of the form {x : Ax

  • No image available

    This work proposes new inference methods for the estimation of a regression coefficient of interest in quantile regression models. We consider high-dimensional models where the number of regressors potentially exceeds the sample size but a subset of them suffice to construct a reasonable approximation of the unknown quantile regression function in the model. The proposed methods are protected against moderate model selection mistakes, which are often inevitable in the approximately sparse model considered here. The methods construct (implicitly or explicitly) an optimal instrument as a residual from a density-weighted projection of the regressor of interest on other regressors. Under regularity conditions, the proposed estimators of the quantile regression coefficient are asymptotically root-n normal, with variance equal to the semi-parametric efficiency bound of the partially linear quantile regression model. In addition, the performance of the technique is illustrated through Monte-carlo experiments and an empirical example, dealing with risk factors in childhood malnutrition. The numerical results confirm the theoretical findings that the proposed methods should outperform the naive post-model selection methods in non-parametric settings. Moreover, the empirical results demonstrate soundness of the proposed methods.

  • No image available

  • No image available