--- title: "rddtools" author: "Matthieu Stigler" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{rddtools} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, message = FALSE} knitr::opts_chunk$set(collapse = T, comment = "#>") ``` **RDDtools** works in an object-oriented way: the user has to define once the characteristic of the data, creating a *rdd_data* object, on which different anaylsis tools can be applied. # Data Preparation and Visualisation Load the package, and load the built-in dataset from [Lee 2008]: ```{r} library(rddtools) data(house) ``` Declare the data to be a *rdd_data* object: ```{r} house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0) ``` You can now directly summarise and visualise this data: ```{r dataPlot} summary(house_rdd) plot(house_rdd) ``` # Parametric Estimation Estimate parametrically, by fitting a 4th order polynomial. ```{r reg_para} reg_para <- rdd_reg_lm(rdd_object=house_rdd, order=4) reg_para plot(reg_para) ``` # Non-parametric Estimation Run a simple local regression, using the [Imbens and Kalyanaraman 2012] bandwidth. ```{r RegPlot} bw_ik <- rdd_bw_ik(house_rdd) reg_nonpara <- rdd_reg_np(rdd_object=house_rdd, bw=bw_ik) print(reg_nonpara) ``` # Regression Sensitivity tests: One can easily check the sensitivity of the estimate to different bandwidths: ```{r SensiPlot} plotSensi(reg_nonpara, from=0.05, to=1, by=0.1) ``` Or run the Placebo test, estimating the RDD effect based on fake cutpoints: ```{r placeboPlot} plotPlacebo(reg_nonpara) ``` # Design Sensitivity tests: Design sensitivity tests check whether the discontinuity found can actually be attributed ot other causes. Two types of tests are available: + Discontinuity comes from manipulation: test whether there is possible manipulation around the cutoff, McCrary 2008 test: **dens_test()** + Discontinuity comes from other variables: should test whether discontinuity arises with covariates. Currently, only simple tests of equality of covariates around the threshold are available: ## Discontinuity comes from manipulation: McCrary test use simply the function **dens_test()**, on either the raw data, or the regression output: ```{r DensPlot} dens_test(reg_nonpara) ``` ## Discontinuity comes from covariates: covariates balance tests Two tests available: + equal means of covariates: **covarTest_mean()** + equal density of covariates: **covarTest_dens()** We need here to simulate some data, given that the Lee (2008) dataset contains no covariates. We here simulate three variables, with the second having a different mean on the left and the right. ```{r} set.seed(123) n_Lee <- nrow(house) Z <- data.frame(z1 = rnorm(n_Lee, sd=2), z2 = rnorm(n_Lee, mean = ifelse(house<0, 5, 8)), z3 = sample(letters, size = n_Lee, replace = TRUE)) house_rdd_Z <- rdd_data(y = house$y, x = house$x, covar = Z, cutpoint = 0) ``` Tests correctly reject equality of the second, and correctly do not reject equality for the first and third.