Title: | Nonparametric Estimation and Inference Procedures using Partitioning-Based Least Squares Regression |
---|---|
Description: | Tools for statistical analysis using partitioning-based least squares regression as described in Cattaneo, Farrell and Feng (2019a, <arXiv:1804.04916>) and Cattaneo, Farrell and Feng (2019b, <arXiv:1906.00202>): lsprobust() for nonparametric point estimation of regression functions and their derivatives and for robust bias-corrected (pointwise and uniform) inference; lspkselect() for data-driven selection of the IMSE-optimal number of knots; lsprobust.plot() for regression plots with robust confidence intervals and confidence bands; lsplincom() for estimation and inference for linear combinations of regression functions from different groups. |
Authors: | Matias D. Cattaneo, Max H. Farrell, Yingjie Feng |
Maintainer: | Yingjie Feng <[email protected]> |
License: | GPL-2 |
Version: | 0.4 |
Built: | 2024-11-02 03:26:42 UTC |
Source: | https://github.com/cran/lspartition |
This package provides tools for statistical analysis using B-splines, wavelets, and
piecewise polynomials as described in
Cattaneo, Farrell and Feng (2019a):
lsprobust
for least squares point estimation with robust bias-corrected pointwise and
uniform inference procedures; lspkselect
for data-driven procedures
for selecting the IMSE-optimal number of partitioning knots; lsprobust.plot
for regression plots with robust confidence intervals and confidence bands;
lsplincom
for estimation and inference for linear combination of regression
functions of different groups.
The companion software article, Cattaneo, Farrell and Feng (2019b), provides further implementation details and empirical illustrations.
Matias D. Cattaneo, Princeton University, Princeton, NJ. [email protected].
Max H. Farrell, University of Chicago, Chicago, IL. [email protected].
Yingjie Feng (maintainer), Princeton University, Princeton, NJ. [email protected].
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019a): Large Sample Properties of Partitioning-Based Series Estimators. Annals of Statistics, forthcoming. arXiv:1804.04916.
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019b): lspartition: Partitioning-Based Least Squares Regression. R Journal, forthcoming. arXiv:1906.00202.
lspkselect
implements data-driven procedures to select the Integrated Mean Squared Error (IMSE) optimal number of partitioning knots for partitioning-based least squares regression estimators. Three series methods are supported: B-splines, compactly supported wavelets, and piecewise polynomials.
See Cattaneo and Farrell (2013) and Cattaneo, Farrell and Feng (2019a) for complete details.
Companion commands: lsprobust
for partitioning-based least squares regression estimation and inference; lsprobust.plot
for plotting results; lsplincom
for multiple sample estimation and inference.
A detailed introduction to this command is given in Cattaneo, Farrell and Feng (2019b).
For more details, and related Stata and R packages useful for empirical analysis, visit https://sites.google.com/site/nppackages/.
lspkselect(y, x, m = NULL, m.bc = NULL, smooth = NULL, bsmooth = NULL, deriv = NULL, method = "bs", ktype = "uni", kselect = "imse-dpi", proj = TRUE, bc = "bc3", vce = "hc2", subset = NULL, rotnorm = TRUE) ## S3 method for class 'lspkselect' print(x, ...) ## S3 method for class 'lspkselect' summary(object, ...)
lspkselect(y, x, m = NULL, m.bc = NULL, smooth = NULL, bsmooth = NULL, deriv = NULL, method = "bs", ktype = "uni", kselect = "imse-dpi", proj = TRUE, bc = "bc3", vce = "hc2", subset = NULL, rotnorm = TRUE) ## S3 method for class 'lspkselect' print(x, ...) ## S3 method for class 'lspkselect' summary(object, ...)
y |
Outcome variable. |
x |
Independent variable. A matrix or data frame. |
m |
Order of basis used in the main regression. Default is |
m.bc |
Order of basis used to estimate leading bias. Default is |
smooth |
Smoothness of B-splines for point estimation. When |
bsmooth |
Smoothness of B-splines for bias correction. Default is |
deriv |
Derivative order of the regression function to be estimated. A vector object of the same
length as |
method |
Type of basis used for expansion. Options are |
ktype |
Knot placement. Options are |
kselect |
Method for selecting the number of inner knots used by |
proj |
If |
bc |
Bias correction method. Options are |
vce |
Procedure to compute the heteroskedasticity-consistent (HCk) variance-covariance matrix estimator with plug-in residuals. Options are
|
subset |
Optional rule specifying a subset of observations to be used. |
rotnorm |
If |
... |
further arguments |
object |
class |
ks |
A matrix may contain |
opt |
A list containing options passed to the function. |
print
: print
method for class "lspkselect
".
summary
: summary
method for class "lspkselect
".
Matias D. Cattaneo, Princeton University, Princeton, NJ. [email protected].
Max H. Farrell, University of Chicago, Chicago, IL. [email protected].
Yingjie Feng (maintainer), Princeton University, Princeton, NJ. [email protected].
Cattaneo, M. D., and M. H. Farrell (2013): Optimal convergence rates, Bahadur representation, and asymptotic normality of partitioning estimators. Journal of Econometrics 174(2): 127-143.
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019a): Large Sample Properties of Partitioning-Based Series Estimators. Annals of Statistics, forthcoming. arXiv:1804.04916.
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019b): lspartition: Partitioning-Based Least Squares Regression. R Journal, forthcoming. arXiv:1906.00202.
Cohen, A., I. Daubechies, and P.Vial (1993): Wavelets on the Interval and Fast Wavelet Transforms. Applied and Computational Harmonic Analysis 1(1): 54-81.
lsprobust
, lsprobust.plot
, lsplincom
x <- data.frame(runif(500), runif(500)) y <- sin(4*x[,1])+cos(x[,2])+rnorm(500) est <- lspkselect(y, x) summary(est)
x <- data.frame(runif(500), runif(500)) y <- sin(4*x[,1])+cos(x[,2])+rnorm(500) est <- lspkselect(y, x) summary(est)
lsplincom
implements user-specified linear combinations across different data sub-groups for regression functions estimation, and computes corresponding (pointwise and uniform) robust bias-corrected inference measures. Estimation and inference is implemented using the lspartition package.
See Cattaneo and Farrell (2013) and Cattaneo, Farrell and Feng (2019a) for complete details.
A detailed introduction to this command is given in Cattaneo, Farrell and Feng (2019b).
For more details, and related Stata and R packages useful for empirical analysis, visit https://sites.google.com/site/nppackages/.
lsplincom(y, x, G, R, eval = NULL, neval = NULL, level = 95, band = FALSE, cb.method = NULL, cb.grid = NULL, cb.ngrid = 50, B = 1000, subset = NULL, knot = NULL, ...) ## S3 method for class 'lsplincom' print(x, ...) ## S3 method for class 'lsplincom' summary(object, ...)
lsplincom(y, x, G, R, eval = NULL, neval = NULL, level = 95, band = FALSE, cb.method = NULL, cb.grid = NULL, cb.ngrid = 50, B = 1000, subset = NULL, knot = NULL, ...) ## S3 method for class 'lsplincom' print(x, ...) ## S3 method for class 'lsplincom' summary(object, ...)
y |
Outcome variable. |
x |
Independent variable. A matrix or data frame. |
G |
Group indicator. It may take on multiple discrete values. |
R |
A numeric vector giving the linear combination of interest. Each element is the coefficient
of the conditional mean estimator of one group, and they are ordered ascendingly along the value
of |
eval |
Evaluation points. A matrix or data frame. |
neval |
Number of quantile-spaced evaluating points. |
level |
Confidence level used for confidence intervals; default is |
band |
If |
cb.method |
Method used to calculate the critical value for confidence bands.
Options are |
cb.grid |
A matrix containing all grid points used to construct confidence bands. Each row correponds to the coordinates of one grid point. |
cb.ngrid |
A numeric vector of the same length as |
B |
Number of simulated samples used to obtain the critical value for confidence bands.
Default is |
subset |
Optional rule specifying a subset of observations to be used. |
knot |
A list of numeric vectors giving the knot positions (including boundary knots) for each dimension
which are used in the main regression. The length of the list is equal to |
... |
Arguments to be passed to the function. See |
object |
class |
Estimate |
A matrix containing eval (grid points), N (effective sample sizes),
tau.cl (point estimates with a basis of order |
sup.cval |
Critical value for constructing confidence bands. |
opt |
A list containing options passed to the function. |
print
: print
method for class "lsplincom
".
summary
: summary
method for class "lsplincom
"
Matias D. Cattaneo, Princeton University, Princeton, NJ. [email protected].
Max H. Farrell, University of Chicago, Chicago, IL. [email protected].
Yingjie Feng (maintainer), Princeton University, Princeton, NJ. [email protected].
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019a): Large Sample Properties of Partitioning-Based Series Estimators. Annals of Statistics, forthcoming. arXiv:1804.04916.
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019b): lspartition: Partitioning-Based Least Squares Regression. R Journal, forthcoming. arXiv:1906.00202.
lsprobust
, lspkselect
, lsprobust.plot
,
x <- runif(500) y <- sin(4*x)+rnorm(500) z <- c(rep(0, 250), rep(1, 250)) est <- lsplincom(y, x, z, c(-1, 1)) summary(est)
x <- runif(500) y <- sin(4*x)+rnorm(500) z <- c(rep(0, 250), rep(1, 250)) est <- lsplincom(y, x, z, c(-1, 1)) summary(est)
lsprobust
implements partitioning-based least squares point estimators for the regression function and its derivatives. It also provides robust bias-corrected (pointwise and uniform) inference, including simulation-based confidence bands. Three series methods are supported: B-splines, compact supported wavelets, and piecewise polynomials.
See Cattaneo and Farrell (2013) and Cattaneo, Farrell and Feng (2019a) for complete details.
Companion commands: lspkselect
for data-driven IMSE-optimal selection of the number of knots on rectangular partitions; lsprobust.plot
for plotting results; lsplincom
for multiple sample estimation and inference.
A detailed introduction to this command is given in Cattaneo, Farrell and Feng (2019b).
For more details, and related Stata and R packages useful for empirical analysis, visit https://sites.google.com/site/nppackages/.
lsprobust(y, x, eval = NULL, neval = NULL, method = "bs", m = NULL, m.bc = NULL, deriv = NULL, smooth = NULL, bsmooth = NULL, ktype = "uni", knot = NULL, nknot = NULL, same = TRUE, bknot = NULL, bnknot = NULL, J = NULL, bc = "bc3", proj = TRUE, kselect = "imse-dpi", vce = "hc2", level = 95, uni.method = NULL, uni.grid = NULL, uni.ngrid = 50, uni.out = FALSE, band = FALSE, B = 1000, subset = NULL, rotnorm = TRUE) ## S3 method for class 'lsprobust' print(x, ...) ## S3 method for class 'lsprobust' summary(object, ...)
lsprobust(y, x, eval = NULL, neval = NULL, method = "bs", m = NULL, m.bc = NULL, deriv = NULL, smooth = NULL, bsmooth = NULL, ktype = "uni", knot = NULL, nknot = NULL, same = TRUE, bknot = NULL, bnknot = NULL, J = NULL, bc = "bc3", proj = TRUE, kselect = "imse-dpi", vce = "hc2", level = 95, uni.method = NULL, uni.grid = NULL, uni.ngrid = 50, uni.out = FALSE, band = FALSE, B = 1000, subset = NULL, rotnorm = TRUE) ## S3 method for class 'lsprobust' print(x, ...) ## S3 method for class 'lsprobust' summary(object, ...)
y |
Outcome variable. |
x |
Independent variable. A matrix or data frame. |
eval |
Evaluation points. A matrix or data frame. |
neval |
Number of quantile-spaced evaluating points. |
method |
Type of basis used for expansion. Options are |
m |
Order of basis used in the main regression. Default is |
m.bc |
Order of basis used to estimate leading bias. Default is |
deriv |
Derivative order of the regression function to be estimated. A vector object of the same
length as |
smooth |
Smoothness of B-splines for point estimation. When |
bsmooth |
Smoothness of B-splines for bias correction. Default is |
ktype |
Knot placement. Options are |
knot |
A list of numeric vectors giving the knot positions (including boundary knots) for each dimension
which are used in the main regression. The length of the list is equal to |
nknot |
A numeric vector of the same length as |
same |
If |
bknot |
A list of numeric vectors giving knot positions used for bias correction. If not
specified and |
bnknot |
A numeric vector of the same length as |
J |
A numeric vector containing resolution levels of father wavelets for each dimension. |
bc |
Bias correction method. Options are |
proj |
If |
kselect |
Method for selecting the number of inner knots used by |
vce |
Procedure to compute the heteroskedasticity-consistent (HCk) variance-covariance matrix estimator with plug-in residuals. Options are
|
level |
Confidence level used for confidence intervals; default is |
uni.method |
Method used to implement uniform inference. Options are |
uni.grid |
A matrix containing all grid points used to implement uniform inference. Each row correponds to the coordinates of one grid point. |
uni.ngrid |
A numeric vector of the same length as |
uni.out |
If |
band |
If |
B |
Number of simulated samples used to obtain the critical value for confidence bands.
Default is |
subset |
Optional rule specifying a subset of observations to be used. |
rotnorm |
If |
... |
further arguments |
object |
class |
Estimate |
A matrix containing eval (grid points), N (effective sample sizes),
tau.cl (point estimates with a basis of order |
k.num |
A matrix containing the number of inner partitioning knots used in the main regression and bias correction for each covariate. |
knot |
A list of knots for point estimation. |
bknot |
A list of knots for bias correction. |
sup.cval |
Critical value for constructing confidence band. |
uni.output |
A list containing quantities used to implement uniform inference. |
opt |
A list containing options passed to the function. |
print
: print
method for class "lsprobust
"
summary
: summary
method for class "lsprobust
"
Matias D. Cattaneo, Princeton University, Princeton, NJ. [email protected].
Max H. Farrell, University of Chicago, Chicago, IL. [email protected].
Yingjie Feng (maintainer), Princeton University, Princeton, NJ. [email protected].
Cattaneo, M. D., and M. H. Farrell (2013): Optimal convergence rates, Bahadur representation, and asymptotic normality of partitioning estimators. Journal of Econometrics 174(2): 127-143.
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019a): Large Sample Properties of Partitioning-Based Series Estimators. Annals of Statistics, forthcoming. arXiv:1804.04916.
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019b): lspartition: Partitioning-Based Least Squares Regression. R Journal, forthcoming. arXiv:1906.00202.
Cohen, A., I. Daubechies, and P.Vial (1993): Wavelets on the Interval and Fast Wavelet Transforms. Applied and Computational Harmonic Analysis 1(1): 54-81.
lspkselect
, lsprobust.plot
, lsplincom
x <- data.frame(runif(500), runif(500)) y <- sin(4*x[,1])+cos(x[,2])+rnorm(500) est <- lsprobust(y, x) summary(est)
x <- data.frame(runif(500), runif(500)) y <- sin(4*x[,1])+cos(x[,2])+rnorm(500) est <- lsprobust(y, x) summary(est)
lsprobust.plot
plots estimated regression functions and confidence regions using the lspartition package.
See Cattaneo and Farrell (2013) and Cattaneo, Farrell and Feng (2019a) for complete details.
Companion command: lsprobust
for partitioning-based least squares regression
estimation and inference; lsprobust.plot
for plotting results; lsplincom
for multiple sample estimation and inference.
A detailed introduction to this command is given in Cattaneo, Farrell and Feng (2019b).
For more details, and related Stata and R packages useful for empirical analysis, visit https://sites.google.com/site/nppackages/.
lsprobust.plot(..., alpha = NULL, type = NULL, CS = "ci", CStype = NULL, title = "", xlabel = "", ylabel = "", lty = NULL, lwd = NULL, lcol = NULL, pty = NULL, pwd = NULL, pcol = NULL, CSshade = NULL, CScol = NULL, legendTitle = NULL, legendGroups = NULL)
lsprobust.plot(..., alpha = NULL, type = NULL, CS = "ci", CStype = NULL, title = "", xlabel = "", ylabel = "", lty = NULL, lwd = NULL, lcol = NULL, pty = NULL, pwd = NULL, pcol = NULL, CSshade = NULL, CScol = NULL, legendTitle = NULL, legendGroups = NULL)
... |
Objects returned by |
alpha |
Numeric scalar between 0 and 1, the significance level for plotting confidence regions. If more than one is provided, they will be applied to data series accordingly. |
type |
String, one of |
CS |
String, type of confidence sets. Options are |
CStype |
String, one of |
title |
String, title of the plot. |
xlabel |
Strings, labels for x-axis. |
ylabel |
Strings, labels for y-axis. |
lty |
Line type for point estimates, only effective if |
lwd |
Line width for point estimates, only effective if |
lcol |
Line color for point estimates, only effective if |
pty |
Scatter plot type for point estimates, only effective if |
pwd |
Scatter plot size for point estimates, only effective if |
pcol |
Scatter plot color for point estimates, only effective if |
CSshade |
Numeric, opaqueness of the confidence region, should be between 0 (transparent) and 1. Default is 0.2. If more than one is provided, they will be applied to data series accordingly. |
CScol |
Color for confidence region. |
legendTitle |
String, title of legend. |
legendGroups |
String vector, group names used in legend. |
Companion command: lsprobust
for partition-based least-squares regression
estimation.
A standard ggplot2
object is returned, hence can be used for further
customization.
Matias D. Cattaneo, Princeton University, Princeton, NJ. [email protected].
Max H. Farrell, University of Chicago, Chicago, IL. [email protected].
Yingjie Feng (maintainer), Princeton University, Princeton, NJ. [email protected].
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019a): Large Sample Properties of Partitioning-Based Series Estimators. Annals of Statistics, forthcoming. arXiv:1804.04916.
Cattaneo, M. D., M. H. Farrell, and Y. Feng (2019b): lspartition: Partitioning-Based Least Squares Regression. R Journal, forthcoming. arXiv:1906.00202.
lsprobust
, lspkselect
, lsplincom
, ggplot2
.
x <- runif(500) y <- sin(4*x)+rnorm(500) est <- lsprobust(y, x) lsprobust.plot(est)
x <- runif(500) y <- sin(4*x)+rnorm(500) est <- lsprobust(y, x) lsprobust.plot(est)