Linear regression when covariates include missing values embedding the correlation information between covariates by Iterative Least Square Estimation.

ilse(...)
  # S3 method for formula
ilse(formula, data=NULL, bw=NULL,  k.type=NULL,  method="Par.cond", ...)
  # S3 method for numeric
ilse(Y, X,bw=NULL, k.type=NULL, method="Par.cond", max.iter=20,
  peps=1e-5, feps = 1e-7, arma=TRUE, verbose=FALSE, ...)

Arguments

...

Arguments passed to other methods.

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under 'Details'.

Y

a numeric vector, the reponse variable.

X

a numeric matrix that may include NAs, the covariate matrix.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which ilse is called.

bw

a positive value, specify the bandwidth in estimating missing values, default as NULL. When bw=NULL, it is automatically selected by empirical method.

k.type

an optional character string, specify the type of kernel used in iterative estimating algorithm and support 'epk', 'biweight', 'triangle', 'gaussian', 'triweight', 'tricube', 'cosine', 'uniform' in current version, defualt as 'gaussian'.

method

an optional character string, specify the iterative algorithm, support 'Par.cond' and 'Full.cond' in current version.

max.iter

an optional positive integer, the maximum iterative times, defualt as '20'.

peps

an optional positive value, tolerance vlaue of relative variation rate of estimated parametric vector, default as '1e-7'.

feps

an optional positive vlaue, tolerance vlaue of relative variation rate of objective function value, default as '1e-7'.

arma

an optional logical value, whether use armadillo and Rcpp to speed computation, default as TRUE

verbose

an optional logical value, indicate whether output the iterative information, default as 'TRUE'.

Details

Models for ilse are specified symbolically. A typical model has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response. A terms specification of the form first + second indicates all the terms in first together with all the terms in second with duplicates removed. A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second. The specification first*second indicates the cross of first and second. This is the same as first + second + first:second.

Value

ilse returns an object of class "ilse".

The functions summary and anova are used to obtain and print a summary and analysis of variance table of the results. The generic accessor functions coefficients, effects, fitted.values and residuals extract various useful features of the value returned by lm.

An object of class "ilse" is a list containing at least the following components:

beta

a named vector of coefficients

hX

a imputed design matrix

d.fn

a nonnegative value, vlaue of relative variation rate of objective function value

d.par

a nonnegative value, relative variation rate of estimated parametric vector when algorithm stopped.

iterations

a positive integer, iterative times in total.

residuals

the residuals, that is response minus fitted values.

fitted.values

the fitted mean values.

inargs

a list including all input arguments.

References

Huazhen Lin, Wei Liu, & Wei Lan (2021). Regression Analysis with individual-specific patterns of missing covariates. Journal of Business & Economic Statistics, 39(1), 179-188.

Author

Wei Liu

Note

nothing

See also

Examples

## exmaple one: include missing value data(nhanes) NAlm1 <- ilse(age~., data=nhanes,bw=1, method = 'Par.cond', k.type='gaussian', verbose = TRUE)
#> iter=2, d_fn=1.000000, d_par = 0.362223 #> iter=3, d_fn=0.001542, d_par = 0.001063 #> iter=4, d_fn=0.000009, d_par = 0.000021 #> iter=5, d_fn=0.000000, d_par = 0.000000
print(NAlm1)
#> $beta #> (Intercept) bmi hyp chl #> 2.06942664 -0.11656313 0.62109971 0.01056364 #> #> $d.fn #> [1] 6.654795e-08 #> #> $d.par #> [1] 1.096466e-07 #> #> $iterations #> [1] 5 #>
NAlm2 <- ilse(age~., data=nhanes, method = 'Full.cond') print(NAlm2)
#> $beta #> (Intercept) bmi hyp chl #> 2.06942664 -0.11656313 0.62109971 0.01056364 #> #> $d.fn #> [1] 6.654795e-08 #> #> $d.par #> [1] 1.096466e-07 #> #> $iterations #> [1] 5 #>
## example two: No missing value n <- 100 group <- rnorm(n, sd=4) weight <- 3.2*group + 1.5 + rnorm(n, sd=0.1) NAlm3 <- ilse(weight~group, data=data.frame(weight=weight, group=group), intercept = FALSE) print(NAlm3)
#> $beta #> (Intercept) group #> 1.518525 3.197213 #> #> $d.fn #> [1] 1 #> #> $d.par #> [1] 1.986027e-15 #> #> $iterations #> [1] 2 #>