Isaac Scientific Publishing

Journal of Advanced Statistics

Variable Selection for Additive Models with Missing Response at Random

Download PDF (526.3 KB) PP. 1 - 9 Pub. Date: March 1, 2017

DOI: 10.22606/jas.2017.21001


  • Jian Wu
    College of Science, Northeastern University, Shenyang 110189, China
  • Junhua Zhang

    College of Mechanical Engineering, Beijing Information Science and Technology University, Beijing 100192, China
  • Gaorong Li*

    Beijing Institute for Scientific and Engineering Computing, Beijing University of Technology, Beijing 100124, China


This paper studies the problems of variable selection and estimation in the additive models with missing response at random. Based on the centered spline basis function approximation, we propose two new imputed estimating equation methods to implement the variable selection for the additive models with missing response at random by using the smooth-threshold estimating equation. Two new imputed methods can select the significant variables and estimate the unknown functions simultaneously. The proposed methods not only avoid the problem of solving a convex optimization, but also reduce the burden of computation. With the proper choices of the regularization parameter, we show that the resulting estimators enjoy the oracle property. The data driven method is used to choose the tuning parameter. A numerical study is analyzed to confirm the performance of the proposed methods.


Additive model, smooth-threshold estimating equations, variable selection, missing data, oracle property.


[1] J. You, G. Chen, and Y. Zhou, “Block empirical likelihood for longitudinal partially linear regression models,” Canadian Journal of Statistics, vol. 34, pp. 79–96, 2006.

[2] L. Wang and L. Yang, “Spline backfitted kernel smoothing of nonlinear additive autoregression model,” The Annals of Statistics, vol. 35, pp. 2474–2503, 2007.

[3] J. D. Opsomer and D. Ruppert, “Fitting a bivariate additive model by local polynomial regression,” The Annals of Statistics, vol. 25, no. 1, pp. 186–211, 1997.

[4] L. Xue, “Consistent variable selection in additive models,” Statistica Sinica, vol. 19, pp. 1281–1296, 2009.

[5] J. Wu and L. Xue, “Model detection for additive models with longitudinal data,” Open Journal of Statistics, vol. 4, pp. 868–878, 2014.

[6] Y. Lin and H. Zhang, “Component selection and smoothing in multivariate nonparametric regression,” The Annals of Statistics, vol. 34, pp. 2272–2297, 2006.

[7] P. Ravikumar, H. Liu, J. Lafferty, and L. Wasserman, “Spam: sparse additive models,” Journal of the Royal Statistical Society, Series B, vol. 71, pp. 1009–1030, 2009.

[8] A. Chouldechova and T. Hastie, “Generalized additive model selection,” arXiv preprint arXiv: 1506.03850, 2015.

[9] M. Ueki, “A note on automatic variable selection suing smooth-threshold estimating equations,” Biometrika, vol. 96, pp. 1005–1011, 2009.

[10] G. Li, H. Lian, S. Feng, and L. Zhu, “Automatic variable selection for longitudinal generalized linear models,” Computational Statistics & Data Analysis, vol. 61, pp. 174–186, 2013.

[11] P. Zhao and G. Li, “Modified see variable selection for varying coefficient instrumental variable models,” Statistical Methodology, vol. 12, pp. 60–70, 2013.

[12] J. Lv, H. Yang, and C. Guo, “Smoothing combined generalized estimating equations in quantile partially linear additive models with longitudinal data,” Computational Statistics, vol. 31, pp. 1203–1234, 2016.

[13] J. Geronimia and G. Saportab, “Variable selection for multiply-imputed data with penalized generalized estimating equations,” Computational Statistics & Data Analysis, vol. 110, pp. 103–114, 2017.

[14] P. Zhao and L. Xue, “Variable selection for semiparametric varying coefficient partially linear models,” Statistics and Probability Letters, vol. 79, pp. 2148–2157, 2009.

[15] Y. Zhou, A. T. K. Wan, and X. Wang, “Estimating equations inference with missing data,” Journal of the American Statistical Association, vol. 103, no. 483, pp. 1187–1199, 2009.

[16] L. Xue, “Consistent model selection for marginal generalized additive model for correlated data,” Journal of the American Statistical Association, vol. 105, pp. 1518–1530, 2010.

[17] C. de Boor, A Practical Guide to Splines. Springer, 2001.