Software
Statistical Theory and Methods
-
COAP: COAP is a package that implements covariate-augmented overdispersed Poisson factor models. The current Poisson factor models often assume that the factors are unknown, which overlooks the explanatory potential of certain observable covariates. This study focuses on high dimensional settings, where the number of the count response variables and/or covariates can diverge as the sample size increases. A covariate-augmented overdispersed Poisson factor model is proposed to jointly perform a high-dimensional Poisson factor analysis and estimate a large coefficient matrix for overdispersed count data. Please see the following paper for details: Wei Liu, Qingzhi Zhong* (2024). High-dimensional covariate-augmented Overdispersed Poisson factor models. Biometrics, 80(2), ujae031.
-
TOSI: TOSI is a package that provides a general framework of two directional simultaneous inference (TOSI) for high-dimensional as well as the fixed dimensional models with manifest variable or latent variable structure, such as high-dimensional mean models, high-dimensional sparse regression models, and high-dimensional latent factor models. Please see the following paper for details: Wei Liu, Huazhen Lin*, Jin Liu & Shurong Zheng (2023) Two-directional simultaneous inference for high-dimensional models, Journal of Business & Economic Statistics, 42:1, 298-309..
-
GFM: This is an R package for ultrahigh dimensiona reduction and nonlinear feature extraction via a generalized factor model that can handle ultra-high dimensional variables with mixed types. Moreover, GFM MATLAB toolbox is also provided and runs faster in MATLAB environment. Please see the following paper and package website for details: Wei Liu, Huazhen Lin*, Shurong Zheng & Jin Liu (2023). Generalized factor model for ultra-high dimensional correlated variables with mixed types. Journal of the American Statistics Association, 118(542), 1385-1401. and GFM’s usage and applications.
-
ILSE: This is an R package for Iterative Least Square Estimation or Full Information Maximum Likelihood estimation and inference for linear regression when data include missing values. Please see the following paper and package website for details: Huazhen Lin*, Wei Liu, & Wei Lan (2021). Regression Analysis with individual-specific patterns of missing covariates. Journal of Business & Economic Statistics, 39(1), 179-188. and ILSE’s usage and applications.
-
nptvcmPD: This is an R package simplified from NonParametric Time-Varying Coefficient Model for Panel Data (nptvcmPD). Nonparametric time-varying coefficients model based penalty for longitudinal data with pre-specified finite time points, also known as panel data. Please see the following paper for details: Lin, H., Hong, H. G., Yang, B., Liu, W., Zhang, Y., Fan, G. Z., & Li, Y*. (2019) . Nonparametric Time-Varying Coefficient Models for Panel Data. Statistics in Biosciences, 11(3), 548-566.
Statistical Genomics
-
ProFAST: ProFAST is an R package that incorporates the FAST method and CoFAST method, and is specifically for the comprehensive analysis of multiple spatially resolved transcriptomics (SRT) datasets. Recently, we developed the coembedding method CoFAST that simultaneously estimates the cells/spots and features embeddings in the same space. This coembedding space benefits the signature gene identification and visualization. Please see the following paper and package website for details: Wei Liu, Xiao Zhang, Xiaoran Chai, Zhenqian Fan, Huazhen Lin, Jinmiao Chen, Lei Sun, Tianwei Yu, Joe Yeong, and Jin Liu*. Fast: a fast and scalable factor analysis for spatially aware dimension reduction of multi-section spatial transcriptomics data. and ProFAST’s usage and applications.
-
iSC.MEB: iSC.MEB is an R package for integrating and analyzing multiple spatially resolved transcriptomics (SRT) datasets, which permits the users to simultaneously estimate the batch effect and perform spatial clustering for low-dimensional representations of multiple SRT datasets. Please see the following paper and package website for details: Xiao Zhang, Wei Liu, Fangda Song, Jin Liu, iSC.MEB: an R package for multi-sample spatial clustering analysis of spatial transcriptomics data, Bioinformatics Advances, 2023;, vbad019 and iSC.MEB’s usage and applications.
-
PRECAST: PRECAST is an R package for integrating and analyzing multiple spatially resolved transcriptomics (SRT) datasets and unifies spatial factor analysis simultaneously with spatial clustering and embedding alignment, requiring only partially shared cell/domain clusters across datasets. Please see the following paper and package website for details: Wei Liu, Liao, X., Luo, Z. et al. Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST. Nature Communications, 14, 296 (2023). and PRECAST’s usage and applications.
-
DR.SC: This is an R package for joint dimension reduction and spatial clustering is conducted for single-cell RNA sequencing and spatial transcriptomics data. It is not only computationally efficient and scalable to the sample size increment, but also is capable of choosing the smoothness parameter and the number of clusters as well. Please see the following paper and package website for details: Wei Liu, Xu Liao, Yi Yang, Huazhen Lin, Joe Yeong, Xiang Zhou*, Xingjie Shi* & Jin Liu* (2022). Joint dimension reduction and clustering analysis of single-cell RNA-seq and spatial transcriptomics data, Nucleic Acids Research, gkac219. and DR.SC’s usage and applications.