Efficient data integration as well as spatial clustering for multiple spatial transcriptomics data

ICM.EM_structure(XList,  K, AdjList, q=15,parameterList=NULL)

Arguments

XList

an M-length list consisting of multiple matrices with class dgCMatrix or matrix that specify the log-normalization gene expression matrix for each data sample used for PRECAST model.

K

a positive integer allowing scalar or vector, specify the number of clusters in model fitting.

AdjList

an M-length list of sparse matrices with class dgCMatrix, specify the adjacency matrix used for Potts model and Intrisic CAR model in PRECAST model. We provide this interface for those users who would like to define the adjacency matrix by their own.

q

a positive integer, specify the number of latent features to be extracted, default as 15.

parameterList

Other arguments in PRECAST model, it can be set by model_set.

Details

Nothing

Value

ICM.EM_structure returns a list with class "SeqK_PRECAST_Object" with the number of components equal to the length of K, where each component includes the model fitting results for one number of cluster and is a list consisting of following components:

cluster

an M-length list that includes the inferred class labels for each data sample.

hZ

an M-length list that includes the batch corrected low-dimensional embeddings for each data sample.

hV

an M-length list that includes the estimate the ICAR component for each sample.

Rf

an M-length list that includes the posterior probability of domain clusters for each sample.

beta

an M-length vector that includes the estimated smoothing parameters for each sample.

Mu

mean vectors of mixtures components.

Sigma

covariance matrix of mixtures components.

W

estimated loading matrix

Lam

estimated variance of errors in probabilistic PCA model

loglik

pseudo observed log-likelihood.

Author

Wei Liu

Note

nothing

See also

None

Examples

# \donttest{
  ## we generate the spatial transcriptomics data with lattice neighborhood, i.e. ST platform.
  library(Matrix)
  q <- 10; K <- 4
  data(PRECASTObj)
  posList <- lapply(PRECASTObj@seulist, function(x) cbind(x$row, x$col))
  AdjList <- lapply(posList, getAdj_reg, platform='ST')
#> Neighbors were identified for 900 out of 900 spots.
#> Neighbors were identified for 900 out of 900 spots.
  XList <- lapply(PRECASTObj@seulist, function(x) t(x[['RNA']]@data))
  XList <- lapply(XList, scale, scale=FALSE)
  parList <- model_set(maxIter=4)
  resList <- ICM.EM_structure(XList,  AdjList = AdjList, 
                   q=q, K=K, parameterList=parList)
#> Intergrative data info.: 2 samples, 97 genes X 1800 spots------
#> PRECAST model setting: error_heter=TRUE, Sigma_equal=FALSE, Sigma_diag=TRUE, mix_prop_heter=TRUE
#> Start computing intial values... 
#> fitting ...
#> 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |======================================================================| 100%
#> ----Fitting PRECAST model----------------
#> variable initialize finish! 
#> predict Y and V! 
#> diff Energy = 44.205532 
#> Finish ICM step! 
#> iter = 2, loglik= -59510.035156, dloglik=0.999972 
#> predict Y and V! 
#> diff Energy = 12.547775 
#> diff Energy = 3.939711 
#> Finish ICM step! 
#> iter = 3, loglik= -53027.710938, dloglik=0.108928 
#> predict Y and V! 
#> diff Energy = 7.640621 
#> diff Energy = 1.688112 
#> Finish ICM step! 
#> iter = 4, loglik= -50711.550781, dloglik=0.043678 
# }