Integrate multiple SRT data based on the PRECASTObj by PRECAST model fitting.

IntegrateSpaData(PRECASTObj, species="Human", 
                 custom_housekeep=NULL, covariates_use=NULL,
                 seuList=NULL, subsample_rate=1, sample_seed=1)



a PRECASTObj object after finishing the PRECAST model fitting and model selection.


an optional string, one of 'Human', 'Mouse' and 'Unknown', specify the species of the SRT data to help choose the housekeeping genes. 'Unknown' means only using the PRECAST results reconstruct the alligned gene expression.


user-specified housekeeping genes.


a string vector, the colnames in `PRECASTObj@seulist[[1]]`, representing other biological covariates to considered when removing batch effects. This is achieved by adding additional covariates for biological conditions in the regression, such as case or control. Default as `NULL`, denoting no other covariates to be considered.


an optional Seurat list object, `seuList` plays a crucial role in the integration process. If `seuList` is set to `NULL` and `PRECASTObj@seuList` is not NULL, then `seuList` will adopt the values of `PRECASTObj@seuList`. Subsequently, the genes within `seuList` will be utilized for integration. Conversely, if `seuList` is not NULL, the integration will directly employ the genes specified within `seuList`. In the event that both `seuList` and `PRECASTObj@seuList` are set to NULL, integration will proceed using the genes outlined in `PRECASTObj@seulist`, i.e., the variable genes. To preserve the `seuList` not NULL in `PRECASTObj@seuList`, user can set `rawData.preserve=TRUE` when running `CreatePRECASTObject`. This parameter empowers users to integrate the entire set of genes in `seuList` when implementing the integration, as opposed to exclusively considering the variable genes within `PRECASTObj@seuList`.


an optional real number ranging from zero to one, this parameter specifies the subsampling rate during integration to enhance computational efficiency, default as 1 (without subsampling).


an optional integer, with a default value of 1, serves to designate the random seed when `subsample_rate` is set to a value less than one, ensuring reproducibility in the sampling process.




Return a Seurat object by integrating all SRT data batches into a SRT data, where the column "batch" in the represents the batch ID, and the column "cluster" represents the clusters obtained by PRECAST.


Wei Liu, Liao, X., Luo, Z. et al, Jin Liu* (2023). Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST. Nature Communications, 14, 296

Gagnon-Bartsch, J. A., Jacob, L., & Speed, T. P. (2013). Removing unwanted variation from high dimensional data with negative controls. Berkeley: Tech Reports from Dep Stat Univ California, 1-112.


Wei Liu



See also



  PRECASTObj <- SelectModel(PRECASTObj)
  seuInt <- IntegrateSpaData(PRECASTObj, species='unknown')
#> Using only PRECAST results to obtain the batch corrected gene expressions since species is unknown or the genelist in PRECASTObj has less than 5 overlapp with the housekeeping genes of given species.
#> Start integration...
#> 2024-01-24 14:35:44 : ***** Data integration finished!, 0.001 mins elapsed.
#> Put the data into a new Seurat object...
#> 2024-01-24 14:35:45 : ***** New Seurat object is generated!, 0.004 mins elapsed.