PRECAST: a probabilistic embedding and clustering with alignment for spatial transcriptomics data integration.

PRECAST is a package for integrating and analyzing multiple spatially resolved transcriptomics (SRT) datasets, developed by the Jin Liu’s lab. It unifies spatial factor analysis simultaneously with spatial clustering and embedding alignment, requiring only partially shared cell/domain clusters across datasets.

Check out our bioRxiv paper for a more complete description of the methods and analyses.

PRECAST can be used to compare and contrast experimental datasets in a variety of contexts, for instance:

  • Across experimental batches
  • Across individuals
  • Across different conditions (i.e., case and control)
  • Across datasets with only partially shared cell/domain clusters

Once multiple datasets are integrated, the package provides functionality for further data exploration, analysis, and visualization. Users can:

  • Identify clusters using all data information
  • Extract aligned low-dimensional embeddings across datasets
  • Recover comparable gene expression matrices among datasets
  • Find significant shared (and dataset-specific) gene markers
  • Conditional spatially variational genes analysis
  • Compare clusters with previously identified domain/cell types
  • Visuzlize extracted embeddings using 3-dim tSNE and UMAP
  • Visualize clusters and gene expression using tSNE and UMAP


“PRECAST” depends on the ‘Rcpp’ and ‘RcppArmadillo’ package, which requires appropriate setup of computer. For the users that have set up system properly for compiling C++ files, the following installation command will work.

# Method 1: install PRECAST from CRAN

# Method 2: Install PRECAST from Github
if (!require("remotes", quietly = TRUE))

# If some dependent packages (such as `scater`) on Bioconductor can not be installed nomrally, use following commands, then run abouve command.
if (!require("BiocManager", quietly = TRUE)) ## install BiocManager
# install the package on Bioconducter


For usage examples and guided walkthroughs, check the vignettes directory of the repo.

For the users that don’t have set up system properly, the following setup on different systems can be referred. ## Setup on Windows system First, download Rtools; second, add the Rtools directory to the environment variable. Users can follow here to add Windows PATH Environment Variable.

Setup on MacOS system

First, install Xcode. Installation about Xcode can be referred here.

Second, install “gfortran” for compiling C++ and Fortran at here.

Setup on Linux system

If you use conda environment on Linux system and some dependent packages (such as scater) can not normally installed, you can search R package at website. We take the scater package as example, and its search result is Then you can install it in conda environment by following command.

conda install -c bioconda bioconductor-scater

For the user not using conda environment, if dependent packages (such as scater) not normally installed are in Bioconductor, then use the following command to install the dependent packages.

# install BiocManager
if (!require("BiocManager", quietly = TRUE))
# install the package on Bioconducter

If dependent packages (such as DR.SC) not normally installed are in CRAN, then use the following command to install the dependent packages.

# install the package on CRAN

Other notes

For running big data, users can use the following system command to set the C_stack unlimited in case of R Error: C stack usage is too close to the limit.

ulimit -s unlimited


For an example of typical PRECAST usage, please see our Package Website for a demonstration and overview of the functions included in PRECAST.