Select the number of study-shared and study-specified factors for the high-dimensional multi-study multi-modality covariate-augmented generalized factor model.

selectFac.MMGFM(
  XList,
  ZList,
  numvarmat,
  q.max = 15,
  qsvec.max = rep(4, length(XList)),
  threshold.vec = c(0.01, 0.001),
  tauList = NULL,
  init = c("MSFRVI", "random", "LFM"),
  epsELBO = 1e-12,
  maxIter = 30,
  verbose = TRUE,
  seed = 1
)

Arguments

XList

a S-length list with each component a m-length list composed by a combined modality matrix of the same type modalities, which is the observed matrix from each source/study and each modality, where m is the number of modality types.

ZList

a S-length list with each component a matrix that is the covariate matrix from each study.

numvarmat

a m-by-T matrix with rownames modality types that specifies the variable number for each modality of each modality type, where m is the number of modality types, T is the maximum number of modalities for one of modality types .

q.max

an optional integer, specify the upper bound for the number of study-shared factors; default as 15.

qsvec.max

an optional integer vector with length S, specify the upper bound for the number of study-specifed factors; default as 4 for each study.

threshold.vec

an optional real vector with length 2, specify the threshold for the singular values of study-shared loading and study-specified loading matrices, respectively.

tauList

an optional S-length list with each component a m-length list correponding the offset term for each combined modality of each study; default as full-zero matrix.

init

an optional string, specify the initialization method, supporting "MSFRVI", "random" and "LFM", default as "MSFRVI".

epsELBO

an optional positive vlaue, tolerance of relative variation rate of the envidence lower bound value, defualt as '1e-5'.

maxIter

the maximum iteration of the VEM algorithm. The default is 30.

verbose

a logical value, whether output the information in iteration.

seed

an optional integer, specify the random seed for reproducibility in initialization.

Value

return a list with two components: q and qs.vec.

Examples

q <- 3; qsvec<-rep(2,3)
nvec <- c(100, 120, 100)
pveclist <-  list('gaussian'=rep(150, 1),'poisson'=rep(50, 2),'binomial'=rep(60, 2))
datlist <- gendata_mmgfm(seed = 1,  nvec = nvec, pveclist =pveclist,
                         q = q,  d= 3,qs = qsvec,  rho = rep(3,length(pveclist)), rho_z=0.5,
                         sigmavec=rep(0.5, length(pveclist)),  sigma_eps=1)
XList <- datlist$XList
ZList <- datlist$ZList
numvarmat <- datlist$numvarmat
### For illustration, we set maxIter=3. Set maxIter=50 when running formally
selectFac.MMGFM(XList, ZList=ZList, numvarmat, q.max=6, qsvec.max  = rep(4,3),
init='MSFRVI',maxIter = 3)
#> Data include 3 studies/sources
#> variables belong to 3 types: gaussian, poisson, binomial
#> Initialization...
#> Initialization using MSFR...
#> Finish Initialization!
#> Initialize the paramters related to s!
#> Initialize the paramters not related to s!
#> iter = 2, ELBO= 2052314215.845796, dELBO=1.000000 
#> iter = 3, ELBO= 2052329561.726041, dELBO=0.000007 
#> Data include 3 studies/sources
#> variables belong to 3 types: gaussian, poisson, binomial
#> Initialization...
#> Initialization using MSFR...
#> Finish Initialization!
#> Initialize the paramters related to s!
#> Initialize the paramters not related to s!
#> iter = 2, ELBO= 2052316646.430007, dELBO=1.000000 
#> iter = 3, ELBO= 2052331255.119864, dELBO=0.000007 
#> $q
#> [1] 3
#> 
#> $qs.vec
#> [1] 2 2 2
#>