Generate simulated data from covariate-augumented generalized factor model

gendata_cmgfm(
  seed = 1,
  n = 300,
  pveclist = list(gaussian = c(50, 150), poisson = c(50), binomial = c(100, 60)),
  q = 6,
  d = 3,
  rho = rep(1, length(pveclist)),
  rho_z = 1,
  sigmavec = rep(0.5, length(pveclist)),
  n_bin = 1,
  sigma_eps = 1,
  seed.para = 1
)

Arguments

seed

a positive integer, the random seed for reproducibility of data generation process.

n

a positive integer, specify the sample size.

pveclist

a named list, specify the number of modalities for each variable type and dimension of variables in each modality.

q

a positive integer, specify the number of modality-shared factors.

d

a positive integer, specify the dimension of covariate matrix.

rho

a numeric vector with length length(pveclist) and positive elements, specify the signal strength of loading matrix for each modality with the same variable type.

rho_z

a positive real, specify the signal strength of covariates.

sigmavec

a positive vector with length length(pveclist), the variance of modality-specified latent factors.

n_bin

a positive integer, specify the number of trails in Binomial distribution.

sigma_eps

a positive real, the variance of overdispersion error.

seed.para

a positive integer, the random seed for reproducibility of data generation process by fixing the regression coefficient vector and loading matrices.

Value

return a list including the following components:

  • XList - a list consisting of multiple matrices in which each matrix has the same type of values, i.e., continuous, or count, or binomial/binary values.

  • Z - a matrix, the fixed-dimensional covariate matrix with control variables;

  • Alist - the the offset vector for each modality;

  • B0list - the true loading matrix for each modality;

  • mu0 - the true intercept vector for each modality;

  • U0 - the modality-specified factor vector;

  • F0 - the modality-shared factor matrix;

  • Uplist - the true intercept-loading matrix for each modality;

  • beta - the true regression coefficient vector for each modality;

  • sigma_eps - the standard deviation of error term;

  • numvarmat - a length(types)-by-d matrix, the number of variables in modalities that belong to the same type.

Details

None

References

None

See also

Examples

n <- 300; 
pveclist = list('gaussian'=c(50, 150),'poisson'=c(50),'binomial'=c(100,60))
d <- 20; q <- 6;
datlist <- gendata_cmgfm(n=n, pveclist=pveclist, q=q, d=d)
str(datlist)
#> List of 12
#>  $ XList    :List of 3
#>   ..$ : num [1:300, 1:200] 25.38 -2.37 7.17 -13.35 -8.38 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : NULL
#>   .. .. ..$ : NULL
#>   ..$ : num [1:300, 1:50] 330 2248349 237378 1 0 ...
#>   ..$ : int [1:300, 1:160] 0 1 1 0 1 0 0 1 0 0 ...
#>  $ Z        : num [1:300, 1:20] 1.574 0.563 1.317 -1.295 -0.35 ...
#>  $ types    : chr [1:3] "gaussian" "poisson" "binomial"
#>  $ Alist    :List of 1
#>   ..$ : num [1:300, 1:5] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ numvarmat: num [1:3, 1:2] 50 50 100 150 0 60
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:3] "gaussian" "poisson" "binomial"
#>   .. ..$ : NULL
#>  $ B0List   :List of 3
#>   ..$ :List of 2
#>   .. ..$ : num [1:50, 1:6] 0.296 -0.782 0.148 -0.398 -0.45 ...
#>   .. ..$ : num [1:150, 1:6] 0.199 0.111 -0.379 0.192 0.739 ...
#>   ..$ :List of 1
#>   .. ..$ : num [1:50, 1:6] 0.602 -0.134 -0.317 -0.697 -0.262 ...
#>   ..$ :List of 2
#>   .. ..$ : num [1:100, 1:6] 0.6966 -0.2756 -0.0865 0.0696 0.0452 ...
#>   .. ..$ : num [1:60, 1:6] 0.0186 -0.6655 -0.393 -0.5609 -0.486 ...
#>  $ mu0List  :List of 3
#>   ..$ : num [1:200] -0.0658 -0.1013 0.2788 0.2227 -0.2755 ...
#>   ..$ : num [1:50] -0.0249 -0.1412 -0.2353 0.3261 -0.132 ...
#>   ..$ : num [1:160] 0.373 0.827 0.179 0.386 -0.143 ...
#>  $ beta0List:List of 3
#>   ..$ : num [1:20, 1:2] -1.253 0.367 -1.671 3.191 0.659 ...
#>   ..$ : num [1:20, 1] 0.683 0.827 0.244 -3.179 -1.575 ...
#>   ..$ : num [1:20, 1:2] -0.79393 -0.00955 -2.1993 -0.69375 2.07231 ...
#>  $ Uplist   :List of 3
#>   ..$ :List of 2
#>   .. ..$ : num [1:50, 1:7] -0.0658 -0.1013 0.2788 0.2227 -0.2755 ...
#>   .. ..$ : num [1:150, 1:7] -0.217 0.483 0.464 0.28 0.635 ...
#>   ..$ :List of 1
#>   .. ..$ : num [1:50, 1:7] -0.0249 -0.1412 -0.2353 0.3261 -0.132 ...
#>   ..$ :List of 2
#>   .. ..$ : num [1:100, 1:7] 0.373 0.827 0.179 0.386 -0.143 ...
#>   .. ..$ : num [1:60, 1:7] 0.373 -0.1221 -0.6367 -0.0417 0.255 ...
#>  $ U0List   :List of 3
#>   ..$ : num [1:300, 1:2] -0.443 0.13 -0.591 1.128 0.233 ...
#>   ..$ : num [1:300, 1] -0.443 0.13 -0.591 1.128 0.233 ...
#>   ..$ : num [1:300, 1:2] -0.443 0.13 -0.591 1.128 0.233 ...
#>  $ F0       : num [1:300, 1:6] -0.626 0.184 -0.836 1.595 0.33 ...
#>  $ sigma_eps: num 1