Function for prediction at new locations for multi-season multi-species spatial occupancy models

The function predict collects posterior predictive samples for a set of new locations given an object of class `stMsPGOcc`. Prediction is possible for both the latent occupancy state as well as detection. Predictions are currently only possible for sampled primary time periods.

Usage

# S3 method for stMsPGOcc
predict(object, X.0, coords.0, t.cols, n.omp.threads = 1, 
                          verbose = TRUE, n.report = 100, 
                          ignore.RE = FALSE, type = 'occupancy', grid.index.0, ...)

Arguments

object: an object of class stMsPGOcc
X.0: the design matrix of covariates at the prediction locations. This should be a three-dimensional array, with dimensions corresponding to site, primary time period, and covariate, respectively. Note that the first covariate should consist of all 1s for the intercept if an intercept is included in the model. If random effects are included in the occupancy (or detection if type = 'detection') portion of the model, the levels of the random effects at the new locations/time periods should be included as an element of the three-dimensional array. The ordering of the levels should match the ordering used to fit the data in stMsPGOcc. The covariates should be organized in the same order as they were specified in the corresponding formula argument of stMsPGOcc. Names of the third dimension (covariates) of any random effects in X.0 must match the name of the random effects used to fit the model, if specified in the corresponding formula argument of stMsPGOcc. See example below.
coords.0: the spatial coordinates corresponding to X.0. Note that spOccupancy assumes coordinates are specified in a projected coordinate system.
t.cols: an indexing vector used to denote which primary time periods are contained in the design matrix of covariates at the prediction locations (X.0). The values should denote the specific primary time periods used to fit the model. The values should indicate the columns in data$y used to fit the model for which prediction is desired. See example below.
n.omp.threads: a positive integer indicating the number of threads to use for SMP parallel processing. The package must be compiled for OpenMP support. For most Intel-based machines, we recommend setting n.omp.threads up to the number of hyperthreaded cores. Note, n.omp.threads > 1 might not work on some systems.
verbose: if TRUE, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen.
ignore.RE: logical value that specifies whether or not to remove random unstructured occurrence (or detection if type = 'detection') effects from the subsequent predictions. If TRUE, random effects will be included. If FALSE, unstructured random effects will be set to 0 and predictions will only be generated from the fixed effects, the spatial random effects, and AR(1) random effects if the model was fit with ar1 = TRUE.
n.report: the interval to report sampling progress.
type: a quoted keyword indicating what type of prediction to produce. Valid keywords are 'occupancy' to predict latent occupancy probability and latent occupancy values (this is the default), or 'detection' to predict detection probability given new values of detection covariates.
grid.index.0: an indexing vector used to specify how each row in X.0 corresponds to the coordinates specified in coords.0. Only relevant if the spatial random effect was estimated at a higher spatial resolution (e.g., grid cells) than point locations.
...: currently no additional arguments

Note

When ignore.RE = FALSE, both sampled levels and non-sampled levels of unstructured random effects are supported for prediction. For sampled levels, the posterior distribution for the random intercept corresponding to that level of the random effect will be used in the prediction. For non-sampled levels, random values are drawn from a normal distribution using the posterior samples of the random effect variance, which results in fully propagated uncertainty in predictions with models that incorporate random effects.

Occurrence predictions at sites that are only sampled for a subset of the total number of primary time periods are obtained directly when fitting the model. See the psi.samples and z.samples portions of the output list from the model object of class stMsPGOcc.

Author

Jeffrey W. Doser doserjef@msu.edu,
Andrew O. Finley finleya@msu.edu

Value

A list object of class predict.stMsPGOcc. When type = 'occupancy', the list consists of:

psi.0.samples: a four-dimensional object of posterior predictive samples for the latent occupancy probability values with dimensions corresponding to posterior predictive sample, species, site, and primary time period.
z.0.samples: a three-dimensional object of posterior predictive samples for the latent occupancy values with dimensions corresponding to posterior predictive sample, species, site, and primary time period.
w.0.samples: a three-dimensional array of posterior predictive samples for the latent spatial factors with dimensions correpsonding to MCMC sample, latent factor, and site.

When type = 'detection', the list consists of:

p.0.samples: a four-dimensional object of posterior predictive samples for the detection probability values with dimensions corresponding to posterior predictive sample, site, and primary time period.

The return object will include additional objects used for standard extractor functions.

Examples

# Simulate Data -----------------------------------------------------------
set.seed(500)
J.x <- 8
J.y <- 8
J <- J.x * J.y
# Years sampled
n.time <- sample(3:10, J, replace = TRUE)
# n.time <- rep(10, J)
n.time.max <- max(n.time)
# Replicates
n.rep <- matrix(NA, J, max(n.time))
for (j in 1:J) {
  n.rep[j, 1:n.time[j]] <- sample(2:4, n.time[j], replace = TRUE)
  # n.rep[j, 1:n.time[j]] <- rep(4, n.time[j])
}
N <- 7
# Community-level covariate effects
# Occurrence
beta.mean <- c(-3, -0.2, 0.5)
trend <- FALSE
sp.only <- 0
p.occ <- length(beta.mean)
tau.sq.beta <- c(0.6, 1.5, 1.4)
# Detection
alpha.mean <- c(0, 1.2, -1.5)
tau.sq.alpha <- c(1, 0.5, 2.3)
p.det <- length(alpha.mean)
# Random effects
psi.RE <- list()
p.RE <- list()
# Draw species-level effects from community means.
beta <- matrix(NA, nrow = N, ncol = p.occ)
alpha <- matrix(NA, nrow = N, ncol = p.det)
for (i in 1:p.occ) {
  beta[, i] <- rnorm(N, beta.mean[i], sqrt(tau.sq.beta[i]))
}
for (i in 1:p.det) {
  alpha[, i] <- rnorm(N, alpha.mean[i], sqrt(tau.sq.alpha[i]))
}
sp <- TRUE
svc.cols <- c(1)
p.svc <- length(svc.cols)
n.factors <- 3
phi <- runif(p.svc * n.factors, 3 / .9, 3 / .3)
factor.model <- TRUE
cov.model <- 'exponential'
ar1 <- TRUE
sigma.sq.t <- runif(N, 0.05, 1)
rho <- runif(N, 0.1, 1)

dat <- simTMsOcc(J.x = J.x, J.y = J.y, n.time = n.time, n.rep = n.rep, N = N,
     beta = beta, alpha = alpha, sp.only = sp.only, trend = trend,
     psi.RE = psi.RE, p.RE = p.RE, factor.model = factor.model,
                 svc.cols = svc.cols, n.factors = n.factors, phi = phi, sp = sp,
                 cov.model = cov.model, ar1 = ar1, sigma.sq.t = sigma.sq.t, rho = rho)

# Subset data for prediction
pred.indx <- sample(1:J, round(J * .25), replace = FALSE)
y <- dat$y[, -pred.indx, , , drop = FALSE]
# Occupancy covariates
X <- dat$X[-pred.indx, , , drop = FALSE]
# Prediction covariates
X.0 <- dat$X[pred.indx, , , drop = FALSE]
# Detection covariates
X.p <- dat$X.p[-pred.indx, , , , drop = FALSE]
# Coordinates
coords <- dat$coords[-pred.indx, ]
coords.0 <- dat$coords[pred.indx, ]

occ.covs <- list(occ.cov.1 = X[, , 2],
     occ.cov.2 = X[, , 3])
det.covs <- list(det.cov.1 = X.p[, , , 2],
     det.cov.2 = X.p[, , , 3])

data.list <- list(y = y, occ.covs = occ.covs,
                  det.covs = det.covs,
                  coords = coords)
# Priors
prior.list <- list(beta.comm.normal = list(mean = 0, var = 2.72),
       alpha.comm.normal = list(mean = 0, var = 2.72),
       tau.sq.beta.ig = list(a = 0.1, b = 0.1),
       tau.sq.alpha.ig = list(a = 0.1, b = 0.1),
       rho.unif = list(a = -1, b = 1),
       sigma.sq.t.ig = list(a = 0.1, b = 0.1),
                   phi.unif = list(a = 3 / .9, b = 3 / .1))
z.init <- apply(y, c(1, 2, 3), function(a) as.numeric(sum(a, na.rm = TRUE) > 0))
inits.list <- list(alpha.comm = 0, beta.comm = 0, beta = 0,
       alpha = 0, tau.sq.beta = 1, tau.sq.alpha = 1,
       rho = 0.5, sigma.sq.t = 0.5,
       phi = 3 / .5, z = z.init)
# Tuning
tuning.list <- list(phi = 1, rho = 0.5)

# Number of batches
n.batch <- 5
# Batch length
batch.length <- 25
n.burn <- 25
n.thin <- 1
n.samples <- n.batch * batch.length

out <- stMsPGOcc(occ.formula = ~ occ.cov.1 + occ.cov.2,
                det.formula = ~ det.cov.1 + det.cov.2,
                data = data.list,
                inits = inits.list,
                n.batch = n.batch,
                batch.length = batch.length,
                accept.rate = 0.43,
    ar1 = TRUE,
    NNGP = TRUE,
    n.neighbors = 5,
    n.factors = n.factors,
    cov.model = 'exponential',
                priors = prior.list,
                tuning = tuning.list,
                n.omp.threads = 1,
                verbose = TRUE,
                n.report = 1,
                n.burn = n.burn,
    n.thin = n.thin,
    n.chains = 1)
#> ----------------------------------------
#> 	Preparing to run the model
#> ----------------------------------------
#> lambda is not specified in initial values.
#> Setting initial values of the lower triangle to 0
#> ----------------------------------------
#> 	Building the neighbor list
#> ----------------------------------------
#> ----------------------------------------
#> Building the neighbors of neighbors list
#> ----------------------------------------
#> ----------------------------------------
#> 	Model description
#> ----------------------------------------
#> Spatial Factor NNGP Multi-season Multi-species Occupancy Model with Polya-Gamma latent
#> variables with 48 sites, 7 species, and 10 primary time periods.
#> 
#> Samples per chain: 125 (5 batches of length 25)
#> Burn-in: 25 
#> Thinning Rate: 1 
#> Number of Chains: 1 
#> Total Posterior Samples: 100 
#> 
#> Using the exponential spatial correlation model.
#> 
#> Using 3 latent spatial factors.
#> Using 5 nearest neighbors.
#> 
#> Source compiled with OpenMP support and model fit using 1 thread(s).
#> 
#> Adaptive Metropolis with target acceptance rate: 43.0
#> ----------------------------------------
#> 	Chain 1
#> ----------------------------------------
#> Sampling ... 
#> Batch: 1 of 5, 20.00%
#> 	Latent Factor	Parameter	Acceptance	Tuning
#> 	1		phi		64.0		1.02020
#> 	2		phi		56.0		1.02020
#> 	3		phi		88.0		1.02020
#> 	Species		Parameter	Acceptance	Tuning
#> 	1		rho		96.0		0.51010
#> 	2		rho		84.0		0.51010
#> 	3		rho		72.0		0.51010
#> 	4		rho		84.0		0.51010
#> 	5		rho		68.0		0.51010
#> 	6		rho		80.0		0.51010
#> 	7		rho		68.0		0.51010
#> -------------------------------------------------
#> Batch: 2 of 5, 40.00%
#> 	Latent Factor	Parameter	Acceptance	Tuning
#> 	1		phi		76.0		1.03045
#> 	2		phi		64.0		1.03045
#> 	3		phi		76.0		1.03045
#> 	Species		Parameter	Acceptance	Tuning
#> 	1		rho		68.0		0.51523
#> 	2		rho		64.0		0.51523
#> 	3		rho		72.0		0.51523
#> 	4		rho		68.0		0.51523
#> 	5		rho		68.0		0.51523
#> 	6		rho		80.0		0.51523
#> 	7		rho		56.0		0.51523
#> -------------------------------------------------
#> Batch: 3 of 5, 60.00%
#> 	Latent Factor	Parameter	Acceptance	Tuning
#> 	1		phi		64.0		1.04081
#> 	2		phi		76.0		1.04081
#> 	3		phi		64.0		1.04081
#> 	Species		Parameter	Acceptance	Tuning
#> 	1		rho		76.0		0.52041
#> 	2		rho		64.0		0.52041
#> 	3		rho		80.0		0.52041
#> 	4		rho		64.0		0.52041
#> 	5		rho		64.0		0.52041
#> 	6		rho		72.0		0.52041
#> 	7		rho		88.0		0.52041
#> -------------------------------------------------
#> Batch: 4 of 5, 80.00%
#> 	Latent Factor	Parameter	Acceptance	Tuning
#> 	1		phi		68.0		1.05127
#> 	2		phi		64.0		1.05127
#> 	3		phi		72.0		1.05127
#> 	Species		Parameter	Acceptance	Tuning
#> 	1		rho		84.0		0.52564
#> 	2		rho		68.0		0.52564
#> 	3		rho		84.0		0.52564
#> 	4		rho		80.0		0.52564
#> 	5		rho		84.0		0.52564
#> 	6		rho		84.0		0.52564
#> 	7		rho		88.0		0.52564
#> -------------------------------------------------
#> Batch: 5 of 5, 100.00%

summary(out)
#> 
#> Call:
#> stMsPGOcc(occ.formula = ~occ.cov.1 + occ.cov.2, det.formula = ~det.cov.1 + 
#>     det.cov.2, data = data.list, inits = inits.list, priors = prior.list, 
#>     tuning = tuning.list, cov.model = "exponential", NNGP = TRUE, 
#>     n.neighbors = 5, n.factors = n.factors, n.batch = n.batch, 
#>     batch.length = batch.length, accept.rate = 0.43, n.omp.threads = 1, 
#>     verbose = TRUE, ar1 = TRUE, n.report = 1, n.burn = n.burn, 
#>     n.thin = n.thin, n.chains = 1)
#> 
#> Samples per Chain: 125
#> Burn-in: 25
#> Thinning Rate: 1
#> Number of Chains: 1
#> Total Posterior Samples: 100
#> Run Time (min): 0.0096
#> 
#> ----------------------------------------
#> 	Community Level
#> ----------------------------------------
#> Occurrence Means (logit scale): 
#>                Mean     SD    2.5%     50%   97.5% Rhat ESS
#> (Intercept) -3.1892 0.2887 -3.7924 -3.1763 -2.7037   NA  63
#> occ.cov.1    0.5453 0.2075  0.1863  0.5507  0.8574   NA  53
#> occ.cov.2    0.6311 0.4656 -0.2421  0.6613  1.4593   NA  29
#> 
#> Occurrence Variances (logit scale): 
#>               Mean     SD   2.5%    50%  97.5% Rhat ESS
#> (Intercept) 0.5167 0.5371 0.0378 0.3882 1.7570   NA   4
#> occ.cov.1   0.2772 0.3596 0.0484 0.1601 1.1616   NA  57
#> occ.cov.2   1.9535 2.1525 0.6564 1.4419 5.8089   NA 100
#> 
#> Detection Means (logit scale): 
#>                Mean     SD    2.5%     50%  97.5% Rhat ESS
#> (Intercept)  0.2507 0.3312 -0.4508  0.2789 0.9027   NA  75
#> det.cov.1    0.9749 0.2797  0.5273  0.9632 1.4611   NA  26
#> det.cov.2   -1.0750 0.6589 -2.2953 -1.1260 0.5085   NA  67
#> 
#> Detection Variances (logit scale): 
#>               Mean     SD   2.5%    50%   97.5% Rhat ESS
#> (Intercept) 0.8170 0.5602 0.2469 0.6299  2.1294   NA 100
#> det.cov.1   0.4024 0.6038 0.0493 0.2339  1.7841   NA  39
#> det.cov.2   3.0619 2.7529 0.6539 2.0213 10.8417   NA  48
#> 
#> ----------------------------------------
#> 	Species Level
#> ----------------------------------------
#> Occurrence (logit scale): 
#>                    Mean     SD    2.5%     50%   97.5% Rhat ESS
#> (Intercept)-sp1 -3.2732 0.2725 -3.8563 -3.2514 -2.8637   NA  12
#> (Intercept)-sp2 -3.4062 0.2867 -3.9686 -3.3926 -2.9389   NA  15
#> (Intercept)-sp3 -3.5280 0.3194 -4.0536 -3.4960 -3.0327   NA   8
#> (Intercept)-sp4 -3.5353 0.4406 -4.4259 -3.4155 -2.8879   NA   3
#> (Intercept)-sp5 -3.6507 0.5085 -4.7056 -3.5912 -2.7773   NA   4
#> (Intercept)-sp6 -2.2058 0.5255 -3.0747 -2.1396 -1.4290   NA   3
#> (Intercept)-sp7 -3.5367 0.4063 -4.3694 -3.5598 -2.8187   NA   8
#> occ.cov.1-sp1    0.2959 0.2053 -0.0496  0.2767  0.7348   NA  33
#> occ.cov.1-sp2    0.6333 0.2046  0.2151  0.6382  1.0431   NA  21
#> occ.cov.1-sp3    0.4328 0.2332 -0.0225  0.4745  0.7934   NA  16
#> occ.cov.1-sp4    0.3326 0.2437 -0.1069  0.3048  0.7747   NA  22
#> occ.cov.1-sp5    0.7649 0.2278  0.4115  0.7424  1.3900   NA  25
#> occ.cov.1-sp6    0.3150 0.1559  0.0412  0.2901  0.6351   NA  38
#> occ.cov.1-sp7    1.1406 0.2881  0.6458  1.1385  1.6647   NA   6
#> occ.cov.2-sp1    1.5165 0.2945  0.9954  1.5136  2.0052   NA  12
#> occ.cov.2-sp2    2.0867 0.3505  1.4934  2.0570  2.8774   NA   6
#> occ.cov.2-sp3   -0.9705 0.2248 -1.3669 -0.9720 -0.5782   NA  12
#> occ.cov.2-sp4   -0.5996 0.2197 -0.9346 -0.6016 -0.1575   NA  28
#> occ.cov.2-sp5    0.0523 0.2414 -0.3904  0.0283  0.5249   NA  24
#> occ.cov.2-sp6    1.1954 0.2239  0.7906  1.2009  1.6189   NA  34
#> occ.cov.2-sp7    1.6138 0.2826  1.1430  1.5884  2.1132   NA  17
#> 
#> Detection (logit scale): 
#>                    Mean     SD    2.5%     50%   97.5% Rhat ESS
#> (Intercept)-sp1  1.1716 0.3574  0.5362  1.1188  2.0363   NA  26
#> (Intercept)-sp2 -0.8150 0.2390 -1.3591 -0.8086 -0.4188   NA  32
#> (Intercept)-sp3 -0.3361 0.4865 -1.1586 -0.3053  0.7124   NA  31
#> (Intercept)-sp4  0.8879 0.2652  0.4080  0.8444  1.4208   NA  36
#> (Intercept)-sp5  0.3327 0.4151 -0.4318  0.4155  0.9785   NA  29
#> (Intercept)-sp6  0.6418 0.1832  0.3151  0.6279  1.0091   NA  54
#> (Intercept)-sp7 -0.0815 0.3068 -0.6471 -0.0480  0.5140   NA  26
#> det.cov.1-sp1    1.4003 0.3841  0.8432  1.3468  2.2751   NA  18
#> det.cov.1-sp2    0.7737 0.1668  0.5019  0.7565  1.0919   NA  58
#> det.cov.1-sp3    0.3068 0.3539 -0.4080  0.3382  1.0260   NA  29
#> det.cov.1-sp4    0.8820 0.2229  0.4442  0.8871  1.2849   NA 100
#> det.cov.1-sp5    1.2319 0.3761  0.6722  1.1722  2.1349   NA  34
#> det.cov.1-sp6    0.9129 0.1651  0.5860  0.9241  1.2052   NA  49
#> det.cov.1-sp7    1.2019 0.2769  0.6709  1.2226  1.7079   NA  40
#> det.cov.2-sp1   -0.1934 0.2830 -0.7942 -0.1647  0.2583   NA  41
#> det.cov.2-sp2   -0.0412 0.1973 -0.4173 -0.0602  0.3463   NA  42
#> det.cov.2-sp3   -4.0881 0.9371 -6.3019 -4.0531 -2.2762   NA   7
#> det.cov.2-sp4   -1.0814 0.2844 -1.6566 -1.0761 -0.5852   NA  36
#> det.cov.2-sp5   -0.6675 0.3668 -1.3896 -0.6619  0.0690   NA  56
#> det.cov.2-sp6   -2.1844 0.3017 -2.7639 -2.1910 -1.6324   NA  22
#> det.cov.2-sp7   -1.8352 0.3555 -2.5222 -1.7941 -1.3038   NA  34
#> 
#> ----------------------------------------
#> 	Spatio-temporal Covariance: 
#> ----------------------------------------
#>                   Mean     SD    2.5%     50%   97.5% Rhat ESS
#> phi-1          20.7204 8.3989  4.4904 22.0156 29.9821   NA   7
#> phi-2          12.2180 7.4260  3.8022 10.5779 27.2741   NA   5
#> phi-3          10.4698 7.3945  3.9123  7.2604 27.0794   NA   3
#> sigma.sq.t-sp1  1.6895 4.0730  0.0417  0.3006 17.6742   NA  11
#> sigma.sq.t-sp2  1.1140 2.0154  0.0711  0.4465  8.1345   NA  12
#> sigma.sq.t-sp3  0.4518 0.5307  0.0414  0.3031  1.9224   NA  26
#> sigma.sq.t-sp4  0.6883 0.6165  0.0921  0.5123  2.3636   NA   5
#> sigma.sq.t-sp5  0.7240 0.9920  0.0568  0.3952  3.4574   NA  21
#> sigma.sq.t-sp6  0.2539 0.2327  0.0293  0.1856  0.8224   NA  10
#> sigma.sq.t-sp7  4.8389 6.4847  0.4020  2.2605 26.3290   NA   7
#> rho-sp1         0.5669 0.3134 -0.0904  0.5892  0.9527   NA   6
#> rho-sp2         0.7045 0.2546  0.0477  0.7591  0.9672   NA   5
#> rho-sp3         0.0069 0.6058 -0.9624  0.1315  0.8025   NA   2
#> rho-sp4         0.6666 0.2584  0.1393  0.7959  0.9349   NA   2
#> rho-sp5         0.5225 0.3844 -0.3379  0.7012  0.9206   NA   3
#> rho-sp6         0.2639 0.3307 -0.4224  0.3167  0.7266   NA   7
#> rho-sp7        -0.0965 0.4254 -0.7415 -0.1230  0.6360   NA   5

# Predict at new sites across all n.max.years
# Take a look at array of covariates for prediction
str(X.0)
#>  num [1:16, 1:10, 1:3] 1 1 1 1 1 1 1 1 1 1 ...
# Subset to only grab time periods 1, 2, and 5
t.cols <- c(1, 2, 5)
X.pred <- X.0[, t.cols, ]
out.pred <- predict(out, X.pred, coords.0, t.cols = t.cols, type = 'occupancy')
#> ----------------------------------------
#> 	Prediction description
#> ----------------------------------------
#> Spatial Factor NNGP Multi-season, Multi-species Occupancy model with Polya-Gamma latent
#> variable fit with 48 sites and 3 years.
#> 
#> Number of covariates 3 (including intercept if specified).
#> 
#> Number of spatially-varying covariates 1 (including intercept if specified).
#> 
#> Using the exponential spatial correlation model.
#> 
#> Using 5 nearest neighbors.
#> Using 3 latent spatial factors.
#> 
#> Number of MCMC samples 100.
#> 
#> Predicting at 16 non-sampled locations.
#> 
#> 
#> Source compiled with OpenMP support and model fit using 1 threads.
#> -------------------------------------------------
#> 		Predicting
#> -------------------------------------------------
#> Location: 16 of 16, 100.00%
#> Generating latent occupancy state
str(out.pred)
#> List of 6
#>  $ z.0.samples  : num [1:100, 1:7, 1:16, 1:3] 0 0 0 0 0 0 0 0 0 1 ...
#>  $ w.0.samples  : num [1:100, 1:3, 1:16] 0.431 -0.894 -0.15 -0.113 -0.959 ...
#>  $ psi.0.samples: num [1:100, 1:7, 1:16, 1:3] 0.03207 0.00273 0.0121 0.01081 0.00487 ...
#>  $ run.time     : 'proc_time' Named num [1:5] 0.018 0.09 0.014 0 0
#>   ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
#>  $ call         : language predict.svcTMsPGOcc(object = object, X.0 = X.0, coords.0 = coords.0, t.cols = t.cols,      n.omp.threads = n.omp.| __truncated__ ...
#>  $ object.class : chr "stMsPGOcc"
#>  - attr(*, "class")= chr "predict.svcTMsPGOcc"