Simulate Single-Species Count Data with Imperfect Detection

The function simNMix simulates single-species count data for simulation studies, power assessments, or function testing. Data can be optionally simulated with a spatial Gaussian Process in the abundance portion of the model. Non-spatial random intercepts/slopes can also be included in the detection or abundance portions of the N-mixture model.

Usage

simNMix(J.x, J.y, n.rep, n.rep.max, beta, alpha, kappa, mu.RE = list(), 
        p.RE = list(), offset = 1, sp = FALSE, cov.model, sigma.sq, phi, nu, 
        family = 'Poisson', ...)

Arguments

J.x: a single numeric value indicating the number of sites to simulate count data along the horizontal axis. Total number of sites with simulated data is \(J.x \times J.y\).
J.y: a single numeric value indicating the number of sites to simulate count data along the vertical axis. Total number of sites with simulated data is \(J.x \times J.y\).
n.rep: a numeric vector of length \(J = J.x \times J.y\) indicating the number of repeat visits at each of the \(J\) sites.
n.rep.max: a single numeric value indicating the maximum number of replicate surveys. This is an optional argument, with its default value set to max(n.rep). This can be used to generate data sets with different types of missingness (e.g., simulate data across 20 days (replicate surveys) but sites are only sampled a maximum of ten times each).
beta: a numeric vector containing the intercept and regression coefficient parameters for the abundance portion of the single-species N-mixture model.
alpha: a numeric vector containing the intercept and regression coefficient parameters for the detection portion of the single-species N-mixture model.
kappa: a single numeric value containing the dispersion parameter for the abundance portion of the N-mixture model. Only relevant when family = 'NB'.
mu.RE: a list used to specify the non-spatial random effects included in the abundance portion of the model. The list must have two tags: levels and sigma.sq.mu. levels is a vector of length equal to the number of distinct random effects to include in the model and contains the number of levels there are in each effect. sigma.sq.mu is a vector of length equal to the number of distinct random effects to include in the model and contains the variances for each random effect. If not specified, no random effects are included in the abundance portion of the model. An optional third tag, beta.indx, is a list that contains integers denoting the corresponding value of beta that each random effect corresponds to. This allows specification of random intercepts as well as slopes. By default, all effects are assumed to be random intercepts.
p.RE: a list used to specify the non-spatial random effects included in the detection portion of the model. The list must have two tags: levels and sigma.sq.p. levels is a vector of length equal to the number of distinct random effects to include in the model and contains the number of levels there are in each effects. sigma.sq.p is a vector of length equal to the number of distinct random effects to include in the model and contains the variances for each random effect. If not specified, no random effects are included in the detection portion of the model. An optional third tag, alpha.indx, is a list that contains integers denoting the corresponding value of alpha that each random effect corresponds to. This allows specification of random intercepts as well as slopes. By default, all effects are assumed to be random intercepts.
offset: either a single numeric value or a vector of length J that contains the offset for each location in the data set.
sp: a logical value indicating whether to simulate a spatially-explicit N-mixture model with a Gaussian process. By default set to FALSE.
cov.model: a quoted keyword that specifies the covariance function used to model the spatial dependence structure among the latent abundance values. Supported covariance model key words are: "exponential", "matern", "spherical", and "gaussian".
sigma.sq: a numeric value indicating the spatial variance parameter. Ignored when sp = FALSE.
phi: a numeric value indicating the spatial decay parameter. Ignored when sp = FALSE.
nu: a numeric value indicating the spatial smoothness parameter. Only used when sp = TRUE and cov.model = "matern".
family: the distribution to use for the latent abundance process. Currently supports 'NB' (negative binomial) and 'Poisson'.
...: currently no additional arguments

Author

Jeffrey W. Doser doserjef@msu.edu

Value

A list comprised of:

X: a \(J \times p.abund\) numeric design matrix for the abundance portion of the model.
X.p: a three-dimensional numeric array with dimensions corresponding to sites, repeat visits, and number of detection regression coefficients. This is the design matrix used for the detection portion of the N-mixture model.
coords: a \(J \times 2\) numeric matrix of coordinates of each site. Required for spatial models.
w: a \(J \times 1\) matrix of the spatial random effects. Only used to simulate data when sp = TRUE.
mu: a \(J \times 1\) matrix of the expected abundance values for each site.
N: a length \(J\) vector of the latent abundances at each site.
p: a J x max(n.rep) matrix of the detection probabilities for each site and replicate combination. Sites with fewer than max(n.rep) replicates will contain NA values.
y: a J x max(n.rep) matrix of the raw count data for each site and replicate combination.
X.p.re: a three-dimensional numeric array containing the levels of any detection random effect included in the model. Only relevant when detection random effects are specified in p.RE.
X.re: a numeric matrix containing the levels of any abundance random effect included in the model. Only relevant when abundance random effects are specified in mu.RE.
alpha.star: a numeric vector that contains the simulated detection random effects for each given level of the random effects included in the detection model. Only relevant when detection random effects are included in the model.
beta.star: a numeric vector that contains the simulated abundance random effects for each given level of the random effects included in the N-mixture model. Only relevant when abundance random effects are included in the model.

Examples

set.seed(400)
J.x <- 10
J.y <- 10
n.rep <- rep(4, J.x * J.y)
beta <- c(0.5, -0.15)
alpha <- c(0.7, 0.4)
kappa <- 0.5
phi <- 3 / .6
sigma.sq <- 2
mu.RE <- list(levels = 10, sigma.sq.mu = 1.2)
p.RE <- list(levels = 15, sigma.sq.p = 0.8)
dat <- simNMix(J.x = J.x, J.y = J.y, n.rep = n.rep, beta = beta, alpha = alpha,
               kappa = kappa, mu.RE = mu.RE, p.RE = p.RE, sp = TRUE, 
               cov.model = 'spherical', sigma.sq = sigma.sq, phi = phi, 
               family = 'NB')