Several goodness-of-fit measure are currently available and can be calculated for a gllvm model fit and predicted values.
goodnessOfFit(
object = NULL,
y = NULL,
pred = NULL,
measure = c("cor", "RMSE", "MAE", "MARNE"),
species = FALSE
)an object of class 'gllvm', to calculate goodness of a model fit.
a response matrix of new observations
predicted values for response matrix y if you want to calculate prediction accuracy for new values. Note that for ordinal model, you need to give the predicted classes.
a goodness-of-fit measure to be calculated. Options are "cor" (correlation between observed and predicted values), "scor" (Spearman correlation between observed and predicted values), "RMSE" (root mean squared error of prediction), "MAE" (Mean Absolute Error), "MARNE" (Mean Absolute Range Normalized Error), "TjurR2" (Tjur's R2 measure, only for binary data), "R2" (R-squared as the square of the correlation), "AUC", "sR2" (R-squared as the square of the spearman correlation). Likelihood based pseudo R2 meaures "NagelkerkeR2", "McFaddenR2", "CoxSnellR2" can be calculated currently only for training data to measure the model's goodness of fit for full data, not response specific.
logical, if TRUE, goodness-of-fit measures are calculated for each species separately. If FALSE, goodness-of-fit measures are calculated for all species together.
goodnessOfFit is used for evaluating the goodness-of-fit of a model or predictions. Available goodness-of-fit measures are correlation, RMSE, MARNE, and R2 measures. Definitions are below. Denote an observed response j (species) at sample i, \(i=1,...,n\), as \(y_{ij}\), and predicted value as \(\hat y_{ij}\).
$$RMSE(\boldsymbol{y_{j}}, \boldsymbol{\hat y_{j}}) = \sqrt{\frac{1}{n}\Sigma_{i=1}^{n} {(y_{ij} - \hat y_{ij})^2}} $$
$$MAE(\boldsymbol{y_{j}}, \boldsymbol{\hat y_{j}}) = \frac{1}{n}\Sigma_{i=1}^{n} |y_{ij} - \hat y_{ij}| $$
$$MARNE(\boldsymbol{y_{j}}, \boldsymbol{\hat y_{j}}) = \frac{1}{n}\Sigma_{i=1}^{n} \frac{|y_{ij} - \hat y_{ij}|}{max(\boldsymbol{y_{j}}) - min(\boldsymbol{y_{j}})} $$
$$Tjur's R2(\boldsymbol{y_{j}}, \boldsymbol{\hat y_{j}}) = \frac{1}{n_1}\Sigma \hat y_{ij}\boldsymbol{1}_{y=1}(y_{ij}) - \frac{1}{n_0}\Sigma \hat y_{ij}\boldsymbol{1}_{y=0}(y_{ij}) $$
if (FALSE) { # \dontrun{
# Fit gllvm model with Poisson family
data(microbialdata)
X <- microbialdata$Xenv
y <- microbialdata$Y[, order(colMeans(microbialdata$Y > 0),
decreasing = TRUE)[21:40]]
fit <- gllvm(y, X, formula = ~ pH + Phosp, family = poisson())
# Calculate metrics
goodnessOfFit(object = fit, measure = c("cor", "RMSE"))
} # }