Introduction to gllvm

R package gllvm

# From CRAN
install.packages(gllvm)
# OR
# From GitHub using devtools package's function install_github
devtools::install_github("JenniNiku/gllvm")

Problems?

gllvm package depends on R packages TMB and mvabund, try to install these first.

  • GLLVMs are computationally intensive to fit due the integral in log-likelihood.

  • gllvm package overcomes computational problems by applying closed form approximations to log-likelihood and using automatic differentiation in C++ to accelerate computation times (TMB2).

  • Estimation is performed using either variational approximation (VA3), extended variational approximation method (EVA4) or Laplace approximation (LA5) method implemented via R package TMB.

  • VA method is faster and more accurate than LA, but not applicable for all distributions and link functions.

  • Using gllvm we can fit

    • GLLVM without covariates gives model-based ordination and biplots
    • GLLVM with environmental covariates for studying factors explaining species abundance
    • Fourth corner models with latent variables for studying environmental-trait interactions
    • GLLVM without latent variables fits basic multivariate GLMs
  • Additional tools: model checking, model selection, inference, visualization.

Distributions

Response Distribution Method Link
Counts Poisson VA/LA log
NB VA/LA log
ZIP LA log
Binary Bernoulli VA/LA probit
EVA/LA logit
Ordinal Ordinal VA probit
Normal Gaussian VA/LA identity
Positive continuous Gamma VA/LA log
Non-negative continuous Exponential VA/LA log
Biomass Tweedie LA/EVA log
Percent cover beta LA/EVA probit/logit

Data input

Main function of the gllvm package is gllvm(), which can be used to fit GLLVMs for multivariate data with the most important arguments listed in the following:

gllvm(y = NULL, X = NULL, TR = NULL, family, num.lv = 2, 
 formula = NULL, method = "VA", row.eff = FALSE, n.init=1, starting.val ="res", ...)
  • y: matrix of abundances
  • X: matrix or data.frame of environmental variables
  • TR: matrix or data.frame of trait variables
  • family: distribution for responses
  • num.lv: number of latent variables
  • method: approximation used “VA” or “LA”
  • row.eff: type of row effects
  • n.init: number of random starting points for latent variables
  • starting.val: starting value method
library(gllvm)
## Loading required package: TMB
## Loading required package: mvabund
## 
## Attaching package: 'gllvm'
## The following object is masked from 'package:mvabund':
## 
##     coefplot
## The following objects are masked from 'package:stats':
## 
##     nobs, predict, simulate, vcov

Example: Spiders

  • Abundances of 12 hunting spider species measured as a count at 28 sites6.
  • Six environmental variables measured at each site.
    • soil.dry: Soil dry mass
    • bare.sand: cover of bare sand
    • fallen.leaves: cover of fallen leaves/twigs
    • moss: cover of moss
    • herb.layer: cover of herb layer
    • reflection: reflection of the soil surface with a cloudless sky

Data fitting

Fitting basic GLLVM \(g(E(y_{ij})) = \beta_{0j} + \boldsymbol{u}_i'\boldsymbol{\theta}_j\) with gllvm:

library(mvabund)
data("spider")
library(gllvm)
fitnb <- gllvm(y = spider$abund, family = "negative.binomial")
fitnb
## Call: 
## gllvm(y = spider$abund, family = "negative.binomial")
## family: 
## [1] "negative.binomial"
## method: 
## [1] "VA"
## 
## log-likelihood:  -733.6806 
## Residual degrees of freedom:  289 
## AIC:  1561.361 
## AICc:  1577.028 
## BIC:  1623.975

Residual analysis

  • Residual analysis can be used to assess the appropriateness of the fitted model (eg. in terms of mean-variance relationship).

  • Randomized quantile/Dunn-Smyth residuals7 are used in the package, as they provide standard normal distributed residuals, even for discrete responses, in the case of a proper model.

par(mfrow = c(1,2))
plot(fitnb, which = 1:2)

Model selection

  • Information criterias can be used for model selection.
  • For example, compare distributions or choose suitable number of latent variables.
fitp <- gllvm(y = spider$abund, family = poisson())
fitnb <- gllvm(y = spider$abund, family = "negative.binomial")
AIC(fitp)
## [1] 1761.655
AIC(fitnb)
## [1] 1561.361

Exercises

Try to do these exercises for the next 10 minutes, as many as time is enough for.

E1. Load spider data from mvabund package and take a look at the dataset.

library(gllvm)
#Package **mvabund** is loaded with **gllvm** so just load with a function `data()`.
data("spider")
# more info: 
# ?spider

Show me the answers.

1. Print the data and covariates and draw a boxplot of the data.

# response matrix:
spider$abund
##    Alopacce Alopcune Alopfabr Arctlute Arctperi Auloalbi Pardlugu Pardmont
## 1        25       10        0        0        0        4        0       60
## 2         0        2        0        0        0       30        1        1
## 3        15       20        2        2        0        9        1       29
## 4         2        6        0        1        0       24        1        7
## 5         1       20        0        2        0        9        1        2
## 6         0        6        0        6        0        6        0       11
## 7         2        7        0       12        0       16        1       30
## 8         0       11        0        0        0        7       55        2
## 9         1        1        0        0        0        0        0       26
## 10        3        0        1        0        0        0        0       22
## 11       15        1        2        0        0        1        0       95
## 12       16       13        0        0        0        0        0       96
## 13        3       43        1        2        0       18        1       24
## 14        0        2        0        1        0        4        3       14
## 15        0        0        0        0        0        0        6        0
## 16        0        3        0        0        0        0        6        0
## 17        0        0        0        0        0        0        2        0
## 18        0        1        0        0        0        0        5        0
## 19        0        1        0        0        0        0       12        0
## 20        0        2        0        0        0        0       13        0
## 21        0        1        0        0        0        0       16        1
## 22        7        0       16        0        4        0        0        2
## 23       17        0       15        0        7        0        2        6
## 24       11        0       20        0        5        0        0        3
## 25        9        1        9        0        0        2        1       11
## 26        3        0        6        0       18        0        0        0
## 27       29        0       11        0        4        0        0        1
## 28       15        0       14        0        1        0        0        6
##    Pardnigr Pardpull Trocterr Zoraspin
## 1        12       45       57        4
## 2        15       37       65        9
## 3        18       45       66        1
## 4        29       94       86       25
## 5       135       76       91       17
## 6        27       24       63       34
## 7        89      105      118       16
## 8         2        1       30        3
## 9         1        1        2        0
## 10        0        0        1        0
## 11        0        1        4        0
## 12        1        8       13        0
## 13       53       72       97       22
## 14       15       72       94       32
## 15        0        0       25        3
## 16        2        0       28        4
## 17        0        0       23        2
## 18        0        0       25        0
## 19        1        0       22        3
## 20        0        0       22        2
## 21        0        1       18        2
## 22        0        0        1        0
## 23        0        0        1        0
## 24        0        0        0        0
## 25        6        0       16        6
## 26        0        0        1        0
## 27        0        0        0        0
## 28        0        0        2        0
# Environmental variables
spider$x
##       soil.dry bare.sand fallen.leaves   moss herb.layer reflection
##  [1,]   2.3321    0.0000        0.0000 3.0445     4.4543     3.9120
##  [2,]   3.0493    0.0000        1.7918 1.0986     4.5643     1.6094
##  [3,]   2.5572    0.0000        0.0000 2.3979     4.6052     3.6889
##  [4,]   2.6741    0.0000        0.0000 2.3979     4.6151     2.9957
##  [5,]   3.0155    0.0000        0.0000 0.0000     4.6151     2.3026
##  [6,]   3.3810    2.3979        3.4340 2.3979     3.4340     0.6931
##  [7,]   3.1781    0.0000        0.0000 0.6931     4.6151     2.3026
##  [8,]   2.6247    0.0000        4.2627 1.0986     3.4340     0.6931
##  [9,]   2.4849    0.0000        0.0000 4.3307     3.2581     3.4012
## [10,]   2.1972    3.9318        0.0000 3.4340     3.0445     3.6889
## [11,]   2.2192    0.0000        0.0000 4.1109     3.7136     3.6889
## [12,]   2.2925    0.0000        0.0000 3.8286     4.0254     3.6889
## [13,]   3.5175    1.7918        1.7918 0.6931     4.5109     3.4012
## [14,]   3.0865    0.0000        0.0000 1.7918     4.5643     1.0986
## [15,]   3.2696    0.0000        4.3944 0.6931     3.0445     0.6931
## [16,]   3.0301    0.0000        4.6052 0.6931     0.6931     0.0000
## [17,]   3.3322    0.0000        4.4543 0.6931     3.0445     1.0986
## [18,]   3.1224    0.0000        4.3944 0.0000     3.0445     1.0986
## [19,]   2.9232    0.0000        4.5109 1.6094     1.6094     0.0000
## [20,]   3.1091    0.0000        4.5951 0.6931     0.6931     0.0000
## [21,]   2.9755    0.0000        4.5643 0.6931     1.7918     0.0000
## [22,]   1.2528    3.2581        0.0000 4.3307     0.6931     3.9120
## [23,]   1.1939    3.0445        0.0000 4.0254     3.2581     4.0943
## [24,]   1.6487    3.2581        0.0000 4.0254     3.0445     4.0073
## [25,]   1.8245    3.5835        0.0000 1.0986     4.1109     2.3026
## [26,]   0.9933    4.5109        0.0000 1.7918     1.7918     4.3820
## [27,]   0.9555    2.3979        0.0000 3.8286     3.4340     3.6889
## [28,]   0.9555    3.4340        0.0000 3.7136     3.4340     3.6889
# Plot data using boxplot:
boxplot(spider$abund)

E2. Fit GLLVM to spider data with a suitable distribution. Data consists of counts of spider species.

# Take a look at the function documentation for help: 
?gllvm

Show me the answers.

2. Response variables in spider data are counts, so Poisson, negative binomial and zero inflated Poisson are possible. However, ZIP is implemented only with Laplace method, so it need to be noticed, that if models are fitted with different methods they can not be compared with information criterias. Let’s try just with a Poisson and NB. NOTE THAT the results may not be exactly the same as below, as the initial values for each model fit are slightly different, so the results may also differ slightly.

# Fit a GLLVM to data
fitp <- gllvm(y=spider$abund, family = poisson())
fitp
## Call: 
## gllvm(y = spider$abund, family = poisson())
## family: 
## [1] "poisson"
## method: 
## [1] "VA"
## 
## log-likelihood:  -845.8277 
## Residual degrees of freedom:  301 
## AIC:  1761.655 
## AICc:  1770.055 
## BIC:  1808.283
fitnb <- gllvm(y=spider$abund, family = "negative.binomial")
fitnb
## Call: 
## gllvm(y = spider$abund, family = "negative.binomial")
## family: 
## [1] "negative.binomial"
## method: 
## [1] "VA"
## 
## log-likelihood:  -733.6806 
## Residual degrees of freedom:  289 
## AIC:  1561.361 
## AICc:  1577.028 
## BIC:  1623.975

Based on AIC, NB distribution suits better. How about residual analysis: NOTE THAT The package uses randomized quantile residuals so each time you plot the residuals, they look a little different.

# Fit a GLLVM to data
plot(fitp)

plot(fitnb)

You could do these comparisons with Laplace method as well, using the code below, and it would give the same conclusion that NB distribution suits best:

fitLAp <- gllvm(y=spider$abund, family = poisson(), method = "LA")
fitLAnb <- gllvm(y=spider$abund, family = "negative.binomial", method = "LA")
fitLAzip <- gllvm(y=spider$abund, family = "ZIP", method = "LA")
AIC(fitLAp)
AIC(fitLAnb)
AIC(fitLAzip)

E3. Explore the fitted model. Where are the estimates for parameters? What about predicted latent variables? Standard errors?

Show me the answers.

3. Lets explore the fitted model:

# Parameters:
coef(fitnb)
## $theta
##                 LV1         LV2
## Alopacce  1.0000000  0.00000000
## Alopcune -0.2631467  1.00000000
## Alopfabr  0.9015779 -0.48406032
## Arctlute -0.7036255  2.03029523
## Arctperi  0.5939606 -1.58945033
## Auloalbi -0.6526990  1.53309945
## Pardlugu -0.8886185 -0.02804052
## Pardmont  0.7498083  0.58844320
## Pardnigr -0.5748828  1.78024800
## Pardpull -0.4147008  1.94807769
## Trocterr -0.4694632  0.78799068
## Zoraspin -0.6964329  1.15696857
## 
## $sigma.lv
##      LV1      LV2 
## 1.803172 1.829258 
## 
## $Intercept
##   Alopacce   Alopcune   Alopfabr   Arctlute   Arctperi   Auloalbi   Pardlugu 
##  0.7740376  0.5515002 -0.1420859 -3.5925856 -3.3660300 -0.7311104  0.3187531 
##   Pardmont   Pardnigr   Pardpull   Trocterr   Zoraspin 
##  1.6247852 -0.2755510 -0.1350650  2.6556997  0.2459376 
## 
## $inv.phi
##   Alopacce   Alopcune   Alopfabr   Arctlute   Arctperi   Auloalbi   Pardlugu 
##  1.3062150  1.4686321  0.7051269  0.8975835  2.3318796  1.5391961  1.1528903 
##   Pardmont   Pardnigr   Pardpull   Trocterr   Zoraspin 
##  1.6178793  2.5410742  2.1962637 12.9779219  2.5728353 
## 
## $phi
##   Alopacce   Alopcune   Alopfabr   Arctlute   Arctperi   Auloalbi   Pardlugu 
## 0.76557077 0.68090572 1.41818438 1.11410253 0.42883860 0.64968977 0.86738525 
##   Pardmont   Pardnigr   Pardpull   Trocterr   Zoraspin 
## 0.61809309 0.39353436 0.45531872 0.07705394 0.38867626
# Where are the predicted latent variable values? just fitp$lvs or
getLV(fitnb)
##               LV1          LV2
## Row1   0.92852295  1.270514416
## Row2  -0.68267655  0.793797989
## Row3   0.67620406  1.274444115
## Row4  -0.24718172  1.149837200
## Row5  -0.36105828  1.217222539
## Row6  -0.35233419  1.008674498
## Row7   0.02923041  1.426051792
## Row8  -1.52131113 -0.070723918
## Row9   0.91073383  0.001223507
## Row10  1.03735952 -0.493619954
## Row11  1.48874665  0.303008216
## Row12  1.32412244  0.807363479
## Row13  0.11539427  1.356624051
## Row14 -0.43680034  1.014588342
## Row15 -1.31925318 -0.544685728
## Row16 -1.17464473 -0.242696718
## Row17 -1.11252047 -0.512801437
## Row18 -1.15895679 -0.542973151
## Row19 -1.34065463 -0.502786270
## Row20 -1.36733582 -0.592091903
## Row21 -1.13971516 -0.475642319
## Row22  0.76440466 -1.345950160
## Row23  0.92078806 -1.380628888
## Row24  0.97904465 -1.395245956
## Row25  0.63124795  0.576822420
## Row26  0.27411148 -1.964089752
## Row27  1.11557339 -1.309376059
## Row28  1.01882377 -0.825802849
# Standard errors for parameters:
fitnb$sd
## $theta
##                LV1       LV2
## Alopacce 0.0000000 0.0000000
## Alopcune 0.2736345 0.0000000
## Alopfabr 0.3301736 0.2043809
## Arctlute 0.6749186 0.7848587
## Arctperi 0.4757480 0.5592111
## Auloalbi 0.4407962 0.3892404
## Pardlugu 0.2275835 0.2005406
## Pardmont 0.2196752 0.1816343
## Pardnigr 0.4734996 0.3799003
## Pardpull 0.4966306 0.4106195
## Trocterr 0.2098740 0.1568623
## Zoraspin 0.3480424 0.2796451
## 
## $sigma.lv
##       LV1       LV2 
## 0.4163208 0.4257394 
## 
## $beta0
##  Alopacce  Alopcune  Alopfabr  Arctlute  Arctperi  Auloalbi  Pardlugu  Pardmont 
## 0.4518412 0.4630159 0.5478085 1.7103834 1.4646015 0.7892939 0.4391385 0.3996809 
##  Pardnigr  Pardpull  Trocterr  Zoraspin 
## 0.7870384 0.8340577 0.3339017 0.5830953 
## 
## $inv.phi
##  Alopacce  Alopcune  Alopfabr  Arctlute  Arctperi  Auloalbi  Pardlugu  Pardmont 
## 0.6025155 0.6025936 0.3957361 0.5787035 1.8042530 0.8383858 0.5810623 0.7756757 
##  Pardnigr  Pardpull  Trocterr  Zoraspin 
## 1.3150227 1.2472695 6.8923695 1.2850962 
## 
## $phi
##   Alopacce   Alopcune   Alopfabr   Arctlute   Arctperi   Auloalbi   Pardlugu 
## 0.35313348 0.27938205 0.79592296 0.71830097 0.33180673 0.35387997 0.43716639 
##   Pardmont   Pardnigr   Pardpull   Trocterr   Zoraspin 
## 0.29633844 0.20365664 0.25857784 0.04092213 0.19413850

E4. Fit model with different numbers of latent variables.

Show me the answers.

4. Default number of latent variables is 2. Let’s try 1 and 3 latent variables as well:

# In exercise 2, we fitted GLLVM with two latent variables 
fitnb
## Call: 
## gllvm(y = spider$abund, family = "negative.binomial")
## family: 
## [1] "negative.binomial"
## method: 
## [1] "VA"
## 
## log-likelihood:  -733.6806 
## Residual degrees of freedom:  289 
## AIC:  1561.361 
## AICc:  1577.028 
## BIC:  1623.975
# How about 1 or 3 LVs
fitnb1 <- gllvm(y=spider$abund, family = "negative.binomial", num.lv = 1)
fitnb1
## Call: 
## gllvm(y = spider$abund, num.lv = 1, family = "negative.binomial")
## family: 
## [1] "negative.binomial"
## method: 
## [1] "VA"
## 
## log-likelihood:  -759.4012 
## Residual degrees of freedom:  300 
## AIC:  1590.802 
## AICc:  1599.712 
## BIC:  1638.762
getLV(fitnb1)
##              [,1]
## Row1  -0.81680096
## Row2  -0.94960133
## Row3  -0.91686860
## Row4  -1.10569018
## Row5  -1.22147786
## Row6  -1.03402154
## Row7  -1.26769984
## Row8  -0.53474357
## Row9   0.36605908
## Row10  1.00892833
## Row11  0.31291662
## Row12 -0.31670590
## Row13 -1.17620934
## Row14 -1.03498651
## Row15 -0.09009225
## Row16 -0.27458320
## Row17 -0.02446666
## Row18 -0.02080146
## Row19 -0.12969356
## Row20 -0.06888764
## Row21 -0.04323099
## Row22  1.54083081
## Row23  1.65024701
## Row24  1.68379528
## Row25 -0.28901228
## Row26  1.92917349
## Row27  1.62339239
## Row28  1.20019873
fitnb3 <- gllvm(y=spider$abund, family = "negative.binomial", num.lv = 3)
fitnb3
## Call: 
## gllvm(y = spider$abund, num.lv = 3, family = "negative.binomial")
## family: 
## [1] "negative.binomial"
## method: 
## [1] "VA"
## 
## log-likelihood:  -733.6806 
## Residual degrees of freedom:  279 
## AIC:  1581.361 
## AICc:  1605.145 
## BIC:  1657.297
getLV(fitnb3)
##               LV1          LV2           LV3
## Row1   0.92851391  1.270548810  9.431962e-06
## Row2  -0.68267006  0.793855142 -8.386363e-06
## Row3   0.67620128  1.274487384  8.445526e-06
## Row4  -0.24714651  1.149901494 -2.185045e-06
## Row5  -0.36103967  1.217284606 -4.044272e-06
## Row6  -0.35232406  1.008733704 -2.146452e-06
## Row7   0.02921248  1.426094102  2.048064e-06
## Row8  -1.52136688 -0.070684422 -1.653403e-05
## Row9   0.91074139  0.001253858 -1.128049e-05
## Row10  1.03734225 -0.493594303 -7.966341e-06
## Row11  1.48870317  0.303022912 -8.517011e-06
## Row12  1.32414883  0.807398841 -4.677532e-06
## Row13  0.11538559  1.356669683  3.898320e-06
## Row14 -0.43681871  1.014633128  9.734836e-07
## Row15 -1.31926181 -0.544634423  7.542452e-06
## Row16 -1.17469016 -0.242659470  3.558241e-06
## Row17 -1.11254234 -0.512757976  8.367344e-06
## Row18 -1.15897652 -0.542931597  9.083275e-06
## Row19 -1.34067065 -0.502735985  2.020586e-06
## Row20 -1.36735438 -0.592045364  3.947129e-06
## Row21 -1.13971618 -0.475593326 -2.679309e-06
## Row22  0.76438429 -1.345943436  1.975783e-06
## Row23  0.92078131 -1.380613197 -7.417643e-07
## Row24  0.97904900 -1.395211312 -9.505309e-07
## Row25  0.63125550  0.576867107  4.699412e-06
## Row26  0.27406906 -1.964095537  3.924502e-06
## Row27  1.11555668 -1.309361894 -9.369011e-07
## Row28  1.01880731 -0.825778957  1.724848e-06

E5. Include environmental variables to the GLLVM and explore the model fit.

Show me the answers.

5. Environmental variables can be included with an argument X:

fitnbx <- gllvm(y = spider$abund, X = spider$x, family = "negative.binomial", seed = 123)
fitnbx
## Call: 
## gllvm(y = spider$abund, X = spider$x, family = "negative.binomial", 
##     seed = 123)
## family: 
## [1] "negative.binomial"
## method: 
## [1] "VA"
## 
## log-likelihood:  -588.9187 
## Residual degrees of freedom:  217 
## AIC:  1415.837 
## AICc:  1548.06 
## BIC:  1574.37
coef(fitnbx)
## $theta
##                   LV1         LV2
## Alopacce 1.000000e+00  0.00000000
## Alopcune 2.201030e+00  1.00000000
## Alopfabr 1.600673e+00 -6.19175541
## Arctlute 2.864009e+00  3.14015521
## Arctperi 4.225987e-06 -1.59827630
## Auloalbi 1.490626e+00 -1.09087040
## Pardlugu 3.944534e-01 -0.01181004
## Pardmont 8.906196e-01 -4.40722335
## Pardnigr 2.417023e+00  3.04636451
## Pardpull 1.680649e+00  3.91430696
## Trocterr 1.030496e+00 -0.26949953
## Zoraspin 1.570712e+00 -0.06703990
## 
## $sigma.lv
##          LV1          LV2 
## 4.769707e-01 5.407191e-08 
## 
## $Intercept
##    Alopacce    Alopcune    Alopfabr    Arctlute    Arctperi    Auloalbi 
##  -1.8071409  -4.6621993  -4.2203179 -15.5708545 -14.9172687 -16.3509261 
##    Pardlugu    Pardmont    Pardnigr    Pardpull    Trocterr    Zoraspin 
##   7.9304033  -6.3995234  -5.6548543 -13.6113376  -0.3335285  -4.1373207 
## 
## $Xcoef
##            soil.dry    bare.sand fallen.leaves         moss  herb.layer
## Alopacce -0.8877652 -0.009236264  -0.291154481  0.274071405  0.65126327
## Alopcune  1.3897158 -0.516769565   0.007388244 -0.136643741  0.48677242
## Alopfabr -0.7970493  0.823486985  -0.144746849  0.531528428  0.42607137
## Arctlute  6.2600527 -0.007168804  -1.824212208  0.502290940 -0.09611211
## Arctperi -1.7800845  0.238640897  -2.189657279  0.496151396  0.02133837
## Auloalbi  0.3201391 -0.134812812   0.608772421  0.129547190  4.18602975
## Pardlugu -2.5395186 -0.732454425   0.364918500 -0.358296603  0.60353930
## Pardmont  1.6381338  0.137516050  -0.546832391  0.917168251  0.70950686
## Pardnigr  2.6358769 -0.161193176  -0.856933463 -0.350885549  0.84822861
## Pardpull  3.0900509 -0.454410066  -0.478356968  0.447287457  2.07114013
## Trocterr  1.2576619 -0.263429950  -0.228842528 -0.158045801  0.47718240
## Zoraspin  2.1514878  0.046106909  -0.577735329 -0.005377085  0.67480473
##           reflection
## Alopacce  0.69600460
## Alopcune  0.31245470
## Alopfabr  0.52418739
## Arctlute -0.85635784
## Arctperi  4.00811796
## Auloalbi -0.66896638
## Pardlugu -1.19124482
## Pardmont  0.04129952
## Pardnigr -0.68172099
## Pardpull -0.48743757
## Trocterr -0.33631811
## Zoraspin -1.00338989
## 
## $inv.phi
##     Alopacce     Alopcune     Alopfabr     Arctlute     Arctperi     Auloalbi 
## 2.062133e+01 3.531758e+00 9.529654e+00 2.265815e+00 2.462368e+06 1.097022e+01 
##     Pardlugu     Pardmont     Pardnigr     Pardpull     Trocterr     Zoraspin 
## 3.464835e+01 2.199516e+00 9.420664e+00 1.903325e+01 2.690830e+01 6.433597e+00 
## 
## $phi
##     Alopacce     Alopcune     Alopfabr     Arctlute     Arctperi     Auloalbi 
## 4.849347e-02 2.831451e-01 1.049356e-01 4.413423e-01 4.061131e-07 9.115588e-02 
##     Pardlugu     Pardmont     Pardnigr     Pardpull     Trocterr     Zoraspin 
## 2.886141e-02 4.546456e-01 1.061496e-01 5.253964e-02 3.716325e-02 1.554340e-01
# confidence intervals for parameters:
confint(fitnbx)
##                                      2.5 %        97.5 %
## theta.LV1.1                   3.184338e-01  6.355076e-01
## theta.LV1.2                  -1.476236e-04  1.477317e-04
## theta.LV1.3                   1.000000e+00  1.000000e+00
## theta.LV1.4                   1.431245e+00  2.970814e+00
## theta.LV1.5                   7.997578e-01  2.401589e+00
## theta.LV1.6                   2.581374e+00  3.146645e+00
## theta.LV1.7                  -3.872607e-03  3.881059e-03
## theta.LV1.8                   7.682947e-01  2.212958e+00
## theta.LV1.9                  -1.209328e-01  9.098397e-01
## theta.LV1.10                  7.055019e-02  1.710689e+00
## theta.LV1.11                  1.704302e+00  3.129744e+00
## theta.LV1.12                  1.136453e+00  2.224844e+00
## theta.LV2.1                   7.035219e-01  1.357471e+00
## theta.LV2.2                   8.624481e-01  2.278975e+00
## theta.LV2.3                   0.000000e+00  0.000000e+00
## theta.LV2.4                   1.000000e+00  1.000000e+00
## theta.LV2.5                  -6.191755e+00 -6.191755e+00
## theta.LV2.6                   3.140155e+00  3.140155e+00
## theta.LV2.7                  -1.598276e+00 -1.598276e+00
## theta.LV2.8                  -1.090870e+00 -1.090870e+00
## theta.LV2.9                  -1.181004e-02 -1.181004e-02
## theta.LV2.10                 -4.407223e+00 -4.407223e+00
## theta.LV2.11                  3.046365e+00  3.046365e+00
## theta.LV2.12                  3.914307e+00  3.914307e+00
## sigma.LV1                    -2.694995e-01 -2.694995e-01
## sigma.LV2                    -6.703990e-02 -6.703990e-02
## Intercept.Alopacce           -1.870663e+00 -1.743618e+00
## Intercept.Alopcune           -4.788344e+00 -4.536055e+00
## Intercept.Alopfabr           -4.301739e+00 -4.138896e+00
## Intercept.Arctlute           -1.562596e+01 -1.551575e+01
## Intercept.Arctperi           -1.502024e+01 -1.481430e+01
## Intercept.Auloalbi           -1.649582e+01 -1.620603e+01
## Intercept.Pardlugu            7.840459e+00  8.020348e+00
## Intercept.Pardmont           -6.443712e+00 -6.355335e+00
## Intercept.Pardnigr           -5.773772e+00 -5.535936e+00
## Intercept.Pardpull           -1.369927e+01 -1.352341e+01
## Intercept.Trocterr           -6.768591e-01  9.802049e-03
## Intercept.Zoraspin           -4.246020e+00 -4.028622e+00
## Xcoef.soil.dry:Alopacce      -1.414328e+00 -3.612023e-01
## Xcoef.soil.dry:Alopcune       7.753925e-01  2.004039e+00
## Xcoef.soil.dry:Alopfabr      -1.299647e+00 -2.944512e-01
## Xcoef.soil.dry:Arctlute       5.967398e+00  6.552707e+00
## Xcoef.soil.dry:Arctperi      -2.532814e+00 -1.027355e+00
## Xcoef.soil.dry:Auloalbi      -4.266244e-01  1.066902e+00
## Xcoef.soil.dry:Pardlugu      -2.936680e+00 -2.142357e+00
## Xcoef.soil.dry:Pardmont       8.946451e-01  2.381622e+00
## Xcoef.soil.dry:Pardnigr       2.038569e+00  3.233184e+00
## Xcoef.soil.dry:Pardpull       2.594877e+00  3.585225e+00
## Xcoef.soil.dry:Trocterr       9.542470e-01  1.561077e+00
## Xcoef.soil.dry:Zoraspin       1.551622e+00  2.751353e+00
## Xcoef.bare.sand:Alopacce     -2.203398e-01  2.018673e-01
## Xcoef.bare.sand:Alopcune     -1.014603e+00 -1.893600e-02
## Xcoef.bare.sand:Alopfabr      4.937318e-01  1.153242e+00
## Xcoef.bare.sand:Arctlute     -2.836152e-01  2.692776e-01
## Xcoef.bare.sand:Arctperi     -3.503039e-01  8.275857e-01
## Xcoef.bare.sand:Auloalbi     -5.375863e-01  2.679607e-01
## Xcoef.bare.sand:Pardlugu     -1.157607e+00 -3.073014e-01
## Xcoef.bare.sand:Pardmont     -1.120486e-01  3.870807e-01
## Xcoef.bare.sand:Pardnigr     -6.258147e-01  3.034284e-01
## Xcoef.bare.sand:Pardpull     -8.681391e-01 -4.068103e-02
## Xcoef.bare.sand:Trocterr     -4.522607e-01 -7.459919e-02
## Xcoef.bare.sand:Zoraspin     -3.233881e-01  4.156019e-01
## Xcoef.fallen.leaves:Alopacce -9.274371e-01  3.451282e-01
## Xcoef.fallen.leaves:Alopcune -3.563748e-01  3.711513e-01
## Xcoef.fallen.leaves:Alopfabr -7.582484e-01  4.687547e-01
## Xcoef.fallen.leaves:Arctlute -2.237477e+00 -1.410947e+00
## Xcoef.fallen.leaves:Arctperi -2.189703e+00 -2.189611e+00
## Xcoef.fallen.leaves:Auloalbi  3.183468e-01  8.991980e-01
## Xcoef.fallen.leaves:Pardlugu  1.172327e-01  6.126043e-01
## Xcoef.fallen.leaves:Pardmont -9.876157e-01 -1.060491e-01
## Xcoef.fallen.leaves:Pardnigr -1.210708e+00 -5.031587e-01
## Xcoef.fallen.leaves:Pardpull -7.554122e-01 -2.013017e-01
## Xcoef.fallen.leaves:Trocterr -3.851551e-01 -7.252995e-02
## Xcoef.fallen.leaves:Zoraspin -8.946508e-01 -2.608198e-01
## Xcoef.moss:Alopacce          -3.968837e-02  5.878312e-01
## Xcoef.moss:Alopcune          -5.303392e-01  2.570517e-01
## Xcoef.moss:Alopfabr           4.396333e-02  1.019094e+00
## Xcoef.moss:Arctlute          -2.027232e-01  1.207305e+00
## Xcoef.moss:Arctperi           8.369031e-02  9.086125e-01
## Xcoef.moss:Auloalbi          -2.475179e-01  5.066123e-01
## Xcoef.moss:Pardlugu          -7.357190e-01  1.912578e-02
## Xcoef.moss:Pardmont           5.622800e-01  1.272057e+00
## Xcoef.moss:Pardnigr          -7.573340e-01  5.556292e-02
## Xcoef.moss:Pardpull           1.677742e-01  7.268008e-01
## Xcoef.moss:Trocterr          -3.292764e-01  1.318482e-02
## Xcoef.moss:Zoraspin          -3.594569e-01  3.487027e-01
## Xcoef.herb.layer:Alopacce     3.205666e-01  9.819599e-01
## Xcoef.herb.layer:Alopcune     1.371091e-02  9.598339e-01
## Xcoef.herb.layer:Alopfabr    -1.767265e-02  8.698154e-01
## Xcoef.herb.layer:Arctlute    -3.673092e-01  1.750850e-01
## Xcoef.herb.layer:Arctperi    -3.727353e-01  4.154120e-01
## Xcoef.herb.layer:Auloalbi     3.620393e+00  4.751666e+00
## Xcoef.herb.layer:Pardlugu     3.381626e-01  8.689160e-01
## Xcoef.herb.layer:Pardmont     2.058513e-01  1.213162e+00
## Xcoef.herb.layer:Pardnigr     3.675671e-01  1.328890e+00
## Xcoef.herb.layer:Pardpull     1.649511e+00  2.492769e+00
## Xcoef.herb.layer:Trocterr     2.593820e-01  6.949828e-01
## Xcoef.herb.layer:Zoraspin     2.156566e-01  1.133953e+00
## Xcoef.reflection:Alopacce     2.237039e-01  1.168305e+00
## Xcoef.reflection:Alopcune    -1.548741e-01  7.797835e-01
## Xcoef.reflection:Alopfabr    -8.434985e-02  1.132725e+00
## Xcoef.reflection:Arctlute    -1.363675e+00 -3.490405e-01
## Xcoef.reflection:Arctperi     3.402525e+00  4.613710e+00
## Xcoef.reflection:Auloalbi    -1.086119e+00 -2.518137e-01
## Xcoef.reflection:Pardlugu    -1.729742e+00 -6.527477e-01
## Xcoef.reflection:Pardmont    -4.722716e-01  5.548706e-01
## Xcoef.reflection:Pardnigr    -1.071291e+00 -2.921513e-01
## Xcoef.reflection:Pardpull    -7.965912e-01 -1.782839e-01
## Xcoef.reflection:Trocterr    -5.269803e-01 -1.456559e-01
## Xcoef.reflection:Zoraspin    -1.404893e+00 -6.018868e-01
## inv.phi.Alopacce              1.494798e+01  2.629469e+01
## inv.phi.Alopcune             -1.143196e-01  7.177836e+00
## inv.phi.Alopfabr              6.373615e+00  1.268569e+01
## inv.phi.Arctlute              1.502191e+00  3.029438e+00
## inv.phi.Arctperi              2.462365e+06  2.462372e+06
## inv.phi.Auloalbi              5.022272e+00  1.691817e+01
## inv.phi.Pardlugu              3.265899e+01  3.663770e+01
## inv.phi.Pardmont              5.237794e-01  3.875252e+00
## inv.phi.Pardnigr              2.349377e-01  1.860639e+01
## inv.phi.Pardpull              4.303903e+00  3.376259e+01
## inv.phi.Trocterr              1.522366e+00  5.229424e+01
## inv.phi.Zoraspin              1.928417e+00  1.093878e+01

Problems? See hints:

I have problems in model fitting. My model converges to infinity or local maxima: GLLVMs are complex models where starting values have a big role. Choosing a different starting value method (see argument starting.val) or use multiple runs and pick up the one giving highest log-likelihood value using argument n.init. More variation to the starting points can be added with jitter.var.

My results does not look the same as in answers: The results may not be exactly the same as in the answers, as the initial values for each model fit are slightly different, so the results may also differ slightly.

Ordination

GLLVM as a model based ordination method

  • GLLVMs can be used as a model-based approach to unconstrained ordination by including two latent variables in the model: \(g(E(y_{ij})) = \beta_{0j} + \boldsymbol{u}_i'\boldsymbol{\theta}_j\)

  • The latent variable term try to capture the underlying factors driving species abundances at sites.

  • Predictions for the two latent variables, \(\boldsymbol{\hat u}_i=(\hat u_{i1}, \hat u_{i2})\), then provide coordinates for sites in the ordination plot and then provides a graphical representation of which sites are similar in terms of their species composition.

Ordination plot

  • ordiplot() produces ordination plots based on fitted GLLVMs.
  • Uncertainty of the ordination points in model based ordination can be assessed with prediction regions based on the prediction errors of latent variables.
  • Prediction regions may also help interpreting which differences between ordination points are really a sign of the difference between species composition at those sites.

Biplot

  • Between species correlations can be visualized with biplot8 by adding latent variable loadings \(\boldsymbol{\theta}_j\) to the ordination of sites, by producing a biplot, (argument biplot = TRUE in ordiplot()).
  • In a biplot latent variables and their loadings are rotated so that the LV loadings of the species are in the same direction with the sites where they are most abundant.
  • Biplot can be used for finding groups of correlated species or finding indicator species common at specific sites.
  • For example, species named Pardlugu is common in the group of sites located on the bottom (eg sites 8, 19, 20 and 21).
  • x and y axes may help the interpretation.
ordiplot(fitnb, biplot = TRUE)
abline(h = 0, v = 0, lty=2)

Environmental gradients

  • The potential impact of environmental variables on species communities can be viewed by coloring ordination points according to the variables.
  • For example, species named Pardlugu seems to prefer sites with lot of dry soil mass and low reflection of the soil surface with a cloudless sky.
# Arbitrary color palette, a vector length of 20. Can use, for example, colorRampPalette from package grDevices
rbPal <- c("#00FA9A", "#00EC9F", "#00DFA4", "#00D2A9", "#00C5AF", "#00B8B4", "#00ABB9", "#009DBF", "#0090C4", "#0083C9", "#0076CF", "#0069D4", "#005CD9", "#004EDF", "#0041E4", "#0034E9", "#0027EF", "#001AF4", "#000DF9", "#0000FF")
X <- spider$x
par(mfrow = c(3,2), mar=c(4,4,2,2))
for(i in 1:ncol(X)){
Col <- rbPal[as.numeric(cut(X[,i], breaks = 20))]
ordiplot(fitnb, symbols = T, s.colors = Col, main = colnames(X)[i], 
         biplot = TRUE)
}

  • Here environmental gradients stand out quite clearly, indicating that, at least, some of the differences in species compositions at sites can be explained by the differences in environmental conditions.

  • The next step would be to include covariates to the model to study more precisely the effects of environmental variables: \(g(E(y_{ij})) = \beta_{0j} + \boldsymbol{x}_i'\boldsymbol{\beta}_{j} + \boldsymbol{u}_i'\boldsymbol{\theta}_j\)


  1. Niku, J., F.K.C. Hui, S. Taskinen, and D.I. Warton. 2019. Gllvm - Fast Analysis of Multivariate Abundance Data with Generalized Linear Latent Variable Models in R. 10. Methods in Ecology and Evolution: 2173–82↩︎

  2. Kasper Kristensen, Anders Nielsen, Casper W. Berg, Hans Skaug, Bradley M. Bell (2016). TMB: Automatic Differentiation and Laplace Approximation. Journal of Statistical Software, 70(5), 1-21↩︎

  3. Hui, F. K. C., Warton, D., Ormerod, J., Haapaniemi, V., and Taskinen, S. (2017). Variational approximations for generalized linear latent variable models. Journal of Computational and Graphical Statistics. Journal of Computational and Graphical Statistics, 26:35-43↩︎

  4. Korhonen, P., Hui, F. K. C., Niku, J., and Taskinen, S. (2021). Fast, universal estimation of latent variable models using extended variational approximations, arXiv:2107.02627 .↩︎

  5. Niku, J., Warton, D. I., Hui, F. K. C., and Taskinen, S. (2017). Generalized linear latent variable models for multivariate count and biomass data in ecology. Journal of Agricultural, Biological, and Environmental Statistics, 22:498-522.↩︎

  6. van der Aart, P. J. M., and Smeenk-Enserink, N. (1975) Correlations between distributions of hunting spiders (Lycosidae, Ctenidae) and environmental characteristics in a dune area. Netherlands Journal of Zoology 25, 1-45.↩︎

  7. Dunn, P. K., and Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics, 5, 236-244.↩︎

  8. Gabriel, K. R. (1971). The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58, 453-467↩︎