GAMs/GAMMs handle nonlinear relationships and can include random effects (e.g., site or tree identity) to account for hierarchical structures and temporal or spatial dependencies, making them well-suited for modeling complex dendrochronological data. in growthTrendR package, a suite of GAM/GAMM models has been implemented to accommodate different types of datasets. Here, we use a single model, gamm_spatial, to demonstrate how to generate a fitting and diagnostic model report from raw data.
# loading processed ring measurement
dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR"))
# climate
dt.clim <- fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR"))
# merge data
dt.samples_clim <- merge(dt.samples_trt$tr_all_wide[, c("uid_site", "site_id","latitude", "longitude", "species", "uid_tree", "uid_radius")], dt.samples_trt$tr_all_long$tr_7_ring_widths, by = "uid_radius")
# # Calculate BAI
dt.samples_clim <- calc_bai(dt.samples_clim)
dt.samples_clim <- merge(dt.samples_clim, dt.clim, by = c("site_id", "year"))This example uses gamm_spatial; other functions with the same arguments (gamm_radius, gamm_site, bam_spatial, gam_mod) may be used depending on the data and analysis goals.
setorder(dt.samples_clim, uid_tree, year)
# Remove ageC == 1 prior to fitting log-scale models.
dt.samples_clim <- dt.samples_clim[ageC > 1]
m.sp <- gamm_spatial(data = dt.samples_clim, resp_scale = "resp_log",
m.candidates =c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)",
"bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + FFD")
)resp_scale The function provides three options for specifying the response variable, and the user must choose the one that best suits their modelling purpose:
“resp_gaussian”: the response variable is used on its original scale and is modelled under a Gaussian distribution with an identity link (no transformation applied).
“resp_log”: the response variable is log-transformed prior to modelling. The transformed response is then assumed to follow a Gaussian distribution and is fitted using an identity link.
“resp_gamma”: the response variable is kept on its original scale, and the model is fitted under a Gamma distribution with a log link, appropriate for strictly positive and right-skewed data.
m.candidates
The list of all candidate equations. Note that the response variable is kept on its original scale in all cases, even when using the option “resp_log”.
# generate_report(robj = m.sp)
gam_model <- m.sp$model$gam
# summary of the model
summary(gam_model)
#>
#> Family: gaussian
#> Link function: identity
#>
#> Formula:
#> log(bai_cm2) ~ log(ba_cm2_t_1) + s(ageC) + FFD + s(uid_site.fac,
#> bs = "re")
#>
#> Parametric coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.240826 0.343068 -0.702 0.4834
#> log(ba_cm2_t_1) 0.540556 0.078397 6.895 5.04e-11 ***
#> FFD -0.003385 0.001400 -2.418 0.0164 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Approximate significance of smooth terms:
#> edf Ref.df F p-value
#> s(ageC) 5.885 5.885 4.194 0.000709 ***
#> s(uid_site.fac) 1.768 2.000 5.625 0.001394 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> R-sq.(adj) = 0.924
#> Scale est. = 0.069672 n = 243
# smooth term importance
term_important <- sterm_imp( gam_model)
print(term_important)
#> term importance_pct method
#> <char> <num> <char>
#> 1: s(ageC) 59.7 ssq
#> 2: s(uid_site.fac) 40.3 ssq
#