The hottest ticket in R will be @gavinsimpson's live stream on What's New in Generalized Additive Models in R

2026-03-06 (17:00–19:00 CET) at https://youtube.com/live/A9U8e1KdlU4?feature=share

• what GAMs are and how they work
• recent {mgcv} updates (incl. Hierarchical GAMs)
• new features in {gratia}
• deeper inference with {marginaleffects}

Post questions at https://github.com/gavinsimpson/gratia/discussions/categories/q-a?discussions_q=is%3Aopen+category%3AQ%26A+label%3Alivestream

#RStats #mgcv #gratia #statistics #GAMs

What's new in the world of Generalized Additive Models

YouTube

📈 Yes you can do that in mgcv update

big thanks to Zachary Susswein for spotting that my code was out of date in my neighbourhood cross-validation examples: https://calgary.converged.yt/articles/ncv.html https://calgary.converged.yt/articles/ncv_timeseries.html

They are now up-to-date, as is the helper package mgcvUtils: https://github.com/dill/mgcvUtils

#mgcvchat #mgcv

Neighbourhood cross-validation – Yes! You can do that in mgcv!

Anyone got anything on using #mgcv with #mrf and #sf objects in #rstats? The package seems to want its own format for polygon regions and (can) compute its own adjacency list etc. But I haz sf objects...

#quarto #rstats friends who use github action to publish articles:

it's currently taking github actions ~30 mins to publish my little #mgcv help site (https://calgary.converged.yt/). This seems to be because it's installing a lot of R packages from source.

What's the current state-of-the-art to get these things to render quickly? (And using minimal power.)

(I'd like to not use github but I would also like to encourage PRs etc from folks without a huge overhead from them, so let's stick to github-based solutions for now.)

Yes! You can do that in mgcv – Yes! You can do that in mgcv!

new (out for a while but sitting in my browser from before Christmas) paper in Biometrika from Benjamin Säfken, Thomas Kneib and Simon Wood on smoothing parameter degrees of freedom

Green OA @ Edinburgh https://www.pure.ed.ac.uk/ws/portalfiles/portal/475921820/asae052.pdf

#mgcvchat #mgcv

#mgcv mini-lifehack:

(assuming you have multithreading enabled) you can get a rough idea of what's happening when fitting a big model by looking at your CPU usage. If only 1 core is being used, the model is still "building" (assembling of design/penalty matrices), once you switch to all cores, then you're actually fitting the model. Sometimes that first model construction phase can take a long time (with a very big model), so it'll probably take a very very long time to fit. So buckle-up.

#mgcvchat

#Poisson regression with #mgcv and #glmmTMB in #rstats just rocks

spending some more time thinking about neighbourhood cross-validation in #mgcv (see original post here: https://calgary.converged.yt/articles/ncv.html), but for time series.

Pretty nice to be able to get back to a yearly trend here without needing to specify an autoregressive structure. We just need to specify a cross-validation scheme and the autocorrelation is "dealt with" during fitting.

Full post on this soon. #mgcvchat #rstats

Neighbourhood cross-validation – Yes! You can do that in mgcv!

Ok, a more *specific* #mgcv #GAM question: When using tensor product interaction terms with `ti()`, do the knots have to match? E.g. do I have to do ti(x, k = 10) + ti(y, k = 20) + ti(x,y k = c(10, 20))? Or can the knots in the interaction term be whatever? Would I want them to be different for some reason?

#rstats

A unifying modelling approach for hierarchical distributed lag models, by Theo Economou et al:

https://doi.org/10.48550/arXiv.2407.13374

code: https://zenodo.org/records/10458640

#rstats #mgcv

A unifying modelling approach for hierarchical distributed lag models

We present a statistical modelling framework for implementing Distributed Lag Models (DLMs), encompassing several extensions of the approach to capture the temporally distributed effect from covariates via regression. We place DLMs in the context of penalised Generalized Additive Models (GAMs) and illustrate that implementation via the R package \texttt{mgcv}, which allows for flexible and interpretable inference in addition to thorough model assessment. We show how the interpretation of penalised splines as random quantities enables approximate Bayesian inference and hierarchical structures in the same practical setting. We focus on epidemiological studies and demonstrate the approach with application to mortality data from Cyprus and Greece. For the Cyprus case study, we investigate for the first time, the joint lagged effects from both temperature and humidity on mortality risk with the unexpected result that humidity severely increases risk during cold rather than hot conditions. Another novel application is the use of the proposed framework for hierarchical pooling, to estimate district-specific covariate-lag risk on morality and the use of posterior simulation to compare risk across districts.

arXiv.org