Mastodawn

Michael DeWitt Sep 20, 2024

Sep 20, 2024

Every time I hear about HPV vaccines & cervical cancer I want to shout from the rooftops - it's incredible to live in a time where we can prevent this devastating disease!

But as Kate Cuschieri highlights, we have to tackle this globally & reduce inequity in access! #ESCV2024

Michael DeWitt Jul 20, 2024

F. Javier Rubio Jul 19, 2024

My paper with A. Christen (@cimatoficial):

"Dynamic survival analysis: modelling the hazard function via ordinary differential equations"

has been accepted for publication in Statistical Methods in Medical Research.

https://doi.org/10.48550/arXiv.2308.05205

GitHub: https://github.com/FJRubio67/ODESurv

Dynamic survival analysis: modelling the hazard function via ordinary differential equations

The hazard function represents one of the main quantities of interest in the analysis of survival data. We propose a general approach for parametrically modelling the dynamics of the hazard function using systems of autonomous ordinary differential equations (ODEs). This modelling approach can be used to provide qualitative and quantitative analyses of the evolution of the hazard function over time. Our proposal capitalises on the extensive literature of ODEs which, in particular, allow for establishing basic rules or laws on the dynamics of the hazard function via the use of autonomous ODEs. We show how to implement the proposed modelling framework in cases where there is an analytic solution to the system of ODEs or where an ODE solver is required to obtain a numerical solution. We focus on the use of a Bayesian modelling approach, but the proposed methodology can also be coupled with maximum likelihood estimation. A simulation study is presented to illustrate the performance of these models and the interplay of sample size and censoring. Two case studies using real data are presented to illustrate the use of the proposed approach and to highlight the interpretability of the corresponding models. We conclude with a discussion on potential extensions of our work and strategies to include covariates into our framework. Although we focus on examples on Medical Statistics, the proposed framework is applicable in any context where the interest lies on estimating and interpreting the dynamics hazard function.

arXiv.org

Michael DeWitt Jul 11, 2024

Raptor's Nest (He/Him)Jul 11, 2024

My ideal manuscript layout:

[Intro]
This is what I want to know.
Here is what we know so far.
Here is what we don't know yet.
This is what I'm going to do to fill this gap.

[Methods]
Here is what I did.

[Results]
Here is what I found.

[Discussion]
How does this relate to what we know.
How does this resolve what we didn't know.

[Conclusion]
Here is my answer to what I wanted to know.

(you can use the same template for research proposals just swapping out "did" with "will do")

Michael DeWitt Jun 12, 2024

Aki Vehtari Jun 12, 2024

Years ago, I did spend a lot of time working on expectation propagation (EP), and I'm still delighted to see others keeping improving it. "Fearless Stochasticity in Expectation Propagation" paper by Jonathan So and Richard Turner is excellent! https://arxiv.org/abs/2406.01801

Fearless Stochasticity in Expectation Propagation

Expectation propagation (EP) is a family of algorithms for performing approximate inference in probabilistic models. The updates of EP involve the evaluation of moments -- expectations of certain functions -- which can be estimated from Monte Carlo (MC) samples. However, the updates are not robust to MC noise when performed naively, and various prior works have attempted to address this issue in different ways. In this work, we provide a novel perspective on the moment-matching updates of EP; namely, that they perform natural-gradient-based optimisation of a variational objective. We use this insight to motivate two new EP variants, with updates that are particularly well-suited to MC estimation. They remain stable and are most sample-efficient when estimated with just a single sample. These new variants combine the benefits of their predecessors and address key weaknesses. In particular, they are easier to tune, offer an improved speed-accuracy trade-off, and do not rely on the use of debiasing estimators. We demonstrate their efficacy on a variety of probabilistic inference tasks.

arXiv.org

Michael DeWitt Jun 12, 2024

Josiah Jun 12, 2024

webr-rs-js 🤝 WebR

Running R in the browser via Rust

Rust 👈🏼😎👉🏼 browser
#webr #rust #rstats

Michael DeWitt Jun 11, 2024

Achim Zeileis Jun 11, 2024

PSA: All #rstats package on #cran will get an official DOI!

This will facilitate bibliometrics and giving credit to package authors.

Registering all 20,000+ packages will still take a few more days. But the first couple of thousand are already live. Example:

Michael DeWitt May 20, 2024

Noam Ross May 20, 2024

Preprint from Simon Wood on the new cross-validation smoothness estimation in #mgcv: https://arxiv.org/abs/2404.16490. It's a neat performant + data-efficient way to estimate GAMs based on complex CV splits (like spatial/temporal/phylo ones).

See ?NCV in latest {mgcv} for examples (https://cran.r-universe.dev/mgcv/doc/manual.html#NCV)

I might write a helper to convert {rsample}/{spatialsample} objects into mgcv's funny CV indexing structure.

#rstats #ml #tidymodels #mgcvchat @MikeMahoney218 @gavinsimpson @ericJpedersen @millerdl

On Neighbourhood Cross Validation

Many varieties of cross validation would be statistically appealing for the estimation of smoothing and other penalized regression hyperparameters, were it not for the high cost of evaluating such criteria. Here it is shown how to efficiently and accurately compute and optimize a broad variety of cross validation criteria for a wide range of models estimated by minimizing a quadratically penalized loss. The leading order computational cost of hyperparameter estimation is made comparable to the cost of a single model fit given hyperparameters. In many cases this represents an $O(n)$ computational saving when modelling $n$ data. This development makes if feasible, for the first time, to use leave-out-neighbourhood cross validation to deal with the wide spread problem of un-modelled short range autocorrelation which otherwise leads to underestimation of smoothing parameters. It is also shown how to accurately quantifying uncertainty in this case, despite the un-modelled autocorrelation. Practical examples are provided including smooth quantile regression, generalized additive models for location scale and shape, and focussing particularly on dealing with un-modelled autocorrelation.

arXiv.org

Michael DeWitt May 19, 2024

Sam Abbott May 19, 2024

I need to estimate a delay distribution (i.e incubation period, reporting delay etc.) for an infectious disease what should I do?

This question is addressed in new work from #epinowcast community member @kcharniga: https://www.epinowcast.org/posts/2024-05-17-best-practices-delays/

Epinowcast - Best practices for estimating and reporting epidemiological delay distributions of infectious diseases using public health surveillance and healthcare data

Epinowcast community site

Epinowcast

Michael DeWitt May 14, 2024

Sam Abbott May 14, 2024

{targets} but in #julia?

https://discourse.julialang.org/t/suggestions-for-best-practice-in-scaling-analysis/114237

Suggestions for best practice in scaling analysis

Hi everyone, I’m exploring whether there are good examples around of using Julia at scale that people can link? At the moment I’m helping develop a package for model-based epidemiological inference Rt-without-renewal/EpiAware at main · CDCgov/Rt-without-renewal · GitHub . We’re at the stage where we want to collect inference results across a number of different scenarios to answer some interesting questions about effective epi modelling. Looking at the space of handy workflow packages in juli...

Julia Programming Language

Michael DeWitt May 14, 2024

Sam Abbott May 14, 2024

Following on from our last work (https://www.medrxiv.org/content/10.1101/2024.01.12.24301247v1) Kelly Charniga has led a piece looking at best practices for estimating and reporting epidemiological delay distributions.

https://hal.science/hal-04572940v1

The aim here is to provide a checklist for both those producing and using epidemiological delay distributions.

Estimating epidemiological delay distributions for infectious diseases

Understanding and accurately estimating epidemiological delay distributions is important for public health policy. These estimates directly influence epidemic situational awareness, control strategies, and resource allocation. In this study, we explore challenges in estimating these distributions, including truncation, interval censoring, and dynamical biases. Despite their importance, these issues are frequently overlooked in the current literature, often resulting in biased conclusions. This study aims to shed light on these challenges, providing valuable insights for epidemiologists and infectious disease modellers. Our work motivates comprehensive approaches for accounting for these issues based on the underlying theoretical concepts. We also discuss simpler methods that are widely used, which do not fully account for known biases. We evaluate the statistical performance of these methods using simulated exponential growth and epidemic scenarios informed by data from the 2014-2016 Sierra Leone Ebola virus disease epidemic. Our findings highlight that using simpler methods can lead to biased estimates of vital epidemiological parameters. An approximate-latent-variable method emerges as the best overall performer, while an efficient, widely implemented interval-reduced-censoring-and-truncation method was only slightly worse. Other methods, such as a joint-primary-incidence-and-delay method and a dynamic-correction method, demonstrated good performance under certain conditions, although they have inherent limitations and may not be the best choice for more complex problems. Despite presenting a range of methods that performed well in the contexts we evaluated, residual biases persisted, predominantly due to the simplifying assumption that the distribution of event time within the censoring interval follows a uniform distribution; instead, this distribution should depend on epidemic dynamics. However, in realistic scenarios with daily censoring, these biases appeared minimal. This study underscores the need for caution when estimating epidemiological delay distributions in real-time, provides an overview of the theory that practitioners need to keep in mind when doing so with useful tools to avoid common methodological errors, and points towards areas for future research. What was known prior to this paper What this paper adds Key findings Key limitations ### Competing Interest Statement The authors have declared no competing interest. ### Funding Statement SF was supported by Wellcome Trust (210758/Z/18/Z). ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes All code used in the present study are available on <https://github.com/parksw3/epidist-paper> <https://github.com/parksw3/epidist-paper>

medRxiv