Mastodawn

Tim Morris May 15, 2023

Wolfgang Viechtbauer May 14, 2023

I am starting to see more and more cases where people have used tools like ChatGPT and Copilot to generate code, it doesn't work, and then they go to StackOverflow or elsewhere to ask why. I am not sure how I feel about this. On the one hand, using such tools can be useful, but at some point, one has to start making the effort to actually figure out how things work. Otherwise, even if the code works, you have no idea if it is doing what you want it to do.

#ChatGPT #Copilot #RStats

Tim Morris Apr 3, 2023

Darren L Dahly PhD oMG FFs jFc Apr 3, 2023

This is *exactly* how I feel about generalised linear models. #statistics

Tim Morris Feb 6, 2023

Anne-Laure Boulesteix Feb 6, 2023

Just appeared in Biometrical Journal:

"Phases of methodological research inbiostatistics—Building the evidence base for new methods"

by @GeorgHeinze with M. Kammer @Tim_P_Morris and I. White

https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200222

Tim Morris Jan 20, 2023

Show thread

Darren L Dahly PhD oMG FFs jFc Jan 19, 2023

@davepmiller I've never argued once there is zero value, current or future. My pushback is against obvious hype...are we not allowed to call it out because one day someone will do something to maybe warrant it? Lol, not enough hype about AI in medicine...get real.

Tim Morris Jan 16, 2023

Richard Riley (R^2)Jan 16, 2023

Looking forward to giving a seminar to UCL at 1pm today - thanks for the invite @Tim_P_Morris
I will be speaking on:

"Stability of clinical prediction models developed using statistical or machine learning methods"

- based on the pre-print here https://arxiv.org/abs/2211.01061

Stability of clinical prediction models developed using statistical or machine learning methods

Clinical prediction models estimate an individual's risk of a particular health outcome, conditional on their values of multiple predictors. A developed model is a consequence of the development dataset and the chosen model building strategy, including the sample size, number of predictors and analysis method (e.g., regression or machine learning). Here, we raise the concern that many models are developed using small datasets that lead to instability in the model and its predictions (estimated risks). We define four levels of model stability in estimated risks moving from the overall mean to the individual level. Then, through simulation and case studies of statistical and machine learning approaches, we show instability in a model's estimated risks is often considerable, and ultimately manifests itself as miscalibration of predictions in new data. Therefore, we recommend researchers should always examine instability at the model development stage and propose instability plots and measures to do so. This entails repeating the model building steps (those used in the development of the original prediction model) in each of multiple (e.g., 1000) bootstrap samples, to produce multiple bootstrap models, and then deriving (i) a prediction instability plot of bootstrap model predictions (y-axis) versus original model predictions (x-axis), (ii) a calibration instability plot showing calibration curves for the bootstrap models in the original sample; and (iii) the instability index, which is the mean absolute difference between individuals' original and bootstrap model predictions. A case study is used to illustrate how these instability assessments help reassure (or not) whether model predictions are likely to be reliable (or not), whilst also informing a model's critical appraisal (risk of bias rating), fairness assessment and further validation requirements.

arXiv.org

Tim Morris Jan 9, 2023

Show thread

Anne-Laure Boulesteix Jan 9, 2023

A few related thoughts:

We need more evidence synthesis like this in the context of methodological research and, more generally, Phase III and IV studies; see
https://arxiv.org/abs/2209.13358 by @GeorgHeinze @Tim_P_Morris et al.

See also the works by @ppgardne in computational biology, e.g.:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6322486/

2/4

Phases of methodological research in biostatistics - building the evidence base for new methods

Although the biostatistical scientific literature publishes new methods at a very high rate, many of these developments are not trustworthy enough to be adopted by the scientific community. We propose a framework to think about how a piece of methodological work contributes to the evidence base for a method. Similarly to the well-known phases of clinical research in drug development, we define four phases of methodological research. These four phases cover (I) providing logical reasoning and proofs, (II) providing empirical evidence, first in a narrow target setting, then (III) in an extended range of settings and for various outcomes, accompanied by appropriate application examples, and (IV) investigations that establish a method as sufficiently well-understood to know when it is preferred over others and when it is not. We provide basic definitions of the four phases but acknowledge that more work is needed to facilitate unambiguous classification of studies into phases. Methodological developments that have undergone all four proposed phases are still rare, but we give two examples with references. Our concept rebalances the emphasis to studies in phase III and IV, i.e., carefully planned methods comparison studies and studies that explore the empirical properties of existing methods in a wider range of problems.

arXiv.org

Tim Morris Jan 9, 2023

Show thread

Brendan Halpin Jan 9, 2023

@HeidiSeibold @statsepi

Mary Shelly's Frankenstein is pretty good, and has inspired lots of imitators, but I don't think the science at the heart of the story has ever been replicated.

Tim Morris Dec 31, 2022