My forthcoming book, Test-Driven Data Analysis is finally available for pre-order from the publisher, with 20% off for the next three days. (I don’t set the price; if I did it would be much lower.)

It covers data validation, testing of analytical pipelines and a lot more, with exercises, examples, checklists, anecdotes and more. I think it will help almost any data professional/data wrangler/analyst/modeller. I think people have found it more readable than you might expect given the subject matter.

Version 3.0 of the accompanying tdda library will be released slightly before or with the book, around 19th May. It’s at RC10 and has lots of new and extended functionality.

#tdda #data #analysis #reproducibility #ML #AI #reproducibleresearch #ETL #QA

The PyData London 2026 runs from 5–7 June 2026. In…London.

https://pydata.org/london2026/

The Call for Proposals is still OPEN, but closes on 16th February (Monday coming).

Maybe you would like to submit a talk?

https://pydata.org/london2026/cfp#submit

PyData is very much the UNION of Python and Data, rather than only the intersection. And it’s inclusive, fun, diverse, parent-friendly, committed to accessibility, and has diversity schoolrships.

#PyDataLondon2026 #Python #Data #PyData #London #TDDA #ML #AI
Probably some #LLM and #GenAI too.

PyData London | 2026

PyData London is a 3-day in-person event for the international community of data scientists, data engineers, and developers of data analysis tools.

PyData London 2026

Well, my book on TDDA has become slightly more real:

It’s not expected to be available until April, but you can see it on the publisher’s website at

https://www.routledge.com/Test-Driven-Data-Analysis/Radcliffe/p/book/9781032897158

Although the publisher won’t let you pre-order till the end of March, the paper copy is listed on Blackwells and Waterstones:

https://blackwells.co.uk/bookshop/product/Test-Driven-Data-Analysis-by-Nicholas-J-Radcliffe/9781032897158

https://www.waterstones.com/book/test-driven-data-analysis/nicholas-j-radcliffe/9781032897158

and Amazon will let you pre-order paper or Kindle copies.

#TDDA #books #data #analysis #testing #datascience #quality #AI #ML

A Month of CHOP (Chat-Oriented Programming): my write-up off a month pair-programming with Claude Code.

https://checkeagle.com/checklists/njr/a-month-of-chat-oriented-programming/

#LLM #VibeCoding #claude #code #claudecode #coding #tdda

A Month of Chat-Oriented Programming - CheckEagle

Believing a large-language model (like ChatGPT) is believing a prediction.

Predictions can be useful, but we shouldn't confuse them with facts or reality. And it's odd to use a prediction instead of looking up an available answer. Yet that is how LLMs are often used these days. That's some of what I find disturbing about them (in addition to the hype, environmental concerns, toxic and harmful output, unrecompensed and uncredited use of creative works etc.)

A predicted answer to a question is not the same as an answer.

A predicted summary of some text is not the same as a summary of the text.

A predicted program to perform a task is not the same as a program to perform that task.

LLMs are hypothesis generators (bullshit generators, if you prefer). If they help you get to valid/correct answer, that's great (subject to the negatives). But naïvely believing them is silly.

https://njr.prose.sh/believing-an-llm

#LLM #AI #TDDA

Believing an LLM is Believing a Prediction

prose.sh

I'm known (slightly) for criticising Jupyter and other computational notebooks. So here's something more positive: Suggested best practices for safe notebook use.

Clickable version at: https://www.tdda.info/best-practices-for-notebook-users

Printable PDF at: https://www.stochasticsolutions.com/pdf/nbp.pdf

#jupyter #notebook #tdda

Best Practices for Notebook Users

In a previous post, I discussed some of the dangers of challenges, dangers and weaknesses of Jupyter Notebooks, JupyterLabs and their ilk. I used The Parables of Anne and Beth as a device to illustrate what I think of as good and bad practices for data science. A reasonable criticism …

Test-Driven Data Analysis

Jupyter Notebooks Considered Harmful: The Parables of Anne and Beth.
https://www.tdda.info/jupyter-notebooks-considered-harmful-the-parables-of-anne-and-beth

This is a post that has been brewing for years.

#jupyter #datascience #notebooks #tdda #tdd #reproducibility

Jupyter Notebooks Considered Harmful: The Parables of Anne and Beth

I have long considered writing a post about the various problems I see with computational notebooks such as Jupyter Notebooks. As part of a book I am writing on TDDA, I created four parables about good and bad development practices for analytical workflows. They were not intended to form this …

Test-Driven Data Analysis