Chapter 5 of my book, Test-Driven Data Analysis, is now freely available online at:
https://book.tdda.info/book/chapter5.html.
The chapter is called Constraint Discovery and Validation, and is concerned with automatic generation of constraints from believed-to-be-good data and the use of of those constraints for validation of new data.
The Python open-source tdda library and command-line tools makes this functionality available for data in Parquet files, CSV files and databases though language-neutral command-line tools, 'tdda discover' for generating constraints and the 'tdda verify' and 'tdda detect' commands for validating data. There is also a Python API for the same purpose.
The print edition version of the book remains available from all good booksellers and all sellers of good books, and the publisher has a 20% discount available until 30 June at https://www.routledge.com/Test-Driven-Data-Analysis/Radcliffe/p/book/9781032897158 with code 26SMA1.
#book #tdda #data #testing #datavalidation #datascience #ML #AI #books







