When Models Eat Their Own Output, Lineage Is the Only Defence
As artificial intelligence (AI) models increasingly train on the output of other models, the lineage of data collapses into a fog. I argue that provenance, a signed and offline-verifiable chain of custody for synthetic data, is the only durable defence. This is the case for treating data lineage as infrastructure, not paperwork.
https://mickai.co.uk/articles/provenance-for-synthetic-data
#syntheticdata #dataprovenance #AIgovernance #modelcollapse #postquantum

When Models Eat Their Own Output, Lineage Is the Only Defence
As artificial intelligence (AI) models increasingly train on the output of other models, the lineage of data collapses into a fog. I argue that provenance, a signed and offline-verifiable chain of custody for synthetic data, is the only durable defence. This is the case for treating data lineage as infrastructure, not paperwork.