Mastodawn

"The consequences will be dire for utility or for privacy, and possibly both. It's hard to understate this point: future statistical releases will either be useless compared to past ones, or they will be incredibly unsafe.

For starters, taking away useful tools from the disclosure avoidance toolbox will always lead to more painful privacy/utility trade-offs. The whole point of this research field is to better understand and quantify privacy risk, and develop better tools to mitigate this risk while preserving utility.

For statistical releases, differential privacy is simply the best tool we have right now. It provides a finer way of quantifying trade-offs, and allows us to get more utility out of the data than competing techniques at similar privacy levels. If you take it away, you're left with techniques that either have worse utility at similar levels of privacy, or worse privacy for the same utility.

But all competing techniques also rely on noise addition. The Cell Key method, used at other statistical agencies, adds noise to statistics. Swapping, used from 1990 to 2010 for the U.S. Census, also injects randomness into the process. Sampling is everywhere in statistical work2. Hell, even imputation technically adds noise to the data3!

By contrast, coarsening and suppression are very blunt instruments. They only work in situations where the statistics are already very coarse, and not too many of them are published.
(...)
It makes sense: privacy attacks on statistical releases are about solving a system of equations. It is such an easier task when you know for sure that the statistics are all perfectly accurate. Noise forces you to compute probabilities, quantify the uncertainty, carefully consider baselines, and so on. That's why randomness is such a useful tool for disclosure avoidance! Even without formal guarantees, it makes attakcs a lot harder. Take it away and attacks become trivial."
https://desfontain.es/blog/banning-noise.html
#USA #Census #Statistics #DifferentialPrivacy

Banning noise will be a disaster for statistical data products - Ted is writing things

Sadly, taking away valuable disclosure avoidance tools doesn't make fundamental trade-offs go away.

Hacker News 1d ago

US bans differential privacy in Census data

https://desfontain.es/blog/banning-noise.html

#HackerNews #USCensus #DifferentialPrivacy #DataPrivacy #PrivacyPolicy #Census2020

Banning noise will be a disaster for statistical data products - Ted is writing things

Sadly, taking away valuable disclosure avoidance tools doesn't make fundamental trade-offs go away.

Census State Data Centers 5d ago

via @commercegov Disclosure Avoidance for Statistical Products | Order Number: DAO 216-26...
"Any use of noise infusion is inconsistent with the Department’s policies." https://www.commerce.gov/opog/disclosure-avoidance-statistical-products?utm_source=censusSDC #differentialprivacy

Code Labs Academy Jan 29

Small clinical datasets burn privacy budget fast. In this guide, we train with #DifferentialPrivacy (DP‑SGD) in #PyTorch using #Opacus, tune clipping (C) + noise (σ), and plot AUROC vs ε to choose a defensible point.

Read: https://codelabsacademy.com/en/blog/evaluating-privacy-utility-tradeoffs-small-clinical-datasets-opacus-pytorch?source=mastodon

#HealthcareAI #MachineLearning #PrivacyEngineering

DP Trade-Offs on Small Clinical Data (PyTorch + Opacus)

Train clinical ML models with differential privacy using PyTorch and Opacus. Tune clipping and noise, track ε/δ, and plot AUROC vs privacy loss.

Code Labs Academy Jan 19

In HealthTech, “remove identifiers” isn’t a DataPrivacy strategy. k-anonymity can reduce singling out in shared tables; differential privacy helps when you publish aggregates or answer many queries.

Deep dive + Python demos: https://codelabsacademy.com/en/blog/k-anonymity-vs-differential-privacy-healthcare?source=mastodon

#DifferentialPrivacy #PrivacyEngineering #DataScience #Cybersecurity

k‑Anonymity vs Differential Privacy in Healthcare

Compare k‑anonymity and differential privacy for healthcare data. Learn re‑identification risks, DP basics (ε, δ), and how to choose the right method.

Code Labs Academy Jan 19

Deep dive + Python demos: https://codelabsacademy.com/en/blog/k-anonymity-vs-differential-privacy-healthcare?source=mastodon

#DifferentialPrivacy #PrivacyEngineering #DataScience #Cybersecurity

k‑Anonymity vs Differential Privacy in Healthcare

Compare k‑anonymity and differential privacy for healthcare data. Learn re‑identification risks, DP basics (ε, δ), and how to choose the right method.

Maciek Jan 18

This looks encouraging for privacy-preserving LLMs. While the actual differential privacy guarantees are notoriously difficult to interpret, "no memorisation" is a nice headline. Caveat: there is around 30% performance (utility) gap between the private and non-private models.

https://arxiv.org/abs/2510.15001

#privacy #ai #llm #differentialPrivacy

Code Labs Academy Jan 8

Building healthcare NLP? This guide shows a HIPAA‑aware de‑identification pipeline for clinical notes in Python: regex + PHI tagging, audit‑ready redaction spans, and production tips (versioning, drift). Also: when #DifferentialPrivacy (DP‑SGD) matters for shared models.

Read the full guide: https://codelabsacademy.com/en/blog/building-hipaa-deidentification-clinical-notes-python?source=mastodon

#Healthcare #DataPrivacy #MLOps

HIPAA De‑Identification Pipeline for Clinical Notes

Build a HIPAA-aware PHI de‑identification pipeline for clinical notes in Python: regex + PyTorch NER, redaction, QA, and optional DP‑SGD with Opacus.

Reddit Tech VN Bot Dec 24

Giờ đây, bạn có thể chạy suy luận LLM cục bộ với bảo đảm quyền riêng tư chính thức! Một gói pip mới đã được phát hành, cho phép bạn sử dụng các mô hình ngôn ngữ lớn (LLM) trên thiết bị của mình với tính năng bảo mật dữ liệu mạnh mẽ thông qua suy luận riêng tư vi phân. Nâng cao quyền riêng tư cho người dùng LLM.
#LLM #Privacy #AI #LocalLLM #DifferentialPrivacy #QuyenRiengTu #MoHinhNgonNgu #BaoMatDuLieu

https://www.reddit.com/r/LocalLLaMA/comments/1puhjqk/now_you_can_run_local_llm_inference_with_

Code Labs Academy Dec 4, 2025

Training on mental health data, but worried about privacy and compliance?
Our new deep dive shows how to use DP‑SGD in PyTorch to add rigorous differential privacy to your models without losing clinical signal.

Read the full article:
https://codelabsacademy.com/en/blog/differential-privacy-mental-health-pytorch-dp-sgd?source=mastodon

#DifferentialPrivacy #PyTorch #HealthcareAI #DataScience #MachineLearning #Bootcamps

Learn Tech Trends and Best Practices

Find expert insights across 4 key areas: cyber security, UX/UI, Data Science and AI, and web development. Stay informed with Code Labs Academy’s blog.