Data Governance Best Practices: Ensuring Data Quality and Security.

Discover how data governance shapes business success. Real stories, best practices, and debate on data quality and security. Data is everywhere. We collect it, store it, and analyze it. Yet, manyโ€ฆ

Medium

Data lineage vergroot vertrouwen in overheidsdata


Overheden maken vaak gebruik van data om beleid te maken, dienstverlening te verbeteren en maatschappelijke vraagstukken aan te pakken. Maar hoe weet je of die data betrouwbaar is? Volgens een nieuw rapport van het Wetenschappelijk Onderzoek- en Documentatiecentrum (WODC) kan data lineage daarbij helpen.

Wat is data lineage?

Data lineage betekent letterlijk โ€˜afstamming van dataโ€™. Het gaat om het in kaart brengen van de volledige reis die data aflegt: van het moment dat het wordt verzameld (bijvoorbeeld via een formulier), tot aan de verwerking, bewerking en het uiteindelijke gebruik in bijvoorbeeld dashboards of rapportages. Met data lineage kun je nagaan:

  • waar de data vandaan komt;
  • welke bewerkingen of transformaties zijn toegepast;
  • in welke systemen of rapporten de data uiteindelijk terecht komt.

Waarom is dit belangrijk voor de overheid?

Data lineage helpt om fouten vroegtijdig te signaleren, risicoโ€™s in beeld te brengen en het vertrouwen in beleidsinformatie te vergroten, zowel binnen als buiten de organisatie. Het WODC benadrukt dat data lineage niet alleen een technisch hulpmiddel is, maar ook een stap richting professionalisering van datamanagement binnen de overheid.

Lees het nieuwsbericht van het WODC op hun website en bekijk het Engelstalige rapport.

Dit is een automatisch geplaatst bericht. Vragen of opmerkingen kun je richten aan @DigitaleOverheid@social.overheid.nl

#BetrouwbareData #DataLineage #nieuwsbrief62025 #WODC

Data lineage vergroot vertrouwen in overheidsdata - Digitale Overheid

Volgens het WODC-rapport kan data lineage -het volgen van de herkomst en het gebruik van data- bijdragen aan meer vertrouwen in datakwaliteit binnen de overheid.

Digitale Overheid

๐Ÿ—บ๏ธ ๐–๐ก๐š๐ญ ๐˜๐จ๐ฎ ๐’๐ก๐จ๐ฎ๐ฅ๐ ๐Š๐ง๐จ๐ฐ ๐๐ž๐Ÿ๐จ๐ซ๐ž ๐ˆ๐ฆ๐ฉ๐ฅ๐ž๐ฆ๐ž๐ง๐ญ๐ข๐ง๐  ๐€ ๐ƒ๐š๐ญ๐š ๐‚๐š๐ญ๐š๐ฅ๐จ๐ . Implementing a data catalog is a necessity if you want to leverage your data. While the allure of cutting-edge technology is strong, the success hinges on a solid foundation of non-technical considerations.

๐Ÿ‘‰ Read our guide & explore what you need to know to avoid common pitfalls and ensure success.
https://www.datalumen.eu/should_know_before_implementing_datacatalog/

#DataCatalog #DataGovernance #DataManagement #DataLineage #MetaDataManagement #DataAgenda #DataStrategy

WHAT YOU SHOULD KNOW BEFORE IMPLEMENTING A DATA CATALOG

The success of your data catalog initiative relies on a solid foundation of non-technical considerations. Explore them to help you out.

Datalumen

"AI is all about data. Reams and reams of data are needed to train algorithms to do what we want, and what goes into the AI models determines what comes out. But hereโ€™s the problem: AI developers and researchers donโ€™t really know much about the sources of the data they are using. AIโ€™s data collection practices are immature compared with the sophistication of AI model development. Massive data sets often lack clear information about what is in them and where it came from.

The Data Provenance Initiative, a group of over 50 researchers from both academia and industry, wanted to fix that. They wanted to know, very simply: Where does the data to build AI come from? They audited nearly 4,000 public data sets spanning over 600 languages, 67 countries, and three decades. The data came from 800 unique sources and nearly 700 organizations.

Their findings, shared exclusively with MIT Technology Review, show a worrying trend: AI's data practices risk concentrating power overwhelmingly in the hands of a few dominant technology companies."

https://www.technologyreview.com/2024/12/18/1108796/this-is-where-the-data-to-build-ai-comes-from/

#AI #GenerativeAI #AITraining #DataLineage

This is where the data to build AI comes from

New findings show how the sources of data are concentrating power in the hands of the most powerful tech companies.

MIT Technology Review

From Chaos to Clarity? ๐Ÿ”Find out how you can make data lineage simple. Data moving through complex architectures doesnโ€™t have to be a mystery. ๐Ÿ”† Check out our latest blog to learn how OpenLineage brings order to your data stack!

๐Ÿ‘‰ Read more to be informed:
https://www.datalumen.eu/openlineage/

#OpenLineage #DataLineage #DataCompliancy #DataGovernance #DataPipelineMonitoring #MetadataManagement

OPENLINEAGE: UNVEILING DATA LINEAGE FOR MODERN DATA ECOSYSTEMS

Discover OpenLineage, the powerful open-source standard for simplifying data lineage tracking and ensuring data quality and compliance.

Datalumen
#ModelExplainability, #DataLineage, and editing the #TrainingData set are topics that will be in the news next yearโ€ฆassuming we make it.
https://social.lol/@rom/112543674749743641
Oh 2 ten (@rom@social.lol)

Here we go - copyright in the US, privacy in the EU. What is next? https://noyb.eu/en/chatgpt-provides-false-information-about-people-and-openai-cant-correct-it

social.lol

๐”๐ง๐๐ž๐ซ๐ฌ๐ญ๐š๐ง๐๐ข๐ง๐  ๐ญ๐ก๐ž ๐’๐ฉ๐ž๐œ๐ญ๐ซ๐ฎ๐ฆ ๐จ๐Ÿ ๐ƒ๐š๐ญ๐š ๐‹๐ข๐ง๐ž๐š๐ ๐ž ๐€๐ง๐š๐ฅ๐ฒ๐ฌ๐ข๐ฌ

#Datalineage analysis is the backbone of #datagovernance, its the journey of data from origin to consumption. It not only ensures #dataintegrity & #compliance but also aids in decision-making processes & enhances data-driven strategies. Within the realm of data lineage analysis, various methodologies & approaches exist, each tailored to specific needs & objectives: https://www.foxconsulting.co/post/understanding-the-spectrum-of-data-lineage-analysis

#dataflow #dataquality

"[#DataAnalysts]..should know how the data was born, with all details of measurement... Few things have more devastating consequences ... than someone in the audience pointing out...measurement issues the analyst didn't consider." Bรฉkรฉs and Kรฉzdi, 2021: Data Analysis for Business, Economics, and Policy

If you're having trouble helping your org understand the value of #datalineage and #metadata, share this with them and ask if they know how all the data they're using was gathered and measured.

I wrote about the Lineage Diff for dbt projects feature of PipeRider:

You can compare then lineage DAG from both and after making code changes in dbt. It's really useful for debugging issues/seeing impact etc:

https://medium.com/inthepipeline/dbt-data-lineage-diff-impact-analysis-visualized-bec9927b0c4e

#DataOps #DataLineage #DataViz #DataQuality #DataTesting #DataEngineering

Looking for options to track #datalineage on #AWS while processing it via MWAA DAGs. Other than airflow's own lineage feature and solutions like #openlineage what else does the community use?