End-to-End Storage Drive Analytics Platform Complete! 🚀
Spent the past weeks on my Data Engineering Zoomcamp final project. Excited to share an end-to-end platform analyzing Backblaze hard drive data to bridge enterprise telemetry and consumer accessibility.
The pipeline ingests daily SMART snapshots into GCS, builds a star schema with dbt, and serves insights via Streamlit dashboard showing failure rates by brand and model. Infrastructure is managed with Terraform; the warehouse was optimized using partitioning to improve query performance.
To increase accessibility, switching to open-source/free-ish tools so anyone can dive in without a cloud signup (and plus my trial expired 🙈). My goal is providing drive reliability data so creators, homelabbers, business, or casual users feel informed about their next storage purchase. 😊
Check out the repo for details: https://github.com/ammartin8/hard_drive_analytics_dashboard
#DataEngineering #harddrive #opensource #cloud #streamlit #buildinpublic #selfhosting #mastodon #python #fediverse #backblaze