Hi All! I started posting my data engineering learning journey and thought it would be great share here as well!
🚀 Week 3 of Data Engineering Zoomcamp by DataTalksClub complete! I'm really enjoying how hands on and practical the course is so far!
This week I focused on data warehousing with #Google #BigQuery. Coming from the world of #Microsoft Azure, it was a great experience to get familiar with BigQuery's serverless architecture and how it manages and processes big data at scale. Here's what I learned:
✅️ Created external tables from GCS bucket data sources (CSV/Parquet)
✅️ Use partitioning/clustering to save on cost & enhance speed of processing SQL queries
✅️ Used both #Docker & #Kestra to orchestrate the extraction, transfer, and loading 20+ million NYC taxi data at scale into a GCS bucket
✅️ Understand the advantages of columnar storage and query optimization
Check out my work here: https://github.com/ammartin8/data_engineering_zoom_camp/blob/main/modules/module_3/project_03/README.md
#googlecloud #dataengineering #microsoft #cloud #bigdata #dataanalytics #fedihire #linux #data





