๐Ÿš•๐Ÿ’ก The model is up and running! It predicts ride durations for NY Yellow Taxi trips, and Iโ€™m loving the MLOps journey. Now focusing on deploying the model and automating the process. #DataScience #AI #MachineLearning #MLOps #ZoomCamp #DataTalksClub
๐Ÿ“Š๐Ÿ’ป Just completed the linear regression model to predict ride durations based on data from Jan-Feb 2023. Now on to tuning and integrating the model into a Docker container. Next steps ahead! #MachineLearning #DataScience #MLOps #ZoomCamp #DataTalksClub
๐Ÿ—ฝ๐Ÿš– Starting with the NY Yellow Taxi dataset from Jan-Feb 2023! Preparing to build a regression model to predict ride durations. Time to dive into the data and start exploring! #MLOps #ZoomCamp #DataTalksClub #MachineLearning
๐ŸŒŸ Just wrapped up the homework for Batch 5 of the Zoomcamp!
I processed and analyzed the yellow_tripdata_2024-10.parquet and taxi_zone_lookup.csv datasets using PySpark and Spark SQL. Feels great to finish a hands-on project! ๐Ÿ†
#DataEngineering #Zoomcamp #DataTalks #ETL #PySpark #SparkSQL
๐Ÿ“ˆ Spark SQL is amazing!
Today I worked on SQL queries within PySpark to analyze and transform large datasets. This is such a powerful tool for data engineering! ๐Ÿš€
#DataEngineering #Zoomcamp #PySpark #DataTalks #SparkSQL
๐Ÿ’ฅ Today, I started using Spark on GCP with PySpark.
Worked with yellow_tripdata_2024-10.parquet and taxi_zone_lookup.csv to process data. Learning how Spark handles big data in the cloud is incredible! ๐Ÿš—
#DataEngineering #Zoomcamp #DataTalks #PySpark #BigData #Spark
๐Ÿš€ Iโ€™ve just started the Zoomcamp Data Engineering by @DataTalksClub!
This module focuses on ETL processing with Spark, Spark SQL, and DataFrames. Excited to dive into big data processing and learn how to use Spark at scale! ๐Ÿ”ฅ
#DataEngineering #Zoomcamp #DataTalks #PySpark
๐Ÿ”ง Now weโ€™re building the pipeline! ๐Ÿ› ๏ธ Transforming the #NYCTaxi data into something useful by processing it with DLT and sending it to DuckDB. ๐Ÿš‚๐Ÿ’พ Stay tuned as we turn raw data into insights! #DataEngineering #DuckDB #DLT" #Zoomcamp
๐Ÿ“ฅ Next step: pulling the #NYCTaxi dataset using RESTClient. ๐Ÿ—ฝ๐Ÿ”„ It's all about getting that raw data into our system so we can transform it into valuable insights. Excited to see how DLT can streamline this process! ๐Ÿš€ #DataEngineering #Zoomcamp
๐Ÿš€ Kicking off the #Zoomcamp by @DataTalksClub today! Weโ€™re diving into #DLT, starting with a hands-on workshop. The journey begins by watching the video to understand the fundamentals and get ready for some serious data work! ๐Ÿ“Š๐Ÿ’ป #DataEngineering