Mastodawn

Kedro Dec 7, 2023

Today Juan Luis Cano Rodríguez from QuantumBlack, AI by McKinsey will give a workshop at PyData Global titled "Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro" in which attendees will learn how to create a complex AI pipeline using Hugging Face transformers and turn it into a Kedro project that cleanly separates code from configuration and data.

Tune in at 16:00 UTC! https://global2023.pydata.org/cfp/talk/NFZDPN/

#python #pydata #pydataglobal #pydataglobal2023 #kedro #huggingface #aipipelines

Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro PyData Global 2023

In this tutorial you will learn how to create a complex AI pipeline using Hugging Face transformers, turn it into a Kedro project that cleanly separates code from configuration and data, and deploy it to production so it starts delivering value. To that end, we will build a system that summarizes and classifies social media posts using several Hugging Face pre-trained models. The outline will be as follows: 1. Introduction (5m) 2. Who needs ChatGPT? Commercial vs open-source AI (5m) 3. Fighting spaghetti data science with Kedro (15m) 4. Using Hugging Face models (15m) 5. Separating code from data using the Kedro catalog (10m) 6. Refactoring the code using Kedro pipelines (20m) 7. Deploying to production (15m) 8. Conclusions

Yes, THAT commandasaurus 🦖Dec 6, 2023

Thanks to everyone who could make our #PyDataGlobal2023 talk today — "Data Tales from an Open Source Research Team"

(Kind of like Duck Tales, but less rolling around in 💰💰💰💰)

And a BIG thank you to the conference organizers and volunteers who made it all happen <3

You can find the slides + our speaker notes here >
https://bit.ly/datatales-pydata-global-2023

#opensource #data #enginerdery

Melissa Santos Dec 6, 2023

anyone else at #PyDataGlobal2023 ? I am excited enough that I got up at 5am without an alarm

Juan Luis Dec 6, 2023

#PyDataGlobal2023 just started and the very first talk was about good practices around Jupyter notebooks: write functions, use git, don't deploy them to production, etc.

We've been on this theme for *years*, and we keep insisting. Aren't we missing some key usability issues around the workflows we propose?

For example, functions: `%autoreload` is known to be flaky https://ipython.readthedocs.io/en/stable/config/extensions/autoreload.html#caveats and yet there's no good solution for developing library code and notebooks together.

#PyDataGlobal #python

Who needs ChatGPT? Rock solid AI pipelines with Hugging Face and Kedro PyData Global 2023

autoreload — IPython 8.18.1 documentation