merve πŸ€—

1.2K Followers
121 Following
125 Posts
I work at πŸ€— and share what I know on #machinelearning & #python 🐍
πŸ‘πŸ‡«πŸ‡·
What do I do?πŸ‘©πŸ»β€πŸ’»πŸπŸ€–
Did you know that you can caption images or retrieve characters from an image (known as OCR) using Image-to-Text models? πŸ–ΌοΈπŸ“„
If you want to use them, we've got you covered πŸ€—
See this task page to learn about it πŸ‘‰ huggingface.co/tasks/image-to-text
Me & Alara Dirik have done a write-up on our disaster response efforts using machine learning, released in Hugging Face blog https://t.co/yiNVylxNER
Using Machine Learning to Aid Survivors and Race through Time

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Our blog post on skops is out! https://www.kdnuggets.com/2023/02/skops-new-library-improve-scikitlearn-production.html πŸ™‡πŸ»β€β™€οΈ @adrin
skops: A New Library to Improve Scikit-learn in Production - KDnuggets

There are various challenges in MLOps and model sharing, including, security and reproducibility. To tackle these for scikit-learn models, we've developed a new open-source library: skops. In this article, I will walk you through how it works and how to use it with an end-to-end example.

KDnuggets
this was the best thing I ever tweeted

One piece of structural discrimination: When men who have failed at their agreed-upon goal are given more resources & people to keep working on it; while women who have succeeded at their agreed-upon goal are told they haven't succeeded and the goal is actually different now.

The first part is a type of "failing upwards". The second "shifting goalpost".
Having language to talk about the patterns helps to make them a bit more apparent, I believe, and ultimately can change the system.

Hello πŸ‘‹πŸΌ
I just uploaded all of my cheatsheets (there's 100+ of them!) on various topics including machine learning, data structures, statistics and more.
πŸ“” GitHub repository: https://github.com/merveenoyan/my_notes
πŸ““ Hugging Face dataset repository: https://huggingface.co/datasets/merve/my_notes
GitHub - merveenoyan/my_notes: My small cheatsheets for data science, ML, computer science and more.

My small cheatsheets for data science, ML, computer science and more. - GitHub - merveenoyan/my_notes: My small cheatsheets for data science, ML, computer science and more.

GitHub
You also don't even need to host them openly, when pushing, set `private` to True and your repositories will be completely private (or if you work with an organization, people in your organization can see when you push to your org, e.g. a lab, company or an ML competition team)
A cool thing with versioning in Hugging Face Hub is that you can access the info in the model card programmatically to run analyses on your experiments, e.g. run multiple experiments, automatically create and push model cards for all of them, pull info from model cards and write a small script for analysis on which model is the best! 🌟
skops already saves some of the above information for you, and rest can be added (as table, metric, plots and more!) so you can just create a script for your training and run it every time you train a model
and it's a very light dependency ☁️✨

When versioning your experiments, it's best to keep couple of information for better reproducibility:

πŸ§‘πŸ»β€πŸ”¬ Your hyperparameters and attributes of preprocessors and architecture (pipeline)
πŸ“ˆ Metrics
πŸ‘‘ Feature importances (which I used ELI5 for)
βœ… Requirements of your environment