- Tokenizing raw text and converting tokens into token IDs
- Applying byte pair encoding
- Setting up data loaders in PyTorch for efficient training
| Website | https://sebastianraschka.com |
| Blog | https://magazine.sebastianraschka.com |
| GitHub | https://github.com/rasbt |
It's been a another wild month in AI & Deep Learning research.
I curated and summarized noteworthy papers here:
https://magazine.sebastianraschka.com/p/ai-research-highlights-in-3-sentences-2a1/
Ranging from new optimizers for LLMs to new scaling laws for vision transformers.
This article is a compilation of 23 AI research highlights, handpicked and summarized. A lot of exciting developments are currently happening in the fields of natural language processing and computer vision! In addition, if you are curious about last month's highlights, you can find them here:
A new Ahead of AI issue is out, where I am covering the latest research highlights concerning LLM tuning and dataset efficiency:
https://magazine.sebastianraschka.com/p/ahead-of-ai-9-llm-tuning-and-dataset/
In the last couple of months, we have seen a lot of people and companies sharing and open-sourcing various kinds of LLMs and datasets, which is awesome. However, from a research perspective, it felt more like a race to be out there first (which is understandable) versus doing principled analyses.
I just saw that my Ahead of AI magazine crossed the 20k subscriber mark!
https://magazine.sebastianraschka.com
I am incredibly grateful for all the support. Knowing that so many people find my writings useful is very, very motivating!
And stay tuned for the next article featuring the most recent research on finetuning LLMs with less data and LLM evaluation pitfalls.
And beyond LLMs, I am also excited to talk about the most recent, efficient computer vision transformers as well!
Ahead AI specializes in Machine Learning & AI research and is read by tens of thousands of researchers and practitioners who want to stay ahead in the ever-evolving field. Click to read Ahead of AI, by Sebastian Raschka, PhD, a Substack publication with tens of thousands of subscribers.
Just put together a list of papers to highlight 4 interesting things about transformers & LLMs.
Including a discussion on why the original transformer architecture figure is wrong, and a related approach published in 1991!
https://magazine.sebastianraschka.com/p/why-the-original-transformer-figure
A few months ago, I shared the article, Understanding Large Language Models: A Cross-Section of the Most Relevant Literature To Get Up to Speed, and the positive feedback was very motivating! So, I also added a few papers here and there to keep the list fresh and relevant.
New AI research & news everywhere! A short post on my personal approach to keeping up with things.
https://sebastianraschka.com/blog/2023/keeping-up-with-ai.html
The two scenarios when fully connected layers are equivalent to convolutional networks.
PyTorch-based code illustration here: https://github.com/rasbt/MachineLearning-QandAI-book/blob/main/supplementary/q12-fc-cnn-equivalence/q12-fc-cnn-equivalence.ipynb
What an awesome week for open source and the PyTorch ecosystem with three big launches!
- PyTorch 2.0
- Lightning Trainer 2.0 for PyTorch
- Fabric for PyTorch!
Just updated my "faster PyTorch" article to include the latest tools!
You love using PyTorch for Deep Learning but want it a bit more organized, so it's easier to take advantage of more advanced features?
Great news: Unit 5 of my free Deep Learning Fundamentals course is finally live! In Unit 5, I'll show you how to train PyTorch models with the Lightning Trainer!
Link to the course: https://lightning.ai/pages/courses/deep-learning-fundamentals/overview-organizing-your-code-with-pytorch-lightning/