Mastodawn

Manjunath Jan 21, 2023

RT @[email protected]

I just published A brief history of the satellite-image-deep-learning Github repository

https://link.medium.com/LsxZDLK3Jwb

🐦🔗: https://twitter.com/robmarkcole/status/1616406634011426816

Manjunath Jan 6, 2023

RT @[email protected]

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Presents VALL-E, a LM approach for TTS that significantly outperforms the SotA zero-shot TTS in terms of speech naturalness and speaker similarity.

proj: https://valle-demo.github.io/
abs: https://arxiv.org/abs/2301.02111

🐦🔗: https://twitter.com/arankomatsuzaki/status/1611174058699395072

VALL-E

Manjunath Jan 6, 2023

RT @[email protected]

Cool new Transformer Circuits paper on toy models of memorisation! IMO, the most exciting part is buried at the end - some fascinating exploration from @[email protected] on MNIST, showing that looking at how many "dimensions" a data point takes up can identify memorisation on a real model! https://twitter.com/AnthropicAI/status/1611045993516249088

🐦🔗: https://twitter.com/NeelNanda5/status/1611102356766347265

Anthropic on Twitter

“We have little mechanistic understanding of how deep learning models overfit to their training data, despite it being a central problem. Here we extend our previous work on toy models to shed light on how models generalize beyond their training data. https://t.co/0bYUToop3m”

Twitter

Manjunath Jan 4, 2023

RT @[email protected]

#ChatGPT cannot scrape the web and has limited knowledge of the world after 2021.

Introducing `WebChatGPT`, a mighty Chrome extension that augments your prompts with relevant results from the web! 🤯

See my demo video below 👇 and install it here:
🔗 https://chrome.google.com/webstore/detail/web-chatgpt/lpfemeioodjbpieminkklglpmhlngfcn

🐦🔗: https://twitter.com/DataChaz/status/1610556519531089921

Web ChatGPT

Augment your ChatGPT prompts with relevant results from the web.

Manjunath Jan 3, 2023

RT @[email protected]

The architecture of GPT3.

http://jalammar.github.io/how-gpt3-works-visualizations-animations/

🐦🔗: https://twitter.com/Grady_Booch/status/1610156904042594310

How GPT3 Works - Visualizations and Animations

Discussions: Hacker News (397 points, 97 comments), Reddit r/MachineLearning (247 points, 27 comments) Translations: German, Korean, Chinese (Simplified), Russian The tech world is abuzz with GPT3 hype. Massive language models (like GPT3) are starting to surprise us with their abilities. While not yet completely reliable for most businesses to put in front of their customers, these models are showing sparks of cleverness that are sure to accelerate the march of automation and the possibilities of intelligent computer systems. Let’s remove the aura of mystery around GPT3 and learn how it’s trained and how it works. A trained language model generates text. We can optionally pass it some text as input, which influences its output. The output is generated from what the model “learned” during its training period where it scanned vast amounts of text.

Manjunath Jan 3, 2023

RT @[email protected]

Can we compress large language models for better perf?

"SparseGPT: Massive Language Models can be Accurately Pruned in One Shot"

Eliminates the need to use/store 50% of weights for a 175B param model with no significant sacrifice in perf

https://arxiv.org/pdf/2301.00774.pdf

Here's how 👇

🐦🔗: https://twitter.com/mathemagic1an/status/1610159526598311936

Manjunath Jan 3, 2023

RT @[email protected]

Muse: Text-To-Image Generation via Masked Generative Transformers

Presents Muse, a text-to-image Transformer model that achieves SotA image generation perf while being far more efficient than diffusion or AR models.

proj: https://muse-model.github.io/
abs: https://arxiv.org/abs/2301.00704

🐦🔗: https://twitter.com/arankomatsuzaki/status/1610088296922718208

Muse: Text-To-Image Generation via Masked Generative Transformers

Manjunath Jan 3, 2023

RT @[email protected]

Deep Learning with PyTorch - University of Amsterdam(UvA)

A fantastic series of tutorials covering a wide array of topics from PyTorch basics, basics of neural nets, architectures(CNNs, transformers, GNNs), generative networks, and contrastive learning.

https://uvadlc-notebooks.readthedocs.io/en/latest/

🐦🔗: https://twitter.com/Jeande_d/status/1609999660177059840

Welcome to the UvA Deep Learning Tutorials! — UvA DL Notebooks v1.2 documentation

Manjunath Dec 30, 2022

RT @[email protected]

Final day of a lovely trip to Bangalore. Looking forward to many future visits, especially to visit a dynamic incoming NLP prof at @[email protected], my dear friend and former student @[email protected].

(Attn prospective graduate students: get in touch w Danish!)

🐦🔗: https://twitter.com/zacharylipton/status/1608653472685260801

Zachary Lipton on Twitter

“Final day of a lovely trip to Bangalore. Looking forward to many future visits, especially to visit a dynamic incoming NLP prof at @iiscbangalore, my dear friend and former student @danish037. (Attn prospective graduate students: get in touch w Danish!)”

Twitter

Manjunath Dec 30, 2022

RT @[email protected]

We are running out of a vital resource: words!

There are “only” 5 to 10 trillion high-quality words (papers, books, code) on the internet. Our AI models will have used all of that for training by 2026. Low-quality data (tweets, fanfic) will last to 2040. https://arxiv.org/pdf/2211.04325.pdf

🐦🔗: https://twitter.com/emollick/status/1605756428941246466