Mastodawn

Manjunath Jan 31, 2023

@[email protected] @[email protected] Great write up! We studied a similar problem too. We looked at when and what augmentations help for various classification models. Link to the paper - https://arxiv.org/abs/2210.06441

🐦🔗: https://twitter.com/gowthami_s/status/1620205032325906434

How Much Data Are Augmentations Worth? An Investigation into Scaling Laws, Invariance, and Implicit Regularization

Despite the clear performance benefits of data augmentations, little is known about why they are so effective. In this paper, we disentangle several key mechanisms through which data augmentations operate. Establishing an exchange rate between augmented and additional real data, we find that in out-of-distribution testing scenarios, augmentations which yield samples that are diverse, but inconsistent with the data distribution can be even more valuable than additional training data. Moreover, we find that data augmentations which encourage invariances can be more valuable than invariance alone, especially on small and medium sized training sets. Following this observation, we show that augmentations induce additional stochasticity during training, effectively flattening the loss landscape.

arXiv.org

Manjunath Jan 29, 2023

RT @[email protected]

Just a few years ago, deep neural networks trained on ImageNet reached only ~75% accuracy on the test set. Now we can get 80% zero-shot accuracy.

Zero-shot performance is what matters for general-purpose AI. @[email protected]’s HPC cluster trained these impactful open-source models🔥 https://twitter.com/laion_ai/status/1618317487283802113

🐦🔗: https://twitter.com/hardmaru/status/1619270829828874240

LAION on Twitter

“We release a new ViT-G/14 CLIP model with OpenCLIP which achieves 80.1% zero-shot accuracy on ImageNet and 74.9% zero-shot image retrieval (Recall@5) on MS COCO. As of January 2023, this is the best open source CLIP model. https://t.co/TmVTUP3tBx https://t.co/PMnpUUTNpc”

Twitter

Manjunath Jan 28, 2023

RT @[email protected]

Please retweet!

Job opening at NICT, Japan.
Post: Technical Researcher
Topic: Indic Natural Language Generation
Minimum qualification: Masters in CS (plans to pursue a PhD and some publications is a plus)

https://nict.go.jp/en/employment/index-e-top2022-4.html
https://www2.nict.go.jp/employment/tempstaffinfo/exr-e/R4/2022T-88.pdf

#NLP
#NLProc

🐦🔗: https://twitter.com/prajdabre1/status/1619136116556431360

nict.go.jp

Manjunath Jan 27, 2023

RT @[email protected]

What if you could share your dreamboothed or fine-tuned Stable Diffusion model in just a 3.3 MB file?

You can now, with LoRA
https://huggingface.co/blog/lora

🐦🔗: https://twitter.com/pcuenq/status/1618633578979667968

Using LoRA for Efficient Stable Diffusion Fine-Tuning

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Manjunath Jan 21, 2023

RT @[email protected]

I just published A brief history of the satellite-image-deep-learning Github repository

https://link.medium.com/LsxZDLK3Jwb

🐦🔗: https://twitter.com/robmarkcole/status/1616406634011426816

Manjunath Jan 21, 2023

RT @[email protected]

Who is actively hiring data scientists, machine learning professionals, and data-centered people in general?

If your company is looking for talent, please reply to this tweet.

🐦🔗: https://twitter.com/svpino/status/1616420296181022723

Santiago on Twitter

“Who is actively hiring data scientists, machine learning professionals, and data-centered people in general? If your company is looking for talent, please reply to this tweet.”

Twitter

Manjunath Jan 21, 2023

If you are laid off please look at open positions we have at @[email protected], we are hiring,

https://careers.hpe.com/us/en/search-results?keywords=determined%20AI

#googlelayoff #layoffs

Search results | Find available job openings at HPE

Search results. Find available job openings at HPE

HPE

Manjunath Jan 19, 2023

RT @[email protected]

🦖Large Transformers are powerful but expensive to train & use. The extremely high inference cost is a big bottleneck for adopting them for solving real-world tasks at scale. Check out my new post on some ideas on inference optimization for Transformers: https://lilianweng.github.io/posts/2023-01-10-inference-optimization/

🐦🔗: https://twitter.com/lilianweng/status/1613305587445665796

Large Transformer Model Inference Optimization

[Updated on 2023-01-24: add a small section on Distillation.] Large transformer models are mainstream nowadays, creating SoTA results for a variety of tasks. They are powerful but very expensive to train and use. The extremely high inference cost, in both time and memory, is a big bottleneck for adopting a powerful transformer for solving real-world tasks at scale. Why is it hard to run inference for large transformer models? Besides the increasing size of SoTA models, there are two main factors contributing to the inference challenge (Pope et al.

Manjunath Jan 12, 2023

RT @[email protected]

Right before #SC22 I had a great conversation with @[email protected] on Scaling with @[email protected] and Frontier. Summarized conversation & thoughts into a new blog: “AMD Instinct: Scaling the Heights of #GPU Acceleration for a New Supercomputing Era” https://bit.ly/Instinct_Scaling

🐦🔗: https://twitter.com/wkmyrhang/status/1613259008193740800

AMD Instinct: Scaling the Heights of GPU Acceleration for a New Supercomputing Era

Since its inception, the Graphics Processing Unit (GPU) has had promising possibilities as an accelerator for tasks other than graphics rendering. While the GPU in a gaming PC and one used as a general-purpose accelerator have considerable base-feature overlap, there are many benefits when optimizin...

AMD.com

Manjunath Jan 12, 2023

RT @[email protected]

Hey #OpenCL/#SYCL/#CUDA developers! I have just #opensourced a small but powerful tool to optimize #GPU performance: A profiler to count #PTX #assembly instructions, listing flops & memory/cache accesses per #kernel for #roofline model analysis. 🖖🧐🔎💻
https://github.com/ProjectPhysX/PTXprofiler

🐦🔗: https://twitter.com/ProjectPhysX/status/1613478743540117504

GitHub - ProjectPhysX/PTXprofiler: A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis. - GitHub - ProjectPhysX/PTXprofiler: A simple profiler to count Nvidia PTX assem...

GitHub