Mastodawn

Mehrdad Yazdani Dec 17, 2024

Checking in on my AI character

Mehrdad Yazdani Sep 15, 2024

AI generated image of a popular meme in the style of medieval art: https://glif.app/@fab1an/glifs/cm128fz6e000312o77ai2ae97

glif - Weird Medieval Style by fab1an

glif

Mehrdad Yazdani Sep 9, 2024

Attention is All You Need gets so much, ahem, attention that Google had to put this on top of the PDF.

Mehrdad Yazdani Sep 4, 2024

Mixture of experts when implemented right is so impressive: https://arxiv.org/abs/2409.02060

OLMoE: Open Mixture-of-Experts Language Models

We introduce OLMoE, a fully open, state-of-the-art language model leveraging sparse Mixture-of-Experts (MoE). OLMoE-1B-7B has 7 billion (B) parameters but uses only 1B per input token. We pretrain it on 5 trillion tokens and further adapt it to create OLMoE-1B-7B-Instruct. Our models outperform all available models with similar active parameters, even surpassing larger ones like Llama2-13B-Chat and DeepSeekMoE-16B. We present various experiments on MoE training, analyze routing in our model showing high specialization, and open-source all aspects of our work: model weights, training data, code, and logs.

arXiv.org

Mehrdad Yazdani Sep 1, 2024

Big Bear early in the morning on Labor Day weekend.

Mehrdad Yazdani Aug 29, 2024

Adapting to switching from Jupyter notebooks to VSCode. I still like the look and feel of Jupyter more, but these additional bells and whistles VSCode has is extra nice.

Mehrdad Yazdani Aug 16, 2024

Man I love this thing

Mehrdad Yazdani Aug 14, 2024

Neat free tool to generate directory tree structures in asciii: https://ascii-tree-generator.com/

Pretty useful when trying to figure out the structure for a project and brainstorming. Curious how the ascii renders here:

```
my_project/
├─ models/
│ ├─ model_1.py
│ ├─ model_2.py
├─ data/
│ ├─ data_prep.py
│ ├─ meta_data.json
├─ results/
├─ checkpoints/
trainer.py
runner.py
```

ASCII Tree Generator

Online interactive folder structure generator. Easily create and visualise your development tree for your new projects and your documentations.

Mehrdad Yazdani Aug 9, 2024

OK, ok, I may be converting to wandb user. Have to admit this feature of showing the run history as ascii art is particularly neat.

Mehrdad Yazdani Aug 9, 2024

Pretty neat transformer visualization project. Wonder if this could help with debugging 😜

https://poloclub.github.io/transformer-explainer/

Transformer Explainer: LLM Transformer Model Visually Explained

An interactive visualization tool showing you how transformer models work in large language models (LLM) like GPT.

Blog	https://crude2refined.wordpress.com/
GitHub	https://github.com/myazdani
Twitter	@crude2refined