Mastodawn

J Λ M Ξ S Oct 9, 2025

@Kingwulf fascinated to see that Hierarchical Reasoning Model proving itself over time

https://bsky.app/profile/did:plc:oxbzsf2flrqyobhxemowaceh/post/3m2rkwru5fk2y

Sebastian Raschka (rasbt) (@sebastianraschka.com)

From the Hierarchical Reasoning Model (HRM) to a new Tiny Recursive Model (TRM). A few months ago, the HRM made big waves in the AI research community as it showed really good performance on the ARC challenge despite its small 27M size. (That's about 22x smaller than the smallest Qwen3 0.6B model.)

Bluesky Social

J Λ M Ξ S Sep 29, 2025

looks like I'll be helping Wharton with their new GenAI Studio this fall. pretty awesome folks will be running it too, including my buddy CCB

https://gail.wharton.upenn.edu/gen-ai-studio/

Generative AI Studio

Wharton Generative AI Labs

J Λ M Ξ S Sep 29, 2025

I used Claude to build a static web server in Cobol. I don't have a clue how to read or write Cobol, but the server works!

https://github.com/jmsdnns/webbol

GitHub - jmsdnns/webbol: A minimal static web server written in COBOL

A minimal static web server written in COBOL. Contribute to jmsdnns/webbol development by creating an account on GitHub.

GitHub

J Λ M Ξ S Sep 10, 2025

🎶 Terminus, a techstep mix by f1rstpers0n

if you like drum n bass that sounds like it's from the future, CLICK PLAY

https://youtu.be/P3oA9Tuykr4

Terminus | Techstep + Drum and Bass Mix

YouTube

J Λ M Ξ S Aug 26, 2025

Bold claims from Nvidia!

> Our Jet-Nemotron-2B model achieves comparable or superior accuracy to Qwen3, Qwen2.5, Gemma3, and Llama3.2 across a comprehensive suite of benchmarks while delivering up to 53.6x generation throughput speedup and 6.1x prefilling speedup.

https://arxiv.org/abs/2508.15884v1

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

We present Jet-Nemotron, a new family of hybrid-architecture language models, which matches or exceeds the accuracy of leading full-attention models while significantly improving generation throughput. Jet-Nemotron is developed using Post Neural Architecture Search (PostNAS), a novel neural architecture exploration pipeline that enables efficient model design. Unlike prior approaches, PostNAS begins with a pre-trained full-attention model and freezes its MLP weights, allowing efficient exploration of attention block designs. The pipeline includes four key components: (1) learning optimal full-attention layer placement and elimination, (2) linear attention block selection, (3) designing new attention blocks, and (4) performing hardware-aware hyperparameter search. Our Jet-Nemotron-2B model achieves comparable or superior accuracy to Qwen3, Qwen2.5, Gemma3, and Llama3.2 across a comprehensive suite of benchmarks while delivering up to 53.6x generation throughput speedup and 6.1x prefilling speedup. It also achieves higher accuracy on MMLU and MMLU-Pro than recent advanced MoE full-attention models, such as DeepSeek-V3-Small and Moonlight, despite their larger scale with 15B total and 2.2B activated parameters.

arXiv.org

J Λ M Ξ S Aug 19, 2025

Vibe coders are becoming sentient https://reddit.com/r/vibecoding/comments/1mu6t8z/whats_the_point_of_vibe_coding_if_i_still_have_to/

J Λ M Ξ S Jul 30, 2025

🤖 Finance Language Model Evaluation (FLaME)

> We are the first research paper to comprehensively study LMs against 'reasoning-reinforced' LMs, with an empirical study of 23 foundation LMs over 20 core NLP tasks in finance. We open-source our framework software along with all data and results.

https://arxiv.org/abs/2506.15846

Finance Language Model Evaluation (FLaME)

Language Models (LMs) have demonstrated impressive capabilities with core Natural Language Processing (NLP) tasks. The effectiveness of LMs for highly specialized knowledge-intensive tasks in finance remains difficult to assess due to major gaps in the methodologies of existing evaluation frameworks, which have caused an erroneous belief in a far lower bound of LMs' performance on common Finance NLP (FinNLP) tasks. To demonstrate the potential of LMs for these FinNLP tasks, we present the first holistic benchmarking suite for Financial Language Model Evaluation (FLaME). We are the first research paper to comprehensively study LMs against 'reasoning-reinforced' LMs, with an empirical study of 23 foundation LMs over 20 core NLP tasks in finance. We open-source our framework software along with all data and results.

arXiv.org

J Λ M Ξ S Jul 24, 2025

🎶 All Cops Are Biomechs, by Bonginator

> my name is robocop
> your name is no one fucking cares

https://youtu.be/1-WEUPMfYCw

Bonginator - All Cops Are Biomechs [Official Music Video]

YouTube

J Λ M Ξ S Jul 23, 2025

anyone out there hiring #rustlang programmers? looking for a fulltime role after doing time in academia

my most recent rust project is killabeez https://github.com/jmsdnns/killabeez

GitHub - jmsdnns/killabeez: A tool for using pools of EC2 instances to do all kinds of things.

A tool for using pools of EC2 instances to do all kinds of things. - jmsdnns/killabeez

GitHub

J Λ M Ξ S Jul 15, 2025

🎸 Panasonic Youth, by Dillinger Escape Plan (guitar cover)

what a beast of a song... cripes. i spent a month watching ben eller videos to get the skills to play this.

if you know who Dillinger is, you know how insane their music is

https://www.youtube.com/watch?v=jdYtasXC1d0

Dillinger Escape Plan - Panasonic Youth (guitar cover)

YouTube

About Me	https://jmsdnns.com
Code	https://github.com/jmsdnns
Band	https://soundcloud.com/americanf00d