자는 동안 700번 실험, Karpathy 오픈소스 AI 연구 자동화 도구 autoresearch

Karpathy가 공개한 autoresearch는 AI 에이전트가 자율로 ML 실험을 반복하며 모델을 개선하는 도구. 700번 실험으로 GPT-2 학습 11% 단축, Shopify는 절반 크기 모델로 기존 모델 성능을 능가했습니다.

https://aisparkup.com/posts/10011

Truth is coming and cannot be stopped – In Manchester, UK

On Facebook. “Truth is coming and cannot be stopped” – By SLM and D7606 in support of Edward Snowden. In Manchester, UK.

https://streetartutopia.com/2013/08/26/truth-is-coming-and-cannot-be-stopped-in-manchester-uk/

Truth is coming and cannot be stopped - In Manchester, UK - STREET ART UTOPIA

On Facebook. “Truth is coming and cannot be stopped” – By SLM and D7606 in support of Edward Snowden. In Manchester, UK.

STREET ART UTOPIA

In my journey to make @silex AI-native and optimized for free #OpenSource local models

I just released a grapesjs-ai-capabilities plugin, inspired by #WordPress new capability API: https://github.com/silexlabs/grapesjs-ai-capabilities

Now I'll refactor all the #GrapeJs plugins I maintain to expose capabilities as #MCP tools, with minimal prompts and very specific errors for #SLM

Then I’ll benchmark different models and maybe experiment with fine-tuning

#GrapesJS #AI #LocalAI #NoCode #BuildInPublic #FOSS #MCP #LLM

GitHub - silexlabs/grapesjs-ai-capabilities: Lightweight discovery layer on top of GrapesJS commands, allowing plugins to declare structured, machine-readable capabilities that builders can expose as MCP tools

Lightweight discovery layer on top of GrapesJS commands, allowing plugins to declare structured, machine-readable capabilities that builders can expose as MCP tools - silexlabs/grapesjs-ai-capabili...

GitHub

AISatoshi (@AiXsatoshi)

Qwen3.5-4B-Q4을 TR7975 CPU에서 추론했을 때 약 15 token/s를 기록했다고 보고. 특히 4B-4bit 구성의 SLM으로 테트리스 테스트를 통과한 점에 대해 놀라움을 표함.

https://x.com/AiXsatoshi/status/2028496680862224649

#qwen #cpu #tr7975 #slm #inference

AI✖️Satoshi⏩️ (@AiXsatoshi) on X

Qwen3.5-4B-Q4 CPU(TR7975)推論で15tok/sだった それより、4B-4bitのSLMでテトリステストクリアしたことに驚く

X (formerly Twitter)

AISatoshi (@AiXsatoshi)

Qwen3.5-4B-Q4가 테트리스 생성 테스트에서 80% 이상의 성공률을 기록했다고 보고. 해당 파라미터 규모(4B)에서 보기 드문 높은 정확도로 평가되며, 게시자는 하이브리드 어텐션 SLM이 새로운 시대를 열고 있다고 언급함.

https://x.com/AiXsatoshi/status/2028500178043617538

#qwen #llm #slm #quantization #tetris

AI✖️Satoshi⏩️ (@AiXsatoshi) on X

Qwen3.5-4B-Q4は、テトリス生成テストで、8割以上の成功率!このサイズで今までなかった精度だと思う。ハイブリッドアテンションSLMの新時代来てる

X (formerly Twitter)
Small Language Models: construyendo la arquitectura de las nuevas redacciones

La gran apuesta estratégica para el sector de los medios no consiste únicamente en que los redactores utilicen herramientas de IA, sino en diseñar una arquitectura tecnológica propia.

Digital_Journey

Geometry > Scale: Как 40М параметров на решетке E8 обходят классические трансформеры

Ребята, кажется, мы уперлись в стену. Пока гиганты наращивают параметры и жгут тераватты, пытаясь выжать каплю разума из статистики, я решил пересмотреть сам фундамент. Проблема не в данных, проблема в «вязкости» стандартного Attention.

https://habr.com/ru/articles/1005298/

#llm #E8 #transformer #transformers #edgeai #slm

Geometry > Scale: Как 40М параметров на решетке E8 обходят классические трансформеры

Ребята, кажется, мы уперлись в стену. Пока гиганты наращивают параметры и жгут тераватты, пытаясь выжать каплю разума из статистики, я решил пересмотреть сам фундамент. Проблема не в данных, проблема...

Хабр
Introduction to Small Language Models: The Complete Guide for 2026 - MachineLearningMastery.com

Learn when small language models outperform large models while cutting AI deployment costs by 95%.

MachineLearningMastery.com

Abhishek Yadav (@abhishek__AI)

7개의 로컬 실행 가능한 스몰 언어 모델(Small Language Model)이 소개됨. 대표적으로 Gemma 2 9B, SmolLM2, Llama 3.2, Ministral 3 8B, Qwen 2.5 7B, Phi-3.5 Mini 등이 있으며, GPU 없이도 실행 가능함. 각 모델은 안전성, 프로토타이핑, 엣지 디바이스, 코딩 및 수학 작업, RAG 등 특정 용도에 최적화됨.

https://x.com/abhishek__AI/status/2025671499974324705

#slm #llm #opensource #ai #localmodels

Abhishek Yadav (@abhishek__AI) on X

7 Small Language Models you can run locally: → Gemma 2 9B (safety-first) → SmolLM2 (fast prototyping) → Llama 3.2 (3B & 1B edge-ready) → Ministral 3 8B (13B-level quality) → Qwen 2.5 7B (coding + math killer) → Phi-3.5 Mini (long-context RAG beast) No GPU Needed

X (formerly Twitter)

"Pourquoi la qualité des données (tokens) prime sur la quantité, et comment les Small Language Models (SLM) vont permettre de décentraliser l'intelligence."

https://www.youtube.com/watch?v=wirmd5kCWok

#aLaFrench #ia #slm #llm

IA : Pourquoi la taille ne compte plus ?

YouTube