Mastodawn

Hume AI just dropped TADA (Text-Acoustic Dual Alignment) – an open-source framework for expressive speech generation.
✨ 1:1 text-audio token alignment
✨ Precise prosody & timing control
✨ Multilingual (DE/EN/FR/ES/JA/AR)
✨ Built on Llama 3.2 (1B/3B)
🔗 https://github.com/HumeAI/tada

AIME Mar 6

Instant LLM adaptation via text prompts? 🧠⚡️

SakanaAI's new Text-to-LoRA (T2L) uses a hypernetwork to generate task-specific LoRAs from simple text descriptions—no expensive fine-tuning required.

✅ Compresses 100s of adapters
✅ Generalizes to unseen tasks
✅ ICML 2025 Paper & Code: https://github.com/SakanaAI/text-to-lora

GitHub - SakanaAI/text-to-lora: Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input

Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input - SakanaAI/text-to-lora

GitHub

AIME Feb 20

Voicebox: open-source, locally run TTS studio—no cloud, no subscriptions.
✅ Powered by Qwen3-TTS for expressive voice cloning
✅ Multi-track editor + inline audio editing
✅ Tauri/Rust app: 10× smaller than Electron
✅ MIT license, full privacy
A self-hosted alternative to ElevenLabs 👇
https://github.com/jamiepine/voicebox

GitHub - jamiepine/voicebox: The open-source voice synthesis studio

The open-source voice synthesis studio. Contribute to jamiepine/voicebox development by creating an account on GitHub.

GitHub

AIME Feb 18

AKI.IO is now live: Curated open-source and open-weight models such as #MiniMax M2.5, #Apertus 70B, #Qwen Image Edit and many more as an API – hosted entirely in European data centers w/o hyperscalers. Happy to get your feedback!
The playground is open, API key via free registration at aki.io

https://www.aki.io/

Home - AKI.IO

Token-based access to leading open-source AI models on EU infrastructure. Evaluate, build and scale your AI product without self-hosting or vendor lock-in.

AKI.IO

AIME Feb 16

Qwen3.5 is out: Alibaba's open-weight series built for agentic AI with native multimodality.
✅ Flagship: 397B total / 17B active params (MoE)
✅ 1M-token context → 2h audio/video in one pass
✅ 60% cheaper, 8× more efficient than predecessor
✅ MIT license, full open weights
Unified vision-language reasoning for agent workflows 👇
https://github.com/QwenLM/Qwen3.5

GitHub - QwenLM/Qwen3.5: Qwen3.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Qwen3.5 is the large language model series developed by Qwen team, Alibaba Cloud. - QwenLM/Qwen3.5

GitHub

AIME Feb 13

Microsoft's VibeVoice-ASR transcribes 60-minute audio in a single pass—no chunking needed.
✅ 9B params, 64K-token context
✅ ASR + speaker diarization + timestamps in one inference
✅ MIT license, fully open source
A leap for meeting/podcast transcription 👇
https://huggingface.co/microsoft/VibeVoice-ASR

microsoft/VibeVoice-ASR · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

AIME Feb 13

MiniMax M2.5 is out: a frontier model optimized via massive RL for agentic workflows.
✅ 80.2% on SWE-Bench Verified
✅ $0.30/hour at 50 tokens/sec (1/10–1/20 cost of Claude Opus/Gemini 3 Pro)
✅ 76.3% on BrowseComp, 20% fewer search rounds
✅ Full-stack coding + office productivity (Word/Excel/PPT)
Forge RL Framework enables near-linear scaling across 10k+ real agent scenarios.
Weights not yet public—MiniMax historically releases them later.
https://www.minimax.io/news/minimax-m25

MiniMax M2.5: 更快更强更智能，为真实世界生产力而生

MiniMax

AIME Feb 12

GLM-5 is here: Z.ai's new MoE flagship (744B total / 40B active params) built for agentic engineering.
✅ #1 open-source on Vending Bench 2 (long-horizon planning)
✅ Closes gap with Claude Opus on CC-Bench-V2 (coding/agents)
✅ DeepSeek Sparse Attention for efficient 200K context
✅ Apache 2.0 license – commercial use allowed
A major leap for open, production-grade agent models 👇
https://github.com/zai-org/GLM-5

AIME Feb 12

ACE-Step v1.5 is out: an open-source music generation model that runs locally with <4 GB VRAM.
✅ Hybrid LM+DiT architecture
✅ 8 diffusion steps → full songs in ~2s (A100)
✅ 4-min tracks with lyrics, 50+ languages
✅ MIT license, full training code included
A leap for accessible, commercial-grade audio AI 👇
https://github.com/ace-step/ACE-Step-1.5

GitHub - ace-step/ACE-Step-1.5: The most powerful local music generation model that outperforms most commercial alternatives

The most powerful local music generation model that outperforms most commercial alternatives - ace-step/ACE-Step-1.5

GitHub

AIME Feb 4

Qwen3-Coder-Next is out: an open-weight MoE model (80B total / 3B active params) built for agentic coding workflows.
✅ 256K context length
✅ Tool-use & multi-step reasoning optimized
✅ Apache 2.0 license for local/dev use
Great step for open coding agents 👇
https://huggingface.co/Qwen/Qwen3-Coder-Next

Qwen/Qwen3-Coder-Next · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Website	https://www.aime.info
LinkedIn	https://www.linkedin.com/company/a-i-m-e/
Blog	https://www.aime.info/blog/
Location	Berlin