AIME

@AIME_hq
37 Followers
161 Following
248 Posts
AIME provides GPU cloud compute and develops AI-machines for deep learning and model inference (Multi-GPU workstations & HPC servers). We are in Berlin, Germany.
Websitehttps://www.aime.info
LinkedInhttps://www.linkedin.com/company/a-i-m-e/
Bloghttps://www.aime.info/blog/
LocationBerlin

Nex-AGI introduces Nex-N2: an open-source agent model for complex workflows.

Core: Agentic Thinking for adaptive reasoning & tool use. Available in Pro & Mini versions on Hugging Face, delivering top-tier performance.

https://github.com/nex-agi/Nex-N2

GitHub - nex-agi/Nex-N2

Contribute to nex-agi/Nex-N2 development by creating an account on GitHub.

GitHub
NVIDIA's Nemotron-3-Ultra-550B-A55B-NVFP4 is now on Hugging Face: 550B total params (55B active), 1M token context, NVFP4 quantization, 11 languages. OpenMDW-1.1 license permits commercial use. Optimized for agents, RAG & long-context inference.
https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Moonshot AI open-sources Kimi-Audio-7B: a unified foundation model for audio understanding, generation, and conversation.

Trained on 13M+ hours of data, achieves SOTA results on LibriSpeech, AISHELL, and VoiceBench. Includes inference code, fine-tuning examples, and evaluation toolkit.

https://github.com/MoonshotAI/Kimi-Audio

#OpenSource #AI #SpeechTechnology #MultimodalAI

EAGLE-3 boosts speculative decoding with direct token prediction + multi-layer feature fusion
RedHatAI now offers a ready-to-use EAGLE-3 speculator for Gemma-4-26B-A4B-it on HuggingFace

huggingface.co/RedHatAI/gemma-4-26B-A4B-it-speculator.eagle3

TurboQuant compresses LLM KV-caches via 3-bit key & 2/4-bit value quantization, cutting memory use by up to 4.4×. Enables longer contexts & higher throughput under GPU constraints. Open-source (GPL-3.0). github.com/0xSero/turboquant #LLM #Inference #Quantization #vLLM #OpenSource
Xiaomi releases MiMo-V2.5 (310B params) and MiMo-V2.5-Pro (1.02T params) as open-source models. Both support 1M token context, hybrid attention, and native multimodal capabilities. Available via API and open weights for agentic AI and code generation tasks. https://huggingface.co/XiaomiMiMo/MiMo-V2.5
XiaomiMiMo/MiMo-V2.5 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Robbyant released LingBot-Map, a new feed-forward 3D foundation model for streaming-based reconstruction – now open source.
✨ Geometric Context Transformer
⚡ ~20 FPS streaming inference
🏆 SOTA results on multiple benchmarks
🔗 https://technology.robbyant.com/lingbot-map
💻 https://github.com/Robbyant/lingbot-map
Robbyant - Exploring the Frontiers of Embodied Intelligence | 蚂蚁灵波科技 - 探索具身智能的上限,打造物理世界的 AGI 平台

Technology-driven and application-oriented. We build foundational large models for embodied AI: spatial perception (LingBot-Depth), VLA (LingBot-VLA), world models (LingBot-World), video action (LingBot-VA). Jointly embrace the new era of embodied intelligence. 技术驱动、场景导向,自研具身智能基础大模型,共迎具身智能新时代,共创幸福生活新场景。

Robbyant 蚂蚁灵波科技
Cohere Transcribe is here – a new open-weights ASR model (Apache 2.0) that's up to 4× faster than Whisper Large v3 with a best-in-class 5.42 WER. 🚀
🔹 2B params | 14 languages | Conformer architecture
🔹 Easy 🤗 Transformers integration
🔗 https://huggingface.co/CohereLabs/cohere-transcribe-03-2026
CohereLabs/cohere-transcribe-03-2026 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

🎙️ Mistral AI's Voxtral TTS is live: 4B params, 9 languages, ~70ms latency, zero-shot voice cloning. Open weights on HF.

https://huggingface.co/mistralai/Voxtral-Mini-4B-Realtime-2602

mistralai/Voxtral-Mini-4B-Realtime-2602 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Hume AI just dropped TADA (Text-Acoustic Dual Alignment) – an open-source framework for expressive speech generation.
✨ 1:1 text-audio token alignment
✨ Precise prosody & timing control
✨ Multilingual (DE/EN/FR/ES/JA/AR)
✨ Built on Llama 3.2 (1B/3B)
🔗 https://github.com/HumeAI/tada