Mastodawn

Mulight 沐光 (@0xMulight)

Gemini의 tool-calling 기능을 2,600만 파라미터의 소형 모델로 증류한 Needle이 공개됐다. GPU 없이 로컬에서 매우 낮은 지연으로 동작하며, 엣지 디바이스나 브라우저에서 함수 호출 기능을 구현하는 데 적합하다.

https://x.com/0xMulight/status/2055462340091646109

#gemini #toolcalling #distillation #edgeai #functioncalling

Mulight 沐光🌟 (@0xMulight) on X

Gemini的工具调用能力被蒸馏进了一个2600万参数的小模型。 Needle可以在本地运行，不需要GPU，延迟低到可以忽略。适合想在边缘设备、浏览器或者嵌入式场景里做function calling的人。 GitHub已开源，可以直接拿来做二次开发。一键安装方法： git clone https://t.co/SIw1i4UWwa cd needle &&

X (formerly Twitter)

sayzard 16h ago

Design Arena (@Designarena)

Design Arena의 370만+ 사용자 기반에서 생성된 슬라이드 작업을 바탕으로 한 랭킹 서비스와, 오픈소스 agent harness를 수정해 이미지 생성 등 도구 호출을 지원하는 평가/실행 인프라를 소개합니다. 슬라이드 생성 에이전트 평가 파이프라인에 관심 있는 개발자에게 실용적인 참고 사례입니다.

https://x.com/Designarena/status/2055400470693466332

#designarena #agentic #opensource #toolcalling #evaluation

Design Arena (@Designarena) on X

@AnthropicAI @Zai_org Slide Arena is powered by 3.7M+ users on Design Arena, creating slides for real-world use cases Check out the leaderboard live at https://t.co/9QNkOYQRqN The harness is a modified version of our open-source agent harness, with access to tool calls like generate_image,

X (formerly Twitter)

sayzard 4d ago

stevibe (@stevibe)

여러 오픈소스 LLM에 슬라이딩 퍼즐과 도구 호출 과제를 주고 장기 추론 능력을 비교한 테스트입니다. 6개 모델 중 5개가 실패했고 1개만 성공했다는 점에서, 단순 벤치마크보다 실제 추론·툴 사용 능력을 드러내는 흥미로운 평가 사례입니다.

https://x.com/stevibe/status/2054206771292692592

#opensource #llm #reasoning #toolcalling #benchmark

stevibe (@stevibe) on X

Six open-source LLMs. One sliding puzzle. A brutal test of long-horizon reasoning and tool calling. Five of them broke. One didn't. I gave each model a move_tile tool and a scrambled 3×3 board, then asked it to solve the puzzle through pure turn-by-turn reasoning. The deeper

X (formerly Twitter)

Foojay.io 4d ago

AI agents are transforming how we build software. Unlike traditional chatbots that just answer questions, agents can reason about what tools they need, decide when to use them, chain multiple actions together, and remember what happened earlier in a conversation. In...
#agenticAI #AIagents #BoxLang #BoxLangAI #Claude #Developertools #Gemini #GenerativeAI #Java #JVM #LLM #LocalAI #Ollama #openai #ToolCalling
https://foojay.io/today/how-to-develop-ai-agents-using-boxlang-ai-a-practical-guide/

foojay – a place for friends of OpenJDK

foojay is the place for all OpenJDK Update Release Information. Learn More.

foojay

Donweb Media May 10

GPT-Realtime-2: tool calling en paralelo y 128K contexto

¿Cuánto mejoró el tool calling en GPT-Realtime-2? 66.5% en benchmark, 128K contexto y llamadas paralelas. Probalo paso a paso con Pruebas GPT-Realtime-2...

https://blog.donweb.com/pruebas-gpt-realtime-2-herramientas-tool-calling/

#gptrealtime2 #openai #toolcalling #agentesdevoz #realtimeapi

Pruebas GPT-Realtime-2 herramientas: guía 2026

¿Cuánto mejoró el tool calling en GPT-Realtime-2? 66.5% en benchmark, 128K contexto y llamadas paralelas. Probalo paso a paso con Pruebas GPT-Realtime-2...

Blog Donweb

Foojay.io May 7

BoxLang AI 3.0 Series · Part 7 of 7 The AI ecosystem has a tool problem. Every framework has its own way of defining tools, every agent has its own way of calling them, and every integration requires custom code on both sides. An agent built in...
#agenticAI #AIagents #AIArchitecture #AIIntegration #APIs #BoxLang #Developertools #Java #JVM #LLM #MCP #ModelContextProtocol #Protocols #ToolCalling
https://foojay.io/today/boxlang-ai-deep-dive-part-7-of-7-mcp-the-protocol-that-connects-everything/

foojay – a place for friends of OpenJDK

foojay is the place for all OpenJDK Update Release Information. Learn More.

foojay

Barret May 5

From the .NET blog...

Python Trending 🇺🇦 (@pythontrending) on X

Rapid-MLX - The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool parsers, prompt cache, reasoning separation, cloud routing. Drop-in... https://t.co/72U6MreOtw

X (formerly Twitter)

Barret May 4

From the .NET blog...

sayzard May 3

Sudo su (@sudoingX)

같은 로컬 모델을 OpenClaw와 Hermes Agent에 적용해 비교한 결과, OpenClaw는 도구 호출이 불안정했지만 Hermes는 안정적으로 에이전트 루프를 수행했다. 문제의 핵심은 모델보다 프레임워크가 컨텍스트 예산을 과도하게 소모하는 데 있다는 점을 강조한다.

https://x.com/sudoingX/status/2050979814992118249

#agent #framework #toolcalling #localai #llm

Sudo su (@sudoingX) on X

i have been saying this for months. ran the same local model through openclaw and hermes agent back to back, openclaw could not reliably call a tool, hermes ran clean agentic loops. it is not the model that is broken, it is the framework eating half the context budget on its

X (formerly Twitter)