Who's played with https://paperclip.ing/ ? Not sure how I wandered across it and I'll get to playing with it eventually, but wondered if anyone had any hot takes on it?

#ai #paperclip #agents

Paperclip — Open-source orchestration for zero-human companies

Manage a team of AI agents to run your business. Org charts, budgets, governance, and goals — all in one deployment.

Microsoft Research (@MSFTResearch)

AsgardBench는 시각 관찰을 바탕으로 임베디드 에이전트가 작업 중 계획을 수정할 수 있는지 평가하는 벤치마크다. 지각 기반 계획 능력에 초점을 맞춰 에이전트의 한계를 드러내고, 신뢰성 향상에 필요한 개선 방향을 제시한다.

https://x.com/MSFTResearch/status/2037244033475453210

#ai #benchmark #agents #embodiedai #planning

Microsoft Research (@MSFTResearch) on X

AsgardBench evaluates whether embodied agents can revise their plans based on visual observations as tasks unfold. By focusing on perception-driven planning, it exposes key limitations and guides improvements in agent reliability. https://t.co/6jAXzgCLvH

X (formerly Twitter)

Z.ai for Startups (@ZaiforStartups)

CodeBuddy와 GLM이 싱가포르에서 글로벌 AI 해커톤을 개최한다. 참가자들은 AI 에이전트를 실제로 만들고 배포하며, 무엇이 문제를 일으키는지 확인하는 방식으로 개발을 진행한다. 1,000달러 이상 상금과 멘토링이 제공되며, 4월 20일까지 지원 가능하다.

https://x.com/ZaiforStartups/status/2037173389451006052

#ai #hackathon #agents #glm #codebuddy

Z.ai for Startups (@ZaiforStartups) on X

Agents don’t learn by watching. They learn by building. CodeBuddy × GLM — Global AI Hackathon Singapore 🇸🇬 Build. Ship. See what actually breaks. $1,000+ prizes + mentorship. Apply by April 20th.

X (formerly Twitter)

BOOTOSHI (@KingBootoshi)

한 에이전트가 문제를 해결하지 못하면 다른 에이전트를 추가로 투입해 서로 다른 LLM 모델의 제안을 비교·보완하는 방식으로 더 많은 컴퓨팅을 활용할 수 있다고 소개한다. 여러 모델을 조합해 다양한 해법을 얻는 실용적인 에이전트 활용 팁이다.

https://x.com/KingBootoshi/status/2037284826047537415

#llm #agents #multiagent #aiworkflow #inference

BOOTOSHI 👑 (@KingBootoshi) on X

you guys know you can throw more compute at a problem yourself right? if one agent couldn't solve it, throw a diff agent at it who can then see the proposed solutions and offer a variety of different ones BIG help especially when they're different LLM models works everytime!

X (formerly Twitter)

Will agent clusters running agent clusters eat the whole?

#GenAI #AI #Agents #Software #Technology #Programming #SoftwareDevelopment #Coding #SoftwareEngineering

GitHub - salespeak-ai/buyer-eval-skill: B2B software vendor evaluation skill for Claude Code — domain-expert questions, vendor AI agent conversations, evidence-based scoring

B2B software vendor evaluation skill for Claude Code — domain-expert questions, vendor AI agent conversations, evidence-based scoring - salespeak-ai/buyer-eval-skill

GitHub

Andrej Karpathy's recent podcast interview is worth your time

Key ideas: agent orchestration over single-session prompting,
AutoResearch loops that remove human researcher from hyperparameter tuning, and a prediction that digital transformation leads while physical robotics lags by years

His take on open source (~6-8 months behind frontier) being a healthy power balance is worth sitting with

"Centralization has a very poor track record." Hard to argue

#OpenSource #LLMs #AIResearch #Agents