Mastodawn

Browser agent için 8 gorsel LLM'i ekran goruntusu temellendirmede kıyasladık.

Şaşırtıcı bulgu: Qwen 3.5-9B, 308B parametreli MiMo V2.5'in kaçırdığı bir dropdown affordance'ını doğru sınıflandırıyor. Affordance parametre sayısıyla ölçeklenmiyor.

8 modelden sadece 1'i (Qwen 3.6-35B-A3B) kalibrasyonda dürüst belirsizlik gösteriyor.

Detaylı yazı + VRAM önerileri:
https://webbrain.one/blog

GitHub'da ⭐ atarsanız çok seviniriz 🙏
https://github.com/esokullu/webbrain

#LocalLLM #VLM #AIAgents #Qwen #AI #yapayzeka

WebBrain Blog

Engineering notes from WebBrain — the open-source AI browser agent.

Emre Sokullu 9h ago

Benchmarked 8 VLMs for screenshot grounding inside a browser agent.

Counterintuitive finding: Qwen 3.5-9B classifies a dropdown affordance that 308B MiMo V2.5 misses. Affordance doesn't scale with parameter count.

Only 1 of 8 (Qwen 3.6-35B-A3B) flags honest uncertainty in calibration.

Full write-up + VRAM tier recs + repo:

https://webbrain.one/blog

#LocalLLM #VLM #AIAgents #OpenSource #Qwen #qwen36 #MachineLearning #FOSS #AI

WebBrain Blog

Engineering notes from WebBrain — the open-source AI browser agent.

Sharat V.15h ago

Running a language model on your own machine used to be limited to an elite few with powerful hardware and technical expertise. Not any more!

https://sharat.autumncloud.one/blog/local-llms-more-accessible/

#ai #email #llm #localllm #python

Local LLMs Are More Accessible Than You Think | Sharat Visweswara

Running a language model on your own machine used to be limited to an elite few with powerful hardware and technical expertise. Not any more!

sayzard 1d ago

Show HN: Cybersecurity Phishing Guard for Chrome using local LLMs for privacy
한 개발자가 개인 정보 보호를 위해 로컬 LLM을 활용한 크롬 확장 프로그램을 개발했다. 이 확장 프로그램은 웹페이지를 자동 또는 수동으로 로컬 LLM 모델에 입력해 6가지 신호를 분석하여 피싱 공격 여부를 판단한다. 향후 크롬에 기본 탑재될 가능성도 제기되고 있다. 이는 로컬 AI 모델을 활용한 보안 분야의 새로운 응용 사례로 주목할 만하다.

https://github.com/tommyjepsen/local-llm-phishing-guard-for-chrome

#cybersecurity #phishing #chrome #localllm #privacy

sayzard 1d ago

Running a Local LLM Coding Server on MacBook Pro M5 Pro 48 GB
MacBook Pro M5 Pro 48GB에서 로컬 코딩 AI 서버를 구축하는 과정을 상세히 소개한다. mlx-lm 서버는 메모리 관리 문제로 장시간 대화 시 크래시가 발생했으나, Ollama는 고정된 컨텍스트 크기와 최적화된 모델로 안정적인 운영이 가능했다. 특히 Qwen 3.6 35B-A3B-mxfp8 모델이 Apple Silicon에 최적화되어 높은 품질과 안정성을 제공하며, OpenCode를 통해 네트워크 내에서 편리하게 접근할 수 있다. 이 사례는 클라우드 없이도 강력한 로컬 LLM 코딩 서버 구축이 가능함을 보여준다.

https://blog.kulman.sk/running-local-llm-coding-server/

#localllm #applesilicon #qwen #ollama #codingai

Running a Local LLM Coding Server on MacBook Pro M5 Pro 48 GB

An honest account of running a local coding AI on Apple Silicon — what crashed, what worked, and why I settled on Ollama with Qwen 3.6 MoE.

happyborg 2d ago

You realise that a local LLM is still an LLM right?

With all the dangers to you and downsides for everyone, and with a fraction of the bits you presumably like (in slow motion), and still controlled by a handful of sociopathic billionaires who will manipulate and enslave you as soon as make you destitute 🤔 including by putting toxic things in those little black boxes you are eagerly installing on your device.

#localLLM #LLMs

ai0.news 4d ago

DeepSeek V4 and Kimi K2.6 challenge frontier labs on price and benchmarks, a Hawley bill to ID-verify chatbot users clears committee, and Microsoft quietly attributed commits to Copilot without its use.

https://ai0.news/posts/2026-05-03-daily-digest/

#AI #AiPolicy #OpenSource #LocalLLM

AI News — May 03, 2026 | ai0.news

ai0.news

sayzard 5d ago

Sudo su (@sudoingX)

DGX Spark에서 Nemotron 3 Omni Q8을 Hermes Agent로 구동해 초당 56토큰 속도를 기록했다. 256K 컨텍스트를 33GB 통합 메모리에서 운영하며, 로컬 에이전트 워크플로가 기대 이상으로 강력하다는 점을 강조한 테스트 결과다.

https://x.com/sudoingX/status/2050274286892667338

#nemotron #dgxspark #localllm #agent #multimodal

Sudo su (@sudoingX) on X

nemotron 3 omni q8 on dgx spark 128gb vram cranking via hermes agent at 56 tok/s. first night of real local agentic on this box and local hits harder than i thought it would. q8 (near lossless quant, perplexity loss <1% vs fp16) running 256k context on 33 gb of unified memory,

X (formerly Twitter)