#bergetai just announced a, for me, very interesting service: inference using powerful models like Kimi K2.6.

They previously released Mistral 3.5 & Gemma 4 as moderate and light models which I would assume is to build a know-how in building good inference endpoints.

Hope to see more of this! Local models and more providers means less lock-in, making inference a commodity and not putting all eggs in the openai/anthropic basket.

https://berget.ai/code

#sovereignai #swedishai #openweightmodels #kimi

Berget Code — Agentic coding on European infrastructure

A sovereign AI coding platform for regulated European organisations. OpenCode agents running on dedicated GPU hardware in Sweden, under EU jurisdiction.

Berget AI

From mid-April, OpenCode (Go, the $10 per month plan) has become my primary coding assistant.
Initially
* Kimi K2.6 and now a days
* DeepSeek v4 is good enough to help me assist in 98% of the cases WITHOUT hitting any rate limit.

#OpenWeightModels

With compute costs plummeting, open‑weight models like Llama and Mistral are finally within reach of more developers. Faster token processing and cheaper training mean the next wave of generative AI can be built, shared, and improved by the community. Dive into how affordability is reshaping the LLM landscape. #OpenWeightModels #TokenProcessing #Llama #Mistral

🔗 https://aidailypost.com/news/falling-costs-drive-expansive-accessibility-language-models

The release of the first widely adopted reasoning model, #o1, marked a #turningpoint in the evolution of #LLMs. An empirical #study using the #OpenRouter platform analysed over 100 trillion tokens of real-world LLM interactions, revealing substantial adoption of #openweightmodels, the popularity of #creativeroleplay and #codingassistance, and the rise of #agenticinference. https://openrouter.ai/state-of-ai?eicker.news #tech #media #news
State of AI 2025: 100T Token LLM Usage Study | OpenRouter

Read OpenRouter's 2025 State of AI report — an empirical 100 trillion token study of real LLM usage, model trends, and developer insights.

OpenRouter

EdgeRunner AI just launched an offline assistant built on the open-source gpt-oss model, marking the first time open-weight LLMs are deployed with the US Army and Air Force. This could reshape how the military uses AI without internet reliance. Read more to see the implications for open-source AI and defense. #EdgeRunnerAI #gptOSS #OpenWeightModels #USMilitary

🔗 https://aidailypost.com/news/edgerunner-ai-runs-assistant-gpt-oss-open-weight-models-join-us

Mỹ và EU có từ bỏ mô hình trọng lượng mở? Dường như gần đây chỉ có mô hình từ Trung Quốc. Liệu họ đã thay đổi chiến thuật? #MôHìnhTrọngLượngMở #OpenWeightModels #TrungQuốc #USA #EU #CôngNghệ #Technology

https://www.reddit.com/r/LocalLLaMA/comments/1ov9lug/has_the_usaeu_given_up_on_open_weight_models/

#ArtificialAnalysis published a #benchmark comparing the performance of #OpenAI’s #gptoss-120b across different #hostedproviders. The results showed #significantvariance. This highlights the challenges faced by customers of #openweightmodels, as #performance can vary depending on the #provider and their implementation. https://simonwillison.net/2025/Aug/15/inconsistent-performance/?eicker.news #tech #media #news
Open weight LLMs exhibit inconsistent performance across providers

Artificial Analysis published a new benchmark the other day, this time focusing on how an individual model—OpenAI’s gpt-oss-120b—performs across different hosted providers. The results showed some surprising differences. Here’s the …

Simon Willison’s Weblog
Extracting memorized pieces of (copyrighted) books from open-weight language models

Plaintiffs and defendants in copyright lawsuits over generative AI often make sweeping, opposing claims about the extent to which large language models (LLMs) have memorized plaintiffs' protected expression in their training data. Drawing on both machine learning and copyright law, we show that these polarized positions dramatically oversimplify the relationship between memorization and copyright. To do so, we extend a recent probabilistic extraction technique to measure memorization of 50 books in 17 open-weight LLMs. Through thousands of experiments, we show that the extent of memorization varies both by model and by book. With respect to our specific extraction methodology, we find that most LLMs do not memorize most books -- either in whole or in part. However, we also find that Llama 3.1 70B entirely memorizes some books, like the first Harry Potter book and 1984. In fact, the first Harry Potter is so memorized that, using a seed prompt consisting of just the first few tokens of the first chapter, we can deterministically generate the entire book near-verbatim. We discuss why our results have significant implications for copyright cases, though not ones that unambiguously favor either side.

arXiv.org