Big News! The completely #opensource #LLM #Apertus 🇨🇭 has been released today:

📰 https://www.swisscom.ch/en/about/news/2025/09/02-apertus.html

🤝 The model supports over 1000 languages [EDIT: an earlier version claimed over 1800] and respects opt-out consent of data owners.

▶ This is great for #publicAI and #transparentAI. If you want to test it for yourself, head over to: https://publicai.co/

🤗 And if you want to download weights, datasets & FULL TRAINING DETAILS, you can find them here:
https://huggingface.co/collections/swiss-ai/apertus-llm-68b699e65415c231ace3b059

🔧 Tech report: https://huggingface.co/swiss-ai/Apertus-70B-2509/blob/main/Apertus_Tech_Report.pdf

After #Teuken7b and #Olmo2, Apertus is the next big jump in capabilities and performance of #FOSS #LLMs, while also improving #epistemicresilience and #epistemicautonomy with its multilingual approach.

I believe that especially for sensitive areas like #education, #healthcare, or #academia, there is no alternative to fully open #AI models. Everybody should start building upon them and improving them.

#KIMündigkeit #SovereignAI #FOSS #ethicalAI #swissai #LernenmitKI

Das #fraunhoferinstitut IAIS bietet am 19.09.25 ein kostenloses Seminar zu #OpenGPTX #Teuken7B : GenAI im öffentlichen Sektor an!

https://www.iais.fraunhofer.de/de/branchen-themen/themen/generative-ki/opengpt-x.html

Zudem wird in einer 5 -Schritte Anleitung der Weg zur individuellen Anwendung, auch mit RAG, aufgezeigt, inkl. kostenlosen Beratungsterminen.

Damit rückt die eigene #ki in greifbare Nähe! #teamdatenschutz #FediKirche #datenschutz

OpenGPT-X: Teuken 7B - Fraunhofer IAIS

Das europäische, offene, multilinguale KI-Sprachmodell. Unternehmen aller Branchen können KI-Anwendungen jetzt kostenfrei mit Teuken 7B umsetzen.

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
Since late 2024, the German #LLM #Teuken7B has been available to science and industry. Our colleagues Stefan Kesselheim and Andreas Herten helped develop Teuken-7B and explain in an interview how it differs from commercial #AI models such as #ChatGPT. https://go.fzj.de/teuken-7b
A transparent multilingual talent

Since the end of 2024, the German AI language model Teuken-7B has been available to anyone interested from science and industry. Stefan Kesselheim and Andreas Herten from the Jülich Supercomputing Centre helped to develop Teuken-7B. Here they explain what distinguishes the multilingual open-source model from commercial products such as ChatGPT.

Was braucht es, damit ein KI-Sprachmodell „Made in Europe“ mit den Tech-Giganten aus Nordamerika mithalten kann?

Seit Ende 2024 steht das deutsche #KI-Sprachmodell #Teuken7B Interessenten aus Wissenschaft und Wirtschaft zur Verfügung. Stefan Kesselheim und @andih vom @fzj_jsc haben Teuken-7B mitentwickelt und erklären im Interview, was das mehrsprachige Open-Source-Modell von kommerziellen Angeboten wie #ChatGPT oder #Gemini unterscheidet.

Zum Interview: https://www.fz-juelich.de/en/news/effzett/2025/teuken-7b

A transparent multilingual talent

Since the end of 2024, the German AI language model Teuken-7B has been available to anyone interested from science and industry. Stefan Kesselheim and Andreas Herten from the Jülich Supercomputing Centre helped to develop Teuken-7B. Here they explain what distinguishes the multilingual open-source model from commercial products such as ChatGPT.

🎩 The European AI brainiacs have finally decided to play catch up with the US tech giants by unveiling their own fancy-sounding #LLMs. Meanwhile, the rest of us are left wondering if Teuken-7B is the name of their robot overlord or just another 🤖 geeky alphabet soup that'll end up giving us more #buzzwords and fewer answers. 🍝
https://arxiv.org/abs/2410.03730 #EuropeanAI #TechGiants #Teuken7B #Innovation #HackerNews #ngated
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs

We present two multilingual LLMs designed to embrace Europe's linguistic diversity by supporting all 24 official languages of the European Union. Trained on a dataset comprising around 60% non-English data and utilizing a custom multilingual tokenizer, our models address the limitations of existing LLMs that predominantly focus on English or a few high-resource languages. We detail the models' development principles, i.e., data composition, tokenizer optimization, and training methodologies. The models demonstrate competitive performance across multilingual benchmarks, as evidenced by their performance on European versions of ARC, HellaSwag, MMLU, and TruthfulQA.

arXiv.org

Teuken-7B-Base and Teuken-7B-Instruct: Towards European LLMs

https://arxiv.org/abs/2410.03730

#HackerNews #Teuken7B #EuropeanLLMs #LLMs #AIResearch #NLP #Innovations

Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs

We present two multilingual LLMs designed to embrace Europe's linguistic diversity by supporting all 24 official languages of the European Union. Trained on a dataset comprising around 60% non-English data and utilizing a custom multilingual tokenizer, our models address the limitations of existing LLMs that predominantly focus on English or a few high-resource languages. We detail the models' development principles, i.e., data composition, tokenizer optimization, and training methodologies. The models demonstrate competitive performance across multilingual benchmarks, as evidenced by their performance on European versions of ARC, HellaSwag, MMLU, and TruthfulQA.

arXiv.org
ChatGPT aus Deutschland: "Teuken-7B" ab sofort frei verfügbar https://www.forschung-und-lehre.de/zeitfragen/chatgpt-aus-deutschland-teuken-7b-ab-sofort-frei-verfuegbar-6787 - hat es von Euch schon jemand ausprobiert? Wie ist die geschätzte Meinung? #LLM #genAI #chatgpt #teuken7b cc @chpollin
ChatGPT aus Deutschland: "Teuken-7B" ab sofort frei verfügbar

Das in Deutschland entwickelte KI-Sprachmodell ist multilingual und Open Source. Besonders die Datenhoheit macht es attraktiv für die Forschung.

#ChatGPT #claude #Llama – die meisten #LargeLanguageModels stammen aus den USA, gegen ihre Nutzung sprechen datenschutzrechtliche Bedenken. Das t3n Magazin beschäftigt sich mit dem europäischen Projekt #OpenGPT und dessen Sprachmodell #teuken7b das über die Plattform Hugging Face frei verfügbar ist. Zum Artikel: https://t3n.de/news/europa-ki-offensiveteuken-7b-opengpt-x-1660108/
Zu Teuken-7B: https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4
Zu OpenGPT: https://opengpt-x.de/models/teuken-7b-de/

#Sprachmodelle, #KI, #Europe, #IT4Science

Europas KI-Offensive: OpenGPT-X Projekt stellt Teuken-7B-Sprachmodell vor

GPT-4, Claude, Grok, Llama und Gemini: Die wichtigsten KI-Sprachmodelle stammen alle aus den USA. Mit dem Proj

t3n Magazin

The large language model developed by the OpenGPT-X research project is now available for download at #huggingface

#Teuken7B was trained from scratch with the 24 official languages of the 🇪🇺EU and has 7 billion parameters.

Open source model allows companies and orgs to operate their own customised models in real applications. Sensitive company data can remain within the company.👏

European AI Innovation: Teuken-7B Language Model Launch 🌍

🤖 #OpenGPTX introduces #Teuken7B, a multilingual #AI model with 7B parameters supporting all 24 official #EU languages

🤗 https://huggingface.co/openGPT-X
🇩🇪 News: https://bit.ly/41ciNhL

openGPT-X (OpenGPT-X)

OpenGPT-X develops big AI language models that enable new data-driven business solutions and specifically address European needs.