Tạo động cơ LLM 1.58-bit chạy 117 token/giây trên 1 nhân CPU với Rust và AVX-512, nhưng bị lỗi ở lớp Activation khiến đầu ra luôn là <unk>. Cần hỗ trợ về: (1) Weight tying trong BitNet – thiếu hệ số tỉ lệ? (2) Cách scale tích lũy nguyên từ VPOPCNTDQ trước khi đưa vào RMSNorm/SiLU. Dự án mã nguồn mở, zero-copy, không heap allocation. #Rust #AVX512 #LLM #MachineLearning #AI #R3Engine #BitNet #LocalAI #HPC #Inference #trítuệnhân tạo #môhìnhtonngẫu #xửlýsongsong #tinhoccao

https://www.reddit.

Chúng ta đang chuyển từ thời đại MatMul sang “AI cộng dồn” với BitNet (trọng số ternary), L‑Mul (thêm thay cho nhân) và mHC (đảm bảo ổn định quy mô). Nếu chạy mô hình 70B+ chỉ dùng 1/100 năng lượng, GPU hiện tại sẽ trở thành lạc hậu, cần ASIC chuyên cộng. Các bạn có nghĩ nên dừng mua GPU và tập trung vào kiến trúc cộng không? #AI #AdditiveAI #BitNet #L_Mul #mHC #CôngNghệ #TríTuệNhânTạo

https://www.reddit.com/r/LocalLLaMA/comments/1qjr074/the_end_of_the_matmul_hegemony_why_we_must_pivot/

Episódio 166 – 30 anos de Internet comercial no Brasil – Parte A - Retrópolis

Bem-vindos ao podcast Retrópolis! Apresentado pela Municipalidade de Retrópolis. Esta é a Parte A do Episódio 166. Sobre o episódio Não parece, mas a Internet comercial no Brasil já tem 30 aninhos de idade. Vamos comemorar o primeiro retorno de Saturno da rede das redes com um agora-vilão especialmente convidado. Sobre esta parte Uma brevíssima

Retrópolis - A cidade dos clássicos
#LLM on a #Pentium2 with 128MB of ram? Yes. Up to 15M parameters, using #bitnet architecture, which uses ternary weights (-1, 0, 1) to reduce computational complexity.
#AI #retro #retrocomputing #LLM #llama
From: https://mastodon.social/@mindsConnected/114727256228518845
Their public availability allows for widespread experimentation and adaptation. However, a significant barrier hinders their broader adoption: the substantial computational resources required for deployment and inference. State-of-the-art open LLMs typically require large memory footprints, consume considerable energy, and exhibit notable inference latency, rendering them impractical for many edge devices, resource-constrained environments, and real-time applications. #bitnet

Weird thing is that I am on a martial arts mailing list (originally created to mock the newbie rec.martial-arts poseurs) that I have been on since 1987 and I am by far the youngest member of the group. I have no idea why they invited me, weird old croaks probably just wanted a youngster perspective. Everyone on there is still alive and kicking though - not very high kicking, but still!

#usenet #bitnet #history #martialarts #wiseguys #meikdo

🔬🤯 Modele 1-bitowe to rewolucja w AI! Wagi sieci neuronowej zapisujemy tylko 1 bitem – zamiast 32 czy 16. To nawet 16x mniejszy rozmiar i ogromne oszczędności energii, przy zachowaniu jakości klasycznych LLM. Przyszłość AI jest lekka! 🚀#AI #LLM #quantization #BitNet
Rassurez-vous : les auteurs ont tous été payés. ^^' #OhWait! #AI #BitNet #SpywareWithASmile ^^'
Bluesky

Bluesky Social
Anything that helps reduce the environmental impacts of LLMs is a good thing.
bitnet.cpp is the official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models on CPU (with NPU and GPU support coming next).

The first release of bitnet.cpp is to support inference on CPUs. bitnet.cpp achieves speedups of 1.37x to 5.07x on ARM CPUs, with larger models experiencing greater performance gains. Additionally, it reduces energy consumption by 55.4% to 70.0%, further boosting overall efficiency. On x86 CPUs, speedups range from 2.37x to 6.17x with energy reductions between 71.9% to 82.2%. Furthermore, bitnet.cpp can run a 100B BitNet b1.58 model on a single CPU, achieving speeds comparable to human reading (5-7 tokens per second), significantly enhancing the potential for running LLMs on local devices.
https://github.com/microsoft/BitNet #BitNet
GitHub - microsoft/BitNet: Official inference framework for 1-bit LLMs

Official inference framework for 1-bit LLMs. Contribute to microsoft/BitNet development by creating an account on GitHub.

GitHub
1-Bit statt Milliarden Parameter: Microsofts BitNet b1.58 zeigt, dass KI auch ohne High-End-Hardware leistungsfähig sein kann. Ein radikaler Ansatz mit Potenzial für mehr Nachhaltigkeit und Zugänglichkeit. Ist das der Anfang vom Ende der GPU-Abhängigkeit? 👉 https://www.all-ai.de/news/top-news24/bitnet-microsofts-cpu-ki-fordert-die-gro%C3%9Fen-heraus #Microsoft #BitNet #KI
BitNet: Microsofts CPU-KI fordert die Großen heraus

Weniger Bits, aber nicht weniger Leistung: Mit BitNet b1.58 bringt Microsoft ein Modell, das Effizienz und Performance vereint – ein neuer Standard?