Ted (@tudoroancea)

배칭을 더 늘리고 4개의 SIMD 레인과 12개 스레드(M4 Max의 12개 퍼포먼스 코어)를 활용한 최적화로 초당 7천만~8천만 토큰 처리 속도를 달성할 수 있다고 언급했다. 추론 성능 개선과 하드웨어 최적화에 관한 기술적 내용이다.

https://x.com/tudoroancea/status/2050937732034187760

#batching #simd #performance #tokens #optimization

Ted (@tudoroancea) on X

@alexocheema with more batching over 4 simd lanes and 12 threads (on 12 perf cores on m4 max) and some other tricks we can get up to 70-80M tok/sec

X (formerly Twitter)

Ivan Fioravanti ᯅ (@ivanfioravanti)

Ollama가 동시 요청에 대한 연속 배치(continuous batching)를 지원하는지 묻는 질문이다. LLM 서빙 성능과 처리량 최적화와 관련된 중요한 개발 도구 기능 문의로 볼 수 있다.

https://x.com/ivanfioravanti/status/2042622686128476553

#ollama #llm #serving #batching #inference

Ivan Fioravanti ᯅ (@ivanfioravanti) on X

Does @ollama support Continuous batching of concurrent requests? 🤔

X (formerly Twitter)

Ivan Fioravanti ᯅ (@ivanfioravanti)

Qwen3.5-122B-A10B를 M2 Ultra에서 continuous batching으로 구동하는 사례가 언급되었고, PR의 수정사항이 vllm-mlx에서 겪던 하이브리드 캐시 문제를 해결한다고 평가된다. 대형 모델의 Apple Silicon 실행 안정성과 배치 처리 개선에 중요한 업데이트다.

https://x.com/ivanfioravanti/status/2040044611779870878

#qwen #vllm #mlx #batching #llm

**Cách áp dụng Batching trong Llama.cpp? Tốc độ giảm theo LOL?** 🤔

@ClimateBoss chia sẻ trải nghiệm khi dùng lệnh `./llama-server --parallel 2 --cont-batching...` và gặp phải:
- Context bị giảm một nửa 😮
- 2 người dùng = 20% chậm hơn so với 1 người? 🤯
- Batching không hiệu quả như mong đợi?

NVIDIA nói tăng người dùng sẽ tăng tổng băng thông (throughput). Làm thế nào để tốc độ tăng lên? 🚀

#LlamaCPP #AI #Performance #Batching #MLOptimизация #ViệcLàmAI #TốcĐộ #Debug #NVIDIA #AIvn

What a Sustainable Workflow Looks Like for Chronic Pain: Real-Life Examples

Sustainable workflows prioritize adapting work habits to individual needs, especially for neurodivergent and disabled individuals. By recognizing that conditions and energies fluctuate, one can create a supportive, flexible routine that fosters productivity without burnout. Embracing rest and automation are essential strategies for maintaining creativity and honoring personal limits, ultimately leading to a more fulfilling work experience.

https://dreamspacestudio.net/what-a-sustainable-workflow-looks-like-for-chronic-pain-real-life-examples/

Why DeepSeek is cheap at scale but expensive to run locally

Why is DeepSeek-V3 supposedly fast and cheap to serve at scale, but too slow and expensive to run locally? Why are some AI models slow to respond but fast once…

It's so easy, when we are trying to work on a task, to get sidetracked. To jump from one thing to another, to another, to another and never quite get anything finished. https://lttr.ai/AUwEg via @ThatHoarder #batching #productivity #GettingThingsDone #MentalHealth

I created a Simple Sprite Batcher for SFML. It batches them for you automatically and improves performance! Work with your normal sprites!

It's a part of my SFML Snippets repository on GitHub:
https://github.com/Hapaxia/SfmlSnippets
and also available on the SFML wiki:
https://github.com/SFML/SFML/wiki/Source:-Simple-Sprite-Batcher

The (small) class is there along with an example program.

Here's a video using it (excuse the compression issues):
Sprite batching in SFML https://youtu.be/2P13hhMgeFs?si=dbsmnlt7NqKhy4pJ

#sfml #sprite #spritebatching #batching #performance

GitHub - Hapaxia/SfmlSnippets: Miscellaneous Snippets for use with SFML

Miscellaneous Snippets for use with SFML. Contribute to Hapaxia/SfmlSnippets development by creating an account on GitHub.

GitHub
Personne ne veut dépenser bêtement des #bitcoins $BTC – Le #batching, une solution pour l’avenir de #Bitcoin $BTC https://journalducoin.com/analyses-dossiers/optimisation-bitcoinale-gestion-frais-espace-bitcoin/
Personne ne veut dépenser bêtement des bitcoins - Le batching, une solution pour l'avenir de Bitcoin - Journal du Coin

La gestion de l'espace et des frais de transactions est cruciale pour Bitcoin, mais comment gérer cette problématique ? On fait le point.

Journal du Coin