Theo - t3.gg (@theo)
Azure에서 OpenAI 모델을 호스팅하는 고객들이 이제 지연 시간과 처리량에서 약 10배 개선을 체감할 수 있다는 업데이트입니다. OpenAI 모델을 사용하는 Azure 기반 배포의 성능이 크게 향상되어, 응답 속도와 처리 효율 측면에서 중요한 개선으로 보입니다.
Theo - t3.gg (@theo)
Azure에서 OpenAI 모델을 호스팅하는 고객들이 이제 지연 시간과 처리량에서 약 10배 개선을 체감할 수 있다는 업데이트입니다. OpenAI 모델을 사용하는 Azure 기반 배포의 성능이 크게 향상되어, 응답 속도와 처리 효율 측면에서 중요한 개선으로 보입니다.
RT @HotAisle: Kimi K2.6 + DFlash: 508 tok/s auf 8x H100
mehr auf Arint.info
#inference #LLM #LLMServing #throughput #transformers #arint_info
<p>RT @HotAisle: Kimi K2.6 + DFlash: 508 tok/s auf 8x H100</p> <p><a href="https://arint.info/@Arint/116447493456384838">mehr</a> auf <a href="https://arint.info/">Arint.info</a></p> <p>#inference #LLM #LLMServing #throughput #transformers #arint_info</p> <p><a href="https://x.com/HotAisle/status/2046620289984057634#m">https://x.com/HotAisle/status/2046620289984057634#m</a></p>
Как одна буква в ассемблере стоит 3× производительности
Я хочу показать вам, как одна буква в ассемблере может стоить 3× производительности. Не в теории — на живых замерах. По дороге мы заглянем внутрь процессора: Register Alias Table, partial register merge, scheduler, latency vs throughput, и даже обнаружим, что делитель выдаёт остаток раньше частного. Но начнём с основ. Приготовьтесь: кроличья нора окажется глубже, чем кажется.
https://habr.com/ru/articles/1024862/?utm_source=habrahabr&utm_medium=rss&utm_campaign=1024862
#x86 #assembly #NASM #div #partial_register_merge #latency #throughput #микроархитектура #Skylake #оптимизация
Как одна буква в ассемблере стоит 3× производительности
Я хочу показать вам, как одна буква в ассемблере может стоить 3× производительности. Не в теории — на живых замерах. По дороге мы заглянем внутрь процессора: Register Alias Table, partial register merge, scheduler, latency vs throughput, и даже обнаружим, что делитель выдаёт остаток раньше частного. Но начнём с основ. Приготовьтесь: кроличья нора окажется глубже, чем кажется.
https://habr.com/ru/articles/1024862/
#x86 #assembly #NASM #div #partial_register_merge #latency #throughput #микроархитектура #Skylake #оптимизация
We have an onboarding guide:
Your First Dataset from #Jira, #Trello, or #OpenProject
It shows how easy it is to get your data from those systems. Just follow these simple steps and analyze your data!
#Cylenivo #Agile #CycleTime #LeadTime #Throughput #Flow #Scrum #Kanban
If you want a precise measure of your Cycle Time, Lead Time, and Throughput on your local machine, maybe you want to check out #Cylenivo - a free App for Windows, Linux, and Mac. Connect to Jira (or Trello or OpenProject), download your Ticket data, and analyze. Give it a try - it's free and Open Source. And boost if you like it!
#Jira #Trello #OpenProject #CycleTime #LeadTime #Throughput #App #FOSS
Jira tells you your tickets exist. It won't tell you how fast your team actually learns. So I built Cylenivo: cycle time, lead time, throughput, and Monte Carlo delivery forecasts, pulled straight from your Jira data.
Fully local. No cloud. Open Source.
This is my passion project! I'd love to hear what flow metrics matter most to your team.
Beta is live: https://cylenivo.org/
Boosts are very welcome!
#agile #cycletime #leadtime #throughput #kanban #scrum #flowmetrics #foss #jira
Throughput and CFD added...
xman (@xuconz)
단일 요청·단일 사용자 환경에서 배칭 없이 Qwen3.5-35B-A3B 모델을 vLLM으로 RTX PRO 6000 96GB 한 장에 구동한 벤치마크입니다. 한 프롬프트 512 토큰 생성에 3.03초(약 169 tok/s) 성능을 기록했으며, 동시 부하에서 연속 배칭을 적용하면 총 처리량은 더 증가할 것이라는 설명입니다.

@ivanfioravanti @danieltvela @alexocheema @Prince_Canuma Single request, single user — no batching at all. One prompt, 512 completion tokens, 3.03s wall clock = ~169 tok/s raw generation speed. Running Qwen3.5-35B-A3B on a single RTX PRO 6000 96GB via vLLM. With continuous batching under concurrent load, aggregate throughput would go
@catsalad ha! You slay me.
#humor #buffy #slayer #vampire #buffering #throughput #latency