OpenAI and Broadcom unveil LLM-optimized inference chip

OpenAI and Broadcom introduce Jalapeño, a custom AI chip built for LLM inference to improve performance, efficiency, and scale across AI systems.

OpenAI

yzma v1.18 is out, with the freshest of llama.cpp multimedia features including video input, batching, & more.

Go get it while it's hot!

https://github.com/hybridgroup/yzma

#golang #llamacpp #inference #gguf #yzma

GitHub - hybridgroup/yzma: Go with your own intelligence - Go applications that directly integrate llama.cpp for local inference using hardware acceleration.

Go with your own intelligence - Go applications that directly integrate llama.cpp for local inference using hardware acceleration. - hybridgroup/yzma

GitHub
Inference cost at scale with napkin math

InJuly
Baseten is close to closing a $1.5B round at a $13B valuation, per WSJ via TechCrunch on June 18. Just five months after a $300M Series E at $5B, the raise is reportedly split-priced and co-led by Spark, Sands, Altimeter, and Wellington. The bet: inference is its own infrastructure category. https://go.aintelligencehub.com/ma-baseten13braise2026 #AI #Inference #Venture #Baseten
Baseten is set to raise $1.5B at $13B as inference demand soars

Baseten is close to closing a $1.5B round at a $13B valuation just five months after its $300M Series E. The raise is split-priced and co-led by Spark, Sands, Altimeter, and Wellington.

Baseten just raised $1.5B in AI inference funding, just months after a mega-round. The 'inference gold rush' continues as AI infrastructure becomes crucial for scaling AI applications. #AI #Fintech #Baseten #Inference

https://dev.to/souju_hegde_5a1f66b0979fb/baseten-raises-15b-in-ai-inference-funding-amid-hot-inference-gold-rush-47m

Baseten Raises $1.5B in AI Inference Funding Amid Hot 'Inference Gold Rush'

In the rapidly evolving landscape of AI infrastructure, Baseten is making waves with a reported $1.5...

DEV Community

#statstab #554 Does Regression Produce Representative Estimates of Causal Effects?

Thoughts: Regression creates an “effective sample” that may be unrepresentative of the pop of interest even w/ a representative sample

#ate #causalinference #regression #sample #weights #representative #observational #inference #LATE

https://onlinelibrary.wiley.com/doi/abs/10.1111/ajps.12185

Recommended: A new #deepseek v4 toolbox over at https://strix-halo-toolboxes.com/
lets me run a powerful DeepSeek v4 flash Q4 quant LLM on dual Strix Halo 128GB. Give it a try! #strixhalo #localAI #inference
Strix Halo AI Toolboxes

The Case Against the Inference API

Depending on a remote inference endpoint means depending on someone else's off switch: their uptime, their pricing, their policy changes, their access to your data. This essay makes the case for sovereign on-device inference and a signed, post-quantum, offline-verifiable record you can hold without trusting the vendor.

https://mickai.co.uk/articles/case-against-the-inference-api

#sovereignAI #inference #ondeviceAI #AIsecurity #vendorlockin

The Case Against the Inference API

Depending on a remote inference endpoint means depending on someone else's off switch: their uptime, their pricing, their policy changes, their access to your data. This essay makes the case for sovereign on-device inference and a signed, post-quantum, offline-verifiable record you can hold without trusting the vendor.

#statstab #547 Statistical inference for exploratory data analysis and model diagnostics

Thoughts: A rather odd and provocative article. Taking visual inference to its limit.

#exploratory #eda #plots #Rorschach #inference #simulation #lineups

https://www.researchgate.net/publication/26871625_Statistical_Inference_for_Exploratory_Data_Analysis_and_Model_Diagnostics

I have written up a more detailed post about linear regression with #probula using grid approximation. The post focuses on:

  • Explaining the example of linear regression in probula (which is an internal DSL embedded in Scala 3)
  • Showing how grid approximation is implemented in probula
  • Discussing how this implementation is tested and how is it used in testing of other #BayesianInference methods

Enjoy!

https://wasowski.dukla.ch/posts/2606-grid-approximation-with-probula/

#ProbabilisticProgramming #Bayesian #Inference #DataAnalysis #Scala #Scala3 #keep

Simple Univariate Regression with Grid Approximation in Probula

Let me share an update on probula, my small purely-functional Bayesian inference library written in Scala 3. The primary goal for this write-up is to force myself (and you) to think about testing of probabilistic models, of inference algorithms, and the languages or APIs in which they are formulated. Arguably, this is a very modest start. But more is on the way! I would like to start with the first inference scheme you encounter, when reading McElreath’s Statistical Rethinking. Grid approximation (as this is the scheme we speak about) is by far the least efficient of the methods discussed in the book, but it remains useful as a testing baseline. Its simplicity and determinism let it serve as ground truth and oracle for other, more complex inference methods.

Andrzej Wąsowski