Steerling-8B, the first interpretable model that can trace any token it generates to its input context, concepts a human can understand, and its training data.

https://www.guidelabs.ai/post/steerling-8b-base-model-release/

#AI #InterpretableAI #DiffusionModel #DiffusionModels

Steerling-8B: The First Inherently Interpretable Language Model

We release Steerling-8B, an 8B-parameter causal diffusion language model that is interpretable by construction — its predictions are routed through concepts you can measure, audit, and control.

Guide Labs