We present Context Length Probing, an embarrassingly simple, model-agnostic, #blackbox explanation technique for causal (#GPT-like) language models.
The idea is simply to check how predictions change as the left-hand context is extended token by token. This allows assigning "differential importance scores" to contexts as shown in the video.
Paper: https://arxiv.org/abs/2212.14815
Code: https://github.com/cifkao/context-probing
Demo: https://cifkao.github.io/context-probing/
#explainability #interpretability #Transformer #NLProc
🧵1/4
Black-box language model explanation by context length probing
The increasingly widespread adoption of large language models has highlighted the need for improving their explainability. We present context length probing, a novel explanation technique for causal language models, based on tracking the predictions of a model as a function of the length of available context, and allowing to assign differential importance scores to different contexts. The technique is model-agnostic and does not rely on access to model internals beyond computing token-level probabilities. We apply context length probing to large pre-trained language models and offer some initial analyses and insights, including the potential for studying long-range dependencies. The source code and an interactive demo of the method are available.