A downloadable project

How would a transformer know when to use ‘an’ rather than ‘a’? The autoregressive nature of the transformer means that it is only capable of predicting one token at a time, yet the choice between the words depends on the subsequent word. We use an IOI-inspired prompt with activation patching to isolate and identify a part of GPT-2 Large — a single MLP neuron — that is largely responsible for predicting the token ‘ an’. When we patch the activation of each neuron in turn we can identify which one is responsible for increasing the logit of this token. We present our method and further evidence in this paper.

More information

Status	Released
Category	Other
Author	clementneo

Download

a(n)-2.pdf

Download

a_an_investigation.ipynb 1.3 MB

We Discovered An Neuron

Download

Leave a comment