"The transformer approach it describes has become the main architecture of a wide variety of AI, such as #LargeLanguageModels" #OutputProbabilities #Softmax Linear #Add&Norm #FeedForward #MultiHead Attention #MaskedMultiHead Attention #PositionalEncoding #OutputEmbedding #FeedForward