#subquadraticsparseattention optimizes #quadratic solving not the way #1.58bit does - by reducing data to #ternary to do no math just logic - but by first exploring the heaviest connections so the rest can be mostly ignored. These approaches are complimentary. #Attention
youtube.com/shorts/IupOu...
SubQ: The Attention Matrix Dis...
SubQ: The Attention Matrix Dis...
