We have an IEEE CAL 2026 paper on a neat little area-efficient integer dot-product hardware unit.
đź“„Paper: https://www.eecg.utoronto.ca/~mcj/papers/2026.fased.cal.pdf
⚙️Code: https://github.com/mcj-group/fased-verilog
The goal is to efficiently support quantized AI models with variable bit widths across layers (e.g., LLMs with 2-, 4-, or 8-bit weights targeting edge devices or even servers). Our key insight is to optimize the dot product holistically, rather than only its subcomponents. Our proposed FASED builds on the design of a Booth multiplier and eliminates nearly half of all full-adders in the dot product unit by fusing the multiplication and reduction steps. FASED reduces area by up to 1.9x over prior variable-width integer dot product designs. This was a fun collaboration with Pavel Golikov, Karthik Ganesan, and Gennady Pekhimenko.
