@FreakyFwoof @rojun The quality from 27b might be little better but absolutely runs slower than 35b because it's a dense model. The 35B model is MoE architecture with only 3B active parameters. This is especially more important for Macs because Apple Silicons with unified memory have much slower memory bandwidth compared to NVidia cards.