Mastodawn

Listening to Paige Bailey
talk about tradeoffs between small and large language models in terms of cost/latency vs quality of output. #sw2con

Small models today can compete with large models from 6-9 months ago.

She thinks smaller models augmented with retrieval is probably the sweet spot.

(also her general rule of thumb is that code assistants need to roundtrip in <500ms.)