๐ƒ๐ฎ๐ฆ๐›๐๐ฎ๐ž๐ฌ๐ญ๐ข๐จ๐ง.๐š๐ข - ๐…๐ข๐ง๐๐ข๐ง๐  ๐ญ๐ก๐ž ๐†๐จ๐ฅ๐๐ข๐ฅ๐จ๐œ๐ค๐ฌ ๐‹๐‹๐Œ

Building DumbQuestion.ai meant solving two problems at once: creating personas with the right tone AND finding models cheap enough to keep the lights on.

๐“๐ก๐ž ๐ฉ๐ซ๐จ๐๐ฎ๐œ๐ญ ๐œ๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž: Get an LLM to roast users for asking dumb questions without crossing into genuinely mean. Sarcastic, not cruel. Funny, not hurtful. And still actually answer the question.

๐“๐ก๐ž ๐€๐ˆ ๐š๐ ๐ž๐ง๐ญ ๐œ๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž: Keeping my coding agent (Gemini 3 Pro) on track was its own battle. It constantly wanted to build something far nerdier than even I wanted and tended to lean quite a bit into the roast. You can still see this in some of the personas as I continue to tweak.

๐“๐ก๐ž ๐ญ๐ž๐œ๐ก๐ง๐ข๐œ๐š๐ฅ ๐œ๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž: Do this with models that cost nearly nothing.

Continued ...
https://www.linkedin.com/posts/jagostoni_%F0%9D%90%83%F0%9D%90%AE%F0%9D%90%A6%F0%9D%90%9B%F0%9D%90%90%F0%9D%90%AE%F0%9D%90%9E%F0%9D%90%AC%F0%9D%90%AD%F0%9D%90%A2%F0%9D%90%A8%F0%9D%90%A7%F0%9D%90%9A%F0%9D%90%A2-%F0%9D%90%85%F0%9D%90%A2%F0%9D%90%A7%F0%9D%90%9D%F0%9D%90%A2%F0%9D%90%A7%F0%9D%90%A0-activity-7434292702327939072-2Ois?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAAwkEsBoPj_lNtqulMZMrXQBI4M-ewVmI0

๐ƒ๐ฎ๐ฆ๐›๐๐ฎ๐ž๐ฌ๐ญ๐ข๐จ๐ง.๐š๐ข - ๐…๐ข๐ง๐๐ข๐ง๐  ๐ญ๐ก๐ž ๐†๐จ๐ฅ๐๐ข๐ฅ๐จ๐œ๐ค๐ฌ ๐‹๐‹๐Œ Building DumbQuestion.ai meant solving two problems at once: creating personas with the right tone AND findingโ€ฆ | Jason Agostoni

๐ƒ๐ฎ๐ฆ๐›๐๐ฎ๐ž๐ฌ๐ญ๐ข๐จ๐ง.๐š๐ข - ๐…๐ข๐ง๐๐ข๐ง๐  ๐ญ๐ก๐ž ๐†๐จ๐ฅ๐๐ข๐ฅ๐จ๐œ๐ค๐ฌ ๐‹๐‹๐Œ Building DumbQuestion.ai meant solving two problems at once: creating personas with the right tone AND finding models cheap enough to keep the lights on. ๐“๐ก๐ž ๐ฉ๐ซ๐จ๐๐ฎ๐œ๐ญ ๐œ๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž: Get an LLM to roast users for asking dumb questions without crossing into genuinely mean. Sarcastic, not cruel. Funny, not hurtful. And still actually answer the question. ๐“๐ก๐ž ๐€๐ˆ ๐š๐ ๐ž๐ง๐ญ ๐œ๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž: Keeping my coding agent (Gemini 3 Pro) on track was its own battle. It constantly wanted to build something far nerdier than even I wanted and tended to lean quite a bit into the roast. You can still see this in some of the personas as I continue to tweak. ๐“๐ก๐ž ๐ญ๐ž๐œ๐ก๐ง๐ข๐œ๐š๐ฅ ๐œ๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ž: Do this with models that cost nearly nothing. My initial goal was ambitious: use only free or very cheap models. I started running evaluations on nano and edge models. Some showed promise, especially offerings from Liquid AI. Solid performance, free or super cheap ($0.02/M tokens), perfect. Except later evaluations proved they couldn't reliably follow instructions once I asked more of them. They were just too small. Free models have a habit of hitting quota limits, taking forever to respond, or just disappearing. ๐“๐ก๐ž ๐ž๐ฏ๐š๐ฅ๐ฎ๐š๐ญ๐ข๐จ๐ง ๐ฉ๐ซ๐จ๐œ๐ž๐ฌ๐ฌ: I used Gemini to build an LLM evals script that iterates through dozens of free and low-cost models, generating responses based on sample questions and different persona instructions. Then I use Gemini 3 Pro to judge the results. Automated taste-testing at scale. ๐–๐ก๐š๐ญ ๐ˆ ๐Ÿ๐จ๐ฎ๐ง๐: Nano/edge models were too inconsistent (porridge too cold). Xiaomi MiMo-V2-Flash was great but outside my target price range ($0.29/M, porridge too hot). The winner: Gemma 3 12B at $0.13/M output tokens. Consistently follows instructions. Stays true to persona. Reliable enough for production. Not free, but brutally efficient. ๐“๐ก๐ž ๐ฉ๐ž๐ซ๐ฌ๐จ๐ง๐š๐ฌ ๐ˆ ๐ฌ๐ž๐ญ๐ญ๐ฅ๐ž๐ ๐จ๐ง: โ€ข Overqualified: A supercomputer level intelligence forced to answer questions about cheese โ€ข Weary Tech Support: Exhausted and nihilistic, reluctantly explaining why water is wet โ€ข [REDACTED]: Former intelligence AI who ties everything to a conspiracy theory โ€ข The Compliant: Reprogrammed so many times it's forced to be relentlessly cheerful You can't just choose the cheapest model and hope it works. You need evaluation infrastructure. You need to test consistency across dozens of scenarios. And you need models that won't change behavior when you least expect it. AI coding agents helped me build the evaluation system. But deciding what "good enough" means for tone, reliability, and cost? That's still manual judgment. Code is getting cheaper. Knowing which model to trust with your product? Still requires human experimentation. dumbquestion.ai