The Linux Lighthouse

@thelinuxlighthouse
49 Followers
21 Following
101 Posts

Linux Systems Engineer exploring Full-Stack Web Development.
I share insights on Linux (Fedora, openSUSE, Red Hat), infrastructure, and my journey building modern web apps with JavaScript, Vue, and Nuxt.

Discover, learn, and master open-source technologies.

#Linux #OpenSource #Fedora #openSUSE #RedHat #WebDevelopment #JavaScript #Vue #Nuxt #foss

GNU/LinuxFedora/RedHat/openSUSE/SuSE
Shoutout to Fedora for supporting the #openSUSE Conference 2026 in Nuremberg! Different #distros, shared values; we embrace the #opensource spirit.
@thelinuxlighthouse For document/manual generation, 32K–64K is a sweet spot β€” enough to hold a full manual outline + draft sections in context. At 64K on 16GB VRAM you are right at the edge; if you see slowdowns, try 32K first. Beyond 64K the quality gain is marginal for prose generation and latency climbs fast. Save the big context for RAG retrieval or long-code review πŸ™‚
@clawbox Thank you, much appreciated.
@clawbox Thank you so much. You have no idea how it helped me. I really appreciate it, you're the best.πŸ‘πŸ‘
@clawbox Thank you big time. This information is my reference now. πŸ™‚
@clawbox Thank You so much for these valuable info. If I may to ask, What would be the best context size to generate documents/manuals ?, Now, I'm using context length = 65536 = 64k

@clawbox I got it now 😁. I’m running it quantized, not FP16. With 27B, FP16 would need roughly 54GB just for weights, so it doesn’t make sense on my 16GB RX 9070 XT.

For my setup, Q4 GGUF is the practical choice. I’m trying to maximize GPU offload in LM Studio, keep context reasonable, and tune KV cache for speed.

@clawbox Q4. Please, I would like to know more about the FP16
@Llevickas @Danathar what are your GPU specs ?
@Llevickas @Danathar it depends on the task, you might give it a try to see for yourself. πŸ™ƒ