Show HN: A deterministic middleware to compress LLM prompts by 50-80%

Hi HN,

I’m working on Skillware, an open-source framework that treats AI capabilities as installable, self-contained modules.

I just added a "Prompt Token Rewriter" skill. It’s an offline heuristic middleware that strips conversational filler and redundant context from long agentic loops before they hit the LLM. It saves significant token costs and inference time, and it's 100% deterministic (no extra model calls).

We're building a registry of "Agentic Know-How" (Logic + Cognition + Governance). If you have a specialized tool for LLMs or want to see what a "standard" skill looks like, I'd love your feedback or a PR:

https://github.com/ARPAHLS/skillware

https://github.com/ARPAHLS/skillware

GitHub - ARPAHLS/skillware: A Python framework for modular, self-contained skill management for machines.

A Python framework for modular, self-contained skill management for machines. - ARPAHLS/skillware

GitHub
The README.md doesn't really explain what it is or why I'd want it, just directory structure and how to install

I looked around the repository, and it looks like it's just 3 regexes to strip whitespace or filler words:

https://github.com/ARPAHLS/skillware/blob/main/skills/optimi...

skillware/skills/optimization/prompt_rewriter/skill.py at main · ARPAHLS/skillware

A Python framework for modular, self-contained skill management for machines. - ARPAHLS/skillware

GitHub