Mastodawn

@loleg interesting. Looking forward to it. So far olmo 3.1 is my favourite. Works quite well for example as q4_0 quantization.

Roman Mar 7

RE: https://venkado.org/2026/03/07/selfhosted-ai-assistant-relagent-reaches/

The web interface (pwa), Android and desktop clients to the 100% #selfhosted #ai assistant make good progress. Also the on device voice integration is coming along nicely. Not just English. But I am lacking access to apple hardware to test iOS or macOS myself.

Any thoughts on a #privacy focused #opensource assistant that won't steal your data or need a credit card? Please let me know. 🗒️

Show thread

Roman Mar 6

@f As a rule of thump, I agree. But in the case of someone who can not or does not want to run things on self-owned hardware:

**What is the next best solution when it comes to running LLM queries?** Renting a GPU is hardly practical for most users.

I should have been more clear. The project I am working on clearly advocates to run the LLM inference locally. It would be nice to offer an alternative.

Show thread

Roman Mar 5

@n_dimension Yes, running the model directly is the desired use case for the selfhosted assistant I am creating (https://gitlab.com/RmMsr/relagent). And that works great.

To let people try it or if someone has no other option, I want to suggest something more suitable than just chatgpt, gemini, openrouter or whatever turns up on the internet that day. ;-)

Roman / relagent · GitLab

A privacy focused AI agent that runs completely on your hardware. Stay in control of your data, costs and dependencies.

GitLab

Show thread

Roman Mar 4

@Alain Danke dir.

Ja, openrouter ist eine Option. Allerdings eher nichtssagende terms of service zu Datenschutz. Plus Verzicht auf Rechte bei opt-in zu Logging.

Parasail hab ich probiert. Keine sichtbare Kostenkontrolle. Und kein tool calling für gpt-oss. Olmo3 verhällt sich eigenartig unflexibel.

Pindown werd ich mal ansehen.

Roman Mar 4

Can anyone recommend a privacy friendly SaaS #llm #inference provider? It needs to support *function calling* on at least one of the more recent #openweights models:

- gpt-oss
- Olma3
- Apertus? (I did not yet succeed using it)

There should be some level of cost control. Ideally a hourly rate limit. European solutions are preferred.

Use case is to have a fallback for demos or experiments where local inference is not practical. Monthly costs should go towards 0 when not used.

#selfhosting

Roman Mar 4

Hello fellow beings,

my name is Roman and it time for an #Introduction . I earn my living as an IT engineer. Caring much about #opensource, #privacy and independence.

I am still very surprised by the raise and risks of #genAI . Instead of being affected by it, I'd rather understand it. So I put my curiosity and time into creating a completely local AI assistant for #selfhosting .

https://gitlab.com/RmMsr/relagent

Happy tooting!

Roman / relagent · GitLab

A privacy focused AI agent that runs completely on your hardware. Stay in control of your data, costs and dependencies.

GitLab

Show thread

Roman Jan 21

@duncan_bayne ovh.com (France) has been working well. Messy and overloaded user interface. But in terms of hosting they have a lot to offer.
I heared good things about gandi.net (also french) if you are not hunting for lowest prices.

All 11 australian providers are listed by IANA, but I have no comment on those: https://www.icann.org/en/contracted-parties/accredited-registrars/list-of-accredited-registrars?page=1&country=Australia

List of Accredited Registrars

Show thread

Roman Jan 18

@taschenorakel I got this using gpt-oss-20b-mxfp4-GGUF:

```
#include <print>

int main() {
std::println("Hello, World!");
}
```

I am out of the loop with C++ for some time. So I am not sure on including <print> and the std namespace.

Show thread

Roman Jan 11

@iamlayer8 True. The model itself is stateless. The state (or context if you want) is all that the services memorizes and adds that to all requests. And that part is making those services more and more valuable.

Which goes in both ways. Usefulness for the user and better data for the provider. I agree privacy and information ownership is a major concern here.

Reading this blog post I learned how little innovation beyond the LLM itself was probably needed to get here: https://manthanguptaa.in/posts/chatgpt_memory/

I Reverse Engineered ChatGPT's Memory System, and Here's What I Found! - Manthan

When I asked ChatGPT what it remembered about me, it listed 33 facts from my name and career goals to my current fitness routine. But how does it actually store and retrieve this information? And why does it feel so seamless? After extensive experimentation, I discovered that ChatGPT’s memory system is far simpler than I expected. No vector databases. No RAG over conversation history. Instead, it uses four distinct layers: session metadata that adapts to your environment, explicit facts stored long-term, lightweight summaries of recent chats, and a sliding window of your current conversation.