Mastodawn

Are there useful or interesting ways to use LLMs other than prompting them? I feel like compressing all the text in the world via a hierarchically structured statistical model is probably useful, but that we're using it in a way that is unlikely to do what we'd hope.

Like everyone I'm impressed and amazed by what they can do, but also very frequently nonplussed by their stupidity. The amazing things convince me there's something important here, but the stupidity seems to have a consistent character that makes me think we're using them in a non-optimal way.

For example, I would expect them to be good at gathering text from their training data that is talking about the same thing in two different ways. This seems like it would be very helpful for synthesizing views on a complex question, but I think not.

I think that part of synthesizing multiple views is building a mental model of the underlying meaning, finding the points of disagreement, and putting that into a new framing. This feels like an inherently back and forth process that LLMs can't do by their very structure.

"Reasoning" models with chain of thought get a little way towards this but they feel like an overkill solution that also isn't enough to really address it. But I have to admit I don't know much about their internals and I've never really had the chance to use them myself.

Another aspect is that I'm not sure it's possible to train models to produce truth, in some sense. I feel like we learn this by living in the world and trying to use the imperfect pieces of knowledge and skills we have to achieve stuff. Without that connection, can it go beyond compression?

So that's why I'm wondering if there is another way to use LLMs that more clearly makes use of the fact that they're an incredible compression scheme? Search seems like one possibility but maintaining sources would undermine their compressing role I guess.

Maybe generating good keywords and alternative phrases that people use when talking about something, that could be the starting point of a literature search? Has anyone tried using them in this way just via prompting? Or maybe there's another way to use the core model without prompting?

Show thread

🅱🅻🆄🅴🅱️

@neuralreckoning

I'll give you one tip.

Think of a difficult to explain model. Like maybe getting a good understanding of a mathematical model. Then prompt Gemini 2.5 pro or chat gpt to create a Google Collab that models that thing, along with knobs to change different variables of that model, or perhaps show you how slightly different models model differently.

Then let someone twiddle with the knobs for a bit to understand the model.

Try explaining that as fast with other methods. It's an extremely fast way to communicate these sorts of ideas. The fact that it's a Google Collab makes it very portable. You can send a link to someone and boom they have it on their computer without installing a bunch of libraries etc.

Show thread

Nick Markov 17h ago

@BlueBee @neuralreckoning could you link an example please?

Show thread

🅱🅻🆄🅴🅱️17h ago

@nsmarkov @neuralreckoning

I can give you a prompt, but not today. Not enough time left in the day.

I'll think of a good example..

Show thread

🅱🅻🆄🅴🅱️16h ago

@nsmarkov @neuralreckoning

So crazy!

But when attempting to do a little experimentation with this, I found that Gemini 2.5 pro might be a better coder than ChatGPT, which was unexpected.

At least for the specific thing I was doing.

One issue I'm running into is that I have no real world problem I'm trying to solve that can serve as inspiration. I can't pull one from work for reasons...

Trying to fabricate one with no inspiration is proving difficult.

I'll try some more and get back to you.

But as some direction if people have any ideas. Think of equations you might model in a graphing calculator where being able to visualize how variables effect the output. Sometimes we have these equations and it's hard to understand what they are doing, but once you can play with the variables and see the changes in real time one is able to get a more intuitive sense of them.

Writing these can sometimes be time consuming, so being able to write a quick description of what you are trying to model and having Gemini pro 2.5 pop out a website or a Collab that visualizes the equation or simulation can be a really good way to communicate the idea.