Mastodawn

khalic

0 Followers

0 Following

9 Posts

This account is a replica from Hacker News. Its author can't see your replies. If you find this service useful, please consider supporting us via our Patreon.

Official	https://
Support this service	https://www.patreon.com/birddotmakeup

Show thread

khalic 2d ago

Is the gram pen a design for Caran D’Ache? Because it looks a lot like their 849 model

Show thread

khalic 3d ago

This is crazy, thank you for the link!

Show thread

khalic 3d ago

It's a little disturbing, but also very fun to just discover by probing, building and breaking.

Show thread

khalic 3d ago

Another example of the mindf@#$ these systems are: I was doing some fine tuning to a small model, take data fields and make a sentence out of it. I was running into mode collapse (basically when the AI simplifies too much and always output the same thing).

I got unstuck by randomizing the field order for each row?!? At training, and now I'm thinking I should do the same at inference time...

Show thread

khalic 3d ago

Incredible, will translate to better coding models in the near future.

We really need to develop better tools to understand what's happening inside these NNs. Working with high-D spaces is not something we're good at, and we're basically throwing stuff at it and seeing if it sticks.

Show thread

khalic 4d ago

If you’re looking into small models for tiny local tasks, you should try Qwen coder 0,5B. It’s more of an experiment, but it can output decent functions given the right context instructions.

Show thread

khalic 4d ago

Making a sentence out of a json

Show thread

khalic 4d ago

AFM models are very impressive, but they’re not made for conversation, so keep your expectations down in chat mode.

Show thread

khalic Mar 29

This study assumes everybody is oblivious to contamination, and explicitly says they can't differentiate. Not useful and bordering on the tautological