@yaxu i see people describe LLMs as algorithms quite often but i'm pretty sure that's not the right word
i suppose they can *mimic* algorithms though
@yaxu idk - do they even qualify as technology?
they're weird, statistical models, not designed or controlled, they just kind of *are*
@yaxu idk if you can say designed or controlled really, maybe slightly
you could maybe say built
it's not really a set of instructions, more a set of weights (most of which are not being set intentionally unless you only include human-generated data and even that would be a bit of a stretch imo)
not often i end up in a semantic hole, pls forgive me
@sean_ae @yaxu yes. what is it modeling? what relations between data are important? what does “training” mean in detail? those are all choices that change what is being modeled.
models are a human invention… the dictionary is also an llm, it models the alphabetical relation between words and associates them with definitions …
@sean_ae i.e, if you wanted an alphabetical model that system can't do it... the algorithm for "training a model" as it's commonly used these days is a specific type of modelling... it's not objective or representing any "natural" state of the data. it's using the data to produce a very specific representation.
@emenel @yaxu the network is the thing you are training, the model is a result of training that network
i suppose what you are saying is that since networks are designed, some of that design is affecting the quality of the model (and it is, but i would not call the model designed - otherwise they would actually work, and they frequently don't)
@sean_ae @emenel @yaxu as a librarian, i can give another big example: model it by description. that is, create a metadata model.
it is also very much not objective or neutral (we talk about this a lot in librarianship). but simply labeling and categorizing the pieces of information is another form of modeling that is not necessarily stats oriented. we simply think of this as a model that aids precise info retrieval.
@sean_ae @yaxu you can create a model from any aspects of the data. for text it could be statistical ngram frequency (ie llm’s), but could be any other quantifiable aspect of the data.
and there are many ways to make a statistical model as well, whoever made the one you’re using had to make decisions about how to process the data. statistics itself is far from objective.
@sean_ae @yaxu yes, a quantifiable computational model lends itself to statistics. which is also already a non-neutral choice. computation is fundamentally numeric, so at some level you’ll always be working with mathematical relationships.
(we have models of things that are not computational as well, but that’s probably veering into different territory)
@sean_ae @yaxu yeah.., so all numeric relations aren’t statistical. but computation lends itself to statistical approaches (it’s basically why it was invented)
you could make a model from other types of data relations (ie not mean or avg or frequency etc) but tbh i would have to put some work into thinking through it.
@sean_ae tbh it’s not the quality, it’s the ontology. what is being modeled and what does the model represent? there’s no neutral or objective answer. someone (group, org, corp, etc) designed a system that models a specific kind of data in a specific way.
any and all modeling is designed…
@emenel but what if the data is arbitrary?
(sorry for repeating myself)
i realise 'arbitrary data' is a very spherical cow here, but just hypothetically, if you had arbitrary data - would you still consider the model to have been designed?
@sean_ae sure, it could be arbitrary data. it could be a list of random numbers ...
the question then is what do you model? how is the model derived from the data?
@emenel well yeah, i guess there are levels of arbitrary
i can work with some pretty arbitrary sound sets but there are always gonna be questions about how or why files ended up in the set - i get that
i just find this idea that [the model is therefore designed] to be a bit weak, but maybe we can agree on 'weakly designed' or something
and i obvs get where you're coming from wrt the bigger LLMs being horribly skewed and manipulated - but to me there's a big difference between manipulating something and designing it
@sean_ae when you make a model with your own sound files, how are the processed into the model? what attributes are modelled? how are those relations represented? what attributes are left out? who made the decisions about how the data becomes a statistical model?
those are the design ... and in this case they've already been done by, i.e., google (tensorflow) or etc.
the type of model that the process produces is a designed artefact, even if the specific contents of the model (weights, nodes, etc) is different for each output.
@emenel well you can tune the weights later on anyway (just ignore the ones you don't want to use)
but yeah you're performing [some analysis] and then working with that data
i mean, what you seem to be saying is that the output of any statistical analysis is designed - and this is kinda like saying the data you get from doing a scientific experiment is designed - and idk if you can really say that
i mean you can say the experiment was designed but the data is just [the data you got from that experiment]
@sean_ae sure... but in this case the output, the model, is the result of a specific designed process. that process creates this artefact, and this artefact only. a different process would produce a different artefact.
a set of data doesn't have a natural model, order, or set of relations. those are all choices and decisions.
"a model" in this context specifically means a computational statistical model of datat-point relations based on frequency and context-frequency. it creates the weights of the relations from a self-refining gradient descent ... (and etc whatever else the training code does with the data).
that's a very specific way of analysing and representing the data that was chosen by people to do a specific thing.
@emenel i mean this is why we repeat experiments isn't it?
i'm not really understanding why you think the results of [some analysis] are themselves designed - i get that the analysis may be - but the results?
and you seem to be disregarding that the data itself was not designed either
and/or you seem to be implying that 'selected' and 'designed' are the same thing, and i don't think they are
@emenel yeah, if you carry out some analysis you will get the results of that analysis
idk what you mean by
>natural representation of the data
@sean_ae i mean that the specific types of analysis aren't neutral. someone is always deciding what and how to analyse, and then how to model those results. it's a designed process...
like how science is a specific methodology and produces specific kinds of outputs/outcomes that aren't neutral or objective (contrary to the current popular understanding of sciences).
to go back to the original point of all of this (lol), when you create a ml model of a dataset it is a designed process that produces a specific and limited kind of result, regardless of the data. that is an intentional set of decisions made by somebody.
@emenel no one said it was neutral though, and lack of neutrality (whatever neutrality is supposed to be) isn't in itself an indication of design
i may just be having a hard time understanding what you mean by these terms like 'natural', 'neutral' and 'objective'
i supsect the word "network" gets in the way a bit here. we use it in a lot of contexts. so would it be better to use a more precise term like "graph" (mathematical kind: nodes and edges) so that you can then begin to get to the next set of clarifying questions? for example, "is this a directed graph (do the edges point in one direction only)?"
also, a graph is the fundamental structure of a "neural network", but can also be both a) the structure of the input data, and b) the structure of the output data.
then if we think of it in these terms, we can start to deal with the other generic, overloaded term, the "model." is it correct to interpret an LLM "model" as the (non-neutral, designed) set of decisions or instructions for how to *traverse* the output graph produced by the LLM?