@emenel @yaxu yeah ok, let me reframe this
if you didn't do any curation at all - and just trained a network on some arbitrary data - could the model be described as having been designed?
@sean_ae @yaxu yes. what is it modeling? what relations between data are important? what does “training” mean in detail? those are all choices that change what is being modeled.
models are a human invention… the dictionary is also an llm, it models the alphabetical relation between words and associates them with definitions …
@emenel @yaxu the network is the thing you are training, the model is a result of training that network
i suppose what you are saying is that since networks are designed, some of that design is affecting the quality of the model (and it is, but i would not call the model designed - otherwise they would actually work, and they frequently don't)
@sean_ae @emenel @yaxu as a librarian, i can give another big example: model it by description. that is, create a metadata model.
it is also very much not objective or neutral (we talk about this a lot in librarianship). but simply labeling and categorizing the pieces of information is another form of modeling that is not necessarily stats oriented. we simply think of this as a model that aids precise info retrieval.
@sean_ae @yaxu you can create a model from any aspects of the data. for text it could be statistical ngram frequency (ie llm’s), but could be any other quantifiable aspect of the data.
and there are many ways to make a statistical model as well, whoever made the one you’re using had to make decisions about how to process the data. statistics itself is far from objective.
@sean_ae @yaxu yes, a quantifiable computational model lends itself to statistics. which is also already a non-neutral choice. computation is fundamentally numeric, so at some level you’ll always be working with mathematical relationships.
(we have models of things that are not computational as well, but that’s probably veering into different territory)
@sean_ae @yaxu yeah.., so all numeric relations aren’t statistical. but computation lends itself to statistical approaches (it’s basically why it was invented)
you could make a model from other types of data relations (ie not mean or avg or frequency etc) but tbh i would have to put some work into thinking through it.
@sean_ae tbh it’s not the quality, it’s the ontology. what is being modeled and what does the model represent? there’s no neutral or objective answer. someone (group, org, corp, etc) designed a system that models a specific kind of data in a specific way.
any and all modeling is designed…
@emenel but what if the data is arbitrary?
(sorry for repeating myself)
i realise 'arbitrary data' is a very spherical cow here, but just hypothetically, if you had arbitrary data - would you still consider the model to have been designed?
@sean_ae sure, it could be arbitrary data. it could be a list of random numbers ...
the question then is what do you model? how is the model derived from the data?
@emenel well yeah, i guess there are levels of arbitrary
i can work with some pretty arbitrary sound sets but there are always gonna be questions about how or why files ended up in the set - i get that
i just find this idea that [the model is therefore designed] to be a bit weak, but maybe we can agree on 'weakly designed' or something
and i obvs get where you're coming from wrt the bigger LLMs being horribly skewed and manipulated - but to me there's a big difference between manipulating something and designing it
@sean_ae when you make a model with your own sound files, how are the processed into the model? what attributes are modelled? how are those relations represented? what attributes are left out? who made the decisions about how the data becomes a statistical model?
those are the design ... and in this case they've already been done by, i.e., google (tensorflow) or etc.
the type of model that the process produces is a designed artefact, even if the specific contents of the model (weights, nodes, etc) is different for each output.
@emenel well you can tune the weights later on anyway (just ignore the ones you don't want to use)
but yeah you're performing [some analysis] and then working with that data
i mean, what you seem to be saying is that the output of any statistical analysis is designed - and this is kinda like saying the data you get from doing a scientific experiment is designed - and idk if you can really say that
i mean you can say the experiment was designed but the data is just [the data you got from that experiment]
@sean_ae sure... but in this case the output, the model, is the result of a specific designed process. that process creates this artefact, and this artefact only. a different process would produce a different artefact.
a set of data doesn't have a natural model, order, or set of relations. those are all choices and decisions.
"a model" in this context specifically means a computational statistical model of datat-point relations based on frequency and context-frequency. it creates the weights of the relations from a self-refining gradient descent ... (and etc whatever else the training code does with the data).
that's a very specific way of analysing and representing the data that was chosen by people to do a specific thing.
@emenel i mean this is why we repeat experiments isn't it?
i'm not really understanding why you think the results of [some analysis] are themselves designed - i get that the analysis may be - but the results?
and you seem to be disregarding that the data itself was not designed either
and/or you seem to be implying that 'selected' and 'designed' are the same thing, and i don't think they are
@emenel yeah, if you carry out some analysis you will get the results of that analysis
idk what you mean by
>natural representation of the data
@sean_ae i mean that the specific types of analysis aren't neutral. someone is always deciding what and how to analyse, and then how to model those results. it's a designed process...
like how science is a specific methodology and produces specific kinds of outputs/outcomes that aren't neutral or objective (contrary to the current popular understanding of sciences).
to go back to the original point of all of this (lol), when you create a ml model of a dataset it is a designed process that produces a specific and limited kind of result, regardless of the data. that is an intentional set of decisions made by somebody.
@emenel no one said it was neutral though, and lack of neutrality (whatever neutrality is supposed to be) isn't in itself an indication of design
i may just be having a hard time understanding what you mean by these terms like 'natural', 'neutral' and 'objective'
@emenel ok sure, i guess you *could* say that performing an analysis on some data is in some way creating something
i don't see it that way personally, i see it as an act of observation
i think the thing you are observing exists regardless of whether you choose to observe it or not
what you're describing is more akin to editing than writing
@emenel yeah all of those things will affect your observation - but it's still an observation
you can say the same for any kind of analysis, recording, data gathering and whatnot (at least at the macro scale)
but none of that implies design
and the data you observe will exist regardless of whether or not you choose to observe it, or however you choose to
i suppose this is just a difference in philosophical approaches or something, we're prob never gonna agree :D
@sean_ae agreed, it is like all those other things. any form of analysis is limited and contextual.
so all i'm saying is that this specific method of analysis (machine learning and its offshoots) does a specific thing and produces a specific kind of output (a set of statistical weights etc). the way this is done was designed for specific reasons, choices and decisions were made.
when you employ those methods and tools you are also enacting those decisions on your data, and the resulting model is a representation of those decisions as much, or more, than of the data itself.
the specific model of that specific data will be unique, but it's a designed representation.
@emenel that's not really true, you can train a network on any data, either for reasons or for no reason at all, and you can choose whatever analysis methods you want
obvs there are loads of shitty LLMs out there but that's cos shitty people are making them
you can't really apply that crit to ML as a whole though
i suppose this is where you tell me claude shannon was a satanist or something
@emenel @sean_ae Not barging into your interesting thread any further, just wanted to point to similar supplementary points made here (re: these models and outputs being designed implicitly and explicitly [via chosen analysis/training/refinement methods/processes and source selections]):
@[email protected] @[email protected] @[email protected] @[email protected] Btw. The inventor of RSS 2.0 has been entering similar Dawkinsian terrain: https://social.vivaldi.net/@ianbetteridge/116522439483754901 FWIW Instead of being ND or not, I think the propensity to see and believe LLMs to be "active minds" or "new forms of consciousness" runs along different axes, including that we're evolutionary geared towards a tendency to see minds everywhere (being more cautious, even with a lot of false positives, meant better survival chances, but for LLMs it feels the inverse is just as important 🤷♂️). Then, there are also these factors: - LLMs are (still?) complete black-box functions (even if introspectable by experts, they're not fully understood, and for normal users probably the closest to "magic" post-modernism ever got to) - The mistake of considering language as direct proxy for minds, amplified by the turn-based interaction/conversational pattern imposed by LLM UIs (starting with ELIZA). Growing context windows and larger potential/scale of variability of answers have amplified that effect even further (and therefore also lowered the threshold of believability for some susceptible people). - A ton of anthropomorphic design choices (dark patterns?) at different layers, e.g. default use of first person voice & conversational/empathetic tone, system prompts to nudge the model output to sound more human & sycophantic... These all are intentional decisions by the model providers! Also important: Most users NEVER encounter the models during training and generally have no clue (or even interest!) about the incredible amount of human effort required to filter, curate, transform, massage & fine-tune inputs, training data and outputs to reduce the amount of public failure modes. I've been working with generative art/design processes for ~30 years and know from own experience that the human input/effort (but also our intent) and curation of these systems is _the_ most valuable aspect of this way of working... Even deeply appreciating the sometimes truly "magic" moments of serendipity in the generated outputs, has never made me doubt even once my own agency involved in the process, or to start assigning these systems a form of consciousness, neural networks and genetic programming included... Lastly, I think Dawkins is a special case, exactly because for decades he's been extremely vocal about the delusion of religion and religious tendencies of people, only to succumb to the lure of LLMs himself... There's a bittersweet irony alone in that!
@emenel i mean it can do, but i'd say it's a bit like news photography
i mean you have some editorial control but essentially you're just dealing with what you can capture
is that design? not really
creative? possibly, depends
i supsect the word "network" gets in the way a bit here. we use it in a lot of contexts. so would it be better to use a more precise term like "graph" (mathematical kind: nodes and edges) so that you can then begin to get to the next set of clarifying questions? for example, "is this a directed graph (do the edges point in one direction only)?"
also, a graph is the fundamental structure of a "neural network", but can also be both a) the structure of the input data, and b) the structure of the output data.
then if we think of it in these terms, we can start to deal with the other generic, overloaded term, the "model." is it correct to interpret an LLM "model" as the (non-neutral, designed) set of decisions or instructions for how to *traverse* the output graph produced by the LLM?