@sean_ae @yaxu it’s obviously built on massive amounts of data. but it is not an objective or natural outcome or representation of that data. it’s a specific, designed, and created artifact that uses the data in a specific and intentional way.

@emenel @yaxu yeah ok, let me reframe this

if you didn't do any curation at all - and just trained a network on some arbitrary data - could the model be described as having been designed?

@sean_ae @yaxu yes. what is it modeling? what relations between data are important? what does “training” mean in detail? those are all choices that change what is being modeled.

models are a human invention… the dictionary is also an llm, it models the alphabetical relation between words and associates them with definitions …

@emenel @yaxu hmm idk about this - you seem to have ignored the word 'arbitrary' in my question
@sean_ae @yaxu even if the data is arbitrary you have to choose how to model it. the data itself has no natural model. the model is always designed.

@emenel @yaxu can you explain a bit more what you mean there?

i don't have to specify anything when training a network locally, i can just chuck any data in (sounds/gestures/images/text), not label anything and get data out the other end

@sean_ae @yaxu yes. because you’re using a system that already determines the type of model. the code you use for training a local model is the design of the model … it’s the algorithm that takes your input data and produces a specific type of model.

@emenel @yaxu ah ok, i see

when you say 'model' you mean what i would call a network

@sean_ae @yaxu what's a netword? imo it's a model of relations...

@emenel @yaxu the network is the thing you are training, the model is a result of training that network

i suppose what you are saying is that since networks are designed, some of that design is affecting the quality of the model (and it is, but i would not call the model designed - otherwise they would actually work, and they frequently don't)

@sean_ae @yaxu not even just the quality... but what the model fundamentally is. statistics are only one way to model a dataset, and not an objective or neutral way.
@emenel @yaxu what would be other ways to model a dataset?

@sean_ae @emenel @yaxu as a librarian, i can give another big example: model it by description. that is, create a metadata model.

it is also very much not objective or neutral (we talk about this a lot in librarianship). but simply labeling and categorizing the pieces of information is another form of modeling that is not necessarily stats oriented. we simply think of this as a model that aids precise info retrieval.

@sean_ae @yaxu you can create a model from any aspects of the data. for text it could be statistical ngram frequency (ie llm’s), but could be any other quantifiable aspect of the data.

and there are many ways to make a statistical model as well, whoever made the one you’re using had to make decisions about how to process the data. statistics itself is far from objective.

@emenel @yaxu i'm curious about how you would train a network on quantifiable aspects of a dataset without involving statistics on some level

@sean_ae @yaxu yes, a quantifiable computational model lends itself to statistics. which is also already a non-neutral choice. computation is fundamentally numeric, so at some level you’ll always be working with mathematical relationships.

(we have models of things that are not computational as well, but that’s probably veering into different territory)

@emenel @yaxu oh ok, it's just that you said

>statistics are only one way to model a dataset

@sean_ae @yaxu yeah.., so all numeric relations aren’t statistical. but computation lends itself to statistical approaches (it’s basically why it was invented)

you could make a model from other types of data relations (ie not mean or avg or frequency etc) but tbh i would have to put some work into thinking through it.

@sean_ae @yaxu like, i could make a model of words that’s based on how i associate them with rooms of my house … that’s still a model of the corpus of words.
@emenel @yaxu so just a 1/0 relationship between the words and the rooms?
@sean_ae @yaxu could be many to many ... but it's not based on a statistical relation in that case. i dunno, i'm just thinking off the cuff and it's not very thorough :)

@sean_ae tbh it’s not the quality, it’s the ontology. what is being modeled and what does the model represent? there’s no neutral or objective answer. someone (group, org, corp, etc) designed a system that models a specific kind of data in a specific way.

any and all modeling is designed…

@emenel but what if the data is arbitrary?

(sorry for repeating myself)

i realise 'arbitrary data' is a very spherical cow here, but just hypothetically, if you had arbitrary data - would you still consider the model to have been designed?

@sean_ae sure, it could be arbitrary data. it could be a list of random numbers ...

the question then is what do you model? how is the model derived from the data?

@emenel well yeah, i guess there are levels of arbitrary

i can work with some pretty arbitrary sound sets but there are always gonna be questions about how or why files ended up in the set - i get that

i just find this idea that [the model is therefore designed] to be a bit weak, but maybe we can agree on 'weakly designed' or something

and i obvs get where you're coming from wrt the bigger LLMs being horribly skewed and manipulated - but to me there's a big difference between manipulating something and designing it

@sean_ae when you make a model with your own sound files, how are the processed into the model? what attributes are modelled? how are those relations represented? what attributes are left out? who made the decisions about how the data becomes a statistical model?

those are the design ... and in this case they've already been done by, i.e., google (tensorflow) or etc.

the type of model that the process produces is a designed artefact, even if the specific contents of the model (weights, nodes, etc) is different for each output.

@emenel well you can tune the weights later on anyway (just ignore the ones you don't want to use)

but yeah you're performing [some analysis] and then working with that data

i mean, what you seem to be saying is that the output of any statistical analysis is designed - and this is kinda like saying the data you get from doing a scientific experiment is designed - and idk if you can really say that

i mean you can say the experiment was designed but the data is just [the data you got from that experiment]

@sean_ae sure... but in this case the output, the model, is the result of a specific designed process. that process creates this artefact, and this artefact only. a different process would produce a different artefact.

a set of data doesn't have a natural model, order, or set of relations. those are all choices and decisions.

"a model" in this context specifically means a computational statistical model of datat-point relations based on frequency and context-frequency. it creates the weights of the relations from a self-refining gradient descent ... (and etc whatever else the training code does with the data).

that's a very specific way of analysing and representing the data that was chosen by people to do a specific thing.

@emenel i mean this is why we repeat experiments isn't it?

i'm not really understanding why you think the results of [some analysis] are themselves designed - i get that the analysis may be - but the results?

and you seem to be disregarding that the data itself was not designed either

and/or you seem to be implying that 'selected' and 'designed' are the same thing, and i don't think they are

@sean_ae i'm not saying the results are designed. i'm saying that the type of results are determined by how the process is designed. it will only create models that model the data in the way that the process is designed. so the output doesn't represent any natural representation of the data, it represents that specific process.

@emenel yeah, if you carry out some analysis you will get the results of that analysis

idk what you mean by
>natural representation of the data

@sean_ae i mean that the specific types of analysis aren't neutral. someone is always deciding what and how to analyse, and then how to model those results. it's a designed process...

like how science is a specific methodology and produces specific kinds of outputs/outcomes that aren't neutral or objective (contrary to the current popular understanding of sciences).

to go back to the original point of all of this (lol), when you create a ml model of a dataset it is a designed process that produces a specific and limited kind of result, regardless of the data. that is an intentional set of decisions made by somebody.

@emenel no one said it was neutral though, and lack of neutrality (whatever neutrality is supposed to be) isn't in itself an indication of design

i may just be having a hard time understanding what you mean by these terms like 'natural', 'neutral' and 'objective'

@sean_ae i'm responding to when you initially said that the models "just are." i'm saying that they aren't like that because they are the direct result of an already specified and determined process that can only produce a certain kind of model... and we're best to look at the models are artefacts of that process (all that it entails) rather than something that just "exists"

@emenel ok sure, i guess you *could* say that performing an analysis on some data is in some way creating something

i don't see it that way personally, i see it as an act of observation

i think the thing you are observing exists regardless of whether you choose to observe it or not

what you're describing is more akin to editing than writing

@sean_ae how do you observe it? what is observed and observable? what tools are used for observation and how do they work? is observing neutral? and how do the observations get recorded and turned into a model? …

@emenel yeah all of those things will affect your observation - but it's still an observation

you can say the same for any kind of analysis, recording, data gathering and whatnot (at least at the macro scale)

but none of that implies design

and the data you observe will exist regardless of whether or not you choose to observe it, or however you choose to

i suppose this is just a difference in philosophical approaches or something, we're prob never gonna agree :D

@sean_ae agreed, it is like all those other things. any form of analysis is limited and contextual.

so all i'm saying is that this specific method of analysis (machine learning and its offshoots) does a specific thing and produces a specific kind of output (a set of statistical weights etc). the way this is done was designed for specific reasons, choices and decisions were made.

when you employ those methods and tools you are also enacting those decisions on your data, and the resulting model is a representation of those decisions as much, or more, than of the data itself.

the specific model of that specific data will be unique, but it's a designed representation.

@emenel that's not really true, you can train a network on any data, either for reasons or for no reason at all, and you can choose whatever analysis methods you want

obvs there are loads of shitty LLMs out there but that's cos shitty people are making them

you can't really apply that crit to ML as a whole though

i suppose this is where you tell me claude shannon was a satanist or something

@emenel @sean_ae Not barging into your interesting thread any further, just wanted to point to similar supplementary points made here (re: these models and outputs being designed implicitly and explicitly [via chosen analysis/training/refinement methods/processes and source selections]):

https://mastodon.thi.ng/@toxi/116527449262566457

Karsten Schmidt (@[email protected])

@[email protected] @[email protected] @[email protected] @[email protected] Btw. The inventor of RSS 2.0 has been entering similar Dawkinsian terrain: https://social.vivaldi.net/@ianbetteridge/116522439483754901 FWIW Instead of being ND or not, I think the propensity to see and believe LLMs to be "active minds" or "new forms of consciousness" runs along different axes, including that we're evolutionary geared towards a tendency to see minds everywhere (being more cautious, even with a lot of false positives, meant better survival chances, but for LLMs it feels the inverse is just as important 🤷‍♂️). Then, there are also these factors: - LLMs are (still?) complete black-box functions (even if introspectable by experts, they're not fully understood, and for normal users probably the closest to "magic" post-modernism ever got to) - The mistake of considering language as direct proxy for minds, amplified by the turn-based interaction/conversational pattern imposed by LLM UIs (starting with ELIZA). Growing context windows and larger potential/scale of variability of answers have amplified that effect even further (and therefore also lowered the threshold of believability for some susceptible people). - A ton of anthropomorphic design choices (dark patterns?) at different layers, e.g. default use of first person voice & conversational/empathetic tone, system prompts to nudge the model output to sound more human & sycophantic... These all are intentional decisions by the model providers! Also important: Most users NEVER encounter the models during training and generally have no clue (or even interest!) about the incredible amount of human effort required to filter, curate, transform, massage & fine-tune inputs, training data and outputs to reduce the amount of public failure modes. I've been working with generative art/design processes for ~30 years and know from own experience that the human input/effort (but also our intent) and curation of these systems is _the_ most valuable aspect of this way of working... Even deeply appreciating the sometimes truly "magic" moments of serendipity in the generated outputs, has never made me doubt even once my own agency involved in the process, or to start assigning these systems a form of consciousness, neural networks and genetic programming included... Lastly, I think Dawkins is a special case, exactly because for decades he's been extremely vocal about the delusion of religion and religious tendencies of people, only to succumb to the lure of LLMs himself... There's a bittersweet irony alone in that!

Mastodon Glitch Edition
@sean_ae or another way to put it -- doesn't performing an analysis inherently include making decisions about what's important in the data? in which case the resulting model of the analysis represents those decisions.

@emenel i mean it can do, but i'd say it's a bit like news photography

i mean you have some editorial control but essentially you're just dealing with what you can capture

is that design? not really

creative? possibly, depends

@sean_ae photo journalists are definitely not objective observers/capturers ... they make a thousand choices on the fly to determine what to take photos of, how to frame them, etc etc. :) yes, they work with what's in front of them, but their job is definitely subjective.
@emenel yeah obviously, but it's not design
@sean_ae @yaxu like, the thing that is produced by the training process models the data based on the algorithm for analysis and recording (ie what metadata is derived, how is it stored, how can it be recalled). all of that is specific and is what produces the model. the model can't be an objective model of the data because someone had to write the code that does the analysis and builds the model. that code is a process that produces a specific, not generic/objective, representation.

@emenel @sean_ae @yaxu

i supsect the word "network" gets in the way a bit here. we use it in a lot of contexts. so would it be better to use a more precise term like "graph" (mathematical kind: nodes and edges) so that you can then begin to get to the next set of clarifying questions? for example, "is this a directed graph (do the edges point in one direction only)?"

also, a graph is the fundamental structure of a "neural network", but can also be both a) the structure of the input data, and b) the structure of the output data.

then if we think of it in these terms, we can start to deal with the other generic, overloaded term, the "model." is it correct to interpret an LLM "model" as the (non-neutral, designed) set of decisions or instructions for how to *traverse* the output graph produced by the LLM?