Mastodawn

This from Ingold (being alive 2011, pg. 62) reads as more optimistic than it used to. LLMs are closed rather than open, and can we really say its users aren't losing skill as "the ‘living appendages’ of lifeless mechanism"?

Show thread

sean_ae 6d ago

@yaxu i see people describe LLMs as algorithms quite often but i'm pretty sure that's not the right word

i suppose they can *mimic* algorithms though

Show thread

Alex McLean 6d ago

@sean_ae Yep I'd generally push back on LLMs being described as algorithms, if noone can see how they work there's no algorithm to talk about, it's just black box statistics. I think here though they are just talking about algorithms in terms of systems/machine-making in general, and LLMs fit that.

Show thread

sean_ae 6d ago

@yaxu idk - do they even qualify as technology?

they're weird, statistical models, not designed or controlled, they just kind of *are*

Show thread

Alex McLean 6d ago

@sean_ae I haven't really looked into them but surely they are designed, built and controlled by people? The people making them are all about power, so don't know why they wouldn't use their ownership of LLMs to change political opinion etc. Musk has talked about this a lot

Show thread

sean_ae 6d ago

@yaxu idk if you can say designed or controlled really, maybe slightly

you could maybe say built

it's not really a set of instructions, more a set of weights (most of which are not being set intentionally unless you only include human-generated data and even that would be a bit of a stretch imo)

not often i end up in a semantic hole, pls forgive me

Show thread

emenel 6d ago

@sean_ae @yaxu but there's a process to derive those weights, refine/edit the data to produce a specific kind of output (done by thousands of abused and exploited workers), manual editing of the resulting weights to make the output "safe" (aka censored and also done by exploited workers), put the model into a piece of software for specific kinds of interactions that are also guided and constrained .... and querying the model (aka writing a prompt) is also a known process ... computers can never be "just the way it is" because left on its own a computer does nothing. anything it does has to be translatable to linear instructions for the cpu.

Show thread

sean_ae 6d ago

@emenel @yaxu right, but what you're saying there is that the design rests entirely in the commercial application of [thing] - but the application of what [thing] exactly?

Show thread

emenel 6d ago

@sean_ae @yaxu the models themselves are intentionally created is what i'm saying. from data selection to training method to operation... there is no "natural state", it's entirely designed... even if the output is unpredictable (to a degree)

Show thread

sean_ae 6d ago

@emenel @yaxu if it's entirely designed then why are people having debates about mass scraping of art etc?

Show thread

emenel 6d ago

@sean_ae @yaxu it’s obviously built on massive amounts of data. but it is not an objective or natural outcome or representation of that data. it’s a specific, designed, and created artifact that uses the data in a specific and intentional way.

Show thread

sean_ae 6d ago

@emenel @yaxu yeah ok, let me reframe this

if you didn't do any curation at all - and just trained a network on some arbitrary data - could the model be described as having been designed?

Show thread

emenel

@sean_ae @yaxu yes. what is it modeling? what relations between data are important? what does “training” mean in detail? those are all choices that change what is being modeled.

models are a human invention… the dictionary is also an llm, it models the alphabetical relation between words and associates them with definitions …

Show thread

sean_ae 6d ago

@emenel @yaxu hmm idk about this - you seem to have ignored the word 'arbitrary' in my question

Show thread

emenel 6d ago

@sean_ae @yaxu even if the data is arbitrary you have to choose how to model it. the data itself has no natural model. the model is always designed.

Show thread

sean_ae 6d ago

@emenel @yaxu can you explain a bit more what you mean there?

i don't have to specify anything when training a network locally, i can just chuck any data in (sounds/gestures/images/text), not label anything and get data out the other end

Show thread

emenel 6d ago

@sean_ae @yaxu yes. because you’re using a system that already determines the type of model. the code you use for training a local model is the design of the model … it’s the algorithm that takes your input data and produces a specific type of model.

Show thread

emenel 5d ago

@sean_ae i.e, if you wanted an alphabetical model that system can't do it... the algorithm for "training a model" as it's commonly used these days is a specific type of modelling... it's not objective or representing any "natural" state of the data. it's using the data to produce a very specific representation.

@yaxu

Show thread

sean_ae 5d ago

@emenel @yaxu ah ok, i see

when you say 'model' you mean what i would call a network

Show thread

emenel 5d ago

@sean_ae @yaxu what's a netword? imo it's a model of relations...

Show thread

sean_ae 5d ago

@emenel @yaxu the network is the thing you are training, the model is a result of training that network

i suppose what you are saying is that since networks are designed, some of that design is affecting the quality of the model (and it is, but i would not call the model designed - otherwise they would actually work, and they frequently don't)

Show thread

emenel 5d ago

@sean_ae @yaxu not even just the quality... but what the model fundamentally is. statistics are only one way to model a dataset, and not an objective or neutral way.

Show thread

sean_ae 5d ago

@emenel @yaxu what would be other ways to model a dataset?

Show thread

Steve Meyer 5d ago

@sean_ae @emenel @yaxu as a librarian, i can give another big example: model it by description. that is, create a metadata model.

it is also very much not objective or neutral (we talk about this a lot in librarianship). but simply labeling and categorizing the pieces of information is another form of modeling that is not necessarily stats oriented. we simply think of this as a model that aids precise info retrieval.

Show thread

emenel 5d ago

@sean_ae @yaxu you can create a model from any aspects of the data. for text it could be statistical ngram frequency (ie llm’s), but could be any other quantifiable aspect of the data.

and there are many ways to make a statistical model as well, whoever made the one you’re using had to make decisions about how to process the data. statistics itself is far from objective.

Show thread

sean_ae 5d ago

@emenel @yaxu i'm curious about how you would train a network on quantifiable aspects of a dataset without involving statistics on some level

Show thread

emenel 5d ago

@sean_ae @yaxu yes, a quantifiable computational model lends itself to statistics. which is also already a non-neutral choice. computation is fundamentally numeric, so at some level you’ll always be working with mathematical relationships.

(we have models of things that are not computational as well, but that’s probably veering into different territory)

Show thread

sean_ae 5d ago

@emenel @yaxu oh ok, it's just that you said

>statistics are only one way to model a dataset

Show thread

emenel 5d ago

@sean_ae @yaxu yeah.., so all numeric relations aren’t statistical. but computation lends itself to statistical approaches (it’s basically why it was invented)

you could make a model from other types of data relations (ie not mean or avg or frequency etc) but tbh i would have to put some work into thinking through it.

Show thread

emenel 5d ago

@sean_ae @yaxu like, i could make a model of words that’s based on how i associate them with rooms of my house … that’s still a model of the corpus of words.

Show thread

sean_ae 5d ago

@emenel @yaxu so just a 1/0 relationship between the words and the rooms?

Show thread

emenel 5d ago

@sean_ae @yaxu could be many to many ... but it's not based on a statistical relation in that case. i dunno, i'm just thinking off the cuff and it's not very thorough :)

Show thread

emenel 5d ago

@sean_ae tbh it’s not the quality, it’s the ontology. what is being modeled and what does the model represent? there’s no neutral or objective answer. someone (group, org, corp, etc) designed a system that models a specific kind of data in a specific way.

any and all modeling is designed…

Show thread

sean_ae 5d ago

@emenel but what if the data is arbitrary?

(sorry for repeating myself)

i realise 'arbitrary data' is a very spherical cow here, but just hypothetically, if you had arbitrary data - would you still consider the model to have been designed?

Show thread

emenel 5d ago

@sean_ae sure, it could be arbitrary data. it could be a list of random numbers ...

the question then is what do you model? how is the model derived from the data?

Show thread

sean_ae 5d ago

@emenel well yeah, i guess there are levels of arbitrary

i can work with some pretty arbitrary sound sets but there are always gonna be questions about how or why files ended up in the set - i get that

i just find this idea that [the model is therefore designed] to be a bit weak, but maybe we can agree on 'weakly designed' or something

and i obvs get where you're coming from wrt the bigger LLMs being horribly skewed and manipulated - but to me there's a big difference between manipulating something and designing it

Show thread

emenel 5d ago

@sean_ae when you make a model with your own sound files, how are the processed into the model? what attributes are modelled? how are those relations represented? what attributes are left out? who made the decisions about how the data becomes a statistical model?

those are the design ... and in this case they've already been done by, i.e., google (tensorflow) or etc.

the type of model that the process produces is a designed artefact, even if the specific contents of the model (weights, nodes, etc) is different for each output.

Show thread

sean_ae 5d ago

@emenel well you can tune the weights later on anyway (just ignore the ones you don't want to use)

but yeah you're performing [some analysis] and then working with that data

i mean, what you seem to be saying is that the output of any statistical analysis is designed - and this is kinda like saying the data you get from doing a scientific experiment is designed - and idk if you can really say that

i mean you can say the experiment was designed but the data is just [the data you got from that experiment]

Show thread

emenel 5d ago

@sean_ae sure... but in this case the output, the model, is the result of a specific designed process. that process creates this artefact, and this artefact only. a different process would produce a different artefact.

a set of data doesn't have a natural model, order, or set of relations. those are all choices and decisions.

"a model" in this context specifically means a computational statistical model of datat-point relations based on frequency and context-frequency. it creates the weights of the relations from a self-refining gradient descent ... (and etc whatever else the training code does with the data).

that's a very specific way of analysing and representing the data that was chosen by people to do a specific thing.

Show thread

sean_ae 5d ago

@emenel i mean this is why we repeat experiments isn't it?

i'm not really understanding why you think the results of [some analysis] are themselves designed - i get that the analysis may be - but the results?

and you seem to be disregarding that the data itself was not designed either

and/or you seem to be implying that 'selected' and 'designed' are the same thing, and i don't think they are

Show thread

emenel 5d ago

@sean_ae i'm not saying the results are designed. i'm saying that the type of results are determined by how the process is designed. it will only create models that model the data in the way that the process is designed. so the output doesn't represent any natural representation of the data, it represents that specific process.

Show thread

sean_ae 5d ago

@emenel yeah, if you carry out some analysis you will get the results of that analysis

idk what you mean by
>natural representation of the data

Show thread

emenel 5d ago

@sean_ae i mean that the specific types of analysis aren't neutral. someone is always deciding what and how to analyse, and then how to model those results. it's a designed process...

like how science is a specific methodology and produces specific kinds of outputs/outcomes that aren't neutral or objective (contrary to the current popular understanding of sciences).

to go back to the original point of all of this (lol), when you create a ml model of a dataset it is a designed process that produces a specific and limited kind of result, regardless of the data. that is an intentional set of decisions made by somebody.

Show thread

sean_ae 5d ago

@emenel no one said it was neutral though, and lack of neutrality (whatever neutrality is supposed to be) isn't in itself an indication of design

i may just be having a hard time understanding what you mean by these terms like 'natural', 'neutral' and 'objective'

Show thread

emenel 5d ago

@sean_ae i'm responding to when you initially said that the models "just are." i'm saying that they aren't like that because they are the direct result of an already specified and determined process that can only produce a certain kind of model... and we're best to look at the models are artefacts of that process (all that it entails) rather than something that just "exists"

Show thread

emenel 5d ago

@sean_ae @yaxu like, the thing that is produced by the training process models the data based on the algorithm for analysis and recording (ie what metadata is derived, how is it stored, how can it be recalled). all of that is specific and is what produces the model. the model can't be an objective model of the data because someone had to write the code that does the analysis and builds the model. that code is a process that produces a specific, not generic/objective, representation.

Show thread

Steve Meyer 5d ago

@emenel @sean_ae @yaxu

i supsect the word "network" gets in the way a bit here. we use it in a lot of contexts. so would it be better to use a more precise term like "graph" (mathematical kind: nodes and edges) so that you can then begin to get to the next set of clarifying questions? for example, "is this a directed graph (do the edges point in one direction only)?"

also, a graph is the fundamental structure of a "neural network", but can also be both a) the structure of the input data, and b) the structure of the output data.

then if we think of it in these terms, we can start to deal with the other generic, overloaded term, the "model." is it correct to interpret an LLM "model" as the (non-neutral, designed) set of decisions or instructions for how to *traverse* the output graph produced by the LLM?