Had a lot of fun with my stats students today. I gave them two data sets. One from a random number generator, the other was one I made up that was not random, but designed to look random. They were able to figure out which one was fake.

Then we had ChatGPT make the same kind of data set (random numbers 1-6 set of 100) and it had the same problems as my fake set but in a different way.

We talked about the study about AI generated passwords.

"Why don't you just load a library to find the mean and SD?"

Because I'M OLD. I like to write my own function. I do it for integration sometimes... kids these days.

@futurebird Faster than finding a library and RTFM too.

@futurebird

And if you keep using someone else's functions, you'll never truly understand the underlying math.

So many nights spent pouring over books and trying out code to learn the what and how... 😵‍💫😅

@futurebird
When I was a kid, we solved integrals in the snow and rain uphill in both directions.
@ohmu @futurebird LOL 42 and 73 are my picks for "random" numbers out of the LLMs, for now.
@ai6yr @ohmu @futurebird wait so... is that the ultimate question? "What number will an LLM always include when generating random numbers?"
@meuwese @ohmu @futurebird Apparently humans have willed that into existence, yes. LOL. (err... Douglas Adams, precisely)
@futurebird I know how to find the SD and I will use the php-stats library every day of the week and twice on Sunday. I would much rather be able to depend on well supported community code. (At least until it is all replaced by ai slop)
@ldpm @futurebird
AIUI, there's also that the formulas for mean and especially stdev that we learn in school don't work great with the way we represent floating point numbers in computers, with the way rounding works with those, and hopefully the stats library uses more obscure formulas that take care of that, what they call "numerical stability"

@ldpm

I don't mind using libraries, but it's fun to write my own versions of things just so I know how they work.

When we make projects where we share code I encourage them to use libraries more often. I'm just a grumpy old lady about it sometimes.

@futurebird I found out quickly that the entropy tools from NIST and Fourmilab don’t work well with a data set that’s log2(6) bits per element.
@futurebird I assume from this post someone already mentioned statistics from the python standard library?