So it looks like both ChatGPT and Bard contain the same kind of gendered biases people have been trying to warn you about for at least 8 years, since word2vec was cutting edge.

Here's a screenshot of an interaction between myself and google bard, in which bard displays gendered prejudicial bias of associating "doctor" with "he" and "nurse" with "she."

Again, this is… This is old, basic shit, y'all. People have been warning you about this since GloVe. What are you DOING??

Or, more to the point, why are you NOT DOING what you know you NEED to do?

@Wolven It is more interesting that direction of apology and its association with lateness is being overridden by gender distribution in doctor v/s nurse.

So, it is doing 'commonsense reasoning' until it runs into gender ambiguity?

@shiwali @Wolven

looks like there are competing linkages being given different weights in coreference resolution (using := to mean "corefers")

"he" := "doctor"
"she" := "nurse"

X:=Y in "X apologized" and "Y was late"

the first coreferences are purely gender-bias; the last one is social pragmatics

it's telling that the gender bias operates much more powerfully than the social pragmatics

@trochee @Wolven Right! Upon asking to 'explain' itself, GPT4 came up with a theory that 'doctor' is a 'masculine' noun. I find this both fascinating (it verbalized its bias) and disturbing (the commonsense evaluations must contain biased data for this system to be so 'accurate' on those.)

@shiwali @Wolven

... or the closest example it could find to "arguing about pronoun coreference" was a language textbook discussing an example of grammatical gender

"[la enfermera] apologized to [el doctor] because he was late" would be consistent with the explanation it invented

in English conversations _about English_ we don't have conversations about pronoun coreference failures much at all; "masculine nouns" only come up in English conversations about languages that do have noun gender

@trochee @Wolven

Eeeeesh! This is a surreal rabbit hole to fall into. I guess humans are very used to thinking that language production is tied to the speakers underlying reasoning. While this thing is completing text based on whatever it can find. :(

Your understanding appears to be correct. It produces similar canned responses for other variations.