Mastodawn

Carl T. Bergstrom Mar 1, 2023

Does anyone have saved examples of scientific citations / reference lists generated by #ChatGPT from a few weeks / months ago? I'm quite curious whether its citation generating behavior has changed but wasn't smart enough to save output from when it was first released.

Carl T. Bergstrom

I'm quite puzzled as to whether #ChatGPT has gotten better with citations or not.

On one hand, it wrote a paper for me about bullshit that contained only references to actual papers. I don't think it used to be able to do that.

Carl T. Bergstrom Mar 1, 2023

On the other, it also wrote an entire essay about a single paper that doesn't even exist, complete with a live hotlink to a 404.

Carl T. Bergstrom Mar 1, 2023

After an evening of playing around, I'm wondering if the change is not any kind of patch to the code itself that I drifted to giving it easier prompts for which it has a fuller training set and less need to fabricate references.

The more offbeat the topic, the more fabricated references I get. Some not completely crazy topics (the physics of asparagus) feature only fabricated references.

Carl T. Bergstrom Mar 1, 2023

That said, I'm heartbroken that this is not a real paper.

Salinas-Melgoza, A., Taylor, A. H., and Seed, A. (2020). Wild crows discriminate objects based on their physical properties and cause a small fire to obtain food. Scientific Reports, 10(1), 1-7.

curiousbear 🏳️‍🌈

@ct_bergstrom It *needs* to be written, or at least start off as a proposal and see where it goes 😀

David Mar 3, 2023

@curiousbear @ct_bergstrom Write a registered report and someone else will run the experiment

curiousbear 🏳️‍🌈

@drdrowland @ct_bergstrom Preferably someone with an affinity for small corvid-adjacent fires :)

David Mar 3, 2023

@curiousbear @ct_bergstrom An experiment that produces crows that set fires. Excellent

JohnMashey Mar 1, 2023

@ct_bergstrom
At least it didn't claim crows had learned to use AI to write code to earn their food.

Jastrow Mar 1, 2023

@ct_bergstrom Hey @djigr we need to write this!

@jastrow @ct_bergstrom YES
Should we use ChatGPT? Or do we use our big brains?

Shambolic Matter Mar 1, 2023

@ct_bergstrom I want to believe that's a real paper from a better timeline

Florian U. Jehn Mar 1, 2023

@ct_bergstrom here's a picture of a crow to cheer you up

Raven Onthill Mar 1, 2023

@ct_bergstrom Aw.

@ct_bergstrom Maybe they are all from parallel universe and we just need to ask Chat to provide the full text pdf

Jeremy Kahn Mar 1, 2023

@ct_bergstrom It’s not that wild:

https://www.nationalgeographic.com/animals/article/wildfires-birds-animals-australia

Why These Birds Carry Flames In Their Beaks

Australia's indigenous peoples have long observed "firehawks" spreading wildfires throughout the country's tropical savannas.

Andrei Kucharavy Mar 1, 2023

Tbh, finding a good title based on a summary of findings is a good application of ChatGPT in science.

Elizabeth Hobson Mar 1, 2023

@ct_bergstrom I will have to let the "first author" know about this, Salinas-Melgoza, A. is my former lab-mate!! 😂

Elisa Fadda-🔺Mar 1, 2023

@ct_bergstrom Hahaha indeed! Glycobiology topics lead to completely fabricated refs, except for books, those are all spot on. We haven't been able to get a ref to a paper correct yet, but we are giving a few more goes by rephrasing queries. Also, cannot gives us a DOI for no love no money, doesn't even make it up

Michael Fessler Mar 1, 2023

@ct_bergstrom Yeah, I got 100% fabricated references in my query about selective mutism among hermit crabs…go figure.

Philip N Cohen Mar 1, 2023

@ct_bergstrom these are some fake citations from December 17. A combination of real authors, real journals, and plausible or real titles, but jumbled up and thus bullshit

Carl T. Bergstrom Mar 1, 2023

@philipncohen Thank you! Do you think it's improving?

Philip N Cohen Mar 1, 2023

@ct_bergstrom anything that's completely real seems like a breakthrough, but I have no idea

Fade Mar 1, 2023

@ct_bergstrom All the papers written with ChatGPT that are coming through the grad student office have laughably incorrect citations (and quotations); but I imagine it's harder for the training data to get enough accurate citations to the appropriate translations of ancient literature, compared to, like, highly cited STEM papers or what not.

DJ Dr. Maitxinha Mar 1, 2023

@fade @ct_bergstrom Exactly

Sheldon Rampton Mar 2, 2023

@ct_bergstrom So to sum up: as the author of a book about human-generated bullshit, you are shocked to discover that computer programs do it too.

l00g33k Mar 2, 2023

@ct_bergstrom I heve a sense that it is treating URL as sentence. When it doesn't find any existing URL that fits, it makes up a reasonable looking one. I've seen for example domain/placename for travel info which is very reasonable but 404.

n0body Mar 1, 2023

@ct_bergstrom Tufte certainly adds a bit of flair

Dr. Darren Abbey Mar 1, 2023

@ct_bergstrom Citing actual papers is a big improvement. The best I got from it was a list of fake papers with some author names appropriate to the subject.

I played with it soon after it was released, but I didn't save the outputs.

Brian Bales Mar 1, 2023

@ct_bergstrom Have you tried Bing chat? It is based on a newer GPT model and always provides linked references. I do believe that a different layer provides the references after the fact, but it does seem less likely to make up plausible sounding trash.

Anthony DiPierro Mar 1, 2023

@bb @ct_bergstrom Doing it in a separate layer (at least filtering it through a separate layer) seems to be the only sane approach.

It’d also be the approach that humans (hopefully) use.

Andrei Kucharavy Mar 1, 2023

@ct_bergstrom they might have put a GeDi sampler checking for existence of references. ChatGPT is constantly revised after all.

John Urbanik Mar 1, 2023

@ct_bergstrom Bullshitting here, but perhaps almost all papers on bullshit in research in the training set cite most of these sources (even Wagenmakers has >1k citations). Even stochastic parrots get phrases right sometimes when they hear the same ones over and over.

John Urbanik Mar 1, 2023

@ct_bergstrom More practically... it could be that they now lower a "temperature" parameter specifically after tokens denoting a bibliography (e.g. if the model outputs a line break + "references" or "sources", what follows should be more directly sampled according to the probability distribution found at training time). If the citations are actually *used* in the body of the text appropriately, this wouldn't explain things though

@ct_bergstrom I am rather sad about that because the fake citations were a good way to tell if a student was using it to write their paper.