avatar
Carl T. Bergstrom @carlbergstrom.com

The only problem is that the citations go to papers that don't actually exist.

jul 7, 2025, 6:45 am • 520 74

Replies

avatar
paulduffy1192 @paulduffy1192.bsky.social

Yet!

jul 7, 2025, 6:56 am • 2 0 • view
avatar
Dr Kitson Wulf (bird guy) @vessellover.bsky.social

How do you put citations to papers that DONT EXIST?? That just sounds… so… idiotic?? Like you’re actively expecting people not to check your sources, which, the fact it’s about openAI makes me not too surprised

jul 7, 2025, 8:45 am • 1 0 • view
avatar
Rarian Rakista @rakista.bsky.social

Gemini will even talk about fake citations as if they are real for awhile.

jul 7, 2025, 11:24 am • 0 0 • view
avatar
Christopher Lauer @schmidtlepp.bsky.social

I like to work with AI but the Idea to let it write something for you instead of just proofreading something for you is just wild.

jul 7, 2025, 6:47 am • 8 0 • view
avatar
Skyline @skyline4438.bsky.social

I actually use it in the reverse, to draft tables or summary text and then review it. But I usually have a structured text that I expect it to produce from multiple documents.

jul 7, 2025, 11:16 am • 0 0 • view
avatar
randomanybody.bsky.social @randomanybody.bsky.social

The problem being that some people don't see this as a problem 🥺

jul 7, 2025, 8:00 am • 4 0 • view
avatar
Alex Rubinsteyn @alexr.bsky.social

Huh, l use o3 + Deep Research almost daily and have yet to get a fake reference. It’s been great at finding important papers in a research niche for focused lit reviews. My main problem with is how credulous it is wrt claimed findings but as long as I read everything it’s citing, it’s great

jul 7, 2025, 9:41 am • 1 0 • view
avatar
Alex Rubinsteyn @alexr.bsky.social

Out of curiosity, where do those two links go? (I wonder if I miss times where it mis-describes a paper but still links to something useful)

jul 7, 2025, 9:46 am • 0 0 • view
avatar
Steven Phelps @evolbrain.bsky.social

💯 I find that the made up citations issue is real, but seems to vary with model and mode. "Deep research" doesn't usually make stuff up, but it does repeatedly cite the one not very good source - a Frontiers article, a Wikipedia page, etc - it happens to have access to on a given topic.

jul 7, 2025, 10:20 am • 1 0 • view
avatar
Steven Phelps @evolbrain.bsky.social

The made up links often go to some other paper, maybe related authors or topic. The imaginary references are often v credible, with authors who work on the topic and real journal names, etc.

jul 7, 2025, 10:24 am • 1 0 • view
avatar
Ian Sudbery @iansudbery.bsky.social

I find exactly this using Gemini deep research when I fed it lit review titles I give to students. The references it provides are real, and the links go to the right place. They are on topic, but almost exclusively to low quality papers, and rarely provides any critical analysis of what it cites.

jul 7, 2025, 6:58 pm • 1 0 • view
avatar
Ian Sudbery @iansudbery.bsky.social

But then u find that is also true of the majority of student lit reviews.

jul 7, 2025, 6:59 pm • 1 0 • view
avatar
David Navarro @davidbnavarro.bsky.social

When it generates wrong citations, they often look very credible: a reference author, a title in line with what you are looking for, and a reputable journal. Almost too good to be true, which should be the hint that it's fake.

jul 7, 2025, 10:35 am • 0 0 • view
avatar
Alex Rubinsteyn @alexr.bsky.social

They’re all links and I have yet to open one that’s not very topical to my query

jul 7, 2025, 12:33 pm • 0 0 • view
avatar
Carl T. Bergstrom @carlbergstrom.com

In this case, the link goes to a completely different paper.

jul 7, 2025, 5:55 pm • 1 0 • view
avatar
Karla Holmboe @karlaholmboe.bsky.social

Exactly, or they do exist but are mis-cited

jul 7, 2025, 12:13 pm • 0 0 • view
avatar
Allosaur @allosaur.bsky.social

They were just saying any old shit five months ago

jul 7, 2025, 7:57 am • 0 0 • view
avatar
Tali (NumberW1tch) @numberw1tch.bsky.social

So, I asked ChatGPT about this. They cited a paper by A. Pseudo saying that it can pull accurate citations. I couldn't find the paper referenced but that may just be a skill issue on my part.

jul 7, 2025, 6:55 am • 6 0 • view
avatar
Marius Loots @mariusloots.bsky.social

Is it a statistical model of words, or a search engine? If it searches, it's not an AI, or am I missing something?

jul 7, 2025, 7:44 am • 0 0 • view
avatar
Carl T. Bergstrom @carlbergstrom.com

letmegooglethat.com?q=openAI+%22...

jul 7, 2025, 7:55 am • 2 0 • view
avatar
Steve T PhD @thelasttheorist.bsky.social

"make"

jul 7, 2025, 6:53 am • 4 0 • view
avatar
Zahra Fakhraai @zahrafakhraai.bsky.social

Not a problem in their opinion though! If AI reads what AI writes, does citation even matter?

jul 7, 2025, 11:28 pm • 0 0 • view
avatar
andyroz Go Green @andyroz.bsky.social

I'm sure there must be other problems with it...?

jul 8, 2025, 12:37 am • 0 0 • view
avatar
Paolo De Los Rios @paolodelosrios.bsky.social

To be honest, with the ~20$/month version, I have not found a single invented citation. Often they are not the most relevant ones, but they do exist. Maybe I have just been lucky?

jul 7, 2025, 7:05 am • 2 0 • view
avatar
Carl T. Bergstrom @carlbergstrom.com

You've just been lucky. This is Deep Research, the $20/month OpenAI tool specifically designed for doing citation-based research reports.

jul 7, 2025, 7:09 am • 20 0 • view
avatar
Paolo De Los Rios @paolodelosrios.bsky.social

I hope it lasts That said, they are seldom useful because as I wrote they rarely are the most relevant ones

jul 7, 2025, 12:28 pm • 0 0 • view
avatar
Tom @tomgj.bsky.social

We are truly living in an age of blatant disinformation and rampant misinformation.

jul 7, 2025, 6:47 am • 22 0 • view
avatar
Laszlo Sragner @xlaszlo.bsky.social

There are details in here because RAGs are exactly for this but most people just yolo prompt into a vanilla ChatGPT and get surprised when it makes up things.

jul 7, 2025, 7:34 am • 0 0 • view
avatar
Carl T. Bergstrom @carlbergstrom.com

This is OpenAI's paid RAG system Deep Research.

jul 7, 2025, 7:42 am • 3 0 • view
avatar
Laszlo Sragner @xlaszlo.bsky.social

And it is still making up references?

jul 7, 2025, 8:09 am • 0 0 • view
avatar
707Kat @707kat.bsky.social

Yes it still hallucinates. From the article.

While Deep Research is based on a reasoning model, and not an LLM, it still uses a language model to work with the input, and generate the output text. OpenAI warns that the Deep Research model can still hallucinate and make up facts, so it's still better to keep an eye on the research output, and not to trust it blindly.
jul 7, 2025, 8:41 am • 2 0 • view
avatar
Laszlo Sragner @xlaszlo.bsky.social

If they have a db with all the processed articles it could pretty trivially to double check if it hallucinated (run a query and feed to the ai again, iterate until conclusion) so I am not sure why they don't do this.

jul 7, 2025, 9:50 am • 0 0 • view
avatar
707Kat @707kat.bsky.social

1/ Most likely, because it could be used as evidence of their piracy and make them further liable in lawsuits. Currently the NYT is suing them for training of their whole body of work and in one of their latest acts they are asking OpenAI to keep user prompts. Terrible for privacy of the users, but-

jul 7, 2025, 9:58 am • 0 0 • view
avatar
707Kat @707kat.bsky.social

2/ Considering the recent two lawsuits that wrapped up against Meta and Anthropic where one of the judges said “the use of purchased material without a license was fair, the piracy wasn’t”. Having a database of evidence of your possible crime is not in their interest.

jul 7, 2025, 9:58 am • 0 0 • view
avatar
Laszlo Sragner @xlaszlo.bsky.social

I would consider this a legitimate and excellent usecase for GenAI. (I bet Elsevier has really devious plans on this) Even if you just collect all content from Arxiv (with permission) that would be a huge win. If you can cut time on background research, it would speed up R&D

jul 7, 2025, 10:14 am • 0 0 • view
avatar
Laszlo Sragner @xlaszlo.bsky.social

I am trying to find some reference if they actually use a RAG (a vector database with all scientific articles processed) but to me it looks like they are doing the research with online queries which is unfeasible to be complete (you won't have enough recall for professional background research)

jul 7, 2025, 10:15 am • 2 0 • view
avatar
Aaron Tay @aarontay.bsky.social

I think academic RAG eg Scopus AI or academic deep search like Undermind.ai search via api a bounded database with uniqueids so they can check if a generated citation exists. Openai/Gemini Deep research do a "live search" of the Web making its harder to verify fake citations.

jul 7, 2025, 11:35 am • 1 0 • view
avatar
Carl T. Bergstrom @carlbergstrom.com

Yes. That's the point of my post.

jul 7, 2025, 6:05 pm • 1 0 • view
avatar
OG_McDuck @ogmcduck.bsky.social

Don't quibble with a gift horse

jul 7, 2025, 10:17 am • 1 0 • view
avatar
Carl T. Bergstrom @carlbergstrom.com

Even if that gift horse is making pretty much every facet of my job worse?

jul 7, 2025, 5:57 pm • 2 0 • view
avatar
OG_McDuck @ogmcduck.bsky.social

I was most assuredly not serious about that

jul 7, 2025, 5:59 pm • 2 0 • view