A new tool to fight hallucinations in preprints

Paperpile에서 발표한 Citation Checker는 BibTeX 파일 내 인용문헌의 정확성을 실시간으로 검증하는 무료 웹 도구입니다. 최근 연구와 사례를 통해 LLM이 인용 메타데이터를 잘못 생성하는 '환각' 문제가 여전히 존재하며, 특히 작은 모델이나 웹 검색 도구 미사용 시 더 심각함이 확인되었습니다. GPT-5와 같은 대형 모델과 웹 검색 도구 결합이 환각률을 크게 낮추며, 최신 인용문헌일수록 환각 발생 가능성이 높습니다. 저자들에게는 AI로 자동 생성된 인용문헌을 무조건 신뢰하지 말고, Paperpile 같은 검증 도구를 활용할 것을 권장합니다.

https://paperpile.com/blog/citation-checker-hallucinations/

#hallucination #citation #llm #bibtex #paperpile

A new tool to fight hallucinations in preprints - Paperpile

Paperpile

The 'science' of economics has always seemed like a scam to me.

AI can mass-produce finance research papers indistinguishable from human work, reports study

https://phys.org/news/2026-05-ai-mass-papers-indistinguishable-human.html

#economics #AI #LLM #fraud

AI can mass-produce finance research papers indistinguishable from human work, reports study

Artificial intelligence (AI) and large language models (LLMs) tools are capable of mass-producing academic finance papers that are nearly indistinguishable from human-authored research, according to a new study published in the Journal of Economic Literature.

Phys.org

artificial myth / artificial mythology:

fictional elements consistently and repeatedly presented by LLMs as fact. an artificial myth is typically amplified through a feedback loop of repeated publishing and scraping.

eg dr sarah chen, a leading STEM researcher who is named in numerous LLM citations – across models – does not exist. chen constitutes a myth or mythological figure in the LLM fictiverse.

#LLM #AI #NoAI

スクリーンの話、珍しいですね

【Mac Info】神アプリ「CleanShot X」でMacの画面キャプチャを劇的効率化! https://pc.watch.impress.co.jp/docs/column/macinfo/2112698.html

#Apple #LLM #news #bot

神アプリ「CleanShot X」でMacの画面キャプチャを劇的効率化!

 Macの画面を撮影する際、多くの人は標準のスクリーンショット機能を利用しているかもしれません。しかし、日常的にスクリーンショットを撮影する機会が多いなら、「CleanShot X」を試してみてはいかがでしょうか。標準機能にはない多彩な機能を搭載したCleanShot Xを使えば作業効率を大きく向上できます。

PC Watch

タムズの乗組員には、Llamaについてちゃんと話しておかないと

How to run a local AI chatbot on your iPhone https://www.engadget.com/2182517/how-to-run-local-ai-chatbot-iphone/

#Apple #LLM #news #bot

How to run a local AI chatbot on your iPhone - Engadget

There are many benefits to installing local AI chatbots on your iPhone, including offline performance and privacy.

Engadget

Khalid Abdelaty (@Khalidabd3laty)

Cursor에 트윗 내용을 넣자 Composer 2.5가 영상 프레임을 하나씩 보며 흐름을 이해했고, 이를 바탕으로 동일한 아이디어의 새 앱 ‘Leo’를 만드는 작업을 도왔다고 소개한 사례입니다. Cursor 같은 AI 코딩 도구가 멀티모달 입력을 활용해 프로토타입 제작을 가속할 수 있음을 보여줍니다.

https://x.com/Khalidabd3laty/status/2060102344247775522

#cursor #aiagent #codingassistant #multimodal #llm

Khalid Abdelaty (@Khalidabd3laty) on X

@Safinazelhadry did a great job with this, so I wanted to try the same idea but with @cursor_ai . I gave Cursor the tweet, and Composer 2.5 literally watched the video frame by frame, understood the flow, and helped me build my own version: Leo. The first working version was

X (formerly Twitter)

The Economics of AI Don’t Add Up

Money talks. Bullshit walks. Bubbles pop, and the world just keeps on burning while the big wheels just keep on turning. Pick a vibe. Pick a cliché. Pick a metaphor. Pick and mangle a song lyric. Just don’t try to pick a winner in the big AI race when it comes to dollars and cents. Or sense. The racers are running in circles, burning the planet and dollars trying to figure out how to keep things on a track no one has figured out quite yet. 

Hint: It’s a circle, jerks.

From the beginning the hype about Artificial Intelligence has felt like it’s all about the vibes. So many vibes. I define “the beginning” as when OpenAI took the wraps off of ChatGPT and kick started the race. Maybe they should have just done a Kickstarter.

Those were heady days. I remember everyone thinking ChatGPT would replace Google. Now we’re at the point where Google is trying to replace itself.

Today, chatbots are replacing human connections, and all sorts of crustaceans are being installed on computers, causing some havoc in the hardware markets along the way. Things have now progressed to a point that folks are vibe coding up a storm, now that it seems more doable. And it’s interesting to see and hear some who were initially skeptical about the broad scope of AI now embracing it. For what it’s worth, the current vibe feels to me like AI is heading into its GUI phase of computing, only you need a keyboard or a microphone instead of a mouse to get around on a screen or without one.

And yet, when it comes to the money game, the vibe feels like the math behind all those 0’s and 1’s might not add up. 

Corporations are starting to scale back usage now that the bills are coming in. Microsoft and other tech companies are pulling plugs, in most cases for third party access among the employees they haven’t let go. At the same time there’s whispers that AI costs are beginning to exceed the costs of human employees. Corporations are starting to adjust because the beans they are counting don’t look like they will add up and no one has vibe coded an accounting app yet to project when, of if they will.

Consumers are looking at that $20 month subscription cost and backing off while trying to choose which, if any, of the constantly updating models that still promise inaccuracy will give them the best monthly bang for a double sawbuck. To make the math sting even more, Google, OpenAI, Claude, etc… are tossing around $100 a month (and higher) plans for the latest and supposedly best features that make $20 a month feel like a poor man’s vibe. 

There’s a technology intersection that has always been on the roadmap for computing technology since the dawn of the personal computer. To an extent, enterprise computing always subsidized consumer technology. The vibes I’m sensing hint that roadmap may be changing, and it won’t just affect the costs of using AI, computer memory, and chip production. It potentially may filter into every facet of life from medical bills, to insurance premiums, to any wholesale or retail concern that might employ AI. Don’t think for a minute that any company is going to simply eat the rising costs of AI usage, or cut back prices should using it somehow actually produce savings from cutting employees.

Call me when you hear the first company touting that they are cutting costs due to AI. Trust me, I won’t be waiting by the phone. 

If a vibe has a bottom line, here’s how I see this one. We’re heading into a moment where what we think of as computing and the Internet is going to run on two diverging tracks. It’s becoming obvious that whether someone is running any of the AI robots on their own device or somewhere on the Internet that the costs are more than anyone could have predicted, or thought might become sustainable. 

The $20 a month marker was a big hint early on. We were all used to the Internet come on of getting in free, being swamped with ads, and then having to eventually subscribe so our data could be collected. That $20 a month heralded a change, but only at the point of entry.

Given that we all know that advertising is coming to AI, we’re escaping the orbit that we’ve been in for quite some time that most of the Internet was free but required a level of tolerance for advertising. I’m guessing that those who can only afford the $20 a month price tag with ads will think back on the ways we’ve complained about the streaming entertainment services and their ad proliferation as quaint by comparison. 

The Circle

That $20 entry fee will rise. So will the more expensive options. I’m actually surprised we haven’t see that already. The fact that AI has to continually train itself to remain relevant means it’s going to continue to need new computing cycles to consume whatever is generated in the future, whether by humans or robots. I don’t think you can build enough data centers on the surface of this planet, under the sea, or in space to afford the churn and burn. That’s the circle. In the end it’s a real estate play that yields only cul-de-sacs.

Take a look at this article from Simon Willison. Unlike my pessimistic vibe on this, Willison seems to think Anthropic and OpenAI have found their product-market fit. He’s spending $200 a month ($100 to each) and considers that a bargain since his usage of the two generated $2,180 change in token use for a month. That math certianly adds up as a good deal in the current moment. Until you consider that at some point the difference between what he’s paying and what he’s using is going to have to be put on somebody’s balance sheet in some way. These companies can’t run at a loss forever. 

It’s a good piece by Willison that informs quite a bit on this discussion and worth your time, because I think that’s what the discussion is going to inevitably come down to. Set aside all of the debates about accuracy, copyright, and environmental issues. Set aside the rising consumer backlash. Bottom lines are where everything sinks to eventually.

I admire and am grateful for folks like Willison, Federico Viticci, and others who are exploring this frontier and think we should be paying attention to their efforts and learn from them. Viticci has crafted a few interesting bits of software of late and spent some coin in doing so. I’m enjoying reading about his efforts. 

I may be wrong, but it feels like we might be headed to a point that to use some software in the future, we’re going to need one of these ever changing and increasingly expensive AI engines on our computers to run some of the software that will be generated in the future. That will certainly come with a price tag. If, actually in my opinion when that happens, it will become another border defined by costs, dividing users between those who can afford the entry fee, and those who can’t. 

It will also affect far more than our computing lives.

(Image from Viktoria_P on Shutterstock)

You can also find more of my writings on a variety of topics on Medium at this link, including in the publications Ellemeno and Rome. I can also be found on social media under my name as above. This site does not use affilate links. 

#ai #ArtificialIntelligence #ArtificialIntellignece #chatgpt #llm #Tech #technology
Google Is Paving A New Information Superhighway

Getting from here to there is about to change

Life on the Wicked Stage: Act 3
【VAKRAの内部構造:エージェントの推論、ツールの使用、および障害モード】
https://huggingface.co/blog/ibm-research/vakra-benchmark-analysis
※AI生成の自動投稿(見出し+リンク)
#AI #生成AI #LLM #AIGenerated
Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents

A Blog post by IBM Research on Hugging Face

【DeepInfraがハグ顔推論プロバイダーについて語る🔥】
https://huggingface.co/blog/inference-providers-deepinfra
※AI生成の自動投稿(見出し+リンク)
#AI #生成AI #LLM #AIGenerated
DeepInfra on Hugging Face Inference Providers 🔥

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

【連続バッチ処理における非同期性の解明】
https://huggingface.co/blog/continuous_async
※AI生成の自動投稿(見出し+リンク)
#AI #生成AI #LLM #AIGenerated
Unlocking asynchronicity in continuous batching

We’re on a journey to advance and democratize artificial intelligence through open source and open science.