Study finds AI search engines struggle with news attribution

Midjourney prompted by THE DECODER

A new study reveals major problems with how AI search engines handle news citations, even when they have formal agreements with publishers.

While nearly 25% of Americans now use AI search engines instead of traditional tools, according to recent data, these systems often fail at basic source attribution. Research from Columbia University's Tow Center for Digital Journalism tested eight AI search engines, including ChatGPT, Perplexity, and Google Gemini, by asking them to identify headlines, sources, publication dates, and URLs from random news articles.

The results paint a concerning picture: more than 60% of queries received incorrect answers. Perplexity emerged as the top performer with a 37% error rate, while Grok 3 struggled significantly, misattributing 94% of citations.

Horizontal bar chart: performance of various generative search tools in identifying the source article and URL for text excerpts, broken down by accuracy of response. — The study found very few instances where the AI tools provided completely accurate attributions. | Image: Columbia Journalism Review

Paid services perform worse than free versions

Surprisingly, paid services like Perplexity Pro and Grok 3 performed worse than their free counterparts. While they attempted to answer more queries, they were more likely to provide incorrect information instead of acknowledging when they didn't know something.

Bar chart on the performance of generative search tools: correct, incorrect and unidentified source references for extracted text snippets. — The graph illustrates the hit rates of various generative search tools in correctly identifying source articles and their metadata. | Image: Columbia Journalism Review

Several systems also ignored publishers' Robots Exclusion Protocol settings. For example, Perplexity accessed National Geographic content despite the publisher explicitly blocking its crawlers.

Matrix diagram: Accessibility of web content for various search engines and crawling methods despite blocking. — Access to content varies significantly depending on the search engine and method used. | Image: Columbia Journalism Review

Publisher agreements don't fix attribution issues

Even formal partnerships between publishers and AI companies haven't resolved the attribution problems. Despite Hearst's agreement with OpenAI, ChatGPT only correctly identified one in ten San Francisco Chronicle articles. Perplexity frequently cited syndicated versions of Texas Tribune articles instead of originals.

The study found that AI search engines often directed users to syndication platforms like Yahoo News rather than original sources. In more than half of cases, Grok 3 and Google Gemini created URLs that didn't exist.

Bar chart: Accuracy of generative search tools for identifying the origin, publication and URL of articles based on license agreements. — Data shows that formal licensing agreements haven't improved the accuracy of content attribution. | Image: Columbia Journalism Review

Time Magazine's COO Mark Howard notes that AI companies are working to improve their systems but cautions against expecting perfect accuracy from current free services: "If anybody as a consumer is right now believing that any of these free products are going to be 100 percent accurate, then shame on them."

A separate BBC study in February identified similar problems with AI assistants handling news queries, including factual errors and poor sourcing.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Recommendation

AI in practice

Update

Study finds AI search engines struggle with news attribution

Paid services perform worse than free versions

Publisher agreements don't fix attribution issues

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

Google rolls out "AI Mode" for Search in the UK

Google says AI content is fine, and SEO basics still apply to AI-powered search

ChatGPT usage for news surges as Google news searches decline

OpenAI launches GPT-5 as a unified system with adaptive reasoning for complex tasks

Google Deepmind's Genie 3 creates interactive 3D worlds that stay consistent for "multiple minutes"

Google upgrades Gemini with Deep Think and flags early warning risks

Study finds AI search engines struggle with news attribution

Paid services perform worse than free versions

Publisher agreements don't fix attribution issues

Share

Bank details