Content
summary Summary

AI search startup Perplexity sometimes copies media content word-for-word for its "Perplexity Pages" feature and doesn't link to sources well enough. This is legally questionable and points to big problems with its product.

Ad

The new Perplexity Pages let users put together content on a specific topic. But it looks like Perplexity itself is using the tool in a questionable way.

As reported by Forbes and other media outlets, some Perplexity Pages consist of exclusive articles from various publications such as Forbes, CNBC, and Bloomberg. Some posts have verbatim passages from paywalled content and garnered tens of thousands of views.

For instance, a post by Perplexity about former Google CEO Eric Schmidt's secret drone project contains several parts that appear to have been lifted from Forbes, including a graphic. Forbes had reported exclusively on Schmidt's secret activities lately. Perplexity Pages are indexed by Google search and in Google AI overviews, making matters worse for Forbes.

Ad
Ad
Image: John Paczkowski via X

Sources are only shown by small logos that link to the original articles and are easy to miss. The media names are not said in the text itself, Forbes reports.

Perplexity CEO doesn't get journalism or his own product

Aravind Srinivas, CEO of Perplexity, responded on X to the claims made by Forbes editor John Paczkowski. He said the Pages feature still has "rough edges" and will get better over time with feedback.

"We agree with the feedback you've shared that it should be a lot easier to find the contributing sources and highlight them more prominently," Srinivas writes.

At least Perplexity gives credit and cites sources in a clear way, which most other chatbots don't do, Srinivas says, in a strong case of whataboutism.

Sources are more visible in Perplexity's core search product, according to Srinivas. Still, a product like Perplexity - or Google's AI Overviews - is likely to drastically reduce traffic to websites if it succeeds.

Recommendation

Srinivas previously told Forbes that the Web is free and anyone can crawl it. Perplexity is no different from other news sites that cite journalistic primary sources in that sense, he suggests.

"Take journalism where you're writing a new article. What do you do, you say according to the New York Times, you cite others. That's what we are also doing."

This quote suggests that Srinivas does not understand journalism or his own product. Journalistic curation is about people making decisions about which stories to cover and to what extent, creating a diversity of human perspectives that help shape public opinion over time. Deciding not to cover a story is one of those decisions. To imply that Perplexity is part of, or similar to, journalistic creation is like saying that Metacritic is a review platform.

Moreover, no news outlet that quotes others claims to be the only platform with all the answers. A claim Perplexity recently underscored with its "The Know-It-Alls" commercial, which aired during Game 1 of the NBA Finals. Journalism is about plurality, a product like Perplexity could kill it.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

News outlets also have to take responsibility for what they publish. Here, Perplexity and others show little drive, instead pointing to the responsibility of sources.

To avoid legal issues, Perplexity could try to rewrite web content more. But in doing so, the company raises the risk of made up info and misinformation.

The fundamental dilemma of AI search engines

The case illustrates a fundamental dilemma of AI search engines, of which misinformation is only a small part: : to be successful, they must make the web they depend on redundant. At the same time, they live off its content. This equation doesn't work.

Even more prominent source citations, as Perplexity is now announcing, won't change the low click rates on the linked pages. It's a band-aid on a gaping wound.

Why should users still click on media sources when the content is already bundled on the AI platform? So far, no chatbot provider has published data on link clicks, and there are likely good reasons for this.

Google's approach to its AI Overviews is even more questionable: Presumably to avoid legal disputes and licensing costs with publishers, the company often copies content from Reddit users or unknown sites where there is little or no pushback. For months, Google has boosted Reddit in search results and invested millions in the platform to prepare for this strategy.

Ad
Ad

Memorable in this context is a recent interview with Google CEO Sundar Pichai, who had no answer when confronted with the fact that a Google AI summary reproduced a website's content almost verbatim. This shows how rushed, destructive and ill-conceived the approach of AI search companies currently is. Speed is all that matters.

OpenAI has recognized this dilemma and is taking a different tack: the company behind ChatGPT is partnering with select publishers. Their content is preferentially displayed and linked in ChatGTP - and paid for by OpenAI.

Despite the licensing, OpenAI's approach is still problematic. Because if the AI search product takes hold, OpenAI will become the arbiter of media diversity.

The company will then decide which media will benefit from partnerships and thus from traffic and ad revenue - and which will miss out and possibly vanish from the market. Publishers will be further disempowered.

Ultimately, lawmakers and courts will have to grapple with these complex issues. One thing is clear: those who technically control access to information hold immense power. Google has demonstrated this in recent years with products like Google News and Google Discover, which have an enormous impact on the behavior of publishers and thus our information ecosystem.

The possible arrival of a chatbot or voice interface that basically turns the whole Internet into a product of Google, Perplexity, Microsoft, or OpenAI makes the problem worse. Policymakers should act before user behavior shifts to a new platform.

Ad
Ad

We need a debate about what a sustainable AI ecosystem can look like, one that values journalistic work rather than exploiting it, and a fair balance of interests between AI companies, the media, and the public.

Right now, AI companies see the web as a rich field to harvest and process as they wish because it supposedly belongs to no one. But this is not true.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • For its new "Perplexity Pages" feature, AI startup Perplexity copied content from media outlets such as Forbes, CNBC, and Bloomberg, sometimes verbatim, without properly attributing or linking to the sources.
  • The case highlights a dilemma for AI search engines: to be successful, they must make the web they depend on redundant. Even more prominent attribution, as promised by Perplexity's CEO, would do little to change the low click-through rates to original pages.
  • What is needed is regulation and a debate about what a sustainable AI ecosystem that values journalistic work might look like. Currently, AI companies see the web as a free-for-all to harvest at will.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.