From misinformation to liability to economics, there are many unanswered questions surrounding chatbots. One of the most important is how chatbots fit into the content ecosystem. Can they be more than just text parasites?
In an interview with The Verge, Microsoft CEO Satya Nadella assured publishers that Microsoft’s Bing offering would “live or die” based on its ability to drive people to publishers’ content.
More competition in the search market from Microsoft and others could also help publishers get more consistent traffic from different sources. Advertisers could get better prices, Nadella said.
“And so publishers will make more money, advertisers will make more money, and users will have great innovation. Think about what a great day it’ll be,” Nadella said.
Nadella cites “fair use”
The search software category is about fair use, Nadella said. Microsoft wants to set outbound traffic at “100 percent” as a key performance indicator (KPI), he said. Otherwise, Bing crawl bots would likely be shut down by publishers. It is unclear if Nadella is referring to classic Bing search or just chat responses in this context.
Nadella went on to say, “In other places, again, it’ll have to be really thought through as to what is the fair use. And then sometimes, I think there’ll be some legal cases that will also have to create precedent.” Nadella added that legal frameworks and financial incentives are necessary.
Microsoft’s Bing Chat implementation still cites other sites as sources, but click-through rates to these sites are likely to be significantly lower than traditional search. In general, many businesses rely on search engine traffic, but a drop in traffic would likely hit publishers the hardest.
Chatbots vs. publishers
Traditional Internet search drives traffic to other websites. The organizations that create and deliver that content – typically publishers, small and large media companies, or independent content creators – live off that traffic.
These people are now threatened by chatbot providers such as Microsoft and Google, which could gradually remix parts of their content on their own platform in order to keep users there as long as possible and better monetize them.
Content providers would lose revenue in this scenario, even though their content was used to train the AI model (and this will likely be the case for news topics in the future). Microsoft’s Bing chatbot, for example, is apparently still based on GPT 3.5, according to a leak, but enriches its answers with current information from Internet sources.
Where does this up-to-date information come from, and how is it generated? Only Microsoft and OpenAI know that at the moment. Presumably, the “next-gen AI models optimized for search” aggregate existing sources for news topics and enrich them with context from the 2021 GPT 3.5 model.
The situation is reminiscent of the unauthorized use of Internet images for AI training, which also led to numerous protests. Getty Images is currently suing Stability AI, one of the organizations behind the open source Image AI Stable Diffusion. It could take years to settle the lawsuit.