A Deepmind developer gives a glimpse into Google’s chat AI lab on Twitter, demonstrating a capability that ChatGPT still lacks.
In May, Deepmind introduced Flamingo, a multimodal AI model that combines image processing (Deepmind Perceiver) and language (Deepmind Chinchilla). This visual language model processed images and associated text during training, developing a rudimentary understanding of the subjects in images.
Thanks to the language model and via a complementary dialog interface, the Flamingo system can answer questions about the image content and be guided to a more detailed result or asked for background information. This multimodality is still unknown to ChatGPT.
Ice crusher, garlic or potato press
Oriol Vinyals, Research Director and Deep Learning Lead at Google’s sister company Deepmind, now shows an impressive new Flamingo presentation on Twitter.
A visually challenging photo of his father’s used potato press serves as the starting point: Flamingo first guesses an ice crusher and is then guided to the correct result by the user through two feedbacks. When asked, Flamingo explains how the potato press works.
Back in May, Deepmind researcher Roman Ring demonstrated Flamingo’s image analysis and dialog capabilities using a special photo (see tweet below). What makes it special is the humor, which is only revealed through contextual and social understanding: President Obama secretly puts his foot on the scale so that the person on the scale will show a higher weight. People in the scene laugh.
The photo is also special because Tesla’s former AI chief, Andrei Karpathy, wrote about a decade ago that the AI industry was still “very, very far from” understanding the content of it. Does Flamingo get the humor?
10 yrs ago @karpathy wrote a blog post on the outlook of AI: https://t.co/bbp5in8tfc in which he describes how difficult it would be for an AI to understand a given photo, concluding "we are very, very far and this depresses me."
Today, our Flamingo steps up to the challenge. pic.twitter.com/JFmrMZTrUw
— Roman Ring (@Inoryy) May 6, 2022
Karpathy called the Flamingo demonstration in May “not exactly convincing, but cute.” He criticized incorrect and sometimes inaccurate answers and strong guidance from the questions. Based on the demo, it was not clear whether Flamingo really got the joke – but the system was “clearly on track to,” Karpathy wrote.
Is Google under pressure from ChatGPT?
Some people in editorial and social media are claiming that ChatGPT is a threat to Google’s core Internet search business. The New York Times also reports that Google has issued “Code Red” over ChatGPT, saying the system poses a fundamental threat to Google’s business. This is although ChatGPT currently mainly generates fictitious and generic texts or corrects simple code. It is not good at providing reliable answers to questions.
The ChatGPT experience is undoubtedly impressive, but it lacks reliable facts for a Google competitor. Even if OpenAI were to achieve high reliability, there would still be many unanswered questions about timeliness, source transparency, copyright, and scaling, as I point out elsewhere.
The recent Flamingo demo shows that Google probably would be able to respond quickly to ChatGPT on a purely technical level. In addition to Flamingo, Deepmind also has a similarly capable chatbot, Sparrow, in development, while Google itself is likely working on Assistant 2.0 with LaMDA. Moreover, Google has the most experience in using language AI for search, as even Meta’s AI chief Yann LeCun writes.
Google is in much better position to bring the latest NLP tech to search than any LLM company is to building a search engine (including OpenAI).
And yes, Google has been doing it for years.
Just as Facebook has been doing it for content ranking. https://t.co/QghhUqeMHo
— Yann LeCun (@ylecun) December 20, 2022
While OpenAI’s ChatGPT may not be a technical challenge for Google, it could still pose an economic threat: Google earns its money from advertising.
A large portion of its revenue, about $39 billion out of about $70 billion in the third quarter of 2022, comes from ads in Google Search. A new form of chatbot search would have to be monetizable at the level of Google Search so that Google does not lose market value.
In this respect, Google might actually feel pressure from ChatGPT: the company could dominate the Internet search even in the chatbot era, but still lose a lot of money due to a forced system change.
OpenAI is also supported by Microsoft, which can almost only gain in Internet search and therefore take full risk.