Google is using AI and Pixel phones to understand what dolphins are saying
Key Points
- Google, along with the Wild Dolphin Project and Georgia Tech, has developed an AI model called DolphinGemma that can analyze dolphin sounds and generate realistic sound sequences to better understand Atlantic spotted dolphin communication.
- The model is based on decades of sound and video recordings of a group of dolphins and uses technologies such as SoundStream to digitally process and predict sound patterns such as whistles and clicks.
- DolphinGemma is used directly underwater with Google Pixel smartphones and combined with the CHAT system to teach dolphins artificial sounds associated with objects - with the goal of enabling a simple form of human-dolphin communication.
Google has unveiled a new AI model called DolphinGemma, developed in collaboration with the Wild Dolphin Project (WDP) and researchers from Georgia Tech. The project aims to better understand the communication patterns of wild Atlantic spotted dolphins (Stenella frontalis).
The WDP has been studying a dolphin group in the Bahamas for nearly 40 years, building an extensive database of audio and video recordings. This collection contains detailed information about individual dolphins, their sounds, behaviors, and social interactions.
DolphinGemma was trained on this data and utilizes Google's audio technologies, including the SoundStream tokenizer, to convert dolphin sounds into digital format. The model can identify, analyze, and even generate realistic sequences of typical sound patterns like whistles, clicks, and burst pulses. It functions similarly to human language models by predicting the next sounds in a sequence.
Taking DolphinGemma underwater with Pixel smartphones
Researchers are deploying DolphinGemma directly in the field using Google Pixel smartphones to record and analyze data underwater. Simultaneously, the team uses the CHAT system (Cetacean Hearing Augmentation Telemetry), which associates specially developed artificial whistles with specific objects like seaweed or play cloths. The goal is for dolphins to learn and use these sounds to interact with researchers. A Pixel smartphone recognizes in real-time which tone a dolphin imitates and acoustically communicates to the diver which object is being requested.
This combination of AI, mobile technology, and long-term field research aims to identify structures in dolphin language and eventually enable a form of communication between humans and dolphins. Google plans to release DolphinGemma as an open model in summer 2025, allowing other research teams to use it for analyzing marine mammal communication.
DolphinGemma is part of Google's broader efforts to apply AI to animal communication research, particularly marine mammals. As part of its "AI for Social Good" program, Google partnered with NOAA to develop a whale-detection AI that analyzes audio data from hydrophones that have been recording marine mammal calls at twelve Pacific locations since 2005. A Google AI model also recently helped identify a mysterious underwater sound as a previously unknown call of the Bryde's whale. The sound, described as "Biotwang," was identified by combining visual sightings with acoustic recordings.
The Earth Species Project is also working on creating representations for animal communication, both for individual species and across multiple species simultaneously. Their goal includes understanding non-verbal forms of communication such as bee dances.
AI News Without the Hype – Curated by Humans
As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.
Subscribe now