Google has unveiled a new AI model called DolphinGemma, developed in collaboration with the Wild Dolphin Project (WDP) and researchers from Georgia Tech. The project aims to better understand the communication patterns of wild Atlantic spotted dolphins (Stenella frontalis).
The WDP has been studying a dolphin group in the Bahamas for nearly 40 years, building an extensive database of audio and video recordings. This collection contains detailed information about individual dolphins, their sounds, behaviors, and social interactions.
DolphinGemma was trained on this data and utilizes Google's audio technologies, including the SoundStream tokenizer, to convert dolphin sounds into digital format. The model can identify, analyze, and even generate realistic sequences of typical sound patterns like whistles, clicks, and burst pulses. It functions similarly to human language models by predicting the next sounds in a sequence.
Taking DolphinGemma underwater with Pixel smartphones
Researchers are deploying DolphinGemma directly in the field using Google Pixel smartphones to record and analyze data underwater. Simultaneously, the team uses the CHAT system (Cetacean Hearing Augmentation Telemetry), which associates specially developed artificial whistles with specific objects like seaweed or play cloths. The goal is for dolphins to learn and use these sounds to interact with researchers. A Pixel smartphone recognizes in real-time which tone a dolphin imitates and acoustically communicates to the diver which object is being requested.
This combination of AI, mobile technology, and long-term field research aims to identify structures in dolphin language and eventually enable a form of communication between humans and dolphins. Google plans to release DolphinGemma as an open model in summer 2025, allowing other research teams to use it for analyzing marine mammal communication.
DolphinGemma is part of Google's broader efforts to apply AI to animal communication research, particularly marine mammals. As part of its "AI for Social Good" program, Google partnered with NOAA to develop a whale-detection AI that analyzes audio data from hydrophones that have been recording marine mammal calls at twelve Pacific locations since 2005. A Google AI model also recently helped identify a mysterious underwater sound as a previously unknown call of the Bryde's whale. The sound, described as "Biotwang," was identified by combining visual sightings with acoustic recordings.
The Earth Species Project is also working on creating representations for animal communication, both for individual species and across multiple species simultaneously. Their goal includes understanding non-verbal forms of communication such as bee dances.