Google, in collaboration with marine biologists and AI researchers, has embarked on a groundbreaking project to understand and potentially communicate with dolphins. This venture utilizes a large language model (LLM) called DolphinGemma, trained not for human interaction, but to decipher the complex vocalizations of these intelligent marine mammals. Developed in partnership with the Wild Dolphin Project (WDP) and Georgia Tech researchers, DolphinGemma represents a significant leap in a four-decade-long quest to unlock the secrets of dolphin communication.
A Legacy of Dolphin Research Meets Cutting-Edge AI
Since 1985, the WDP has conducted the world’s longest underwater study of Atlantic spotted dolphins (S. frontalis) in the Bahamas. Their non-invasive research, involving decades of underwater audio and video recordings linked to individual dolphins, provides a rich dataset of behavioral contexts. This includes courtship rituals, aggressive interactions, and unique “signature whistles” that function as individual identifiers.
This extensive, labeled dataset of vocalizations became the foundation for training DolphinGemma. Built upon the same research underpinning Google’s Gemini models, this approximately 400-million parameter model processes dolphin sounds similarly to how ChatGPT processes human language. DolphinGemma operates on an audio-in, audio-out basis, listening to vocalizations and predicting subsequent sounds, effectively learning the structure of dolphin communication.
AI: A Powerful Tool for Understanding Animal Communication
AI is revolutionizing the field of animal communication research. From dog barks to bird songs, large language models can analyze vast amounts of acoustic data, identifying patterns and potential meanings. Last year, researchers leveraged AI to discern emotions, gender, and identity from dog barks, demonstrating the power of this technology.
Cetaceans, including dolphins and whales, are particularly well-suited for AI-driven analysis. Their sophisticated social structures and complex communication systems offer a wealth of data to explore. The distinct clicks and whistles they use are readily recordable, enabling models like DolphinGemma to uncover the underlying “grammar” of their communication. Project CETI, for example, used AI to analyze sperm whale codas, revealing rhythmic patterns and contributing to the development of a whale phonetic alphabet.
CHAT: Bridging the Gap Between Humans and Dolphins
DolphinGemma has the potential to generate dolphin-like sounds in the correct acoustic patterns, opening up possibilities for two-way communication with these animals. This interaction relies on Cetacean Hearing Augmentation Telemetry (CHAT), an underwater computer system that generates synthetic dolphin sounds associated with objects the dolphins regularly engage with, such as seagrass and researchers’ scarves.
The goal is for dolphins to learn to mimic these synthetic whistles to request specific items. As our understanding of dolphin vocalizations grows, more natural sounds can be integrated into the system. Deployed on modified smartphones, CHAT aims to establish a basic shared vocabulary between humans and dolphins, facilitating interactions akin to a game of charades.
Future versions of CHAT will incorporate more advanced processing and algorithms, enabling faster, clearer interactions. However, as this technology evolves, ethical considerations surrounding human interaction with wild dolphins will need careful attention.
A Promising Future for Dolphin Communication Research
Google plans to release DolphinGemma as an open model this summer, empowering researchers studying other dolphin species to utilize this powerful tool. While we are not yet at the point of having a dolphin TED Talk, DolphinGemma represents a significant advancement in our understanding of these remarkable creatures and offers a tantalizing glimpse into the potential of AI to bridge the communication gap between humans and the animal kingdom.