Content
summary Summary

Google DeepMind has added text recognition to its SynthID AI watermarking technology. The company is integrating this feature into its Gemini models and releasing it as an open-source project.

Ad

SynthID for Text uses a complex process that intervenes in the text generation of large language models (LLMs). These models generate text token by token, with tokens representing individual characters, words, or parts of sentences.

As an LLM creates a text sequence, it predicts the most likely next token based on previous words and probability scores for potential tokens. SynthID slightly adjusts these probability scores, but only when it won't affect the output's quality, accuracy, or creativity.

Google's SynthID for text manipulates the prediction probabilities for tokens to create an AI text watermark. | Video: Google Depemind

Ad
Ad

Google DeepMind explains that this process repeats for all generated text. A single sentence could contain ten or more adjusted probability scores, while an entire page might have hundreds. The final pattern of scores - both for the model's word choices and the adjusted probabilities - forms the watermark.

According to Google DeepMind, this technique can be applied to as few as three sentences. For longer texts, the watermark becomes more robust and accurate. While the method works well across languages, it has some weaknesses when it comes to edited AI text.

Gemini integration and open-source release

Google DeepMind has integrated SynthID into the Gemini app and website to watermark and identify generated texts. The technology is also available as an open-source project on GitHub, in the Google Responsible Generative AI Toolkit, and on Hugging Face.

Google DeepMind has published a detailed description of the technology in the scientific journal Nature. The company claims SynthID performs better than existing text watermarking systems. Previously, Google DeepMind introduced SynthID for images, voices, and music.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google DeepMind has expanded its SynthID AI watermarking technology to include text recognition. The company has integrated this feature into its Gemini models and released it as an open-source project.
  • SynthID for Text works by subtly adjusting token probability scores during text generation. This process creates a watermark pattern without affecting the output's quality or creativity, according to Google Deepmind.
  • The technology functions across multiple languages, but has limitations with heavily edited text. Google DeepMind has made SynthID available through partnerships with Hugging Face and as part of its Responsible Generative AI Toolkit.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.