Ad
Skip to content

Microsoft releases Florence 2 Vision models that can outperform larger specialist models

Microsoft has released a set of vision models called Florence 2. Florence 2 is a prompt-based vision model designed for computer vision and image processing tasks such as image description, object recognition, localization, and segmentation. According to Microsoft, Florence 2 can outperform other specialized and larger vision models in some tasks. To train Florence, Microsoft created the FLD-5B dataset, which contains 5.4 billion annotations for 126 million images. The models come in two sizes, with 0.23B and 0.77B parameters, and are available on Hugging Face for commercial use under the MIT license.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

  • Over 20 percent launch discount.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder