Microsoft's Bing team open-sources "Harrier" embedding model

Apr 7, 2026

Microsoft's Bing team (yes, really) has released "Harrier," an open-source embedding model. Harrier supports more than 100 languages, offers a 32,000-token context window, and was trained on over two billion examples plus synthetic data from GPT-5. According to the team, Harrier takes the top spot on the multilingual MTEB v2 benchmark and outperforms proprietary models from OpenAI and Amazon.

Rank (Borda)	Model	Zero-shot	Active Params (B)	Total Params (B)	Embedding Dim	Max Tokens
1	harrier-oss-v1-27b	78%	25.6	27.0	5376	131072
2	KaLM-Embedding-Gemma3-12B-2511	73%	10.8	11.8	3840	32768
3	llama-embed-nemotron-8b	99%	7.0	7.5	4096	32768
4	Qwen3-Embedding-8B	99%	6.9	7.6	4096	32768
5	gemini-embedding-001	99%			3072	2048
6	Qwen3-Embedding-4B	99%	3.6	4.0	2560	32768
7	Octen-Embedding-8B	99%	6.9	7.6	4096	32768
8	F2LLM-v2-14B	88%	13.2	14.0	5120	40960
9	F2LLM-v2-8B	88%	6.9	7.6	4096	40960
10	harrier-oss-v1-0.6b	78%	0.440	0.596	1024	32768

Alongside the full 27-billion-parameter model, the team released two smaller variants—0.6B and 270M—designed to run on less powerful hardware. All three models are available on Hugging Face under the MIT license. Going forward, the team plans to integrate the technology into Bing and into new grounding services for AI agents.

Embedding models handle the searching, retrieving, and organizing of information AI systems need for accurate answers. According to Microsoft, they're becoming increasingly critical as AI agents independently take on more complex, multi-step tasks.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Microsoft Bing Blog