AI in practice

Feb 1, 2024Feb 1, 2024

Open source Nomic Embed text embedding model outperforms OpenAI's Ada-002

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Profile

E-Mail

Nomic AI has released an open-source embedding model called Nomic Embed that outperforms OpenAI's Ada-002 and text-embedding-3-small models on both short and long-context tasks. The model is fully reproducible, auditable, and supports a context length of 8192. Nomic Embed outperformed its competitors on the Massive Text Embedding Benchmark (MTEB) and the LoCo Benchmark, but fell short on the Jina Long Context Benchmark. Model weights and full training data are published for "complete model auditability". Nomic Embed is also available via the Nomic Atlas Embedding API with one million free tokens for production workloads and via the Nomic Atlas Enterprise offering for enterprises.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:

Bank transfer

Sources

Nomic Embed Text v1 Nomic AI Technical report

Matthias Bastian

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Profile

E-Mail

AI research

Jun 29, 2025Jun 29, 2025

OmniGen 2 blends image and text generation like GPT-4o, but is open source

News, tests and reports about VR, AR and MIXED Reality.

What happens next with MIXED My personal farewell to MIXED Meta and Anduril are now jointly developing XR headsets for the US military MIXED-NEWS.com

AI in practice

Jun 11, 2025Jun 11, 2025

OpenAI postpones open-weight AI until summer due to "unexpected and quite amazing" progress

AI research

May 26, 2025May 26, 2025

Google releases open-source LMEval to benchmark language and multimodal models

Google News

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Open source Nomic Embed text embedding model outperforms OpenAI's Ada-002

OmniGen 2 blends image and text generation like GPT-4o, but is open source

OpenAI postpones open-weight AI until summer due to "unexpected and quite amazing" progress

Google releases open-source LMEval to benchmark language and multimodal models

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Open source Nomic Embed text embedding model outperforms OpenAI's Ada-002

OmniGen 2 blends image and text generation like GPT-4o, but is open source

OpenAI postpones open-weight AI until summer due to "unexpected and quite amazing" progress

Google releases open-source LMEval to benchmark language and multimodal models