Ad
Skip to content

DeepSeek-Coder-V2: Open-source model beats GPT-4 and Claude Opus

Image description
Midjourney prompted by THE DECODER

Key Points

  • DeepSeek-AI has released the open-source language model DeepSeek-Coder-V2, which is designed to keep pace with leading commercial models such as GPT-4, Claude, or Gemini in terms of program code generation.
  • DeepSeek-Coder-V2 supports 338 programming languages, can handle contexts of up to 128,000 tokens, and has been trained on a total of 10.2 trillion tokens, 60 percent of which are source code, 10 percent mathematical data, and 30 percent natural language.
  • In benchmarks for code generation, mathematics, and language, DeepSeek-Coder-V2 achieves results similar to the best commercial models - and in some cases exceeds them. It is available for download as open-source and can be used for both research and commercial purposes.

The academic research collective DeepSeek-AI has released the open-source language model DeepSeek-Coder-V2. It aims to compete with leading commercial models like GPT-4, Claude, or Gemini in code generation capabilities.

DeepSeek-Coder-V2 builds on the previous DeepSeek-V2 model and has been additionally trained on 6 trillion tokens from a high-quality multi-source corpus. The model now supports 338 programming languages, up from 86, and can process contexts of up to 128,000 tokens, up from 16,000.

The training dataset consists of 60% source code, 10% mathematical data, and 30% natural language. The code portion contains 1.17 trillion tokens from GitHub and CommonCrawl, while the mathematical part includes 221 billion tokens from CommonCrawl.

DeepSeek-Coder-V2 uses a Mixture-of-Experts architecture and comes in two variants: The 16-billion-parameter model has only 2.4 billion active parameters, while the 236-billion model has just 21 billion. Both versions have been trained on a total of 10.2 trillion tokens.

Ad
DEC_D_Incontent-1

DeepSeek-Coder-V2 breaks the dominance of closed models

In benchmarks like HumanEval or MBPP, DeepSeek-Coder-V2 can keep up with the best commercial models, according to DeepSeek-AI. The 236-billion version achieved an average of 75.3%, slightly lower than GPT-4o's 76.4% but better than GPT-4, or Claude 3 Opus.

In mathematical benchmarks such as GSM8K, MATH, or AIME, DeepSeek-Coder-V2 is on par with the leading commercial models. In language tasks, it performs similarly to its predecessor, DeepSeek-V2.

Image: DeepSeek

The DeepSeek-Coder-V2 model is available for download on Hugging Face under an open-source license. It can be used for both research and commercial purposes without restrictions. It is also accessible via an API.

Despite the impressive results, the developers see room for improvement in the model's ability to follow instructions. This is crucial for handling complex programming scenarios in the real world, which DeepSeek-AI aims to work on in the future.

Ad
DEC_D_Incontent-2

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Github (Paper)