OpenAI Foundry is set to become a new service for enterprises. A leak provides insight into model pricing - and how powerful GPT-4 could become.
OpenAI is launching a new developer product called Foundry, designed for "cutting-edge customers running larger workloads, allowing inference at scale". Foundry is said to give enterprises full control over model configuration and performance profiling. The information comes from screenshots shared by a Twitter user from Foundry's early access program.
OpenAI has privately announced a new developer product called Foundry, which enables customers to run OpenAI model inference at scale w/ dedicated capacity.
It also reveals that DV (Davinci; likely GPT-4) will have up to 32k max context length in the public version. ? pic.twitter.com/5KEsWLqPdc
- Travis Fischer (@transitive_bs) February 21, 2023
Foundry promises static allocation of compute capacity and access to various models and tools that OpenAI itself uses to develop and optimize its own AI models.
OpenAI leak shows a massive leap in next-gen GPT context window
Pricing ranges from $26,000 to $156,000 per month, depending on the model and contract term. The table in the document shows three models: GPT-3.5 Turbo and two variants of DV. The GPT-3.5 Turbo model presumably corresponds to ChatGPT's Turbo model, and the name DV could stand for Davinci, which is already the name of the largest variant of GPT-3 and GPT-3.5.
These two DV models are available in two versions: One with around 8,000 tokens of context - which is twice the length of ChatGPT - and one with a massive 32,000 tokens of context. If these numbers are confirmed, it would be a massive leap, so it could be GPT-4 or a direct predecessor. The context length determines the amount of text that a Transformer model like GPT-3 can process in its input, in ChatGPT's case the content of the current chat.
More context could enable new applications for the language models
The largest DV model would thus have eight times the context length of OpenAI's current GPT models, and could likely process well over 30,000 words or 60 pages of context. Such a model could read entire scientific articles, summarize studies, or perform much larger programming tasks. The resulting use cases could make ChatGPT look like an outdated demo.
More context could also enable new forms of prompt engineering. For example, in late 2022, researchers at the Université de Montréal and Google Research unveiled "algorithmic prompting", a method that enables large language models to achieve up to ten times lower error rates when solving mathematical and logical tasks. The team developed detailed prompts that describe algorithms for solving the tasks in question. In their paper, they speculate that with longer context lengths, even more extensive algorithmic prompts are possible, which could significantly improve performance in logical reasoning, for example.
So far, OpenAI has not officially confirmed the plans, but according to the Twitter user, the company has deleted the leaked documents that were previously available on Google Docs.