BloombergGPT is a GPT-3 level language model for the financial industry

Midjourney prompted by THE DECODER

Bloomberg developed a language model specifically for the financial sector. To train the AI, the company used its own financial data and augmented it with online text data. This demonstrates how companies can develop domain-specific language models that are more useful for their industry than generic models.

Bloomberg's AI teams first built a dataset of English-language financial documents: 363 billion financial-specific tokens came from its own data assets, and another 345 billion generic tokens came from online text datasets The Pile, C4, and Wikipedia.

Using 569 billion tokens from this dataset, the team trained the domain-specific "BloombergGPT," a 50-billion-parameter decoder-only language model optimized for financial tasks. The Bloomberg team used the open source Bloom language model as its base architecture.

For finance-specific tasks, BloombergGPT outperforms popular open-source language models such as GPT-NeoX, OPT, and Bloom. But it also outperforms these models on generic language tasks such as summaries, in some cases significantly so, and is almost on par with GPT-3 according to Bloomberg's benchmarks.

"The quality of machine learning and NLP models comes down to the data you put into them," explained Gideon Mann, Head of Bloomberg’s ML Product and Research team.

BloombergGPT illustrates the value of domain-specific language models

According to Bloomberg, language models can be used in many areas of financial technology, from sentiment analysis in articles, such as those related to individual companies, to automatic entity recognition, to answering financial questions. For example, Bloomberg's news division can use the model to automatically generate headlines for newsletters.

News headlines generated by BloombergGPT. | Image: Bloomberg

In addition, the model needed only a few examples to formulate queries in Bloomberg's own query language (BQL) to extract data from a database. You can use natural language to tell the model what data you need, and it'll generate the appropriate BQL.

The language model quickly learned Bloomberg's BQL database language with just a few examples. | Image: Bloomberg

"For all the reasons generative LLMs are attractive – few-shot learning, text generation, conversational systems, etc. – we see tremendous value in having developed the first LLM focused on the financial domain," said Shawn Edwards, Bloomberg’s Chief Technology Officer.

The domain-specific language model, he said, allows Bloomberg to develop many new types of applications and achieve much higher performance than with custom models for each application - all with a faster time to market.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Recommendation

AI in practice

BloombergGPT is a GPT-3 level language model for the financial industry

BloombergGPT illustrates the value of domain-specific language models

AI safety alignment can make language models more deceptive, says Anthropic study

Microsoft's RUBICON tells if your AI coding buddy is actually helping or just slacking off

Language models like GPT-4 memorize more than they reason, study finds

McDonald's AI drive-through experiment turns into a nothingburger, IBM's not lovin' it

Apple's local AI agent framework paves the way for more useful Apple Intelligence

Apple AI researchers question OpenAI's claims about o1's reasoning capabilities

Tesla unveils Cybercab robot taxi, but robot Optimus is the bigger deal

BloombergGPT is a GPT-3 level language model for the financial industry

BloombergGPT illustrates the value of domain-specific language models

Share

Bank details