Prompt2Model automates creation of custom language models for non-experts

Midjourney prompted by THE DECODER

Prompt2Model automates the generation of special-purpose NLP models that, in some cases, can outperform GPT-3.5 Turbo while being up to 700 times smaller.

Researchers at Carnegie Mellon University and Tsinghua University have developed a new system called Prompt2Model that can generate custom language models from prompts. The system aims to make the development of specialist AI models accessible to non-experts. Prompt2Model is not meant to be a GPT-4 alternative, but rather an automated pipeline for special-purpose NLP models that perform a particular task very well, are much smaller than large models, and can therefore run locally on weaker hardware.

The system first decomposes the prompt into a structured statement. It then looks for data sets that might be useful for the task at hand and uses OpenAI's GPT-3.5 Turbo to generate additional synthetic training data tailored to the task. It then identifies an appropriate pre-trained model for fine-tuning the hugging face and trains it on the collected data.

After training, Prompt2Model can create a web interface to interact with the model. The modular design allows customization of each pipeline component.

Prompt2Model shows promising results

The team evaluated the results of Prompt2Model in three benchmarks. In two tasks (SQuAD, Temporal), the resulting Flan-T5 models outperformed even GPT-3.5 Turbo, even though the Google model has almost 700 times fewer parameters. In the third benchmark (MCoNaLa) Prompt2Model was clearly behind the OpenAI model.

Prompt2Model has difficulty supporting tasks that require languages other than English, according to the team. The team cited GPT-3.5-Turbo's limited language support as the reason.

The fact that the team uses the OpenAI model to generate data is also probably Prompt2Model's biggest limitation, as OpenAI prohibits the use of its own models to train models that might compete with it, making Prompt2Model unusable for commercial applications. However, the team is exploring the integration of large open-source language models to get around the reliance on proprietary APIs.

More information and the code is available on GitHub.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Prompt2Model automates creation of custom language models for non-experts

Prompt2Model shows promising results

Researchers reveal that AI models have distinct strategic fingerprints in classic game theory tests

Sakana AI's new algorithm lets large language models work together to solve complex problems

The Maquet machine: how AI is reviving Alexandre Dumas' successful model

"Cat attack" on reasoning model shows how important context engineering is

Apple's claims about large reasoning models face fresh scrutiny from a new study

Cloudflare CEO Matthew Prince sees trouble ahead for the open web

Prompt2Model automates creation of custom language models for non-experts

Prompt2Model shows promising results

Researchers reveal that AI models have distinct strategic fingerprints in classic game theory tests

Sakana AI's new algorithm lets large language models work together to solve complex problems

The Maquet machine: how AI is reviving Alexandre Dumas' successful model

Prompt2Model automates creation of custom language models for non-experts

Share

Bank details

"Cat attack" on reasoning model shows how important context engineering is

Apple's claims about large reasoning models face fresh scrutiny from a new study

Cloudflare CEO Matthew Prince sees trouble ahead for the open web