New method adapts language models without training

Researchers at Sakana AI have introduced Text-to-LoRA (T2L), a method that adapts large language models to new tasks using only a simple text description - no extra training data required.

Large language models typically get specialized using techniques like LoRA (Low-Rank Adaptation). LoRA works by inserting small, low-rank matrices into certain layers of the model, making the adaptation process much more efficient than full fine-tuning. Instead of updating billions of parameters, only a few million need adjustment.

Still, each new task usually requires its own training data and carefully tuned hyperparameters, making the process time-consuming and resource-intensive. Text-to-LoRA automates this step. The system uses a hypernetwork trained on 479 tasks from the Super Natural Instructions Dataset. By learning to connect task descriptions to the right LoRA settings, T2L can generate LoRA weights for a new task in just one step - even if it has never seen the task before.

Video: Sakana AI

Sakana AI developed three T2L variants: T2L-L (55 million parameters) generates both LoRA matrices at once, T2L-M (34 million parameters) shares an output layer for both, and T2L-S (5 million parameters) generates only individual matrix ranks.

The team compared two training approaches for T2L: reconstruction training, where the system learns to recreate existing LoRA adapters, and supervised fine-tuning (SFT), where it's trained directly on target tasks.

SFT models outperformed reconstruction-based ones, achieving 66.3 percent of the reference benchmark on average versus 61.8 percent. The researchers attribute this to SFT's ability to group similar tasks more effectively.

Comparison of T2L performance: reconstruction training 61.8% vs. supervised fine-tuning 66.3% (zero-shot average). — T2L models trained with supervised fine-tuning achieve higher average zero-shot performance. | Image: Sakana AI

In tests on ten standard benchmarks, the best T2L model reached 67.7 percent average performance. In a direct comparison on eight benchmarks, T2L scored 74.0 percent, just behind task-specific LoRA adapters at 75.8 percent - about 98 percent of the reference performance, but without any additional training effort.

Adapting to unseen tasks

T2L can handle entirely new tasks, outperforming multi-task LoRA baselines and other methods. Performance, however, hinges on how closely the new task matches the training data - the closer the fit, the better the outcome.

Recommendation

AI research

Study reveals AI models have hidden capabilities they can't access through normal prompts

Zero-shot accuracies on ten benchmarks for baselines and text-to-LoRA (S/M/L); green = above task-specific LoRA, bold = above multi-task LoRA. — Zero-shot performance of text-to-LoRA (S/M/L) compared to baselines on ten benchmarks; values marked in green above the task-specific LoRAs, in bold those above the multi-task LoRA. | Image: Sakana AI

Clear, task-focused prompts deliver results comparable to specialized adapters, while vague descriptions hurt performance.

Mistral-7B responses to a GSM8K sheep counting task: Base model and low-quality LoRA incorrect, task-aligned LoRAs correct. — Precise, task-specific descriptions with T2L LoRAs on Mistral-7B produce correct GSM8K answers through different calculation paths, while generic prompts lead to errors. This highlights T2L's level of controllability. | Image: Sakana AI

According to the study, T2L is highly efficient, requiring over four times fewer computational operations than classic fine-tuning and no task-specific training data. It also worked reliably with models like Llama-3.1-8B and Gemma-2-2B.

Limitations remain: T2L is sensitive to prompt wording and still trails specialized LoRA adapters on complex, out-of-distribution tasks. Still, the researchers see it as a strong step toward automated model adaptation. The code and installation instructions are available on GitHub.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

New method adapts language models without training

Adapting to unseen tasks

Study reveals AI models have hidden capabilities they can't access through normal prompts

OpenAI’s math breakthrough might also mean AI is getting better at knowing its own limits

Anthropic says that AI can learn risky behaviors even when the training data looks completely safe

AI Math Olympiad wins revive the debate over symbols, reasoning, and the nature of intelligence

Google DeepMind's Gemini wins Mathematical Olympiad gold using only natural language

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New method adapts language models without training

Adapting to unseen tasks

Share

Bank details