Guardrails for ChatGPT: Nvidia wants to make large language models more secure

Midjourney prompted by THE DECODER

Nvidia's NeMo Guardrails is designed to make chatbots like ChatGPT more secure for use in enterprise applications.

Generative AI models permeate our digital infrastructure, whether for images, text, or code. Nvidia has long offered NeMo, an open-source framework for training and deploying large-scale language models. Now, NeMo Guardrails is another building block that addresses three problems with such models.

How to make ChatGPT fit for enterprise applications?

Chatbots like OpenAI's ChatGPT can be connected to third-party applications via toolkits like LangChain or automation platforms like Zapier to answer questions in enterprise support chat, help with coding, send emails, or schedule appointments.

That's useful because ChatGPT's ability to handle all these tasks in dialog with humans is in many cases far beyond the level of classical solutions. But with the general capabilities come problems.

In theory, ChatGPT can answer questions on any topic - but in the case of an enterprise application, this is not desirable: For example, a support chatbot should not recommend competing products when asked about alternatives, or write an essay on free will when asked.

Another concern is the hallucinations and toxic content that speech models can produce - and if a chatbot has access to third-party applications, an attacker could trigger unwanted actions through targeted queries.

Nvidia to use NeMo Guardrails to drive development of security standards

To address these three issues, Nvidia is developing NeMo Guardrail, a Python-based framework that can be placed upstream of toolkits such as LangChain or Zapier to filter and regulate the output and actions that users see.

In practice, Guardrail allows enterprises to use a programming language developed by Nvidia to specify various rules, such as what and how a chatbot like ChatGPT can respond, whether facts should be verified through another model, configure allowed APIs, and detect jailbreak attempts.

“Safety, security, and trust are the cornerstones of responsible AI development, and we’re excited about NVIDIA’s proactive approach to embed these guardrails into AI systems," Reid Robinson, lead product manager for AI at Zapier, said of Guardrails. "We look forward to the good that will come from making AI a dependable and trusted part of the future.”

Recommendation

AI in practice

Musk unveils Grok 4 as xAI’s new AI model that beats OpenAI and Google on major benchmarks

According to Nvidia, NeMo Guardrails works with all major language models, including GPT-4. NeMo Guardrails is available as open source on GitHub and is also integrated into Nvidia's NeMo Framework as part of the AI Enterprise Software Suite and as a cloud service in Nvidia's AI Foundations.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Guardrails for ChatGPT: Nvidia wants to make large language models more secure

How to make ChatGPT fit for enterprise applications?

Nvidia to use NeMo Guardrails to drive development of security standards

Musk unveils Grok 4 as xAI’s new AI model that beats OpenAI and Google on major benchmarks

xAI says Grok 4 is no longer searching for Musk's views before it answers

Grok introduces interactive AI avatars for iOS app

Google makes NotebookLM a content platform with curated public notebooks

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Musk unveils Grok 4 as xAI’s new AI model that beats OpenAI and Google on major benchmarks

Guardrails for ChatGPT: Nvidia wants to make large language models more secure

How to make ChatGPT fit for enterprise applications?

Nvidia to use NeMo Guardrails to drive development of security standards

Share

Bank details