Content
summary Summary

Update from February 13, 2024:

"Chat with RTX" from Nvidia is now available as a free download. The download takes about 35 GB.

The chip manufacturer recommends at least a Geforce RTX 30 series graphics card with 8 GB VRAM and 16 GB RAM as well as Windows 11. Once installed, you can use Llama 2 and Mistral language models to make your files and YouTube video transcripts "chattable" on your local drive via RAG.

Original post from January 11, 2024:

Ad
Ad

Nvidia has announced a new demo application called Chat with RTX that allows users to personalize an LLM with their content such as documents, notes, videos, or other data.

The application leverages Retrieval Augmented Generation (RAG), TensorRT-LLM, and RTX acceleration to allow users to query a custom chatbot and receive contextual responses quickly and securely.

The chatbot runs locally on a Windows RTX PC or workstation, providing additional data protection over your standard cloud chatbot.

Chat with RTX supports various file formats, including text, PDF, doc/docx, and XML. Users can simply point the application to the appropriate folders, and it will load the files into the library.

Users can also specify the URL of a YouTube playlist and the application will load the transcripts of the videos in a playlist and make them chattable. Google Bard offers a similar feature, but only with a Google account in the Google Cloud. Chat with RTX processes the transcript locally.

Recommendation

Video: Nvidia

You can register here to be notified when Chat with RTX is available.

Developers can get started right away

The Chat with RTX Tech Demo is based on the TensorRT-LLM RAG Developer Reference Project available on GitHub. According to Nvidia, developers can use this reference to build and deploy their RAG-based applications for RTX accelerated by TensorRT-LLM.

In addition to Chat with RTX, Nvidia also introduced RTX Remix at CES, a platform for creating RTX remasters of classic games, which will be available in beta in January, and Nvidia ACE Microservices, which provides games with intelligent and dynamic digital avatars based on generative AI.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Nvidia has also released TensorRT acceleration for Stable Diffusion XL (SDXL) Turbo and Latent Consistency models, which is expected to deliver up to a 60 percent performance boost. An updated version of the Stable Diffusion WebUI TensorRT extension with improved support for SDXL, SDXL Turbo, LCM - Low-Rank Adaptation (LoRA) is now available.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Nvidia announced Chat with RTX, a demo application that allows users to personalize an LLM chatbot with their content and run it locally on a Windows RTX PC or workstation.
  • The application supports multiple file formats and allows YouTube playlist transcripts to be integrated into the chatbot.
  • Developers can use the TensorRT-LLM RAG Developer Reference Project on GitHub to build and deploy their own RAG-based applications for RTX.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.