ChatGPT programs AR app using only natural language

Humanity has been exploring the depths of OpenAI's ChatGPT neural network since early December. One developer got the dialog AI to spit out working AR code.

OpenAI's ChatGPT dialog AI is optimized for generating texts and answering questions. But initial tests from early December quickly showed that there's more to the system than just a few neatly worded sentences. Programming code, for example.

ChatARKit - AR app generated by ChatGPT

Developer Bart Trzynadlowski wanted to find out if he could use ChatGPT to develop an AR app that autonomously places digital 3D objects in the environment using only voice commands. He also recognizes the voice commands using an AI model - OpenAI's Whipser - and then brings them into the JavaScript environment of the ChatARKit app as an AI prompt.

As a result, ChatGPT selects 3D objects from Sketchfab that match the voice command and places them on the desktop or floor as prompted. If you prompt it, ChatGPT even scales and rotates the 3D models. The AI system generates the code for this on its own.

These are some working sample prompts according to Trzynadlowski:

"Place a cube on the nearest plane."
"Place a spinning cube on the floor."
"Place a sports car on the table and rotate it 90 degrees."
"Place a school bus on the nearest plane and make it drive back and forth along the surface."

According to Trzynadlowski, ChatGPT does not work reliably. For identical commands, the AI model generates very different output and places incorrect JavaScript code lines in the app. Occasionally, ChatGPT turns object descriptions into code identifiers, which means that the 3D models can no longer be retrieved from Sketchfab.

Trzynadlowski makes his ChatGPT AR app available for free as open source on Github.

Generate 3D objects in VR with natural language

For VR, developer Jasmine Roberts recently demonstrated an implementation of OpenAI's new 3D AI Point-E: like the image AI DALL-E 2, it can generate content based solely on text input. Instead of images, however, Point-E generates 3D point clouds that represent a 3D model. Per generation, Point-E takes only about one to two minutes on a single Nvidia V100 GPU. Roberts' demo runs in real time.

thrilled our preliminary research on #generativeai in #vr was accepted into #neurips2022

here is a bare-bones "holodeck" #3d implementation real-time in #unity. spoken user commands are compiled at runtime. 3d objects are created, loaded and manipulated via @openai's gpt-3. pic.twitter.com/4xg0FKYkEk

— jasmine roberts (@jasminezroberts) October 20, 2022

Point-E is a starting point for OpenAI for further work in text-to-3D synthesis. Google with Dreamfusion or Nvidia with Magic3D also recently introduced text-to-3D systems, which could play an important role in the further spread of 3D content - a fundamental assumption of the metaverse thesis - in the future.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Recommendation

AI in practice

Update

ChatGPT programs AR app using only natural language - ChatARKit

ChatARKit - AR app generated by ChatGPT

Generate 3D objects in VR with natural language

US Copyright Office says fair use does not cover AI trained on "vast troves of copyrighted works

Google DeepMind open-sources AI text watermarking for Gemini

Microsoft's RUBICON tells if your AI coding buddy is actually helping or just slacking off

Language models like GPT-4 memorize more than they reason, study finds

Cloudflare CEO Matthew Prince sees trouble ahead for the open web

New Othello experiment supports the world model hypothesis for large language models

ChatGPT might be draining your brain, MIT warns - what ‘cognitive debt’ means for you

ChatGPT programs AR app using only natural language - ChatARKit

ChatARKit - AR app generated by ChatGPT

Generate 3D objects in VR with natural language

Share

Bank details