summary Summary

Humanity has been exploring the depths of OpenAI's ChatGPT neural network since early December. One developer got the dialog AI to spit out working AR code.


OpenAI's ChatGPT dialog AI is optimized for generating texts and answering questions. But initial tests from early December quickly showed that there's more to the system than just a few neatly worded sentences. Programming code, for example.

ChatARKit - AR app generated by ChatGPT

Developer Bart Trzynadlowski wanted to find out if he could use ChatGPT to develop an AR app that autonomously places digital 3D objects in the environment using only voice commands. He also recognizes the voice commands using an AI model - OpenAI's Whipser - and then brings them into the JavaScript environment of the ChatARKit app as an AI prompt.

As a result, ChatGPT selects 3D objects from Sketchfab that match the voice command and places them on the desktop or floor as prompted. If you prompt it, ChatGPT even scales and rotates the 3D models. The AI system generates the code for this on its own.


These are some working sample prompts according to Trzynadlowski:

  • "Place a cube on the nearest plane."
  • "Place a spinning cube on the floor."
  • "Place a sports car on the table and rotate it 90 degrees."
  • "Place a school bus on the nearest plane and make it drive back and forth along the surface."

According to Trzynadlowski, ChatGPT does not work reliably. For identical commands, the AI model generates very different output and places incorrect JavaScript code lines in the app. Occasionally, ChatGPT turns object descriptions into code identifiers, which means that the 3D models can no longer be retrieved from Sketchfab.

Trzynadlowski makes his ChatGPT AR app available for free as open source on Github.

Generate 3D objects in VR with natural language

For VR, developer Jasmine Roberts recently demonstrated an implementation of OpenAI's new 3D AI Point-E: like the image AI DALL-E 2, it can generate content based solely on text input. Instead of images, however, Point-E generates 3D point clouds that represent a 3D model. Per generation, Point-E takes only about one to two minutes on a single Nvidia V100 GPU. Roberts' demo runs in real time.

Point-E is a starting point for OpenAI for further work in text-to-3D synthesis. Google with Dreamfusion or Nvidia with Magic3D also recently introduced text-to-3D systems, which could play an important role in the further spread of 3D content - a fundamental assumption of the metaverse thesis - in the future.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • A developer shows how ChatGPT can be integrated into an AR app as a natural language interface.
  • The ChatARKit app is available as open source.
  • Another developer combines OpenAI's new Point-E 3D AI into a virtual reality environment for real-time 3D generation using voice commands.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.