ChatGPT review: Does OpenAI's text AI beat our editorial team?

We summarize our articles for our readers in short bullet points. But how do we compare to an AI?

When we designed THE DECODER, our team decided to offer our readers the main points of our articles as a short summary. Before launching the website, we also tested how large language models like OpenAI's GPT-3 could do the job. Our result: The AI summaries were okay, but kept missing important points in the text.

But OpenAI continued to improve the GPT models with human feedback in recent months. The GPT-3.5 models, as OpenAI calls them, which have been available since the beginning of 2022, also form the basis for the live test of ChatGPT.

However, the model additionally processed dialog data from humans for training, OpenAI hired AI trainers, who wrote both their text and that of an AI assistant. These AI trainers had access to suggestions that helped them write responses.

Does ChatGPT write better summaries than we do?

ChatGPT has attracted an enormous amount of attention since its release, as we show in our article about some of the best ChatGPT demos. Part of these demos are impressive summaries created by ChatGPT. AI tools like Bearly.AI or SumUp bring this capability just a click away to any browser.

So it's time to give AI summaries another shot. Here are five of our article summaries and what ChatGPT does with the same text. For all examples, the prompt "Summarize the text in a few bullet points" was placed in front of the article and the first output was used.

Deepmind's "Dramatron" can write film and theater scripts

Summary THE DECODER

Dramatron from Deepmind uses a special prompting method to elicit consistent storylines from large language models.
This allows the system to create entire screenplays and theater scripts.
Human testers see Dramatron as a useful creative tool.

Summary ChatGPT

Deepmind has developed a system called Dramatron that is capable of generating coherent scripts and screenplays for theater and film
The system, which is based on a short summary of a story called a log line, uses a method called "prompt chaining" to generate a list of characters, a plot, locations, and dialogue that are all consistent with the log line
Dramatron consists of five hard-coded prompts that build on each other to create the final output
The system was developed using a large language model trained with 70 billion parameters, and was evaluated by 15 professionals from the theater and film industries
The professionals mostly had positive reactions to the output of the system, but there were also criticisms, including the risk of plagiarism and the impact on the creative industry.

Recommendation

AI in practice

Meta takes on OpenAI's GPT-4o with Llama 3 405B, its largest open-source LLM to date

Video training for roboter hand allows it to learn human movements

Summary THE DECODER

Robots often learn real-world interactions from sample data of those very interactions. But these are hard to come by.
With VideoDex, researchers at Carnegie Mellon University show how a robot can learn from human interactions from Internet videos.
The AI system learns human movements in addition to the usual visual representations. In benchmarks, it performs better than previous methods.

Summary ChatGPT

Researchers at Carnegie Mellon University developed VideoDex, a method for teaching robots through internet videos of human interactions
VideoDex uses algorithms to track hand, wrist, and camera position in space and maps these movements to a 16-DOF robotic hand and a robotic arm
In addition to visual representations, VideoDex uses learned Neural Dynamic Policies to improve robot control
After training, VideoDex can outperform state-of-the-art robot learning methods in real-world interactions with just a few examples demonstrated through human remote control

Multimodal prompting is key for better generative AI

Summary THE DECODER

Current generative AI models generate new images based mainly on text.
German AI company Aleph Alpha is now introducing a multimodal input prompting technique for AI image generation.
The new model, M-Vader, can generate a new image based on multiple images and text that reference each other. AI image models thus become more precisely controllable.

Summary ChatGPT

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Aleph Alpha has developed a new technique for generative image models called multimodal prompts
The M-VADER model can process multimodal inputs such as photos, sketches, and textual descriptions to generate new image ideas
M-VADER uses a multimodal decoder called S-MAGMA to guide the image generation process using pre-trained models
M-VADER can generate images based on multimodal context, merge two images together, or create variations of an image
The technology will become part of Aleph Alpha's Luminous offering and is a world first in multimodal image generation.

Nvidia moves into the financial sector with a multi-year AI partnership with Deutsche Bank

Summary THE DECODER

Nvidia and Deutsche Bank form a multi-year AI partnership to improve financial services with AI.
The companies will initially focus on intelligent avatars, large language models for financial texts, and fraud detection.

Summary ChatGPT

Deutsche Bank and Nvidia have announced a multi-year partnership to improve financial services using artificial intelligence (AI)
The two companies have been working together for 18 months and are developing a range of AI applications, including intelligent avatars and large language models for risk management and fraud detection
The partnership will also benefit Deutsche Bank's cloud transformation and enable the bank to extract useful data from unstructured information
Deutsche Bank is also using Nvidia AI Enterprise software and is developing a 3D virtual avatar using Nvidia Omniverse to improve its internal systems and customer experience
Nvidia CEO Jensen Huang said the partnership will "reimagine the way financial services are operated and delivered."

ChatGPT is a GPT-3 chatbot from OpenAI that you can test now

Summary THE DECODER

ChatGPT is a chat model from OpenAI based on GPT 3.5 and refined with dialogs.
It is now freely available online as a demo version. The demo is not sentient.
OpenAI hopes to gain insights from user feedback that can be used in the development of future language models.

Summary ChatGPT

OpenAI has released ChatGPT, a chatbot that has been trained using human feedback
ChatGPT is intended to help OpenAI develop better AI systems through user feedback
The chatbot has been trained using reinforcement learning from human feedback (RLHF), which has resulted in texts that are better evaluated by humans and reduces hate speech and misinformation
ChatGPT has been trained on dialog data collected from humans who wrote both sides of a dialog, their own and that of the AI assistant
The chatbot has limitations, including generating plausible-sounding but incorrect and nonsensical responses, reacting strongly to small changes in prompts, and being too wordy, repetitive, and biased towards more detailed answers
ChatGPT is freely available with an OpenAI account and is an "early demo of what's possible" according to co-founder Sam Altman

ChatGPT Summaries: Prompt engineering for "key takeaways".

ChatGPT ist "too wordy"? Maybe that explains the verbosity of the summaries, which is especially evident in the article about the cooperation between Nvidia and Deutsche Bank. But as I show, my prompt doesn't make ChatGPT keep it short.

Using the prompt "List three key takeaways from this text", I can significantly shorten ChatGPT's output. Here is an example from the Nvidia article:

Summary ChatGPT

Deutsche Bank and Nvidia have announced a multi-year partnership to improve financial services using artificial intelligence (AI).
The partnership will include the development of intelligent avatars, large language models for risk management, and fraud detection, as well as benefits to Deutsche Bank's cloud transformation.
Deutsche Bank will be using Nvidia's AI Enterprise software suite and Omniverse platform to create 3D virtual avatars for internal navigation and customer service, and to accelerate computing for risk management and scenario calculations.

That's 78 words instead of 112. With "List two key takeaways from this text", it's down to 45:

Summary ChatGPT

Deutsche Bank and Nvidia have announced a multi-year innovation partnership to improve financial services with AI.
The partnership will focus on developing applications including intelligent avatars, large language models for risk management, and fraud detection, and will also benefit Deutsche Bank's cloud transformation.

OpenAI's ChatGPT could make my job easier, but at the end it missteps

So, in my sample, ChatGPT seems to work as a tool for our editing. The summaries never miss any key points - at least in this run. Whether ChatGPT or other large language models can make our job easier - and how economical that is - only a long-term test can show.

Here are the three key takeaways, ChatGPT gets from this article:

OpenAI's GPT-3 language model was not able to provide accurate summaries of text, but with human feedback and additional training on dialog data, the improved GPT-3.5 model, also known as ChatGPT, has shown impressive summary capabilities in demos.
Carnegie Mellon University researchers have developed a system called VideoDex, which allows robots to learn real-world interactions from internet videos of human interactions, rather than just sample data.
Aleph Alpha's M-Vader generative AI model is able to generate new images based on multiple images and text input, rather than just text input alone. This multimodal prompting technique improves the quality and variety of the generated images.

While those could be key takeaways a human gets, when reading this article, it is not what it is about. Too bad, I'll write the summary then.

ChatGPT review: Does OpenAI's text AI beat our editorial team?

Does ChatGPT write better summaries than we do?

Deepmind's "Dramatron" can write film and theater scripts

Meta takes on OpenAI's GPT-4o with Llama 3 405B, its largest open-source LLM to date

Video training for roboter hand allows it to learn human movements

Multimodal prompting is key for better generative AI

Nvidia moves into the financial sector with a multi-year AI partnership with Deutsche Bank

ChatGPT is a GPT-3 chatbot from OpenAI that you can test now

ChatGPT Summaries: Prompt engineering for "key takeaways".

OpenAI's ChatGPT could make my job easier, but at the end it missteps

Here is an interesting take on LLM hallucinations by Andrej Karpathy

GPT-4 Turbo's best new feature doesn't work very well

Meta, Google, OpenAI defend AI's transformative use of copyrighted data

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

ChatGPT review: Does OpenAI's text AI beat our editorial team?

Does ChatGPT write better summaries than we do?

ChatGPT Summaries: Prompt engineering for "key takeaways".

OpenAI's ChatGPT could make my job easier, but at the end it missteps

Share

Bank details