AI Tool Tips: Prompt Engineering and first Whisper software

Oct 2, 2022

DALL-E 2 prompted by THE DECODER

Phraser is supposed to help with prompt generation for DALL-E 2 and co., while OpenAI's Whisper enables free audio transcriptions.

Image AIs let even people who can barely hold a pen generate creative art. Provided they master so-called "prompt engineering" - the art of giving the AI the right image command.

This is not as trivial as it sounds. For one thing, of course, you have to be fundamentally capable of translating an image idea into the most pictorial language possible. For another, generative image AIs such as DALL-E 2, Midjourney, or Stable Diffusion have countless parameters and styles that strongly influence image generation.

The Phraser web software is designed to facilitate prompt engineering. As usual, you have to develop the image idea yourself, but when it comes to finding the style, Phraser provides support along the various parameters of the individual systems.

Through a step-by-step menu, you can decide

on the medium (e.g., photo, template, movie poster),
create a text description with the most important components,
choose color, texture, and resolution
and decide on camera settings, the mood, and the era.

After logging in, you get the appropriate prompt for the initially selected image AI. In addition, the software inspires you with similar images that have already been generated and somewhat match your prompt.

OpenAI Whisper arrives in first tools

With Whisper, OpenAI recently released an open-source model for speech recognition and transcription in various languages. OpenAI makes the model freely accessible and available free of charge - the first developers are downloading it and integrating it into tools.

With YouTube Whisperer, the cloud platform Hugging Face already has an implementation of the model in a simple user interface that can be used to transcribe YouTube videos.

Whisper by OpenAI, also on Hugging Face, can turn words spoken into a microphone into text within a few seconds. However, the software is only available as a demo, which stops after 30 seconds. But you can record several texts in a row.

Probably the most interesting project currently is Stage Whisper: Here a team of volunteers is working together to develop a simple and free transcription app based on Whisper, which can be used by people who are less familiar with the technology. A first version is expected to be released in just a few weeks. Anyone who wants to get involved can sign up on Stage Whisper's Discord channel.

Another project on Github, "Whispering," wants to use Whisper for real-time transcription.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

AI news without the hype
Curated by humans.

More than 16% discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

AI Tool Tips: Prompt Engineering and first Whisper software

OpenAI Whisper arrives in first tools

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.