Ad
Skip to content

AI Tool Tips: Prompt Engineering and first Whisper software

Image description
DALL-E 2 prompted by THE DECODER

Key Points

  • Phraser web software helps create prompts for DALL-E 2 and co.
  • The AI model Whisper from OpenAI enables free audio transcriptions and is used in first tools.

Phraser is supposed to help with prompt generation for DALL-E 2 and co., while OpenAI's Whisper enables free audio transcriptions.

Image AIs let even people who can barely hold a pen generate creative art. Provided they master so-called "prompt engineering" - the art of giving the AI the right image command.

This is not as trivial as it sounds. For one thing, of course, you have to be fundamentally capable of translating an image idea into the most pictorial language possible. For another, generative image AIs such as DALL-E 2, Midjourney, or Stable Diffusion have countless parameters and styles that strongly influence image generation.

The Phraser web software is designed to facilitate prompt engineering. As usual, you have to develop the image idea yourself, but when it comes to finding the style, Phraser provides support along the various parameters of the individual systems.

Ad
DEC_D_Incontent-1

Through a step-by-step menu, you can decide

  • on the medium (e.g., photo, template, movie poster),
  • create a text description with the most important components,
  • choose color, texture, and resolution
  • and decide on camera settings, the mood, and the era.

After logging in, you get the appropriate prompt for the initially selected image AI. In addition, the software inspires you with similar images that have already been generated and somewhat match your prompt.

OpenAI Whisper arrives in first tools

With Whisper, OpenAI recently released an open-source model for speech recognition and transcription in various languages. OpenAI makes the model freely accessible and available free of charge - the first developers are downloading it and integrating it into tools.

With YouTube Whisperer, the cloud platform Hugging Face already has an implementation of the model in a simple user interface that can be used to transcribe YouTube videos.

Ad
DEC_D_Incontent-2

Whisper by OpenAI, also on Hugging Face, can turn words spoken into a microphone into text within a few seconds. However, the software is only available as a demo, which stops after 30 seconds. But you can record several texts in a row.

Probably the most interesting project currently is Stage Whisper: Here a team of volunteers is working together to develop a simple and free transcription app based on Whisper, which can be used by people who are less familiar with the technology. A first version is expected to be released in just a few weeks. Anyone who wants to get involved can sign up on Stage Whisper's Discord channel.

Another project on Github, "Whispering," wants to use Whisper for real-time transcription.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.