TechingToday

I tried MidJourney's greatly upgraded Describe feature.

General

MidJourney has upgraded its Describe feature that allows you to get a description of any image that can be used to transform it into a work of generative AI art.

This is the same level of functionality as the other version 6 upgrades, and a significant improvement over the previous version that described my selfie as a bearded woman named Tina.

The new version of Describe utilizes the same upgrades seen in MidJourney v6 that led to higher realism and improved prompts.

It even remarked about the bookshelf behind me in my selfie and did not call me Tina - which was a nice bonus.

As in the previous version, it offered four possible explanations for your photo, one or all of which could be changed to whatever I chose to call your MidJourney "me."

MidJourney Describe works by typing /describe in a Discord chat session with a MidJourney bot. You will then be presented with the option to upload an image or share a link to an image. After a few seconds, four descriptions of the picture will appear.

This is a particularly important feature, reversing the usual text-to-image conversion and moving to image-to-text conversion. This feature, a form of computer vision, gives AI insight into the real world and helps it learn more about images.

Sometimes the descriptions can be bizarre. For example, when I uploaded a photo of Jeff Parsons, editor-in-chief of Tom's Guide UK, two of his names appeared completely random, suggesting that he works as a computer scientist.

In addition to improved accuracy, the new description is longer. The previous version described my selfie this way: "tina is a woman with a beard and glasses, rtx, dadaistic, classicist, genderless, close-up, webcam, manapunk style".

The new version goes like this: "A bespectacled, bearded, short-haired, gray-shirted, talking to camera, video game content creator with a large head, short dark brown hair, spaced-out black frame glasses, and a chubby face, brown eyes, a medium-length facial beard, and a bookshelf in the style of a 2018 Snapchat post in the background."

Other than using it to describe and roast someone's photo, Describe is a useful tool for inspiration. If you create something in the real world and want to create a digital version using AI, you can use Describe to improve your prompts by sharing photos.

Another important, and arguably more important use case is accessibility. The ability to generate descriptions of images can improve the quality of the alt text used to describe images for people using audio description tools to browse the web.

The ability to obtain image descriptions is not limited to MidJourney. Almost all AI image tools have similar capabilities, and all major chatbots have image analysis skills. However, MidJourney increasingly does the job well and was one of the first to offer this feature.

Claude3 can read all the text in an image, calculate graph positions, and suggest image prompts from images, among other features discussed in my review; Gemini and ChatGPT also have similar capabilities, and other tools can generate images from the image.

One of the fun quirks of MidJourney's Describe is the ability to create alternate versions of oneself. Each description can be used as a prompt to generate an image. I tried this with my boss and his boss.

In my case, my face changes completely, but it is closer to the version of me I would have if my life had been made into a TV movie and I were played by an actor who sat in front of a laptop and did as little exercise as possible and spent as little time indoors as possible.