Essay: Artificial intelligence in the visual arts: Danger and opportunity

An image of a turtle outrunning a rabbit generated by SDXL 1.0 using the Anime v2 preset, with seed 1692461269, sampling method K_DPMPP_2M, no CLIP guidance.

It's already happening: human artists are being asked to clean up the mistakes made by so called "generative" "artificial intelligence." Or, perhaps more insulting, they're being asked to repeat on a traditional medium what an A.I. has already created on the screen.

An A.I. like DALL-E 3 creates something new only in the sense that the exact image it produces is almost always something that usually differs in a tangible way from any image that has been produced before. Then it's original, but that's not the same thing as "good."

The purveyors of A.I. encourage users to think of themselves as "artists." But typing some words into a box (writing a "prompt") and then pressing a Create button doesn't make you an artist. If you need an image for an important purpose, you'll probably need to hire a real artist.

It is impressive that an A.I. can take the words of a prompt and produce something that recognizably corresponds to the words of the prompt. It does also happen sometimes that an A.I. produces an image that looks very good, is in the style you were expecting and correctly corresponds the prompt.

Much more frequently, however, the images produced look good but don't quite correspond to the prompt, or they do correspond very well to the prompt but are not in the style you were hoping for, or are even just plain bland. It also happens quite often that one or more words in your prompt seem to have been simply ignored. And once in a blue moon, the A.I. seems to just completely ignore your prompt.

Let's say for example that you want a picture of Justin Timberlake playing the violin. The A.I. probably starts by retrieving several images of Justin Timberlake and several images of violins. That's probably where a human artist would start, if he or she is not familiar with Justin Timberlake or violins.

The next part is where human artists and A.I. differ considerably. If the A.I. uses a generation algorithm of the stable diffusion type, the next step might be to gradually add a lot of random noise until producing an image that looks like static. And then, somehow, from the static, the A.I. wrests a recognizable image.

A. I.-generated image of Justin Timberlake playing the violin

Image of Justin Timberlake playing generated by SDXL 1.0 using the Modern Comic preset, with seed 1364152073, sampling method K_DPMPP_2M, no CLIP guidance.

In this example image, you can recognize Justin Timberlake and you can recognize that he's playing the violin. It's actually quite good, if you don't scrutinize where his right hand is relative to the bow. And please don't scrutinize the background!

If a picture is worth a thousand words, then maybe a thousand words are needed to generate an image. The prompts for some of the best A.I.-generated images are very elaborate, intricate, and use a special syntax that is not all that intuitive even to professional computer programmers.

Also, there are several decisions for human operators to make before the A.I. starts its work, such as:

NightCafé offers a lot of help with these settings, by bundling some of the most popular choices into presets.

When a human artist makes a mistake, the patron can usually explain to the artist what the intended artwork was supposed to look like and how the produced artwork fell short. When the A.I. fails to produce what you want, how do you explain it in terms of random numbers and sampling algorithms? I don't know.

Sometimes you can't even figure out why the A.I. messed up. The first time I noticed the violinists in the background of Justin Timberlake playing the violin, I was completely puzzled as to why they were playing fragments of broken violins.

It wasn't until much later that it occurred to me that the A.I. misunderstood photos of violinists with other violinists behind them. Let's say there are four violinists playing, seated in a row, and behind them there is a row of another four violinists.

From your vantage point, your view of the back row musicians' instruments is obscured, but you understand that they're also playing violins. Or maybe they're playing violas, many people can't tell the difference visually between violins and violas. But you understand that the back row musicians are playing instruments that are very similar if not the same in shape and size.

Look at this stock photo of women violinists. Look at the woman in the center. The camera's view of her instrument was partially obstructed by a music stand with sheet music on it.

Photo of women playing violins in an orchestra

Photo by Lucas Oliver of women playing violins in an orchestra. See the external links below to download a higher resolution copy of this image.

An A.I. looking at this image might actually conclude that the woman in the center and the woman next to her are playing different instruments. The woman in the center would be misunderstood to be playing a violin fragment. Quite likely something similar happened with one of the source images that SDXL used to produce the image of Justin Timberlake playing the violin.

By the way, the stock photo is free from Pexels, but Pexels did suggest I could donate. I obtained a JPEG sized at 6,000 by 4,000 pixels, which I downsampled to 360 by 240 only because I want it to load quickly. A similar image from Adobe Stock or Getty Images could cost me $50 easy, and a lot more on the latter.

If one of your photos or paintings is used as an input to an A.I. image generator, are you entitled to any monetary compensation? Some artists say that any use of your images by A.I. is theft. Just another way to deny artists of their rightful wages. Though the argument could also be made that it's not all that different from another human artist drawing on your work for inspiration.

The main fear with A.I. image generators is of course that artists are going to be put out of work. Why pay Getty Images $500 for a high quality image made by a human artist when you can maybe get a good enough image for less than $1 from an A.I.?

The danger of human artists being replaced might be overblown. Nevertheless, it has already caused a lot of nuisance, and a cheapening of real art by real artists.

But there's also an opportunity here to educate the public about what makes real art, how it comes about by a combination of a human artist's natural aptitude, lived experiences, training and practice, and also understanding the work of other artists working in similar media or genres. That can't be simulated by computers rolling dice.

"There is no art without intention," Duke Ellington once said. Maybe that explains why art generated by "artificial intelligence" is so often lacking in certain qualities that we find in the best art by humans and even in the mediocre art of people who would much rather delegate the creation to an A.I.

Artificial intelligence in the visual arts: Danger and opportunity

External links