Over the past few months, the debate over AI-generated art has alternately inflamed, scared, and bored most of the people who pay attention to it. Gleeful venture capitalists say AI means they will never have to pay a living creative person ever again, while defenders of art for art’s sake claim a computer could never understand Truth and Beauty.
But it isn’t just the computers doing the work: There’s also human input. On the one hand, the raw material used to train AI image generators is art made by human artists, which has led to lawsuits. On the other hand, however, prompting an AI tool to generate an image is more complicated than just telling the computer to draw something. A good human prompter must learn the ins and outs of how each model works and nudge it into producing the kind of image they want.
Recent research shows that Americans who are learning about AI tools are mostly teaching themselves, often through sources and communities found online. And the best prompt engineers seem to be on Reddit. There’s one big subreddit for each kind of generator, including Midjourney, Stable Diffusion, and DALLE-3, along with several others where users debate, post, and refine prompts. A connected universe of wikis, YouTube tutorials, and influencers flesh out the emerging institutional world of AI-generated art.
Take an image known as “Spiral Town,” generated by a user known as Ugleh and posted to the StableDiffusion subreddit this September. Many of the comments on the original Spiral Town post are people telling Ugleh where they first saw the viral image: “a shrooms facebook group,” says one, while another lists other non-AI subreddits. Ugleh seems ambivalent about it: “I’m fine with it tbh. I only spent about 10 minutes on this photo.”
But as others praise Ugleh and post links to their own YouTube tutorials on how to make images similar to Spiral Town, some commenters double down on the argument that Ugleh should be treated as a “real” artist. Sure, generating the Spiral Town image may have taken minutes, but that doesn’t mean that creating such works doesn’t require skill — in fact, much of the subreddit’s audience seems to be people trying to develop these very abilities. Almost every post on the Stable Diffusion subreddit has a flare next to its title that says “Workflow Included,” meaning it explains the procedure used to create the image.
An Ugleh piece made three days later seems to have taken more than 10 minutes. The checkerboard image below was created starting with the deceptively simple prompt “Medieval village scene with busy streets and castle in the distance,” followed by fifteen lines of complicated and sometimes indecipherable modifiers, including one that instructs the AI to not make the image like a “bad anime.”
The two images of Medieval towns were made by Ugleh using QR Monster, an add-on tool to Stable Diffusion. QR Monster is based on ControlNet, a “neural network architecture” developed last February by researchers that allows Stable Diffusion to be more precisely conditioned and controlled.
ControlNet allows a user to give Stable Diffusion two different prompts, which it works on at the same time. The developers behind QR Monster used this capability to create a package that would make the AI generate both a QR code and a separate image simultaneously — so that if you were running a pizza restaurant, for example, you could make an AI image of a pizza in which the pepperonis formed a scannable QR code for your menu.
Ugleh used QR Monster to create “Spiral Town,” but instead of inserting a QR code, he provided a spiral as the second prompt to incorporate into the image of the medieval town. QR Monster packaged the “make two images at the same time” functionality in a way that would make it easy for businesses to use it — but AI art enthusiasts took the tech and ran with it.
This fits into a broader pattern of the AI artists turning against anything that smells of business. A week after Ugleh’s projects made the QR Monster add-on viral, a user named Pintjaguar produced a set of images that used the tool to embed company logos for Nike, Bayer, and Exxon inside AI-generated images of deforestation, sweatshops, and oil spills. While the political gesture might be ham-handed, the sentiment is widespread: The tight-knit AI art community is seeking to develop the technology on its own terms and in its own way.
It also goes beyond QR Monster. The user who generated a viral AI image of Pope Francis wearing a white Balenciaga jacket using Midjourney instructed the AI to render the photo as if it had been “taken using a Canon EOS R camera with a 50mm f/1.8 lens, f/2.2 aperture, shutter speed 1/200s, ISO 100 and natural light,” describing a series of other conditions and specific requirements for its appearance. In an interview with Buzzfeed, the user shared he was a construction worker from the Chicago area who was raised Catholic and got into making AI art while “dealing with grief” after the death of his brother.
Whether all this prompt engineering and crowdsourced code development counts as art is another question. It is, at least, craft: a technical ability spread out among a community of people who aren’t professionals but truly love tinkering with the models for their own enjoyment. The story of AI art isn’t just a story of robots and tech bros — though there certainly are many of those — but of the people behind these strange, futuristic visuals. Nobody is paying Redditors to make random pictures of spirals, but they still do it because they love to do it. That might be the most artful thing about AI art.