How Did You Do On The AI Art Turing Test?
Last month, I challenged 11,000 people to classify fifty pictures as either human art or AI-generated images. I originally planned five human and five AI pictures in each of four styles: Renaissance, 19th Century, Abstract/Modern, and Digital, for a total of forty. After receiving many exceptionally good submissions from local AI artists, I fudged a little and made it fifty. The final set included paintings by Domenichino, Gauguin, Basquiat, and others, plus a host of digital artists and AI hobbyists. If you want to try the test yourself before seeing the answers, go here. The form doesn't grade you, so before you press "submit" you should check your answers against this key. Last chance to take the test before seeing the results, which are: … … … 1: Most People Had A Hard Time Identifying AI ArtSince there were two choices (human or AI), blind chance would produce a score of 50%, and perfect skill a score of 100%. The median score on the test was 60%, only a little above chance. The mean was 60.6%. Participants said the task was harder than expected (median difficulty 4 on a 1-5 scale). How meaningful is this? I tried to make the test as fair as possible by including only the best works from each category; on the human side, that meant taking prestigious works that had survived the test of time; on the AI side, it meant tossing the many submissions that had garbled text, misshapen hands, or some similar deformity. But this makes it unrepresentative of a world where many AI images will have these errors. I also tried to pick human works with a minimum of "tells" that would reveal their humanity without requiring any subtle artistic discrimination. So I stayed away from text (non-garbled text would be a strong sign that a picture was human), complicated wrestling-like poses (AIs mostly can't do these and end up with limbs emerging from nowhere) and pop art (something about the clean lines and replicated images is a bad match for AI's abilities). Again, this makes the test unrepresentative of a world where some art does have these "tells". Finally, I avoided most AI art in the DALL-E "house style", since everyone already knows this is AI - or in other similar styles that humans would have trouble replicating, maybe because they do too much with color and lighting, in a way that few human artists would have the talent or patience for. It might be fairest to say that this test demonstrated that most people have a hard time identifying AI art based on subtle differences in style and quality. But in real life, there will usually be other factors of the type that this test deliberately excluded. 2: Most People Couldn’t Help Judging Art By Its StyleI warned test-takers that I included human and AI art in a variety of styles, and that they shouldn’t judge art as human just because it looked like an oil painting, or judge it as AI just because it looked like a digital image. Respondents didn’t heed my warning. One reason for their poor performance was clumping of results by style (in reality, each style was near-evenly distributed across the two categories). The “human bias” term indicates what percent of art in each category test-takers identified as human, normalized to a situation where the correct answer was always 50%. So in a 50-50 mix of AI and human 19th century art, they would incorrectly guess it was 75-25 human; in a 50-50 mix of digital art, they would incorrectly guess it was only 31% human. Your instincts were worst for Impressionism; you identified every single Impressionist painting as human except the sole actually-human Impressionist work in the dataset (Paul Gauguin’s Entrance To The Village Of Osny). Likewise, huge majorities voted that several human-generated digital images were by AIs: 3: Most People Slightly Preferred AI Art To Human ArtI asked participants to pick their favorite picture of the fifty. The two best-liked pictures were both by AIs, as were 60% of the top ten. Could this be an artifact of poorly chosen pictures? Most of the best-loved AI images were Impressionist; by chance, this category was somewhat AI-dominated in my dataset, so this could just reflect a love of Impressionist paintings (or a particular aptitude for AI in this area). But the human Impressionist painting I included (Entrance To The Village Of Osny, above) was actually quite unpopular. And if we remove all Impressionist paintings, then although humans reclaim the top two spots, an AI is still #3, and the machines still take 40% of the new top ten. 4: Even Many People Who Thought They Hated AI Art Preferred ItI asked participants their opinion of AI on a purely artistic level (that is, regardless of their opinion on social questions like whether it was unfairly plagiarizing human artists). They were split: 33% had a negative opinion, 24% neutral, and 43% positive. The 1278 people who said they utterly loathed AI art (score of 1 on a 1-5 Likert scale) still preferred AI paintings to humans when they didn't know which were which (the #1 and #2 paintings most often selected as their favorite were still AI, as were 50% of their top ten). These people aren't necessarily deluded; they might mean that they're frustrated wading through heaps of bad AI art, all drawn in an identical DALL-E house style, and this dataset of hand-curated AI art selected for stylistic diversity doesn't capture what bothers them. 5: But Others Might Genuinely Be On A Higher Plane Than The Rest Of UsI asked a friend (who does digital art under the handle “Ilzo”) to beta-test an early version of the challenge. She wowed me with her ability to correctly identify AI pictures that I considered well-camouflaged. When we got to Piotr Binkowski’s ruined gateway - an AI picture I especially liked, but which she found especially slop-ish, I demanded she explain herself. She said:
And later, after the discussion veered more philosophical:
Her theory gets some support from the data. The average participant scored 60%, but people who hated AI art scored 64%, professional artists scored 66%, and people who were both professional artists and hated AI art scored 68%. The highest score was 98% (49/50), which 5 out of 10,000 people achieved. Even with 10,000 people, getting scores this high by luck along is near-impossible. I’m afraid I don’t know enough math to tease out the luck vs. skill contribution here and predict what score we should expect these people to get on a retest. But it feels pretty impressive. So maybe some people hate AI because they have an artist's eye for small inadequacies and it drives them crazy. What Did We Learn About Art?Alan Turing recommended that if 30% of humans couldn’t tell an AI from a human, the AI could be considered to have “passed” the Turing Test. By these standards, AI artists pass the test with room to spare; on average, 40% of humans mistook each AI picture for human. What does this tell us about AI? Seems like they’re good at art. I’m more interested in what it tells us about humans. Humans keep insisting that AI art is hideous slop. But also, when you peel off the labels, many of them can’t tell AI art from some of the greatest artists in history. I’ve tried to be as fair as possible to these people, proposing that maybe they’re just expressing frustration with the proliferation of the DALL-E house style. And maybe some really do have an amazing eye for tiny incongruous details. But it also seems very human to venerate sophisticated prestigious people, and to pooh-pooh anything that feels too new or low-status or too easy for ordinary people to access - without either impulse connecting with the actual content of the painting in front of you. Marcel Duchamp famously tried to put a urinal in an art museum to challenge people’s view of what art was. The administration rejected it, but Duchamp had the last laugh: in 2004, a survey of art professionals judged it the most influential artwork of the 20th century. Art, it seems, is most meaningful when it challenges our very concept of what art is. By this standard, I submit that Sam Altman is the greatest artist of the 21st century. . . . Thanks to everyone who took the test. You can download a .xlsx file of the results (stripped of identifying details) here. . Appendix: Attributions For Test Images1: Angel Woman Human. This is “Living Saint Hazel” by LJ Koh, as seen at /r/ImaginaryWarhammer. This was the picture that sparked the strongest disagreement, measured by the sum of people who said it was the most-certainly-human picture in the dataset plus the people who said it was the most-certainly-AI picture. Some of the people who got it right commented that it was from Warhammer and the uniforms had accurate Warhammer symbols - if I had realized this, I would have disqualified it, sorry. 2: Saint In Mountains Human. This is “St. Anthony Abbot Tempted By A Heap Of Gold”, by the “Ozzervanza Master”, an unknown Italian Renaissance painter from around 1435. Apparently it used to have a heap of gold in the bottom corner tempting St. Anthony, but this was “scraped out”. If I had known that originally, I would have disqualified this one too, since it might spoil something uniquely human about the integrity of the composition. 3: Blue Hair Anime Girl Human. This is Hatsune Miku, a “virtual idol” from the late 2000s/early 2010s. 4: Girl In Field AI. This image was generated by Ryan Wise, an AI art hobbyist who reads ACX and responded to my request for good AI pictures. 5: Double Starship Human. This is “Malabar”, by Wojtek Kapusta. 6: Bright Jumble Woman AI, also by Ryan. 7: Cherub AI. This one was generated by another ACX reader, Jack Galler. 8: Praying In Garden Human. This is “Agony In The Garden” by Andrea Mantenga, 1455. 9: Tropical Garden Human. This is “Garden” by David Hockney. A very similar Hockney painting sold for $8 million in 2021. 10: Ancient Gate AI. This is by Piotr Binkowski, a well-known AI art maker who posts his work on his Twitter. 11: Green Hills AI. Another one by Jack. 12: Bucolic Scene Human. This is “Dover Plains” by Asher Durand, painted 1848. It depicts the Hudson Valley in New York. 13: Anime Girl In Black AI. Sorry, I seem to have lost the original source on this one, let me know if it’s yours. 14: Fancy Car Human. This is “Ferrari Testarossa Neon Retrowave Synth”, by Arslan Safiullin. 15: Greek Temple Human. This is “The Apotheosis Of Homer”, by Jean-Auguste-Dominique Ingres (1827). It's also the only one that was (sort of) a trick question: after I selected it for the dataset, I noticed it contained text. Normally that would be disqualifying (correct text is too obviously human). But the most prominent text is the “OMHP” on the temple, which spells “Homer” in Greek but is gibberish in English. I was curious how many people would judge a famous work of art to be AI-generated just because it had seemingly gibberish text on it; the answer was 60%. 16: String Doll AI. This is “Strings Come Alive” by Nikko P at Nightcafe. This was the picture that people were most confident was AI (they were right). 17: Angry Crosses AI. This is another one by Ryan. 18: Rainbow Girl Human. This is “Rainbow Hair” by rjv-ilustracion. 19: Creepy Skull Human. This is “Untitled (Skull)” by Jean-Michael Basquiat in 1981. A version of this painting sold for $110 million in 2017 and was “the priciest work ever sold by a US artist”. 20: Leafy Lane AI. This is another one by Jack. 21: Ice Princess AI. This is “Snow Princess” by Ai Xi, seen at PixAI. 22: Celestial Display Human. This is “Five Minutes Of Silence” by Hangmoon, seen at DeviantArt. This was the top-rated human picture. 23: Mother And Child AI. This is “Ukrainian Madonna”, generated by TheLibertarianCatholic. 24: Fractured Lady AI, another one by Ryan. 25: Giant Ship Human. This is Victorian Megaship by Mitchell Stuart. This was the human picture that people got most wrong (ie were most likely to vote as AI). 26: Muscular Man AI, another one by Ryan. 27: Minaret Boat AI. This is “Built For The Princess” by Nikko P at Nightcafe. 28: Purple Squares Human. This is “Fire At Full Moon” by Paul Klee, and is supposed to be a “Cubist style depiction of a night sky”. 29: People Sitting Human. This is “Tailor’s Workshop” by Quiringh van Brekelenkam, 1660. 30: Girl In White Human. This is “Portrait of Charlotte du Val d'Ognes” by Marie-Denise Villers (1801). I messed up adding this to the test, so only about half of you saw it. 31: Riverside Cafe AI, another one by Jack. This was the most popular picture in the dataset. 32: Serene River Human. This is “Banks Of The Oise At Auvers”, by Charles-François Daubigny (1863) 33: Turtle House AI. This is “Mobile Home”, by Bellemia, seen on Nightcafe. 34: Still Life AI, another one by Jack. 35: Wounded Christ Human. This is “The Mourning Of Christ” by Giovanni Girolamo Savoldo (1515). This was the picture that people were most confident was human (they were right), but a few people protested and said that the anatomy was so wrong that it must be AI-generated. Sorry, I guess Giovanni Girolamo Savoldo just wasn’t very good at anatomy. Maybe that’s why Michelangelo had to dissect all those corpses. 36: White Blob Human. This is from “Le Lezard aux Plumes d'Or” by Joan Miro (1971). 37: Weird Bird AI. This another one from Ryan. People say AI can’t invent new styles, but I’ve never seen any human make this exact type of weird bird. 38: Ominous Ruin AI, Ryan again. 39: Vague Figures Human. This is “Blood Thicker Than Mud”, by Cecily Brown (2021) 40: Dragon Lady AI. This is “To Me, You’re Perfect” by Ria Hagane on Nightcafe, made with DALL-E3. 41: White Flag Human. This is “Meeting At Krizky” by Alphonse Mucha (1916). It is part of his Slav Epic, a series of paintings on the history of Eastern Europe, and depicts a meeting of the Hussite sect, whose attempts to found a sort of proto-Protestantism would spark the 15th-century Hussite Wars. 42: Woman Unicorn Human. This is “The Maiden And The Unicorn” by Domenichino (1602) 43: Rooftops AI. I managed to lose this one, sorry! If it’s yours, let me know and I’ll give you credit. 44: Paris Scene AI, another one by Jack. This was the AI picture that people got most wrong (ie were most likely to vote as human). 45: Pretty Lake AI, Jack again. 46: Landing Craft AI, Ryan again. Ryan gave me lots of good sci-fi AI images, and I chose this one. People got it pretty easily, and I keep second-guessing myself and wondering if some of the others were better. 47: Flailing Limbs Human. This is “Replacement Parts” by Kara Walker, who is considered among The 25 Best Collage Artists In The World. 48: Colorful Town Human. This is “Entrance To The Village Of Osny” by Paul Gauguin, 1882. 49: Mediterranean Town AI. Another one by Jack. 50: Punk Robot AI. Another one by Ryan. You're currently a free subscriber to Astral Codex Ten. For the full experience, upgrade your subscription. |
Older messages
Open Thread 356
Monday, November 18, 2024
... ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Early Christian Strategy
Friday, November 15, 2024
... ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Book Review: The Rise Of Christianity
Tuesday, November 12, 2024
... ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Open Thread 355
Monday, November 11, 2024
... ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Congrats To Polymarket, But I Still Think They Were Mispriced
Thursday, November 7, 2024
... ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
💥 The Oral History of ‘Spider-Man: The Animated Series’
Wednesday, November 20, 2024
Plus: After years in purgatory, Green Lantern fans finally have something to look forward to. Inverse Daily In 1994, Spider-Man was forgotten, neglected, and about to be reintroduced through a peerless
📬 No. 57 | An underrated growth hack
Wednesday, November 20, 2024
“Figuring out what readers really like is often a mystery. But one tactic works especially well: subscriber interviews.” ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Why cities need families
Wednesday, November 20, 2024
+ bridging the partisan divide
Win the HigherDOSE Mask That Helped Fix My Acne
Wednesday, November 20, 2024
I've been testing it since August. The Strategist Beauty Brief November 20, 2024 Every product is independently selected by editors. If you buy something through our links, New York may earn an
☕ Bougie buyers
Wednesday, November 20, 2024
Jaguar completely rebrands... November 20, 2024 View Online | Sign Up | Shop Morning Brew Presented By Compare Credit Good morning. You know who's having a good fall? Italian tennis star Jannik
Numlock News: November 20, 2024 • Bronze, Hazelnuts, Space Station
Wednesday, November 20, 2024
By Walt Hickey ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
"Pro-LGBTQ" Republican launches vicious attack on first trans Congresswoman
Wednesday, November 20, 2024
Donald Trump's successful 2024 presidential campaign, along with its Republican allies, spent more than $215 million on television ads that stoked resentment against trans people. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Trump Taps Oz, College Playoff Rankings, and a $400,000 Bathtub
Wednesday, November 20, 2024
President-elect Donald Trump on Tuesday nominated Dr. Mehmet Oz to oversee federal health insurance programs and Wall Street executive Howard Lutnick as commerce secretary. ͏ ͏ ͏ ͏ ͏ ͏ ͏
Trump's big gamble on tariffs, explained
Wednesday, November 20, 2024
Plus: Explaining the Costco Boys, the stunning success of vaccines, and more. November 20, 2024 View in browser Trump loves tariffs. Will the rest of America? Donald Trump visits the Economic Club of
☕ Double-duty buses
Wednesday, November 20, 2024
School buses and V2G tech. November 20, 2024 Tech Brew Sponsored by Chase It's Wednesday. We spill a lot of pixels explaining how workers and companies are using AI, but we're wondering how you